Is there a psychometric tool for measuring the difficulty of mental activities?

Is there a psychometric tool for measuring the difficulty of mental activities?

It is often stated that Alzheimer's disease can be prevented to some extent when a person engages more often in challenging mental activities (

Although I can imagine that solving mathematical problems is more difficult than watching reality shows on TV, the criteria for this classification are vague to me (Ok not so vague because mathematics requires more cognitive load, but this may not be the only factor). I am interested in how could a psychometric scale that quantifies level of difficulty in a broad range of activities be constructed as it would be useful in research on prevention and rehabilitation of certain neurocognitive disorders.

Has such a scale been developed?

There is a workload measure called the NASA-TLX (Task Load Index), one component of which measures mental workload. It is self-reported workload, however, and also requires interrupting most tasks, so it has some theoretical as well as practical weaknesses.

There is also decently good data that working memory load can be extracted from theta-band activity in the frontal regions, possibly due to the parieto-frontal circuits between the anterior cingulate, inferior frontal, and posterior parietal cortices.

(All of this, however, assumes that working memory load and "difficulty" are the same thing.)

Nasa-Task Load Index (NASA-TLX); 20 Years Later. Hart, S. (2006). Proceedings of the Human Factors and Ergonomics Society Annual Meeting October 2006 vol. 50 no. 9 904-908 doi: 10.1177/154193120605000909

Evidence for effects of task difficulty but not learning on neurophysiological variables associated with effort. Brouwer, A. & al. (2014). International Journal of Psychophysiology Volume 93, Issue 2, August 2014, Pages 242-252. doi: 10.1016/j.ijpsycho.2014.05.004

Working memory load modulation of parieto-frontal connections: Evidence from dynamic causal modeling. Ma, L.; Steinberg, J.; Hasan, K.; Narayana, P.; Kramer, L.; et al. Human Brain Mapping 33.8 (Aug 2012): 1850-1867. doi: 10.1002/hbm.21329


The introduction of the International Classification of Functioning, Disabilities and Health (ICF) by the WHO in 20011 led to a change of paradigm in rehabilitative processes and welfare politics in Germany. Rehabilitative processes and welfare politics changed from the excluding care approach to an integrative process with preferably unlimited participation of people with disabilities and chronic diseases.2 The biopsychosocial model of the ICF plays an important role in rehabilitation for the recovery of significant improvement in functioning, especially at the level of activities and participation as well as in addressing changes in contextual and environmental factors/barriers, when the participation of a person is endangered or limited.3 The importance of participation as the goal of rehabilitative processes seems undisputed.4–7 The concept of social participation (in the following, only called participation) has increasingly become the focus of science and practice.8–10 However, this poses a challenge for science and practice to develop and apply appropriate assessment tools and evaluation instruments. The German Law on strengthening the participation and self-determination of persons with disabilities (short form: Federal participation law) requires instruments based on the ICF for the assessment of individual needs. The instruments should be able to capture restrictions in activities and participation in different areas of life. Participation is particularly important for the development of adolescents. It affects the level of competence experience (eg, skills), social experience (eg, relationship experience) and social-emotional development (eg, self-efficacy, self-concept).11–13 However, especially for the adolescent group, there are no high-quality assessment tools available in Germany for measuring social participation.14 For the conception, development and the comparability of assessment tools, a transparent theoretical framework and a consistent understanding of terms are elementary requirements.

The term of social participation

In the ICF for Children and Youth (ICF-CY), participation is described as ‘involvement in a life situation’15 which is affected through activities, personality of the adolescent (eg, motivation) and environmental factors (eg, family, environmental conditions, legislation).16 Participation thus includes the social perspective of functioning. At the same time, activity is understood as the ‘execution of a task or action by an individual’.15 Even though in the ICF and the ICF-CY, participation and activity are conceptually differentiated, they are ultimately summarised in one component, consisting of nine domains.14 15 17 18

In the distinction between participation and activity, there are different approaches in the literature. One presumption is that an activity primarily involves a functional aspect of an action that can be performed without a role performance at the societal level.4 Using ‘a role performance at the societal level’ as a distinguishing criterion should be analysed critically for adolescents with disabilities or chronic diseases because some activities such as food consumption frequently take place in interaction with others (eg, caregivers). The strict distinction of where an activity is primarily an individual activity is difficult to delineate.19 Another assumption to distinguish is the view on the complexity of the life situation.20 The hypothesis where participation differs from activity in terms of complexity seems reasonable,20 but not distinct enough. It is therefore proposed to differentiate between a spatial (eg, school) and temporal (eg, recurrent daily) component.21 In addition to complexity, participation may also differ from activities by its meaning, and it may be understood as ‘sets of organized sequences of activities directed towards a personally or socially meaningful goal’.21 Activities are therefore to be understood as smaller ‘action units’ out of which sequences of participation are designed. It is important that participation can be assigned to a rather higher-level goal of action.21

Even though four different qualifier options are proposed in the ICF-CY to differentiate between activity and participation,22 there has been no preference or homogeneity so far.17 Imms et al state that there are contemporary descriptions of how participation can be measured with the help of qualifiers, but in effect, it amounts to activity competence and not to participation.18

Theoretical foundation of social participation

In rehabilitation science, the concept of participation is predominantly determined by the ICF-CY. However, this raises the problem that the ICF-CY is based on the framework concept of the ICF and uses a mutual language, but the ICF itself emerged in a consensus procedure and lacks a theoretical foundation.23 Although the ICF-CY model is based on a biopsychosocial understanding of health, it is not sufficiently elaborated. Therefore, a theoretical inclusion of the concept of participation used here is relatively difficult.

Research suggests that participation is not only the number of activities a child participates in, or how often they attend that activity (attendance). Additionally, with regard to the feeling of involvement, prerequisites are observed, indicating it should be personally meaningful.7 24 Even if attendance and involvement are considered sets for the concept of participation, their relationship to each other is not yet completely clarified.18 To gain a more holistic view of the construct of participation in the ICF-CY, the introduction of a third qualifier of the subjective aspects of participation within the activity and participation domain is discussed.7 24 25

Participation is considered as a ‘multidimensional and evolving phenomena with the interaction of personal and environmental factors occurring over time’.7 It is seen as a process and as a result. For this reason, participation can be considered as both an independent and a dependent variable in research.9 10 18

In recent research, Imms et al have presented a conceptual framework, the family of participation-related constructs,18 26 which are closely related but not identical to participation. There are intrinsic person-related concepts that include activity competence, sense of self and preferences. These concepts influence future participation and are influenced by past and present participation. In addition, there are extrinsic environment-related concepts that influence and are influenced by participation. These factors should be distinguished between environment and context. Context is considered to ‘be personal, considered from the perspective of the person participating, and relates to the people, place, activity, objects, and time in which participation is set’.18 Whereas ‘environment is external, and refers to the broader, objective social and physical structures in which we live.’18 The processes of the interactions between these concepts and further distinctions can be found in Imms et al.18

Overall, beyond the simple definition of the term participation in the ICF-CY, profound consideration is given to the theoretical foundation of the term, and the process of understanding participation must be continued in science.

Measurement of participation

Some reviews have been published on the analysis of participation assessment tools for children and adolescents.14 17 27–29 In summary, although a large number of assessment instruments are available, an unqualified recommendation is difficult.14 27 This is because many instruments mix items of activity and participation,14 17 no single instrument measures the whole extent of participation in all life areas,14 28 and the quality criteria (on content validity, internal consistency, reliability and construct validity) are not convincing.14 29

To date, three participation assessment tools have been translated into German language.30–32 Two of these instruments (‘Participation and Environment Measure for Children and Youth’33 and ‘Children and Adolescent Scale of Participation’)32 34 are used as an external assessment in which legal guardians (parents or caregivers) assess the participation of the children or adolescent. This can lead to distortions, in particular due to the subjective components of participation (meaningfulness). The third and very often used instrument ‘Children’s Assessment of Participation and Enjoyment/Preferences for Activities of Children’35 refers to leisure activities only, does not distinguish between participation and activity, and only reaches mediocre quality criteria.31 Due to the legal conditions, the German version is not available for scientific or practical use. As a consequence, there is no reliable and valid instrument for the self-assessment of the participation of adolescents in German-speaking countries.

Aim of this study

This study aims to close parts of the existing gap in participation measurement among adolescents in research and practice. Instruments for the assessment of participation should be used more often for the planning and evaluation of rehabilitation processes but are hardly available in German language. As part of a sequential mixed-methods study, a participation assessment instrument will be developed for questioning adolescents aged between 12 and 17 years.

The HEADS-ED: a rapid mental health screening tool for pediatric patients in the emergency department

Background and objective: The American Academy of Pediatrics called for action for improved screening of mental health issues in the emergency department (ED). We developed the rapid screening tool home, education, activities/peers, drugs/alcohol, suicidality, emotions/behavior, discharge resources (HEADS-ED), which is a modification of "HEADS," a mnemonic widely used to obtain a psychosocial history in adolescents. The reliability and validity of the tool and its potential for use as a screening measure are presented.

Methods: ED patients presenting with mental health concerns from March 1 to May 30, 2011 were included. Crisis intervention workers completed the HEADS-ED and the Child and Adolescent Needs and Strengths-Mental Health tool (CANS MH) and patients completed the Children's Depression Inventory (CDI). Interrater reliability was assessed by using a second HEADS-ED rater for 20% of the sample.

Results: A total of 313 patients were included, mean age was 14.3 (SD 2.63), and there were 182 females (58.1%). Interrater reliability was 0.785 (P < .001). Correlations were computed for each HEADS-ED category and items from the CANS MH and the CDI. Correlations ranged from r = 0.17, P < .05 to r = 0.89, P < .000. The HEADS-ED also predicted psychiatric consult and admission to inpatient psychiatry (sensitivity of 82% and a specificity of 87% area under the receiver operator characteristic curve of 0.82, P < .01).

Conclusions: The results provide evidence to support the psychometric properties of the HEADS-ED. The study shows promising results for use in ED decision-making for pediatric patients with mental health concerns.

Is there a psychometric tool for measuring the difficulty of mental activities? - Psychology


McClendon, D., Warren, J., Green, K., Burlingam, G., Eggett, D., & McCledon R. (2011). Sensitivity to change of youth treatment outcome measures: A comparison of the CBCL, BASC-2, and Y-OQ. Journal of Clinical Psychology, 67 (1), 111-125.

Evaluation Methodology

“The publisher's website indicates that the measure is used to assess an individual's emotional/behavioral issues that may negatively affect their school or home functioning, to differentiate between hyperactivity and attention problems, and to assess treatment programs and their outcomes” (Measure Profile, 2012)

Measurement Characteristics

“This instrument is designed to measure a wide variety of constructs in the areas of behavior and emotions in order to identify potential issues or problem areas for children, adolescents, and young adults “ (Measure Profile, 2012)

Up to 160 self-reported items (depending on the form and the respondent), with 4-point Likert scale responses ranging from “Never” to “Almost Always” (Measure Profile, 2012)

5 composite scores: Adaptive skills, Behavioral symptoms index, Externalizing problems, Internalizing problems, School problems (Measure Profile, 2012)

“The BASC-2 consists of 16 primary scales and 7 optional scales, along with 5 composite scales. Since some scales are not applicable to certain age groups (i.e. the Activities of Daily Living scale does not apply to adolescents), the number of subscales used depends upon the age of the child being measured. Furthermore, some subscales are also not applicable to certain forms ‐ for example the Learning Problems scale does not apply to the Parent Rating Scales form. The BASC ‐ 2 composite scales consist of: Adaptive Skills, Behavioral Symptoms Index, Externalizing Problems, Internalizing Problems, and School Problems. Depending upon the age groups and rating forms used, the individual will have 4-5 composite scores comprised of 11-16 primary scales. Depending upon the individual’s behavioural circumstances, 1-7 optional content scales can be examined at the discretion of the clinician” (Behavior Assessment System for Children- Second Edition, 2011).

Parents/ guardians or teachers of children ages 2-21 (Measure Profile, 2012)

“The tool consists of 5 different kinds of reports/forms that are completed by an array of parents, teachers, caregivers, the examinee, and the clinician:

· The Teacher Rating Scales (TRS) are separated into 3 forms: preschool (ages 2 to 5), child (ages 6 to 11), and adolescent (ages 12 to 21). The TRS has between 100 and 139 items that are rated on a 4 ‐ point scale of behavioral frequency from “Never” to “Almost Always.”

· The Parent Rating Scales (PRS) share the same age group forms, and have between 134 to 160 items using the same 4 ‐ point rating system.

· The Self ‐ Report of Personality (SRP) uses 3 separate forms: child (ages 8 to 11), adolescent (ages 12 to 21), and college (ages 18 to 25). An oral interview ‐ based form is also available (SRP ‐ I) for children aged 6 to 7 years. Children are orally presented questions and asked to respond with a simple “yes” or “no” response” (Behavior Assessment System for Children- Second Edition, 2011)

“4,800 TRS reports, 4,800 PRS reports, and 3,400 SRP. The sample consisted of both general population and clinical individuals from the United States. The children ranged in age from 2 to 18. The gender and ethnic composition of the general population portion of the sample closely resembled that of the U.S. general population. There was also a college sample of 706 American students ages 18-25” (Measure Profile, 2012)

“The BASC ‐ 2 was normed using two populations: (1) a general population sample of American children and adolescents from various settings: public/private schools, mental health clinics/hospitals and preschools/daycares, and (2) a clinical norm sample of American children and adolescents (ages 4 ‐ 18) who were diagnosed with emotional, behavioural, or physical problems.

“The general population sample had a total of 4,650 Teacher Rating Scales reports, 4,800 Parent Rating Scales reports, and 3,400 Self ‐ Report on Personality reports. In terms of age groups, there were 2,250 TRS and PRS for pre ‐ school ages 2 ‐ 5, 3,600 TRS, PRS and SRP for children ages 6 ‐ 11, and 5,500 TRS, PRS and SRP for adolescents (ages 12 ‐ 18). It should be noted that limited sample sizes are available for ages 2 and 18, although the authors claim that this has had a negligible effect on the norms. In terms of gender and ethnic representation, the general sample is extremely close to U.S. population estimates. Data were also collected for the SRP ‐ COL (College ‐ level SRP) for ages 18 ‐ 25 using an American sample of 706 students.

“The clinical sample had a total of 5,281 reports across the TRS, PRS and SRP scales. The sample was comprised of 317 pre ‐ school aged (2 ‐ 5) children, 673 children ages 6 ‐ 11, and 789 adolescents ages 12 ‐ 18. The children used in this sample had a variety of diagnoses, including specific learning disabilities, speech/language impairments, emotional/behavioural disturbances, hearing impairment and ADD/ADHD. Additional demographic information for both samples is detailed in the manual” (Behavior Assessment System for Children- Second Edition, 2011)

“The authors report internal consistency coefficient alphas in the 0.90s for all composite scores, test-retest correlation coefficients in the 0.80s for the composite scores, and inter-rater reliability coefficients ranging from 0.57 to 0.74 for the composite scores” (Measure Profile, 2012)

“The BASC ‐ 2 manual discusses three kinds of reliability measures that were established based on the two aforementioned samples.

Internal Consistency: An analysis of internal consistency yielded coefficient alpha reliabilities generally in the .90s for the composite scales, and reliabilities generally in the .80s for individual scales across all forms (TRS, PRS, SRP) in both the general sample and the clinical sample.

Test-retest Reliability: Samples of individuals distributed across the three age groups, that were part of both the general and clinical samples, were retested with the BASC ‐ 2 one to eight weeks after the first administration. The test ‐ retest reliabilities were calculated for TRS, PRS and SRP, and yielded average correlations in .80s for composite scores and between the .70s and .80s for individual scales across all age groups.

Interrater Reliability: Interrater reliability analysis was performed for the Teacher and Parent reports for a significant amount of the scores. Two samples (for TRS and PRS) were tested and rated by 2 teachers and 2 parents respectively. Median reliabilities for composite scores ranged from .57 to .74, and median reliabilities ranged from .53 to .65 across individual scales for the TRS sample. The PRS sample had median reliabilities for both composite scores and individual scales in the .70s” (Behavior Assessment System for Children- Second Edition, 2011)

“The authors report exemplary concurrent validity and extensive convergent validity” (Measure Profile, 2012)

“The BASC ‐ 2 includes an extensive section on test validity. According to the authors, the BASC ‐ 2 was developed using content that came from teachers, parents, children and psychologists. The tool was also developed using diagnostic criteria from the DSM-IV

and DSM-IV-TR, as well as other behavioural instruments.

Construct, Convergent, and Discriminative Validity: The manual discusses several validation measures that were designed to assess the BASC ‐ 2’s similarity to other kinds of behavioural scales. For the Teacher Rating Scales form, the tool was compared to several related behavioural assessment tools, such as: the Achenbach System of Empirically Based Assessment Caregiver ‐ Teacher Report Form (ASEBA), Conners’ Teacher Rating Scale ‐ Revised (CTSR ‐ R), and the previous version of the Behavior Assessment System for Children (BASC). In general, correlations between subscales were high (in the .70s and .80s) when they addressed similar content as expected the BASC ‐ 2 was highly correlated (in the .90s) with the previous BASC.

“Similarly, the PRS scale was compared with other behavioural measures such as the ASEBA Child Behavior Checklist for Ages 1 ‐ 5, the Conners’ Parent Rating Scale ‐ Revised, the Behavior Rating Inventory of Executive Functioning (BRIEF), and the BASC. Generally, the BASC ‐ 2 correlated in the .70s and .80s with the first three scales, and in the .90s with the previous BASC.

“Finally, adolescent and college scores on the SRP scale were correlated with related self-report measures in the ASEBA Youth Self ‐ Report, the Conners ‐ Wells’ Adolescent Self ‐ Report Scale (CASS), Children’s Depression Inventory (CDI), Revised Children’s Manifest Anxiety Scale (RCMAS), Brief Symptom Inventory (BSI), Beck Depression Inventory ‐ II (BDI ‐ II), Minnesota Multiphasic Personality Inventory ‐ 2 (MMPI ‐ 2) and the BASC. Correlations between the tools varied widely depending upon the specific subscales that were compared, however in general the correlations fell in the .50s and .60s. The manual also includes profiles for clinical samples for the TRS, PRS and SRP scales however this information is beyond the scope of this review” (Behavior Assessment System for Children- Second Edition, 2011)

10-20 minutes for parents/ teachers (BASC-2, 2012 Measure Profile, 2012)

“Scoring and interpretation requires a doctorate in psychology, education, or a related field with relevant training and experience in assessment, or a license to practice in a health or allied health care field (e.g. doctors, nurse practitioners, social workers, etc.) Scoring can be done manually or electronically with the ASSIST/ASSIST Plus software” (Measure Profile, 2012)

“Administration and scoring for the TRS, PRS, and SRP should be completed by professionals or paraprofessionals that are familiar with testing procedures and with appropriate supervision. As with other ‘Level C’ instruments, score interpretation must be completed by professionals with formal graduate ‐ level training or clinicians with training in psychological assessment. The test is easy to administer, moderately easy to score, and moderately difficult to interpret based on the clinician’s experience and interpretation guidelines in the manual” (Behavior Assessment System for Children- Second Edition, 2011).

English, French, Chinese and Spanish (Measure Profile, 2012)

English, Spanish (BASC-2, 2012 Behavior Assessment System for Children- Second Edition, 2011)


“Practicing psychologists have the professional training and clinical skills to help people learn to cope more effectively with life issues” (

In Malta, and possibly everywhere else, Educational Psychologists (EPs) focus on applying psychological science to improve the learning process and promote educational and development success for all students. In an effort to maximize efficiency, validity, reliability, and objectivity, many Psychologists find themselves including psychometric tools in their practice.

Whilst no one seems to question the usefulness of these tools, the “over-use” or “over-dependence” of psychometric tools could lead EPs to be seen exclusively as psychometricians.

My argument here is that this definition, of an EP as a psychometrician, is very limiting.

What are psychometrics?

The word “psyche” comes from a Greek word which refers to the mind. The term “metric” refers to the process of measurement. Psychometrics are testing tools which are used to measure various aspects of mental function. A psychometric test is an objective resource for identifying and measuring qualities in individuals in order to make informed decisions. The tests available range from intelligence tests (IQ tests), personality tests, tests of motivation, learning skills, attitudes, etc. Psychologists use these tests because they consider them to be an objective and standardized measure, enabling them to obtain valid and reliable measures of particular skills or abilities.

Is an Educational Psychologist (EP) a Psychometrician?

The short answer is: not exactly!

The psychometric aspect to a EP’s work is only a small aspect of the role a psychologist fulfils. EPs are highly trained professionals who use psychometric tests to inform their practice: i.e. their assessments and decisions/recommendations about intervention.

Farrell et al. (2006) suggest that the core focus of the Educational Psychologist is assessment and intervention pertaining to children’s cognitive, linguistic, sensory, physical and/or social and emotional development. In this definition, Farrell (2006) points towards the intermingled nature of assessment and intervention. Nevertheless, these ‘procedures’ are often regarded as two separate courses of action pertaining to two separate dimensions with no apparent connection. Moreover, assessment is sometimes perceived to be the main, and sometimes the only, professional service an educational or school psychologist can offer (Harrison, 2009). The irony of this supposition is that it is as correct as much as it is incorrect. Accuracy of the supposition lies in the fact that assessment is one of the key tasks that Educational Psychologist performs. As mentioned earlier, it is also one of the tasks which EPs are highly specialized in! EPs have received a great deal of training on how to use psychometrics, and how to use them in the process of identifying a child’s strengths and difficulties.

Nonetheless, the fault lies in the notion of assessment held within “popular culture”, that assessment is a test, and one and the same as psychometrics.

Psychometrics tests & psycho-educational assessment

This notion of psychologists as a psychometrician is owed to the fact that standardized norm-referenced tests have represented conventional practice for a long time, especially when it comes to obtaining measures of performance of intellectual functioning, academic achievement and social-emotional functioning (Dykeman, 2006). Norm-referenced assessment strategies compare a child’s performance on a standardized test with the typical performance of other children of same age (Dykeman, 2008).

Assessment, as defined by the Oxford Dictionary refers to the act of judging or deciding the amount, value, quality or importance of something. While this is a precise definition of the word ‘assessment’, it is an incomplete description of what the process of psychological-assessment (or psycho-educational assessment) entails.

Frederickson, Webster and Wright (1991) describe the process of psychological assessment as an in-depth investigation of a broad range of hypothesis that builds on research from all areas of psychology. They assert that this is at the heart of the distinctive contribution that educational psychologists can make. Sayeed and Guerin (2000) delve deeper and depict assessment as the identification of a child’s level of functioning through the use of observation and/or interaction with the objective of understanding the child’s needs and potential.

In their book, “Frameworks for practice in Educational Psychology”, Kelly, Woolfson and Boyle (2008) strive to provide clarity regarding the process of psychological assessment and its place in the practice of Educational Psychology. Their line of reasoning recognizes assessment as being an integral component of the course of action leading towards change, in which the Educational Psychologist is the catalyst. Assessment is thus anything but a test that a child is obliged to pass. Rather, it resembles more the start of a relationship that sets the tone for all future interventions.

Professional guidelines for Educational and Child Psychology practice, set by the British Psychological Society (2002), acknowledge the usefulness of psychometrics. However, they also encourage psychologists to take a wider view of assessment.

In my opinion, an overly strict focus on positivist ontologies and epistemologies, whilst indeed providing objectivity, could actually subjugate the EP to a psychometrician’s role. Psychometrics can indeed provide information on an array of qualities. However, these tools alone do not always provide sufficient depth and insight into the complexity of a child’s reality.

Sayeed & Guerin (2000) encourage EPs to consider alternatives to ‘psychometric models’ of practice. In their opinion, psychometric models provide too static an impression of a child. Whilst also acknowledging that such tools are useful, they argue that the “psychometric model” contributes very little towards planning future involvement and interventions.

Frederickson et al., (1991) argue that static assessment frameworks limit outcomes to mere descriptors of behaviors and abilities. They propose that psychological assessment should move beyond describing what a child can and cannot do. They argue that the process of psychological assessment should “understand why particular patterns of strength and difficulty are being experienced” (Frederickson et al., 1991. pp. 20).

Are psychometric tests for everyone?

Bagnato, Neisworth, Paget & Kovaleski (1987) argue that even though psychometric tests do in fact provide adequate reliability and validity, the unnatural testing situation, the complex language demands, and the question and answer format are for the most part foreign to children pertaining to the early-years demographic. This early-years bracket has been given more importance in relation to Educational Psychology practice, ever since research supporting the notion of intervening as early as possible in a child’s life came about (Kenny & Culbertson, 1993). Hence, the issue of which assessment an EP must use with such a potentially delicate age group arises. This issue is amplified by the child’s spontaneity and lack of self-consciousness characteristics which are frequently associated with early years.

As a matter of fact, often very young children will not have ‘typical’ school-age behavior during administration of tests. This is primarily because normal developmental transitions affect the motivation, interests, and cooperation of young children (Culbertson & Willis, 1993). Their behavior would be irregular or unpredictable especially if they have never been in a structured school setting before. This would make the idea of sitting down at a table, and doing activities an examiner requests, implausible for the child. Furthermore, children with little interpersonal experiences with adults might react with fright or aggression to the demands of their examiner (Kamphaus, Dresden, & Kaufman, 1992).

Whilst standardized tests could be adequate for school aged children, it is dubious if not improbable that preschool children can perform to the best of their potential under such unnatural conditions. Preschool children are more prone to encounter difficulties with physically accessing activities, comprehending standardized directions, producing verbal responses and coping with untried materials (Sayeed & Guerin, 2000).

In view of these facts, Blaker Sayer (2003) maintains that Educational Psychologists should become increasingly familiar with tests that are more suitable for this young demographic. Thus, whilst it is imperative to always keep in mind the purpose of the assessment being undertaken, Educational Psychologists should always strive to employ assessment tools which are sensitive to the demographic being considered (Frederickson et al., 1991 Pellegrini., 2001).

Conclusion: if there could ever be a conclusion to this debate.

Psychometric tests are very useful tools. They do present EPs with the opportunity to obtain information on a wide range of skills and abilities. However, psychometric tests are one amongst a variety of tools which can be used to inform ones practice. Furthermore, one has to be cautious about the conclusions and/or assertions made as a result of such summative and static assessment instruments.

Also, to answer my first question: is an EP a psychometrician?

Maybe the better answer would be: “yes…EPs are psychometricians…amongst many other things”.

EPs are psychometricians…and so much more!

Bagnato, S. J., & Neisworth, J. T., Paget, K., & Kovaleski, J. (1987). The developmental school psychologist: Professional profile of an emerging early childhood specialist. Topics in Early Childhood Special Education, 7, 3, 75- 89.

Bagnato, S, J., & Neisworth, J, T., (1994). A national Study of the Social and Treatment “Invalidity” of Intelligence testing for Early Intervention. School Psychology Quarterly, 9, 2, 81-102.

Blaker Sayer, K. (2003). Preschool Intellectual Assessment. In Reynolds, C. R. & Kamphaus, W. R. (Eds.), Handbook of Psychological & Educational Assessment of Children: Intelligence, Aptitude, and Achievment (pp. 187-203). London: The Guilford Press

British Psychological Society (2002). Professional Practice Guidelines: Division of Educational and Child Psychology. Leicester: PBS.

Culbertson, J. L., & Willis, D. J. (1993). Introduction to testing young children. In Culbertson, J. L., & Willis, D. J. (Eds.), Testing young children: A reference guide for developmental, psychoeducational, and psychosocial assessments (pp. 1-10). Austin: Pro-Ed.

Dykeman, B. (2006). Alternative strategies in assessing special education needs. Education, 127, 2, 265-273.

Dykeman, B. (2008). Play-Based Neuropsychological Assessment of Toddlers, Journal of Instructional Psychology, 35, 4, 405-408.

Farrell, P., Woods, K., Lewis, S., Rooney, S., Squires, G., & O’Connor, M. (2006). A review of the functions and contribution of educational psychologists in England and Wales in light of “Every Child Matters: Change for Children”. London: DfES Publications.

Frederickson, N., Webster, A. & Wright, A. (1991). Psychological assessment: A change of emphasis. Educational Psychology in Practice, 7, 1, 20-29.

Harrison, P. L. (2009). Preschool Assessment. In Gutkin, T. B., & Reynolds, C. R. (Eds.), The Handbook of School Psychology (pp. 247-268). New Jersey: John Wiley & Sons, Inc.

Kamphaus, R. W., Dresden, J., & Kaufman, A. S. (1992). Clinical and psychometric considerations in the assessment of preschool children. In D. J. Willis & J. L. Culbertson (Eds.), Testing young children: A reference guide for developmental, psychoeducational, and psychosocial assessments. Austin, TX: pro-ed.

Kelly, B., Woolfson, L., & Boyle, J. (Eds). (2008). Frameworks for Practice in Educational Psychology: A textbook for trainers and Practitioners. London: Jessica Kingsley Publishers.

Kenny, T. K., & Culbertson, J. L. (1993). Developmental screening for preschoolers. In J.L. Culbertson & D.J. Willis (Eds.), Testing young children: A reference guide for developmental, psychoeducational, and psychosocial assessments (pp. 73- 100). Austin, TX: Pro-Ed.

Pellegrini A. D. (2001). Practitioner Review: The Role of Direct Observation in the Assessment of Young Children. Child Psychology and Psychiatry, 42, 7, 861- 869.

Sayeed, Z., & Guerin, E. (2000). Early Years Play: A happy medium for assessment and intervention. London: David Fulton Publishers.


Descriptive Statistics

Descriptive statistics were calculated for the PAT 2.0 as well as the validating measures. PAT2.0 Total and subscale statistics are reported in Table I. Descriptive statistics for the validating measures are in Table II.


Internal consistency for the Total PAT2.0 score was strong (α = .81, Table I). For six of the seven PAT2.0 subscales an alpha coefficient of .60 or above was successfully obtained through removal of items. Due to the multidimensional nature of the Family Beliefs subscale, only four items were analyzed. The internal consistency of these four items were α = .59 but was retained in analyses for theoretical reasons (Kazak et al., 2004).

These items were chosen because they represent a subset of cancer-related beliefs, competence (See Fig. 2 items 15a, 15f, and 15h) and positive growth (item 15c), theoretically measured similar beliefs and demonstrated adequate internal consistency. Pearson Product Moment correlations indicated very good test–retest reliability for the PAT2.0 Total score for mothers (r = .78, p < .001) and fathers (r = .87, p < . 001).

Mothers’ and fathers’ scores were compared using paired sample t-tests on the PAT2.0 Subscale and Total scores. There were no significant differences, with one exception. Mothers reported significantly fewer sibling problems than fathers [t (1, 63) = −3.74, p < 01].

Content Validity

The content validity of the PAT2.0 subscales was examined by correlating PAT2.0 subscale scores with measures that theoretically assess the same content. All correlations were conducted independently for mothers and fathers ( Table III). Specifically, the Family Structure and Resources, Family Problems, and Sibling Problems subscales were correlated with the FES Cohesion and Conflict Scales the Family Beliefs and Stress Reaction subscales were correlated with the ASDS and the Child Problems subscale was correlated with the BASC-2. For mothers, correlations were in the expected directions between the Structure and Resources subscale and FES-Cohesion (p <.05) Family Problems Subscale and the FES-Cohesion (p < .05) and FES-Conflict Scales (p < .05) the Stress Reaction and the ASDS and State anxiety scores, the Family Beliefs subscale and the ASDS (p's < .05) the Sibling Problems and FES-Conflict (p < .05) and between the Child Problems Subscale and the BASC-2 (p's < .001). For fathers, significant correlations were observed between the Family Structure and Resources and FES-Cohesion (p < .05), between Family Problems and FES-Conflict (p < .05) the Stress Reaction subscale and the ASDS and State anxiety scale and between Child Problems and the BASC-2.

Criterion-Related Validity

To assess criterion-related validity, PAT2.0 Total scores were correlated with outcome variables indicative of, or associated with, psychosocial risk. Maternal PAT2.0 scores were significantly correlated in the predicted directions with maternal ASDS, State anxiety, and FES-Conflict and the BASC-2 (all p's < .01). Likewise, higher father PAT2.0 scores were significantly correlated with higher paternal ASDS, FES-Conflict, and the BASC-2, as well as lower FES-Cohesion ( Table IV). The ITR-2 was not significantly associated with PAT2.0 total scores.

Next, PAT2.0 Total score cutoffs were established to classify families into PPPHM categories. Score cutoffs were determined a priori based on the PPPHM theory (Kazak, 2006) and our previous empirical evidence using the original PAT (Kazak et al., 2001). Then the cutoffs were examined to determine where they fell in the distribution of the sample scores. PAT2.0 Total Scores of <1SD above the mean were placed in the Universal category, scores between 1 and 2SD above the mean were classified in the Targeted category and scores >2SD above the mean were classified in the Clinical category. Consistent with what would be predicted by the PPPHM, based upon maternal reports, 55% of the families fell into the Universal category, 32% fell into Targeted, and 13% fell into the Clinical category. For fathers’ reports, 67% of the families fell into Universal, 32% fell into Targeted, and 1% fell into the Clinical range.

To further validate these cutoffs, scores on the ASDS, BASC-2, FES-Conflict scale, and the STAI-Y State scales were compared among the PPPHM categories. Analyses were conducted separately for mothers and fathers. For mothers, omnibus ANOVAs were significant for all measures (p's < .05). Follow-up comparisons were conducted using Bonferroni corrections to determine which levels of the PPPHM model differed significantly from one another. Mean differences and effect sizes for each of the comparisons are listed in Table V. ASDS scores were significantly lower for mothers in the Universal versus the Targeted and Clinical groups (p's < .05) and those in the Targeted group were significantly lower than the Clinical group. With regard to BASC-2 scores, the Universal and Targeted groups defined by mothers did not differ significantly however both of these groups had more favorable BASC-2 scores than the Clinical group. Finally, the STAI-Y State scale was significantly lower for mothers in the Universal group compared to mothers in the Targeted or the Clinical groups. Mothers in the Targeted and Clinical groups did not differ in anxiety scores (p > .05). For fathers, comparisons were only conducted between the Universal and Targeted groups. The Clinical group only included one family, therefore precluding post hoc analyses with this group. No significant differences were observed between the Universal and Targeted groups on the ASDS, BASC-2, FES-Conflict, or the STAI-Y State for fathers.

Mean Differences on Outcome Measures Between PPPHM Risk Categories for Mothers and Fathers

. . Mothers . Fathers .
Variable . PPPHM categories compared . Mean Δ . d . Mean Δ . d .
ASDS F(2,127) = 25.68, p < .001 F(1, 69) = 2.92, p = .06
Universal vs. Targeted 6.66 * .49 7.06 .47
Universal vs. Clinical 26.86 *** 1.41
Targeted vs. Clinical 20.00 *** 2.11
BASC-2 F(2, 99) = 20.09, p < .001 F(1, 48) = 4.73, p = .01
Universal vs. Targeted 2.99 .02 3.23 .29
Universal vs. Clinical 15.02 *** 1.25
Targeted vs. Clinical 12.02 *** 1.43
FES-Conflict F(2, 125) = 9.33, p < .001 F(1, 69) = 2.70, p = .07
Universal vs. Targeted .46 .31 .55 .48
Universal vs. Clinical 2.00 *** .90
Targeted vs. Clinical 1.54 ** 1.15
State Anxiety F(2, 127) = 14.87, p < .001 F(1, 69) = 1.86, p = .16
Universal vs. Targeted 8.07 ** .74 5.62 .45
Universal vs. Clinical 14.16 *** .53
Targeted vs. Clinical 6.09 1.41
. . Mothers . Fathers .
Variable . PPPHM categories compared . Mean Δ . d . Mean Δ . d .
ASDS F(2,127) = 25.68, p < .001 F(1, 69) = 2.92, p = .06
Universal vs. Targeted 6.66 * .49 7.06 .47
Universal vs. Clinical 26.86 *** 1.41
Targeted vs. Clinical 20.00 *** 2.11
BASC-2 F(2, 99) = 20.09, p < .001 F(1, 48) = 4.73, p = .01
Universal vs. Targeted 2.99 .02 3.23 .29
Universal vs. Clinical 15.02 *** 1.25
Targeted vs. Clinical 12.02 *** 1.43
FES-Conflict F(2, 125) = 9.33, p < .001 F(1, 69) = 2.70, p = .07
Universal vs. Targeted .46 .31 .55 .48
Universal vs. Clinical 2.00 *** .90
Targeted vs. Clinical 1.54 ** 1.15
State Anxiety F(2, 127) = 14.87, p < .001 F(1, 69) = 1.86, p = .16
Universal vs. Targeted 8.07 ** .74 5.62 .45
Universal vs. Clinical 14.16 *** .53
Targeted vs. Clinical 6.09 1.41

Note: Cohen's d was used as a measure of effect size. Effect sizes were not calculated between the Universal and Clinical and the Targeted and Clinical for fathers because only one father was in the Clinical category based on his PAT2.0 Total Score. Number of participants varies depending on the valid data available for each measure.

Mean Differences on Outcome Measures Between PPPHM Risk Categories for Mothers and Fathers

. . Mothers . Fathers .
Variable . PPPHM categories compared . Mean Δ . d . Mean Δ . d .
ASDS F(2,127) = 25.68, p < .001 F(1, 69) = 2.92, p = .06
Universal vs. Targeted 6.66 * .49 7.06 .47
Universal vs. Clinical 26.86 *** 1.41
Targeted vs. Clinical 20.00 *** 2.11
BASC-2 F(2, 99) = 20.09, p < .001 F(1, 48) = 4.73, p = .01
Universal vs. Targeted 2.99 .02 3.23 .29
Universal vs. Clinical 15.02 *** 1.25
Targeted vs. Clinical 12.02 *** 1.43
FES-Conflict F(2, 125) = 9.33, p < .001 F(1, 69) = 2.70, p = .07
Universal vs. Targeted .46 .31 .55 .48
Universal vs. Clinical 2.00 *** .90
Targeted vs. Clinical 1.54 ** 1.15
State Anxiety F(2, 127) = 14.87, p < .001 F(1, 69) = 1.86, p = .16
Universal vs. Targeted 8.07 ** .74 5.62 .45
Universal vs. Clinical 14.16 *** .53
Targeted vs. Clinical 6.09 1.41
. . Mothers . Fathers .
Variable . PPPHM categories compared . Mean Δ . d . Mean Δ . d .
ASDS F(2,127) = 25.68, p < .001 F(1, 69) = 2.92, p = .06
Universal vs. Targeted 6.66 * .49 7.06 .47
Universal vs. Clinical 26.86 *** 1.41
Targeted vs. Clinical 20.00 *** 2.11
BASC-2 F(2, 99) = 20.09, p < .001 F(1, 48) = 4.73, p = .01
Universal vs. Targeted 2.99 .02 3.23 .29
Universal vs. Clinical 15.02 *** 1.25
Targeted vs. Clinical 12.02 *** 1.43
FES-Conflict F(2, 125) = 9.33, p < .001 F(1, 69) = 2.70, p = .07
Universal vs. Targeted .46 .31 .55 .48
Universal vs. Clinical 2.00 *** .90
Targeted vs. Clinical 1.54 ** 1.15
State Anxiety F(2, 127) = 14.87, p < .001 F(1, 69) = 1.86, p = .16
Universal vs. Targeted 8.07 ** .74 5.62 .45
Universal vs. Clinical 14.16 *** .53
Targeted vs. Clinical 6.09 1.41

Note: Cohen's d was used as a measure of effect size. Effect sizes were not calculated between the Universal and Clinical and the Targeted and Clinical for fathers because only one father was in the Clinical category based on his PAT2.0 Total Score. Number of participants varies depending on the valid data available for each measure.

Convergent Validity

Convergent validity, the relationship between two measures purported to measure the same domain, was assessed by calculating correlations between PAT2.0 total scores and Staff PAT scores from nurses and physicians. Maternal PAT2.0 scores were significantly associated in the expected directions with both physician (r = .45, p < .01) and nurse (r = .38, p < .01) staff PAT reports, respectively. For fathers, PAT2.0 scores were significantly correlated with nurse reported Staff PAT (r = .36, p < .05) but not physician reports (r = .17, p > .05).

Discriminant Validity

As predicted, the PAT2.0 was not correlated with physician rated treatment intensity for mothers (r = −.10 p > .05) or fathers (r = −.05 p > .05). The sensitivity and specificity of the PAT2.0 to detect clinically significant outcomes was also examined using ROC Curves (McFall & Treat, 1999 Zweig & Campbell, 1993). The ASDS and the BASC-2 were chosen for the ROC analyses as both measures have established clinical or “at risk” cutoffs, (for the ASDS a Total score of 56 or greater and for the BASC-2 a Behavioral Symptom Index a T-score of 60 or greater). Here, the ROC curves resulted in areas under the curve (AUC) significantly better than .50 (the value when diagnostic performance of a measure is equal to chance) for mothers on both the BASC-2 and the ASDS and for fathers on the BASC-2. For mothers, the AUC for the BASC-2 was .94 (p < .001) and for the ASDS the AUC was .80 (p < .001). Using a PAT2.0 Total score of 1.0 for the cutoff, the PAT2.0 correctly classified 35/47(75%) of the mothers with scores above the clinical cutoff on the ASDS (sensitivity) and 60/81(74%) of the mothers who did not have ASDS scores above the clinical cutoff (specificity). Maternal PAT2.0 scores of 1.0 or above correctly classified 8/8 (100%) of the participants with BASC-2 scores over the “at-risk” cutoff and 55/92 (60%) of those scoring below the “at-risk” range. We then examined whether a PAT2.0 Total score of 2.0 or greater discriminated those that scored in the “clinically significant” range on the BASC-2 (score > 70). Indeed, PAT2.0 scores above 2.0 identified 3/3 children with BASC-2 scores above 70 and correctly classified 83/95 (87%) children that did not score in the “clinically significant” range.

For fathers the AUC for the BASC-2 was significant (AUC = .96, p < .001) but not for the ASDS (AUC = .60, p > .05). Further analyses of paternal data on the BASC-2 indicated that a PAT2.0 Total score of 1.0 or greater correctly classified 5/5 (100%) of the participants with BASC-2 scores over the “at-risk” cutoff and 30/45 (67%) of those under the cutoff. There was insufficient data to examine whether a PAT2.0 Total score of 2.0 or greater discriminated children in the “clinically significant” range on the BASC-2 (no fathers reported their child as in the BASC-2 clinically significant range).

Psychometric evaluation of a patient-reported symptom assessment tool for adults with haemophilia (the HAEMO-SYM)

In patients with haemophilia, repeated bleeding events result in significant comorbid conditions that can degrade health-related quality of life. Clinician-reported symptom measures are available for use in patients with haemophilia A or B however, there has not been a validated patient-reported symptom evaluation instrument available for haemophilia to date. The objective of this study was to develop and evaluate a self-report instrument, the HAEMO-SYM, for measuring symptom severity in patients with haemophilia. Eighty-four haemophilic subjects from Canada and the USA were enrolled and completed the HAEMO-SYM, SF-36, and Health Assessment Questionnaire-Functional Disability Index (HAQ-FDI). Four-week reproducibility was evaluated in 72 stable subjects. Construct validity was assessed by correlating subscale scores with the SF-36, HAQ-FDI, a coping questionnaire and clinical scores. The final 17-item HAEMO-SYM has two subscales: pain and bleeds. Internal consistency reliability was good (Cronbach's alphas, 0.86-0.94) and test-retest reliability was good (Intraclass Correlation Coefficients, 0.75-0.94). HAEMO-SYM subscale scores were significantly correlated with SF-36 scores (P < 0.05 for all except HAEMO-SYM Pain and SF-36 Mental Health), HAQ-FDI scores (P < 0.05 for all but HAEMO-SYM Bleeds with HAQ-FDI Hygiene and Reach), Gilbert scale (P < 0.01), coping (P < 0.05) and global pain (P < 0.001). Mean HAEMO-SYM scores varied significantly in groups defined by severity, HIV status and treatment regimen. Greater symptom severity was associated with more severe disease, HIV-positive status and prophylaxis treatment. The results of this study suggest that the HAEMO-SYM, a haemophilia-specific symptom severity instrument, has good reliability and provides evidence that supports construct validity in patients with haemophilia.

Behaviour Assessments

Behavioural assessments examine whether a child’s challenging behaviour (e.g. hyperactivity, aggression, impulsivity etc.) is age-appropriate. They are typically administered alongside several interviews with caregivers/parents/teachers (e.g. ABAS-3 is a questionnaire given to caregivers covering three domains – conceptual, practical and social). This will provide you with more accurate results because a child behavioural psychologist can then evaluate your child’s behaviour in context.

Behavioural psychometric assessments are most often used to diagnose the following mental health disorders:

  • Attention deficit/hyperactivity disorder (ADHD).
  • Oppositional defiance disorder (ODD).
  • Conduct disorder (CD).

Psychological assessment and diagnosis is crucial. From here, your child’s psychologist can follow-up with you to develop an effective treatment plan that is individualised.

As highlighted throughout this guide – the best way to help your child with any issues they may be experiencing is to understand their specific needs. Psychometric testing is one of the most effective ways to do this and will allow the psychologist to help your child reach their full potential. Hopefully this guide has helped you understand the purpose of any testing recommended by your child’s psychologist and will allow you to better support your child.


Ang, R. P., & Jiaqing, O. (2012). Association between caregiving, meaning in life, and life satisfaction beyond 50 in an Asian sample: Age as a moderator. Social Indicators Research, 108(3), 525–534.

Borca, G., Bina, M., Keller, P. S., Gilbert, L. R., & Begotti, T. (2015). Internet use and developmental tasks: Adolescents’ point of view. Computers in Human Behavior, 5249–5258.

Buhl, H. M., & Lanz, M. (2007). Emerging adulthood in Europe: Common traits and variability across five European countries. Journal of Adolescent Research, 22(5), 439–443.

Celik, S. S., Celik, Y., Hikmet, N., & Khan, M. M. (2018). Factors affecting life satisfaction of older adults in Turkey. The International Journal of Aging and Human Development, 87(4), 392–414.

Crumbaugh, J. C., & Maholick, L. T. (1964). An experimental study in existentialism: The psychometric approach to Frankl's concept of noogenic neurosis. Journal of Clinical Psychology, 20(2), 200–207.

Dezutter, J., Wiesmann, U., Apers, S., & Luyckx, K. (2013). Sense of coherence, depressive feelings and life satisfaction in older persons: A closer look at the role of integrity and despair. Aging & Mental Health, 17(7), 839–843.

Diener, E. D., Emmons, R. A., Larsen, R. J., & Griffin, S. (1985). The satisfaction with life scale. Journal of Personality Assessment, 49(1), 71–75.

Domino, G., & Affonso, D. D. (1990). A personality measure of Erikson's life stages: The inventory of psychosocial balance. Journal of Personality Assessment, 54(3/4), 576–588.

Duchesne, S., Ratelle, C. F., & Feng, B. (2014). Developmental trajectories of achievement goal orientations during the middle school transition: The contribution of emotional and behavioral dispositions. Journal of Early Adolescence, 34(4), 486–517.

Dyer, C. (2006). Research in psychology: A practical guide to methods and statistics. Malden: Blackwell Publishing.

Erikson, E. H. (1950). Childhood and society. New York: Norton.

Erikson, E. H. (1968). Identity. Youth and crisis. New York: Norton.

Erikson, E. H. (1982). The life cycle completed. New York: Norton.

Field, A. (2005). Discovering statistics using SPSS. London: SAGE Publications.

Freitas, E. R., Barbosa, A. J. G., Scoralick-Lempke, N., Magalhães, N. C., Vaz, A. F. C., Daret, C. N., et al. (2013). Developmental tasks and life history of the elderly: Analysis of Havighurst's theory. Psicologia: Reflexão E Critica, 26(4), 809–819.

Fuller-Iglesias, H. R., & Rajbhandari, S. (2016). Development of a multidimensional scale of social integration in later life. Research on Aging, 38(1), 3–25.

Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2014). Multivariate data analysis: Pearson new international edition. Essex: Pearson Education Limited.

Havighurst, R. (1963). Successful aging. In C. Tobbits & W. Donahue (Eds.), Process of aging (pp. 291320, Williams). New York.

Havighurst, R. (1981). Developmental tasks and education. New York: Longman and Green.

Heszen-Niejodek, I., & Gruszczyńska, E. (2004). Wymiar duchowy człowieka, jego znaczenie w psychologii zdrowia i jego pomiar [Spirituality as a human dimension, its importance in health psychology, and its measurement]. Przegląd Psychologiczny, 47(1), 15–31.

Hu, L. T., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3(4), 424–453.

Hutteman, R., Hennecke, M., Orth, U., Reitz, A. K., & Specht, J. (2014). Developmental tasks as a framework to study personality development in adulthood and old age. European Journal of Personality, 28(3), 267–278.

Isaksson, U., Santamäki-Fischer, R., Nygren, B., Lundman, B., & Åström, S. (2007). Supporting the very old when completing a questionnaire: Risking bias or gaining valid results? Research on Aging, 29(6), 576–589.

Izdebski, P., & Polak, A. (2005). Bilans życia i poczucie koherencji osób starszych w zależności od ich aktualnej sytuacji życiowej [life outcome and sense of coherence in elderly people in dependence of their present life situation]. Gerontologia Polska, 13(3), 188–194.

Janis, L., Canak, T., Machado, M. A., Green, R. M., & McAdams, D. P. (2011). Development and valdiation of the Nortwestern Ego integrity scale. Evanston: Nortwestern University.

Kim, S., & Park, S. (2017). A meta-analysis of the correlates of successful aging in older adults. Research on Aging, 39(5), 657–677.

Kline, P. (1999). The handbook of psychological testing (2nd ed.). London: Routledge.

Knapik, A. (2014). Assessing the quality of aging - presentation of research tool. Physiotherapy and Health Activity, 22(1), 42–47.

Lopez, F. G., Ramos, K., & Kim, M. (2018). Development and initial validation of a measure of attachment security in late adulthood. Psychological Assessment, 30(9), 1214–1225.

McCormick, C. M., Kuo, S. I.-C., & Masten, A. S. (2011). Developmental tasks across the lifespan. In K. F. Fingerman, J. Smith, & T. C. Antonucci (Eds.), Handbook of lifespan development (pp. 117–140). New York: Springer.

Merriam, S., & Mullins, L. (1981). Havighurst's adult developmental tasks: A study of their importance relative to income, age and sex. Adult Education, 31(3), 123–141.

Newman, B. M., & Newman, P. R. (2012). Development through life: A psychosocial approach. Belmont: Wadsworth Cengage Learning.

Ogińska-Bulik, N., & Juczyński, Z. (2008). Skala pomiaru prężności–SPP-25 [Resiliency Measurement Scale -SPP-25]. Nowiny Psychologiczne, 3, 39–56.

Ohlert, J., & Ott, I. (2017). Developmental tasks and well-being in adolescent elite athletes in comparison with recreational/non-athletes. European Journal of Sport Science, 17(10), 1343–1349.

Osborne, J. W. (2014). Best practices in exploratory factor analysis. Scotts Valley: CreateSpace Independent Publishing.

Peterson, R. A., & Kim, Y. (2013). On the relationship between coefficient alpha and composite reliability. Journal of Applied Psychology, 98(1), 194–198.

Pinquart, M., & Pfeiffer, J. P. (2015). Solving developmental tasks in adolescents with a chronic physical illness or physical/sensory disability: A meta-analysis. International Journal Of Disability, Development & Education, 62(3), 249–264.

Power, M., Quinn, K., Schmidt, S., & WHOQOL-OLD, G. (2005). Development of the WHOQOL-old module. Quality of Life Research, 14(10), 2197–2214.

Pruchno, R., Heid, A. R., & Wilson-Genderson, M. (2017). The great recession, life events, and mental health of older adults. The International Journal of Aging and Human Development, 84(3), 294–312.

Rattray, J., & Jones, M. C. (2007). Essential elements of questionnaire design and development. Journal of Clinical Nursing, 16(2), 234–243.

Röcke, C., & Cherry, K. E. (2002). Death at the end of the 20th century: Individual processes and developmental tasks in old age. The International Journal of Aging and Human Development, 54(4), 315–333.

Salmela-Aro, K., Read, S., Korhonen, T., Vuoksimaa, E., Rose, R. J., & Kaprio, J. (2012). Young Adults' developmental task-related goals modify the association between self-focused goals and depressive symptoms. Applied Psychology: Health & Well-Being, 4(1), 106–125.

Schroots, J. J. (1996). Theoretical developments in the psychology of aging. The Gerontologist, 36(6), 742–748.

Schulenberg, J. E., Bryant, A. L., & O'Malley, P. M. (2004). Taking hold of some kind of life: How developmental tasks relate to trajectories of well-being during the transition to adulthood. Development and Psychopathology, 16(4), 1119–1140.

Torges, C. M., Stewart, A. J., & Duncan, L. E. (2009). Appreciating life’s complexities: Assessing narrative ego integrity in late midlife. Journal of Research in Personality, 43(1), 66–74.

Wadensten, B. (2006). An analysis of psychosocial theories of ageing and their relevance to practical gerontological nursing in Sweden. Scandinavian Journal of Caring Sciences, 20(3), 347–354.

Westerhof, G. J., Bohlmeijer, E. T., & McAdams, D. P. (2017). The relation of Ego integrity and despair to personality traits and mental health. Journals Of Gerontology Series B: Psychological Sciences & Social Sciences, 72(3), 400–407.

Wiesmann, U., & Hannich, H. (2011). A Salutogenic analysis of developmental tasks and Ego integrity vs. Despair. International Journal Of Aging & Human Development, 73(4), 351–369.

Williams, N. (2010). Public health and aging, maximizing function and well-being. Occupational Medicine, 60(7), 580.

Wold, G. H. (2013). Basic geriatric nursing-E-book. Elsevier Health Sciences.

Yesavage, J. A., Brink, T. L., Rose, T. L., Lum, O., Huang, V., Adey, M., & Leirer, V. O. (1982). Development and validation of a geriatric depression screening scale: A preliminary report. Journal of Psychiatric Research, 17(1), 37–49.

Zadworna-Cieślak, M. (2017a). Developmental tasks’ attainment in late adulthood – The construction of a new psychometric tool. Gerontologia Polska, 25, 156–162.

Zadworna-Cieślak, M. (2017b). The health-related behavior questionnaire for seniors. Annals of Psychology, 20(3), 661–699.

Strengths and Difficulties Questionnaire (SDQ)

The Strengths and Difficulties Questionnaire (SDQ) is a mental health screening tool for use with children and adolescents.

The SDQ is a brief behavioural screening questionnaire about 2-17 year olds. It exists in several versions to meet the needs of researchers, clinicians and educationalists. Each version includes between one and three of the following components:

A) 25 items on psychological attributes

All versions of the SDQ ask about 25 attributes, some positive and others negative. These 25 items are divided between 5 scales: emotional symptoms (5 items), conduct problems (5 items), hyperactivity/inattention (5 items), peer relationship problems (5 items), prosocial behaviour (5 items).

Several two-sided versions of the SDQ are available with the 25 items on strengths and difficulties on the front of the page and an impact supplement on the back. These extended versions of the SDQ ask whether the respondent thinks the young person has a problem, and if so, inquire further about chronicity, distress, social impairment, and burden to others.

The follow-up versions of the SDQ include not only the 25 basic items and the impact question, but also two additional follow-up questions for use after an intervention.

Target Population: Children between the ages of 2 to 17

Time to Administer: One sided version with 25 items, administration time approximately 5 minutes

Completed By: Parents and teachers. There is also a self-report version for 11-17 year olds

Modalities Available: Although the SDQ is free to download and can be manually scored, use of the online scoring version is recommended for a fee of .25 per use due to the level of errors that occur when scoring it manually. For more information, see Users are not permitted to create or distribute electronic versions for any purpose without prior authorization from YouthinMind. If you are interested in making translations or creating electronic versions, you MUST first contact YouthinMind via email at [email protected]

Scoring Information: The fast SDQ scoring site for online scoring and report generation hand scoring

Languages Available: Afrikaans, Amharic (Ethiopia), Arabic, Bulgaria, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Farsi (Iran), Finnish, French, German, Greek, Gujarati (India), Haitian Creole, Hebrew, Hindi, Hungarian, Icelandic, Italian, Japanese, Kannada (India), Khmer, Korean, Kurdish, Lithuanian, Malay, Malayalam (India), Norwegian, Polish, Portuguese, Punjabi (India), Romanian, Russian, Serbian, Spanish, Swedish, Tamil, Thai, Turkish, Ukranian, Urdu (India & Pakistan), Xhosa &mdash the rating for the measure is based solely on the English version of the measure.

Training Requirements for Intended Users: There is no minimum lisencing to use the tool. Making use of the website and the website, we have found that people from many different professions and levels of training have been able to collect, use and interpret the SDQ. Individual organizations may, however, choose to restrict the use or interpretation of the SDQ to particular professions or levels of experience.

Availability: Although the SDQ is free to download and can be manually scored, use of the online scoring version is recommended for a fee of .25 per use due to the level of errors that occur when scoring it manually. For more information, see Users are not permitted to create or distribute electronic versions for any purpose without prior authorization from YouthinMind. If you are interested in making translations or creating electronic versions, you MUST first contact YouthinMind via email at [email protected]

Contact Information

Summary of Relevant Psychometric Research

This tool has received the Measurement Tools Rating of "A &ndash Psychometrics Well-Demonstrated" based on the published, peer-reviewed research available. The tool must have 2 or more published, peer-reviewed studies that have established the measure’s psychometrics (e.g., reliability and validity, sensitivity and specificity, etc.). Please see the Measurement Tools Rating Scale for more information.

Show relevant research.

Goodman, R., & Scott, S. (1999). Comparing the Strengths and Difficulties Questionnaire and the Child Behavior Checklist: Is small beautiful? Journal of Abnormal Child Psychology, 27(1), 17-24.


Participants &mdash Mothers of 132 children aged 4 to 7 attending a children's dental clinic (low-risk) or child psychiatric clinic (high-risk).

Race/Ethnicity &mdash Not Specified


This study compared scores on the SDQ and the Child Behavior Checklist (CBCL), another other widely used measure of child behavior problems, to assess convergent validity. Two groups of children were identified. The low-risk group was recruited from a dental clinic and the high-risk group was recruited from a group of children referred to a psychiatric clinic for externalizing behavior problems. Mothers filled out both the SDQ and the CBCL. Mothers in the psychiatric sample were also administered an interview: the Parental Account of Child Symptoms. Both questionnaires were able to distinguish high from low-risk children well. Correlations between comparable subscales on the two questionnaires were also high. Mothers indicated that they preferred the SDQ. This study is limited by a small sample size.

Goodman, R. (2001). Psychometric properties of the Strengths and Difficulties Questionnaire. Journal of the American Academic of Child and Adolescent Psychiatry, 40(11), 1337-1345.


Participants &mdash 9,998 parents, 7,313 teachers, 3,983 11-15-year-olds

Race/Ethnicity &mdash Not Specified


A national sample of parents, teachers, and youth in the United Kingdom completed a survey and interviews on mental health. This study examines completed SDQs which were included as part of the survey packets. A sub-sample also repeated the questionnaire 4 to 6 months after the first administration. The interview portion of the study included the Development and Well-Being Assessment, which allowed children to receive a DSM-IV diagnosis. Reliability analyses showed that the internal correlations of items within SDQ subscales were satisfactory. Inter-rater correlations for inter-rater agreement were statistically significant, though not high. The authors note that inter-rater agreement for the SDQ was higher than is typically reported for similar measures. Test-retest reliabilities were moderate, with youth self-ratings lowest and teacher ratings highest. The SDQ also successfully differentiated between those who were identified as having a psychiatric disorder with those who were not.

Mellor, D. (2004). Furthering the use of the Strength and Difficulties Questionnaire: Reliability with younger child respondents. Psychological Assessment, 16(4), 396-401.


Participants &mdash 917 7-17 year old children recruited from schools in Australia, and their parents and teachers

Race/Ethnicity &mdash Not Specified


SDQs were completed by children, parents and teachers. In addition, for a sub-sample of older children, a second SDQ was completed two weeks later. Internal reliabilities were moderate for all versions of the measure, with the highest reliabilities for the parent and teachers versions and the lowest for the Peer Problems subscale of the children's version. The inter-correlations between parent, teacher, and child SDQs for each set were statistically significant, as were test-retest correlations. The authors note that there was more congruence between parent and child reports for older children than for younger children. Older and younger children had comparable agreement with teacher reports. Older and younger children also showed comparable test-retest reliabilities, with the exception of younger children's peer relation reports, which were more inconsistent.

Sharp, C., Croudace, T. J., Goodyer, I. M., & Amtmann, D. (2005). The Strength and Difficulties Questionnaire: Predictive validity of parent and teacher ratings for help-seeking behaviour over one year. Educational & Child Psychology, 22(3), 28-44.


Participants &mdash Parents and teachers of 659 7-11-year-old children recruited from schools in the United Kingdom

Race/Ethnicity &mdash Approximately 97% White, 2% Asian, 0.5% Black, 0.5% Oriental.


This study examined the ability of SDQ scores to predict the likelihood of seeking help for behavior problems at one year. Three levels of help-seeking were defined: informal (discussion with friends or family), front-line (discussion with a general practitioner or teacher), and specialist (contact with Child and Mental Health Services (CAMHS)). Parents, teachers, and children completed the SDQ at baseline, six months and 12 months. In addition, information was collected on mental health service use, level of parental concern, socio-economic status, and child's IQ, based on the Wechsler Intelligence Scale for Children. Results showed that parents SDQ scores tended to become more positive over time and that they were not related to help-seeking behavior. In contrast, teacher scores became more negative and the more difficulties reported by teachers on the SDQ, the more likely it was that parents would seek help for their children. The SDQ was also related to parents self-reports of their level of concern over their children's behavior. Limitations include a low survey response rate.

Hill, C., & Hughes, J. N. (2007). An examination of the convergent and discriminant validity of the Strengths and Difficulties Questionnaire. School Psychology Quarterly, 22(3), 380-406.


Participants &mdash 374 parents, teachers and peers (complete data), recruited from schools in Texas. Children were 6 years old on average.

Race/Ethnicity &mdash 34% White, 23% African American, 37% Hispanic, and 6% Other.


Parents and teachers completed the SDQ. Children were asked to name peers who fit descriptions similar to SDQ items and to rate their liking for peers. Internal reliability was found to be acceptable for all SDQ subscales, with Peer Relationship Problems having the lowest internal consistency. Statistical analysis indicated that the subscales of the SDQ provide valid measures of each construct, but that there is overlap, possibly due the fact that problems can co-occur. Interpretation of peer data was limited by the use of an alternate measure of collection. The authors conclude that the SDQ appears to be a valid overall screening measure, but that it does not reliably distinguish among different types of problems and should not be used for diagnosis.

Palmierei, P. A., & Smith, G. C. (2007). Examining the structural validity of the Strengths and Difficulties Questionnaire (SDQ) in a U.S. sample of custodial grandmothers. Psychological Assessment, 19(2), 189-198.


Participants &mdash 733 custodial grandmothers providing care to children 4-16 years of age for at least 3 months in Ohio.

Race/Ethnicity &mdash 50% Black, 50% White


Grandmothers completed the SDQ as part of a telephone interview. Results showed good internal reliability for the subscales of the SDQ, with Peer Problems having the weakest reliability. The authors make note of an ongoing controversy with the structure of the SDQ, particularly that positively worded items tend to be correlated, even if they are not part of the same subscales.

Ruchkin, V., Jones, S., Vermeiren, R., & Schwab-Stone, M. (2008). The Strengths and Difficulties Questionnaire: The self-report version in American urban and suburban youth. Psychological Assessment, 20(2), 175-182.


Participants &mdash 4,661 urban youth (13.0 years, on average. 937 suburban youth (14.0 years, on average)

Race/Ethnicity &mdash Urban: 57.5% African American, 26.6% Hispanic, 12.8% Caucasian, 0.8% Asian American, and 2.3% Other Suburban: 83.8% Caucasian, 6.4% Asian, 1.8% Hispanic, 3.1% African American, and 4.9% Other.


Students filled out copies of the SDQ as it was read to them by survey administrators. Analysis of the scale with this sample indicated a three factor structure: Emotional Distress/Withdrawal, Behavioral Reactivity/Conduct problems, and Prosocial Behavior/Peer competence. The authors note that the original 5 factors also represented a satisfactory structure for the measure. Low internal reliability was noted for some subscales, particularly for Peer Relationship and Conduct Problems.

Date Reviewed: February 2015 (Originally reviewed in June 2009)

3. The Cognitive Abilities

Cognitive Abilities are brain-based skills needed to process any activity, simple or complex. It is the ability to comprehend, reason, visualize and solve, etc. How things are perceived, strategic thinking and decision-making abilities, too, influence people’s behavior in a given situation. It could be divided into two categories:

Fluid intelligence is the ability to perceive things, absorb and retain new information to tackle novel circumstances.

Crystallized intelligence is the capacity to recover and utilize data obtained over a lifetime and leverage the acquired knowledge to perform certain tasks.

You can read our Guide to Cognitive Assessments for a detailed explanation.

Apart from the bright and dark side and cognitive abilities, other factors also affect human behavior. These are clubbed under X-factors.