Clinical Research and Outreach Services: Assessment Tools

Core Services

Administrative Services

Neuroscience Services

Technology and Informatics Services

Clinical Research and Outreach Services:

Cognitive Assessment


Wechsler SeriesAges

The above tests are all norm-referenced and individually administered. They are comprised of a series of subtests (about 12) divided into Verbal and Non-Verbal categories. One obtains a Full Scale IQ score, as well as a Verbal and a Non-Verbal IQ score. Additionally, each subscale is scored separately, and could be used for research purposes. Subtests tap a variety of cognitive skills such as nonverbal sequential reasoning, verbal abstraction, nonverbal holistic (visual spatial) problem solving, general knowledge acqusition, arithmetic reasoning, grapho-motor abilities/planning; attention/memory, and social knowledge.

Administration Time: The whole battery for a Wechsler test takes about an hour and a half; any given subtest might take, on average, 10 minutes.


This test is a comprehensive achievement test that has the same psychometrics as the Wechsler intelligence measures above, so it has the beauty of natural comparison between a student’s IQ and achievement. It has 8 subtests, however, 3 of these subtests (e.g., Basic Reading, Mathematics Reasoning, & Spelling) can be used as a brief Screener.

Admin. Time: For the Comprehensive battery: about 30-50 minutes for grades K-2; 55-60 min. for grades 3-12 excluding the written expression subtest time. Brief Screener takes about a third of this time.

The Wechsler series tests have a long track record and have very high reliability and validity.


The Stanford Binet -IV is another comprehensive general IQ/problem solving test that has a lower floor and a higher ceiling than the Wechsler series and is thus better suited for gifted children or youth with developmental delays. It has excellent reliability and validity (especially for composite scores); it takes about 60 minutes to administer. It is comprised of 15 subtests grouped into 4 general areas: Verbal reasoning, Quantitative reasoning, Abstract/visual reasoning, & Short- term memory. Screening batteries for giftedness and learning disabilities are subsumed under the SB-IV.

There is little evidence that the Kaufman can be substituted for more traditional measures of intellectual or achievement, however, the test is intended to be a general intelligence and achievement test drawn from the information processing theories (e.g., Das). Subscales include Simultaneous Processing, Sequential Processing, and Nonverbal processing. It purports to do a better job assessing exceptional children (particular LD) as well as minorities (esp. African American, Hispanic, and bilingual children. The K-ABC can also be geared to assess preschoolers. It is limited, of course, by the ceiling age of 12. It is reliable for composite scores; not for subtests. The validity evidence for achievement is unimpressive. Moderate to good validity for intelligence portion.

This test incorporates developmental theory and its subtests are predictive of skills that are necessary for later school success. Tasks are interesting and attractive for young children, and they are presented in a "rapport-establishing" sequence. The MSCA is divided into 5 categories; Verbal; Perceptual-Performance; Quantitative; Motor; and Memory. The motor scale (gross and fine) is a unique feature of this test, as most other intelligence scales for children do not measure this directly. although well-standardized and psychometrically sound, it is potentially outmoded due to not having been restandardized for 20 years.

This individually administered, norm-referenced achievement test is a very short, quick achievement screener. It covers the basic academic skills necessary in reading, spelling, arithmetic. It has adequate reliability and good construct validity. It has questionable content validity.

The K-TEA comes in a brief form (BF) or comprehensive form(CF), taking 10-40 or 20-75 minutes, respectively, depending on grade. It is individually administered and norm-referenced. The BF content validity is less well defined than the CF (which is good). The BF assesses Reading, mathematics, and Spelling. As with any achievement test, the most critical concern is content validity. Users must be sensitive to the correspondence of the K-TEA’a content with a student’s curriculum. Validity, in general, appears adequate otherwise. The composite scores have good internal consistency, but insufficient evidence for stability to be assessed as adequate or not. The K-TEA has a low ceiling for gifted children, and does not have kindergarten norms. It has a limited floor for younger children with academic delays.

This norm-referenced, individually administered achievement test covers 6 areas: Mathematics, Reading Recognition, Reading Comprehension, Spelling, General Information, and Written Expression. It has good reliability but little meaningful evidence on validity.

Individudally administered; norm-referenced. Intended to assess intellectual and academic development. 21 cognitive subtests based on the Horn-Cattell theory of Fluid & Crystallized intelligence. Measures 8 of the 9 broad categories of this theory. 7 subtests comprise the standard battery: Memory for sentences, Memory for Names, Visual Matching, Incomplete Words, Visual Closure, Picture vocabulary, and Analysis-Synthesis. There are 14 Supplementary subtests and 14 achievement subtests (e.g., Passage comprehension, Calculation, Applied problems. Good reliability for the broad cognitive and broad achievement clusters. Adequate concurrent, but not construct, validity.

Individually administered; norm-referenced; administration time about 30 minutes (untimed test). There are 2 parallel forms of this test. This math achievement test covers 3 main areas: Basic Concepts (e.g., numeration, rational numbers, geometry); Operations (e.g., addition, subtraction, multiplication, division, mental computation); Applications (e.g., measurement, time & money, estimation, interpretation of data, problem solving). Reliability for Total Scores is excellent; area subtests fluctuate, esp. in the lower grades. Little validity data available for construct or concurrent validity; more evidence for adequate content validity, which is the most important for an achievement test.


Non-Verbal Intelligence/Cognitive/General Problem Solving Tests



Designed for those who require language-free, motor-reduced, or culture-reduced test of abstract/figural problem solving. This tests "a small sliver" of intelligence. Administered individually in pantomime; the participant points to answers. This test is untimed; takes about 15 minutes to administer. Good reliability and validity, although may studies are from the TONI and extrapolated to the TONI-2.

This new revision of the Leiter is a nonverbal measure of intelligence requiring no speaking or writing from examiner or examinee. It claims to be "culture free". It is untimed. There are 20 subtests in these broad area: Reasoning, Visualization (spatial), Memory, Attention. Scores derived include a Full Scale IQ score; a brief IQ scree; a brief ADHD screen; and a brief gifted screen. Overall good reliability with some variation among subtests; Good validity.


Infant Assessment


This norm-referenced, individually administered test assesses developmental functioning. There are 3 subscales: Mental, Motor, and Behavior. Norms appear representative in terms of race/ethnicity, geographic region, parental education, and sex. Internal consistency of the Mental Scale is adequate for making important decisions for children at about half of the ages; the consistency of the Motor Scale and the total Behavior Rating Scale is adequate for about one third of the ages. As should be expected for scales intended for use with this population, scores are moderately stable. Although the content of the BSID-2 appears comprehensive and appropriate, only limited evidence for criterion-related and construct validity is reported. However, the Bayley Scales is purported to be "by far the best measure of infant development and provides valuable information about patterns of early mental development" (Sattler, 1992).

This scale designed by Brazelton, (1973) is another useful procedure for evaluation of infants during the early months of development. Items on this scale range from the evaluation of neurological reflex to the evaluation of alertness. Sameroff (1978) provides a detailed review of this scale.

No available information on this test at this time.

Neuropsychological Measures



Individual or group administration. The VMI is a set of 27 geometric forms that are increasingly difficult to copy, which the test taker is to copy with paper and pencil. The short VMI is intended for use with children ages 3-7. Two supplemental subtests are available: Visual Percpetion and Motor Coordination. The VMI has relatively high reliability and validity in comparison with other measure of perceptual-motor skills.

Part of the halstead Reitan battery for Children that has limited norms at this time. Little is known about reliability and validity. Trailmaking is a test that taps into visual motor planning and seqencing. It takes only a few minutes to give.

A manual dexterity task that measure fine motor coordination. Gardner (1979) and Wilson et al. (1982) provide percentile norms. Takes about 2 minutes to administer. Norms are limited. Reliability and validity are satisfactory. A useful measure of fine motor coordination and laterality.

Employs auditory commands that require child to manipulate tokens varying along 3 dimensions–color, shape, and size. Takes about 10-15 min. to administer. Norms are limited. More information is needed about its reliability and validity. Helpful in identifying mild receptive distrubances in aphasic individuals.

Measures gross and fine motor skills with 8 subtests. Takes approximately 45-60 minutes to give. Norm group is excellent. Reliability is satisfactory for Battery Composite score, less satisfactory for Fine and Gross Composites, and unsatisfactory for individual subtests. More information is needed about its validity.

The WRAML is a new assessment instrument developed to assess the child’s ability to learn and memorize a variety of information. Useful in assessing learning disabilities, effects of head injury, and attention problems. The WRAML has 3 components: Verbal Memory; Visual Memory; and Learning. There is a Delayed Recall component so that both short and longer term memory are assessed across domains (e.g., visual, verbal). The WRAML is helpful in detecting less-apparent learning problems such as a dysfunction in auditory processing or difficulty in dual conceptual tracking. Adequate reliability and validity for a relatively new test.

Measures children’s verbal memory on both recognition and recall tasks; both immediate and delayed recall.

Measures visual-motor organization and visual memory; gives indication of various aspects of hemispheric functioning.

Behavioral Rating Scales/Childhood Symptomotology/Personality Measures


The Achenbach scales are the most widely used measure in research for child and adolescent emotional/behavioral functioning. The CBCL is a parental report measure that taps 2 main areas: Competence and Problem items. With the Problem items, there are 2 subsections: Internalizing and Externalizing problems. The psychometrics of this test are excellent. The forms can be completed by parents in approximately 15 minutes.

Same as the CBCL but the teacher reports on the targeted child.

The Youth Self Report version of the CBCL (indicating the same areas reflected on the CBCL). Takes about 15 minutes; youth must have a 5th grade reading level or above. Good reliability; low to moderate construct validity.

The PIC is an objective, multiscale instrument completed by an adult informant, usually the child’s or adolescent’s mother. The PIC has 600 items, and is similar in structure to the Minnesota Multiphasic Personality Inventory that there are both validity scales and clinical scales, and it yields actuarially based interpretive systems for individual scales, scale patterns, and profile types. Not all 600 items are necessary to generate a profile; 420, 280, or 131 item sets can be used. The longer version can require 1 to 2 hours for a parent to complete; parent must have at least a 6th grade reading level. The PIC is useful in differentiating hyperactive, learning-disabled, and normal children as well as learning-disabled vs. behaviorally disordered students. Adequate reliability and validity, however, the norms may be outdated.

Multirater, multicontext instrument with 3 Student Rating Scales: Home, School, and Peers. Peer rating form uses the Sociogram (peer perceptions of the target student). Good reliability, although low for Grades 1 & 2. Reasonable validity.

Narrow-band Behavior checklist with 48 items that has five factors covering behavior problems: conduct problem; learning problem; psychosomatic; impulsive-hyperactive; and anxiety. Parent report; takes about 20 minutes. Norm group, reliability, and validity are adequate.

Teacher version of the above-referenced Conner’s instrument. 39 items with 6 factors covered. Takes about 15 minutes to complete. Norm group, reliability, and validity are adequate.

Measures depressive symptomotology in children. Good psychometrics; used often in research.

37 item, self-report measure covering four domians: Physiological Anxiety; Worry/Oversensitivity; Social Concerns/Concentration; and a Lie scale (validity scale). Good reliability and validity

Projective technique using highly structured pictorial stimuli and requiring a child’s response in verbal form. The interpretation is based on content analysis of a rather qualitative nature. The TAT was originally designed for adults; however, the technique is used with children and teens. The TAT has 19 cards in black and white. Several different theory-driven scoring techniques exist; psychometrics have been difficult to obtain; clinical utility, however, continues.

The RAT-C comes closer to meeting psychometric standards of test construction and evaluation than do other techniques of this type. The RAT-C provides 2 overlapping sets of 16 stimulus cards, one for boys and one for girls. A supplementary set with picture of Black children are also available but have not been normed. The picture depict familiar interpersonal situations involving children in their relations with adults or other children. Clear and explicit guidelines permit fairly objective scoring of responses; norms are based on the responses of 200 teacher-named well-adjusted children.

TEMAS is the Spanish word for "themes" and the acronym for "tell me a story", an instrument specifically designed for the assessment of cognitive, affective, and personality characteristics of children. TEMAS uses 2 parallel sets of stimulus cards in full color, one for ethnic minority children and one for White children. The stimulus materials were carefully developed to facilitate verbal production and stimulate stories dealing with choices among conflicting goals, such as immediate versus delayed gratification. Although the ethnicity represented in the cards is highly praised, the psychometrics of the instrument, especially its test-retest reliability and internal consistency, have been repeatedly called into question. It does, however, have a clearly guided formal scoring system.

An individually administered, norm-referenced device intended to assess the adaptive and maladaptive behaviors of individuals. Reported by a caregiver. Norming is good. Reliability of the scales varies considerably. Validity data are adequate. Domains assessed included: Communication; Daily Living Skills, Socialization; Motor Skills; maladaptive Behavior. Most often used for persons with Developmental Disabilities.
 

 * Tests with * are held in the Psychology and Human Development collection (accessible)