View sample psychological assessment in child mental health settings research paper. Browse other research paper examples and check the list of psychology research paper topics for more inspiration. If you need a psychology research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Feel free to contact our custom writing service for professional assistance. We offer high-quality assignments for reasonable rates.
Although most children receive mental health services because some concern has been raised regarding their emotional and behavioral adjustment, these mental health services are provided in a variety of settings by a variety of professionals. Core evaluation and treatment services may be provided by educational or health care organizations through outpatient clinics, inpatient facilities, or residential care agencies. Such services may be supported by an annual institutional budget from which resources are allocated on some rational basis, or each service may be available only to the extent to which associated expenses can be reimbursed. The latter consideration is always of central importance in private practice settings that provide the majority of fee-based or insurancereimbursed mental health care. Psychological assessment services may be routinely integrated into intake evaluation, treatment planning, and subsequent outcome review, or may be obtained on a referral basis.
Routine psychological assessment in child mental health settings focuses on the identification and quantification of symptom dimensions and problem behaviors and the collection of information relevant to the development of treatment strategies. In contrast, psychological assessment provided in response to referral may incorporate any of the varied testing methodologies appropriate for the understanding of youth. Of necessity, routine assessment is designed to be cost and time efficient, requiring relatively narrowly defined skills that are easily acquired. The information provided in such routine assessments must be easily understood and applied by the variety of mental health professionals who evaluate and treat children, adolescents, and their families. This research paper provides a detailed discussion of the forms of psychological assessment that can be either applied routinely or integrated into assessments designed to answer specific diagnostic inquiries.
Psychological assessment services requested by referral are usually provided by, or under the supervision of, doctorallevel psychologists with specialized training who are certified or licensed to provide mental health services independently. For example, the training necessary to provide assessment with projective techniques or neuropsychological assessment requires specific graduate or postgraduate coursework and considerable supervised clinical experience delivered within structured practica, internships, and postdoctoral fellowships. Referral for psychological assessment is often requested to achieve an effective differential diagnosis. Such referrals are made following the collection of a complicated and contradictory history obtained from parent and child interview or subsequent to completion of an ineffective course of treatment.
Surveys of senior doctoral psychologists who maintain specific professional memberships often associated with the use of psychological testing or who conduct research using psychological tests provide some insight regarding valued tests and test-related procedures. Two of these surveys, conducted in 1990 and 1998, have focused on the provision of assessment services to adolescents (Archer, Maruish, Imhof, & Piotrowski, 1991; Archer & Newsom, 2000). The first of these surveys noted the prominence of the Wechsler intelligence scales, Rorschach Inkblot Method, BenderGestalt Test, Thematic Apperception Test, Sentence Completion Test, and Minnesota Multiphasic Personality Inventory, often (84%) also administered in a standard battery. The most recent of these surveys suggests the continuing prominence of all but the Bender-Gestalt, the growing use of parent and teacher rating scales, and the influence of managed care in discouraging the use of the most labor-intensive procedures in psychological testing. Unfortunately, such surveys identify neither the degree to which youth receiving mental health services are evaluated using psychological tests, nor the context of such applications (e.g., differential diagnosis, treatment planning, outcome assessment).
Mental Health Evaluation of Youth
This research paper focuses on the ways in which the evaluation of the adjustment of children and adolescents benefits from the use of objective rating scales and questionnaires. The routine use of such procedures within mental health settings supports the primary mission of evaluation and treatment, although other assessment techniques make a positive contribution in this regard. The evaluation of child and adolescent adjustment may benefit from the additional application of projective techniques (cf. Exner & Weiner, 1982; McArthur & Roberts, 1982), the evaluation of cognitive and academic dysfunction, and the assessment of family status. Such efforts are applied to gain a fuller understanding of a child’s adjustment, to arrive at an accurate differential diagnosis, to support treatment planning, and to monitor ongoing efforts. The case examples in Lachar and Gruber (2001) demonstrate the considerable contribution that projective techniques, psychoeducational evaluation, and neuropsychological assessment may make to the understanding of child adjustment, although any examination of the issues involved in such applications would merit a separate research paper. The overall goal of this research paper is to examine the routine application of objective methods in youth evaluation and treatment, and to discuss in some depth the issues related to such application.
Characteristics of Children and Adolescents
The evaluation of youth is substantially different from the comparable evaluation of adults by mental health professionals. Children function in uniform social contexts and consistently perform in standard contexts. That is, they are routinely observed by parents and other guardians, and once they reach the age of 5 years, spend a substantial amount of their lives in the classroom and pursuing school-related activities. Many behavioral expectations are related to a child’s specific age, and childhood is characterized by the attainment of a succession of developmental, academic, and social goals. Children and adolescents typically are not self-referred for mental health services, but are referred by parents and teachers. Problems in child adjustment are usually defined and identified by adults, not by the child. These adults are routinely involved in assessment and treatment, because treatment efforts routinely incorporate modification of the home and classroom environments (cf. LaGreca, Kuttler, & Stone, 2001).
Developmental and Motivational Issues
The Dimensions and Content of Adjustment Problems
Thesameorquitesimilarcorepresentingsymptomsandproblems may be associated with different diagnoses. Presenting problems such as inattention may suggest the presence of attention-deficit/hyperactivity disorder (ADHD), depression, anxiety, defective reality testing, a learning disability, or an acquired cognitive deficit. The same core disability may be demonstrated by quite different symptoms and behaviors at different ages, problem behaviors may change substantially with maturation, and problems may appear as a consequence of a prior untreated condition.
Young children are routinely characterized as unable to contribute meaningfully to the assessment process through the completion of self-report questionnaires (Ammerman & Hersen, 1993). Children under the age of 10 have not been reliable reporters of their own behaviors (Edelbrock, Costello, Dulcan, Kalas, & Conover, 1985). Relevant challenges to test construction and test application most certainly include normative limitations of a child’s age-appropriate language comprehension and reading skills. The task of self-evaluation and self-description may also be compromised by a fundamental developmental immaturity in the understanding of the principles of psychosocial adjustment. Hence, developmental immaturity represents a challenge to test validity because of the presence of inadequately developed introspective skills, such as a lack of appreciation for the relation between thoughts and feelings (cf. Flavell, Flavell, & Green, 2001).
In contrast to adults who request services from mental health professionals, children and adolescents seldom request such assistance. Children are unlikely to find the completion of a self-description of adjustment consistent with their expectations and are unlikely to experience completion of such a questionnaire as positive. It is quite reasonable to anticipate that most youth will not be motivated to contribute information that is useful in the diagnostic process. In the mental health setting, youth contribution to a formal test-based assessment process may be even more problematic. Youth are frequently referred to mental health professionals because they have been unwilling or unable to comply with the requests of adults. Such youth frequently also present with cognitive or academic disabilities that represent additional obstacles to the use of formal assessment techniques.
Unique Challenges of the Mental Health Setting
Assessment of Comorbid Conditions
Comorbidity, the simultaneous occurrence of two or more unrelated conditions, is very commonly observed in youth evaluated in mental health settings. This expectation of comorbid conditions should be seriously considered in the conduct of initial evaluations. Comprehensive multidimensional evaluations of adjustment, and therefore the use of tests that simultaneously assess multiple dimensions of problematic adjustment (or multiple unidimensional tests that provide comparable information) are employed by mental health professionals because of the nature of the problems of the youth they evaluate. Children and adolescents troubled by multiple disorders are most likely to be assessed by a mental health professional because the probability of referral of such a child is determined by the combined likelihood of the referral for each separate disorder (Caron & Rutter, 1991). This referral bias has been demonstrated by clinical interviews and in the separate application of standardized questionnaires completed by parents, teachers, and students (McConaughy & Achenbach, 1994). It is therefore always reasonable to assume that conditions other than the one presented as the primary problem are contributing to the referral process and influencing current adjustment; this possibility must be considered in the selection of assessment procedures. In addition, it is important to consider the developmental implications of current conditions, in that the presence of specific unresolved problems may increase the subsequent likelihood of other specific conditions.
It is important to be alert to the possible presence of the various conditions that are frequently comorbid in youth seen by mental health professionals. Considerable effort has been applied in identifying frequent patterns of comorbidity. For example, as many as two thirds of elementary school children with ADHD referred for clinical evaluation have been found to have at least one other diagnosable psychiatric disorder. Measurement and treatment of these other disorders are often of comparable importance to the assessment and treatment of ADHD itself (Cantwell, 1996). Such comorbid conditions may delineate meaningful subgroups of children withADHD (Biederman, Newcorn, & Sprich, 1991). Even studies of nonreferred samples demonstrate that the majority of children withADHD also qualify for an additional disruptive behavior disorder (e.g., oppositional defiant disorder [ODD], conduct disorder [CD]). These patterns of comorbidity are more common in boys than girls, are associated with increased severity and persistence of symptoms, and have negative implications for future family and societal adjustment (Jensen, Martin, & Cantwell, 1997). Internalizing disorders (anxiety, depression) are frequently diagnosed in children with ADHD; this pattern of problems appears to have important implications for treatment effectiveness. The presence of comorbid internalizing symptoms decreases the likelihood of positive response to stimulant medications (cf. Voelker, Lachar, & Gdowski, 1983) and suggests the need to consider alternative treatment with antidepressants. Jensen et al. (1997) noted that underachievement, Tourette’s syndrome, bipolar disorder, and a variety of medical conditions should also be considered as possibly comorbid when ADHD has been established as a current diagnosis (see also Pliszka, 1998).
Conduct disorder and ODD demonstrate substantial comorbidity in epidemiological studies and obtain rates of comorbidity in excess of 90% in referred samples. Some authors have even considered these two diagnostic categories not to be independent phenomenon, but points on a continuum, perhaps representing variation in developmental stage and symptom severity (cf. Loeber, Lahey, & Thomas, 1991; Nottelmann & Jensen, 1995). The majority of referred children with CD or ODD also meet the diagnostic criterion for ADHD. Comorbid internalizing conditions are less frequent, although gender may play a role. Girls are more likely than boys to have a comorbid internalizing condition at any age. The co-occurrence of depression is more likely in preadolescence for boys, and such comorbidity in girls increases significantly with age. Indeed, the majority of referred youth (except for adolescent boys) with CD have one or more additional diagnoses (Offord, Boyle, & Racine, 1991). Comorbid anxiety may be associated with fewer CD symptoms of aggression, whereas comorbid depression is associated with increased risk for suicidal behavior (Loeber & Keenan, 1994). Conduct disorder is also often associated with substantial academic underachievement (Hinshaw, Lahey, & Hart, 1993). The conjoint presence of CD and depression may represent an even greater risk for a variety of problems than is represented by each condition alone. These problems may include substance dependence, academic problems, problematic social competence and peer relationships, a predisposition not to experience positive emotions, treatment seeking, treatment resistance, and increased long-term negative implications (Marmorstein & Iacono, 2001).
The comorbidity of depression and anxiety is substantial in clinically referred youth (Brady & Kendall, 1992). Indeed, substantial evidence exists that anxiety and depression are part of a broader construct of emotional distress in children and adolescents (Finch, Lipovsky, & Casat, 1989; King, Ollendick, & Gullone, 1991). Anxiety may more frequently appear before depression, and their joint occurrence suggests a higher degree of disability. Many of these youth have a comorbid externalizing disorder. ADHD occurs with anxiety or depression 25 to 33% of the time, whereas CD or ODD is present at least 50% of the time (Nottelmann & Jensen, 1995).
Multidimensional inventories may be especially valuable in the assessment of children with a known disability. For example, comorbid conditions in youth classified as mentally retarded are typically underdiagnosed (Nanson & Gordon, 1999). This phenomenon is so prevalent that unique descriptive labels have been proposed. Diagnostic overshadowing is the tendency for clinicians to overlook additional psychiatric diagnoses once the presence of mental retardation has been established (Spengler, Strohmer, & Prout, 1990); masking is the process by which the clinical characteristics of a mental disorder are assumed instead to be features of developmental delay (cf. Pearson et al., 2000). Studies suggest comorbidity for various behavioral or psychiatric disorders of 30 to 60% for children with mental retardation (McLaren & Bryson, 1987).
Problem Intensity and Chronicity
Referral to a mental health professional, whether for hospitalization or residential care or for outpatient evaluation at a clinic or other tertiary referral center, assures the presence a high proportion of difficult and complicated cases that will include high levels of comorbidity (Caron & Rutter, 1991). Such referrals often represent a pattern of maladjustment that does not remit over time and that is also resistant to primary corrective efforts in the home or the school, or through consultation with a pediatrician or family physician.An extended symptomatic course suggests the presence of conditions that are secondary to primary chronic problems (e.g., primary disruptive behavior contributes to peer rejection that results in social isolation that leads to dysphoria). Chronicity and intensity of current adjustment problems represent an assessment challenge to establish the historical sequence of problem emergence and the consequences of previous intervention efforts. Such a history may seduce a clinician into making significant diagnostic leaps of inference that may not be warranted. Such errors may be avoided through systematic use of a multidimensional instrument during the intake process.When current problems have a significant history, the significant adults who will participate in the assessment process (parents, teachers) are likely to bring with them a high degree of emotional commitment to problem resolution. Such informant intensity may decrease the clarity of their contribution as a questionnaire informant to the assessment.
The Referral Process
Youth generally come to mental health settings only because they are referred for specific services, although other evaluations may be conducted at school or in the physician’s office secondary to some routine, setting-specific observation or other data-gathering process. Some consideration of the referral process provides insight into the challenges inherent in assessments conducted by mental health professionals. Requests for mental health evaluation often originate with a request by a professional or from a setting that is distant from the mental health professional, allowing less than complete communication. The mental health professional or mental health service delivery agency cannot assume that the detail that accompanies the request for service is either sufficient or accurate. Rather, at least one adult has been motivated to initiate this referral and at least one or more focused concerns may be communicated to some degree.
In other instances the referral for an evaluation may come from a behavioral health managed care company and may represent only the information provided by a parent who has called the number on the back of an insurance card. In such instances the referral assures the clinician of some financial reimbursement for services rendered, but provides no independent meaningful clinical information.That is, the clinician must first document problem presence, then type, pattern, and severity. In such cases, the clinician must be especially vigilant regarding potential errors in focus—that is, assuming that specific behaviors represent one problem, while they actually represent another (i.e., similar behaviors reflect dissimilar problems).
The Challenges of Managed Care
Maruish (2002), although focusing on mental health services for adults, provides a balanced discussion of the changes in psychometric practice that have accompanied behavioral health benefits management. Requests for psychological testing must be preauthorized if these services will be reimbursed. Approval of such requests will be most successful when psychological testing is proposed to support the development of a differential diagnosis and a plan of treatment. Collection of this information is consistent with an emphasis on the application of treatments with proven effectiveness (Roberts & Hurley, 1997). Psychological testing routinely applied without focus, or supporting such goals as the development of an understanding of “the underlying personality structure,” as well as administration of collections of tests that incorporate duplicative or overlapping procedures, are inconsistent with the goals and philosophy of managed care. This review process may reduce the use of psychological testing and limit more time-consuming procedures, while supporting the use of brief, easily scored measures and checklists (Piotrowski, Belter, & Keller, 1998).
In contrast, the objectives and general philosophy of managed care are consistent with the application of objective multidimensional measures in the evaluation and treatment of children and adolescents. These goals include the efficient and rapid definition of current problems, the development of an effective treatment program, the monitoring of such intervention, and the evaluation of treatment effectiveness. Of greatest efficiency will be the application of procedures that generate information that supports all of these goals. The information generated by objective ratings and questionnaires are time and cost effective, and provide information that can be easily assimilated by the significant adults in a child’s life, therapists with various training backgrounds, and the organizations that ultimately monitor and control mental health resources.
It is useful to contrast contemporary descriptions of effective diagnostic and psychological assessment procedures to the expectation of managed mental health care that the information necessary for accurate diagnosis and treatment planning can be obtained in a 1-hr clinical interview. Cantwell (1996) outlined the necessary diagnostic components in the evaluation of ADHD. These components are as follows: (a) a comprehensive interview with all parenting figures to review current symptoms and developmental, medical, school, family social, medical, and mental health history; (b) a developmentally appropriate interview with the child that incorporates screening for comorbid disorders; (c) a medical evaluation; (d) assessment of cognitive ability and academic achievement; (e) application of both broad-spectrum and more narrowly focused parent and teacher rating scales; and (f) other adjunct assessments such as speech and language assessment. Cordell (1998) described the range of psychological assessment services often requested by a child psychiatry service. She provided outlines of assessment protocols for preschoolers, preteens, and adolescents. Each of these protocols requires three to five sessions for a total of up to 6 hrs of patient contact.
The assessment methods that are the focus of this research paper may be applied to meet the goals of managed care. In routine (not crisis) evaluation, parents may be mailed a teacher rating form to be completed and returned before the intake interview. Parents may be asked to complete a questionnaire in a similar fashion. Completion of such rating forms not only provides valuable independent assessment of the child, but also represents a sample of positive parent behavior. This compliant behavior may predict an increased likelihood of parent attendance at the first scheduled appointment. This is an important consideration, because an acutely distressed parent may make an appointment for mental health services, yet not appear if the specific conditions that generated the distress resolve before the scheduled appointment.
When a parent completes a questionnaire to describe the child before the intake interview, this additional information canaddanefficientfocustothetopicssubsequentlydiscussed. Because of the central role of family and school in child treatment, the feedback to parents and teachers from these measures is usually accepted with little if any resistance. When these profiles are inconsistent with the global opinions that have motivated the mental health consultation, the presentation and discussion of such results may facilitate realignment of parent or teacher opinion and the development of an alliance with the therapist.
The Conduct Ofassessment by Questionnaire and Rating Scale
Contemporary models of the assessment of psychiatric disorders in youth are, in contrast to the models proposed by managed care, likely to be comprehensive. For example, although early approaches to the behavioral assessment of CD focused on identifying the parenting skills deficits conceptualized as causative and therefore in need of primary remediation, the increased understanding of the developmental aspects of this disorder has substantially influenced assessment. McMahon (1987) noted:
As our knowledge of the multiple factors influencing the development, manifestation, and maintenance of conduct disorders has grown, it has become apparent that a proper assessment of the conduct disordered child must make use of the multiple methods (e.g., behavioral rating scales, direct observation, interviews) completed by multiple informants (parents, teachers, the children themselves) concerning the child’s behavior in multiple settings (e.g., home, school). Furthermore, it is essential that the familial and extra-familial contexts in which the conduct disordered child functions be assessed as well. (p. 246)
This assessment process is often described as sequential (Mash & Lee, 1993). The presence of specific problems is first established. Multidimensional inventories can make an efficient and effective contribution to this process. Each problem must be placed in its developmental and historical context, and assessed in relation to recent experiences. Such information is most efficiently gathered by a focused and tailored interview. Once a treatment plan is developed, its effectiveness should be monitored through additional assessment. Repetition of baseline assessment procedures or the use of more focused or narrowly defined questionnaires during and at the completion of treatment can be applied in the support of this process. Such efforts can support modification of ongoing treatment, quantify change at termination, and estimate stability of such improvement by follow-up survey.
Introducing a Family of Multidimensional, Multisource Measures
Personality Inventory for Children, Second Edition
First published in 1977, this questionnaire completed by parent or other guardian was completely revised in 2001. The Personality Inventory for Children has been described as “one of the earliest and remains among the most well known of parent rating scales. . . . the grandparent of many modern rating scales”(Kamphaus&Frick,1996).ThePersonalityInventory for Children, Second Edition (PIC-2) is provided in two formats.The first format consists of a reusable 275-statement administration booklet and a separate answer sheet for the recording of parent responses to booklet statements. Various answer sheets can be scored by hand with templates, or the recorded responses (True-False) can be entered for processing into a personal computer;answer sheetsmay also be mailed or faxed to the test publisher for processing.Amultiscale profile (the PIC-2 Behavioral Summary) interpreted using guidelines presented in the test manual (Lachar & Gruber, 2001) is obtained by completion of the first 96 items, which takes about 15 min. A second, similarly interpreted comprehensive profile (the Standard Format) and responses to a critical item list may be obtained by completing the entire administration booklet, which takes about 40 min or less. The second published format provides the 96 statements of the PIC-2 Behavioral Summary and a simple efficient method to generate and profile its 12 scores.The PIC-2 gender-specific T-score values are derived from a contemporary national sample of parent descriptions of youth 5 to 18 years of age. (A preschool version of the PIC for children 3 to 5 years of age is currently being developed.)
Table 11.1 lists the components of these two profiles and some of their associated psychometric characteristics. PIC-2 statements are written at a low- to mid-fourth-grade reading level and represent current and previous behaviors, feelings, accomplishments, and interactions, both common to and relatively infrequent among youth evaluated by mental health professionals. These statements reflect both variations in problem frequency and severity. PIC-2 adjustment scales were constructed using an iterative procedure. Potential scale items were first assigned to initial dimensions on the basis of previous scale structure or manifest statement content, whereas final item-scale assignment reflected a demonstrated strong and primary correlation with the dimension on which it was finally assigned. The nine scales of the standard profile were then further refined with the assistance of factor analysis to construct 21 subscales of greater content homogeneity applied to facilitate scale interpretation. The PIC-2 Behavioral Summary profile presents eight core scales and four composites or combinations of these values designed to measure change in symptomatic status. Each of these core scales consists of 12 statements selected from the full-length standard form to support treatment planning and to measure behavioral change. Each short scale correlates .92 to .96 with its full-length equivalent.
Asignificant element of the PIC-2 Standard Format profile is the provision of three response validity scales. The first of these scales (Inconsistency) consists of 35 pairs of statements. Because each pair of statements is highly correlated, two of the four possible pairs of responses (True-True and False-False, or True-False and False-True) are classified as inconsistent and their presence adds a unit weight to the Inconsistency scale raw score that can range from 0 to 35. Review of several examples of inconsistent response pairs clarifies this concept; for example, “My child has many friends. (True)/My child has very few friends. (True)”; “My child often disobeys me. (True)/My child often breaks the rules. (False).” An elevated T score on this scale (T > 69) suggests that the parent who completed the PIC-2 failed to attend sufficiently to, or to achieve adequate comprehension of, PIC-2 statement content.
The second validity scale, Dissimulation, evaluates the likelihood that the responses to PIC-2 statements represent an exaggeration of current problems in adjustment, or the description of nonexistent problems and symptoms. These scale items were identified through an analytic process in which three samples were compared: a normative sample, a referred sample, and a sample in which parents were asked to describe their asymptomatic children as if they were in need of mental health services (i.e., a malingering sample). The average endorsement rate for these 35 items was 6.3% in normative, 15.3% in referred, and 54.5% in directed malingered protocols. Elevation of Dissimulation may reflect the presence of informant distress that may distort youth description.
The third validity scale, Defensiveness, includes 12 descriptions of infrequent or highly improbable positive attributes (“My child always does his/her homework on time. [True]”) and 12 statements that represent the denial of common child behaviors and problems (“My child has some bad habits. [False]”). Scale values above 59T suggest that significant problems may be minimized or denied on the PIC-2 profile. The PIC-2 manual provides interpretive guidelines for seven patterns of these three scales that classified virtually all cases (99.8%) in a study of 6,370 protocols.
Personality Inventory for Youth
The Personality Inventory for Youth (PIY) and the PIC-2 are closely related in that the majority of PIY items were derived from rewriting content-appropriate PIC items into a firstperson format. As demonstrated in Table 11.2, the PIY profile is very similar to the PIC-2 Standard Format profile. PIY scales were derived in an iterative fashion with 270 statements assigned to one of nine clinical scales and to three validity response scales (Inconsistency, Dissimulation, Defensiveness). As in the PIC-2, each scale is further divided into two or three more homogenous subscales to facilitate interpretation. PIYmaterials include a reusable administration booklet and a separate answer sheet that can be scored by hand with templates, processed by personal computer, or mailed to the test publisher to obtain a narrative interpretive report, profile, and responses to a critical item list. PIY items were intentionally written at a low readability level, and a low- to mid-fourthgrade reading comprehension level is adequate for understanding and responding to the PIY statements. When students have at least an age-9 working vocabulary, but do not have a comparable level of reading ability, or when younger students have limited ability to attend and concentrate, an audiotape recording of the PIY items is available and can be completed in less than 1 hr. Scale raw scores are converted to T scores using contemporary gender-specific norms from students in Grades 4 through 12, representing ages 9 through 19 (Lachar & Gruber, 1995).
Student Behavior Survey
This teacher rating form was developed through reviewing established teacher rating scales and by writing new statements that focused on content appropriate to teacher observation (Lachar, Wingenfeld, Kline, & Gruber, 2000). Unlike ratings that can be scored on parent or teacher norms (Naglieri, LeBuffe, & Pfeiffer, 1994), the Student Behavior Survey (SBS) items demonstrate a specific school focus. Fifty-eight of its 102 items specifically refer to in-class or inschool behaviors and judgments that can be rated only by school staff (Wingenfeld, Lachar, Gruber, & Kline, 1998). SBS items provide a profile of 14 scales that assess student academic status and work habits, social skills, parental participation in the educational process, and problems such as aggressive or atypical behavior and emotional stress (see Table 11.3). Norms that generate linear T scores are gender specific and derived from two age groups: 5 to 11 and 12 to 18 years.
SBS items are presented on one two-sided form. The rating process takes 15 min or less. Scoring of scales and completion of a profile are straightforward clerical processes that take only a couple of minutes. The SBS consists of two major sections. The first section, Academic Resources, includes four scales that address positive aspects of school adjustment, whereas the second section, Adjustment Problems, generates seven scales that measure various dimensions of problematic adjustment. Unlike the PIC-2 and PIY statements, which are completed with a True or False response, SBS items are mainly rated on a 4-point frequency scale. Three additional disruptive behavior scales each consist of 16 items nominated as representing phenomena consistent with the characteristics associated with one of three major Diagnostic and Statistical Manual, Fourth Edition (DSM-IV) disruptive disorder diagnoses: ADHD, combined type; ODD; and CD (Pisecco et al., 1999).
This author continues to champion the application of objective multidimensional questionnaires (Lachar, 1993, 1998) because there is no reasonable alternative to their use for baseline evaluation of children seen in mental health settings. Such questionnaires employ consistent stimulus and response demands, measure a variety of useful dimensions, and generate a profile of scores standardized using the same normative reference. The clinician may therefore reasonably assume that differences obtained among dimensions reflect variation in content rather than some difference in technical or stylistic characteristic between independently constructed unidimensional measures (e.g., true-false vs. multiple-choice format, application of regional vs. national norms, or statement sets that require different minimum reading requirements). In addition, it is more likely that interpretive materials will be provided in an integrated fashion and the clinician need not select or accumulate information from a variety of sources for each profile dimension.
Selection of a multidimensional instrument that documents problem presence and absence demonstrates that the clinician is sensitive to the challenges inherent in the referral process and the likelihood of comorbid conditions, as previously discussed. This action also demonstrates that the clinician understands that the accurate assessment of a variety of child and family characteristics that are independent of diagnosis may yet be relevant to treatment design and implementation. For example, the PIY FAM1 subscale (Parent-Child Conflict) may be applied to determine whether a child’s parents should be considered a treatment resource or a source of current conflict. Similarly, the PIC-2 and PIYWDL1 subscale (Social Introversion) may be applied to predict whether an adolescent will easily develop rapport with his or her therapist, or whether this process will be the first therapeutic objective.
The collection of standardized observations from different informants is quite natural in the evaluation of children and adolescents. Application of such an approach has inherent strengths, yet presents the clinician with several challenges. Considering parents or other guardians, teachers or school counselors, and the students themselves as three distinct classes of informant, each brings unique strengths to the assessment process. Significant adults in a child’s life are in a unique position to report on behaviors that they—not the child—find problematic. On the other hand, youth are in a unique position to report on their thoughts and feelings. Adult ratings on these dimensions must of necessity reflect, or be inferred from, child language and behavior. Parents are in a unique position to describe a child’s development and history as well as observations that are unique to the home. Teachers observe students in an environment that allows for direct comparisons with sameage classmates as well as a focus on cognitive and behavioral characteristics prerequisite for success in the classroom and the acquisition of knowledge. Collection of independent parent and teacher ratings also contributes to comprehensive assessment by determining classes of behaviors that are unique to a given setting or that generalize across settings (Mash &Terdal, 1997).
Studies suggest that parents and teachers may be the most attuned to a child’s behaviors that they find to be disruptive (cf. Loeber & Schmaling, 1985), but may underreport the presence of internalizing disorders (Cantwell, 1996). Symptoms and behaviors that reflect the presence of depression may be more frequently endorsed in questionnaire responses and in standardized interviews by children than by their mothers (cf. Barrett et al., 1991; Moretti, Fine, Haley, & Marriage, 1985). In normative studies, mothers endorse more problems than their spouses or the child’s teacher (cf. Abidin, 1995; Duhig, Renk, Epstein, & Phares, 2000; Goyette, Conners, & Ulrich, 1978). Perhaps measured parent agreement reflects the amount of time that a father spends with his child (Fitzgerald, Zucker, Maguin, & Reider, 1994).Teacher ratings have (Burns,Walsh, Owen, & Snell, 1997), and have not, separated ADHD subgroups (Crystal, Ostrander, Chen, & August, 2001). Perhaps this inconsistency demonstrates the complexity of drawing generalizations from one or even a series of studies. The ultimate evaluation of this diagnostic process must consider the dimension assessed, the observer or informant, the specific measure applied, the patient studied, and the setting of the evaluation.
An influential meta-analysis by Achenbach, McConaughy, and Howell (1987) demonstrated that poor agreement has been historically obtained on questionnaires or rating scales among parents, teachers, and students, although relatively greater agreement among sources was obtained for descriptions of externalizing behaviors. One source of informant disagreement between comparably labeled questionnaire dimensions may be revealed by the direct comparison of scale content. Scales similarly named may not incorporate the same content, whereas scales with different titles may correlate because of parallel content. The application of standardized interviews often resolves this issue when the questions asked and the criteria for evaluating responses obtained are consistent across informants. When standardized interviews are independently conducted with parents and with children, more agreement is obtained for visible behaviors and when the interviewed children are older (Lachar & Gruber, 1993).
Informant agreement and the investigation of comparative utility of classes of informants continue to be a focus of considerable effort (cf. Youngstrom, Loeber, & StouthamerLoeber, 2000). The opinions of mental health professionals and parents as to the relative merits of these sources of information have been surveyed (Loeber, Green, & Lahey, 1990; Phares, 1997). Indeed, even parents and their adolescent children have been asked to suggest the reasons for their disagreements. One identified causative factor was the deliberate concealment of specific behaviors by youth from their parents (Bidaut-Russell et al., 1995). Considering that youth seldom refer themselves for mental health services, routine assessment of their motivation to provide full disclosure would seem prudent.
The parent-completed Child Behavior Checklist (CBCL; Achenbach, 1991a) and student-completedYouth Self-Report (YSR;Achenbach, 1991b), as symptom checklists with parallel content and derived dimensions, have facilitated the direct comparison of these two sources of diagnostic information. The study by Handwerk, Larzelere, Soper, and Friman (1999) is at least the twenty-first such published comparison, joining 10 other studies of samples of children referred for evaluation or treatment. These studies of referred youth have consistently demonstrated that the CBCL provides more evidence of student maladjustment than does the YSR. In contrast, 9 of the 10 comparable studies of nonreferred children (classroom-based or epidemiological surveys) demonstrated the opposite relationship: The YSR documented more problems in adjustment than did the CBCL. One possible explanation for these findings is that children referred for evaluation often demonstrate a defensive response set, whereas nonreferred children do not (Lachar, 1998).
Because the YSR does not incorporate response validity scales, a recent study of the effect of defensiveness on YSR profiles of inpatients applied the PIY Defensiveness scale to assign YSR profiles to defensive and nondefensive groups (see Wrobel et al., 1999, for studies of this scale). The substantial influence of measured defensiveness was demonstrated for five of eight narrow-band and all three summary measures of the YSR. For example, only 10% of defensive YSR protocols obtained an elevated (>63T) Total Problems score, whereas 45% of nondefensive YSR protocols obtained a similarly elevated Total Problems score (Lachar, Morgan, Espadas, & Schomer, 2000). The magnitude of this difference was comparable to the YSR versus CBCL discrepancy obtained by Handwerk et al. (1999; i.e., 28% of YSR vs. 74% of CBCL Total Problems scores were comparably elevated). On the other hand, youth may reveal specific problems on a questionnaire that they denied during a clinical or structured interview.
Clinical Issues in Application
Priority of Informant Selection
When different informants are available, who should participate in the assessment process, and what priority should be assigned to each potential informant? It makes a great deal of sense first to call upon the person who expresses initial or primary concern regarding child adjustment, whether this be a guardian, a teacher, or the student. This person will be the most eager to participate in the systematic quantification of problem behaviors and other symptoms of poor adjustment. The nature of the problems and the unique dimensions assessed by certain informant-specific scales may also influence the selection process. If the teacher has not referred the child, report of classroom adjustment should also be obtained when the presence of disruptive behavior is of concern, or when academic achievement is one focus of assessment. In these cases, such information may document the degree to which problematic behavior is situation specific and the degree to which academic problems either accompany other problems or may result from inadequate motivation. When an intervention is to be planned, all proposed participants should be involved in the assessment process.
Disagreements Among Informants
Even estimates of considerable informant agreement derived from study samples are not easily applied as the clinician processes the results of one evaluation at a time. Although the clinician may be reassured when all sources of information converge and are consistent in the conclusions drawn, resolving inconsistencies among informants often provides information that is important to the diagnostic process or to treatment planning. Certain behaviors may be situation specific or certain informants may provide inaccurate descriptions that have been compromised by denial, exaggeration, or some other inadequate response. Disagreements among family members can be especially important in the planning and conduct of treatment. Parents may not agree about the presence or the nature of the problems that affect their child, and a youth may be unaware of the effect that his or her behavior has on others or may be unwilling to admit to having problems. In such cases, early therapeutic efforts must focus on such discrepancies in order to facilitate progress.
Multidimensional Versus Focused Assessment
Adjustment questionnaires vary in format from those that focus on the elements of one symptom dimension or diagnosis (i.e. depression,ADHD) to more comprehensive questionnaires. The most articulated of these instruments rate current and past phenomena to measure a broad variety of symptoms and behaviors, such as externalizing symptoms or disruptive behaviors, internalizing symptoms of depression and anxiety, and dimensions of social and peer adjustment. These questionnaires may also provide estimates of cognitive, academic, and adaptive adjustment as well as dimensions of family function that may be associated with problems in child adjustment and treatment efficacy. Considering the unique challenges characteristic of evaluation in mental health settings discussed earlier, it is thoroughly justified that every intake or baseline assessment should employ a multidimensional instrument.
Questionnaires selected to support the planning and monitoring of interventions and to assess treatment effectiveness must take into account a different set of considerations. Response to scale content must be able to represent behavioral change, and scale format should facilitate application to the individual and summary to groups of comparable children similarly treated. Completion of such a scale should represent an effort that allows repeated administration, and the scale selected must measure the specific behaviors and symptoms that are the focus of treatment. Treatment of a child with a single focal problem may require the assessment of only this one dimension. In such cases, a brief depression or articulated ADHD questionnaire may be appropriate. If applied within a specialty clinic, similar cases can be accumulated and summarized with the same measure.Application of such scales to the typical child treated by mental health professionals is unlikely to capture all dimensions relevant to treatment.
Selection of Psychological Tests
Evaluating Scale Performance
Consult Published Resources
Although clearly articulated guidelines have been offered (cf. Newman, Ciarlo, & Carpenter, 1999), selection of optimal objective measures for either a specific or a routine assessment application may not be an easy process. An expanded variety of choices has become available in recent years and the demonstration of their value is an ongoing effort. Manuals for published tests vary in the amount of detail that they provide. The reader cannot assume that test manuals provide comprehensive reviews of test performance, or even offer adequate guidelines for application. Because of the growing use of such questionnaires, guidance may be gained from graduate-level textbooks (cf. Kamphaus & Frick, 2002; Merrell, 1994) and from monographs designed to review a variety of specific measures (cf. Maruish, 1999). An introduction to more established measures, such as the Minnesota Multiphasic Personality Inventory (MMPI) adapted for adolescents (MMPI-A; Butcher et al., 1992), can be obtained by reference to chapters and books (e.g., Archer, 1992, 1999; Graham, 2000).
Estimate of Technical Performance: Reliability
Test performance is judged by the adequacy of demonstrated reliability and validity. It should be emphasized from the onset that reliability and validity are not characteristics that reside in a test, but describe a specific test application (i.e., assessment of depression in hospitalized adolescents). A number of statistical techniques are applied in the evaluation of scales of adjustment that were first developed in the study of cognitive ability and academic achievement. The generalizability of these technical characteristics may be less than ideal in the evaluation of psychopathology because the underlying assumptions made may not be achieved.
The core of the concept of reliability is performance consistency; the classical model estimates the degree to which an obtained scale score represents the true phenomenon, rather than some source of error (Gliner, Morgan, & Harmon, 2001). At the item level, reliability measures internal consistency of a scale—that is, the degree to which scale item responses agree. Because the calculation of internal consistency requires only one set of responses from any sample, this estimate is easily obtained. Unlike an achievement subscale in which all items correlate with each other because they are supposed to represent a homogenous dimension, the internal consistency of adjustment measures will vary by the method used to assign items to scales. Scales developed by the identification of items that meet a nontest standard (external approach) will demonstrate less internal consistency than will scales developed in a manner that takes the content or the relation between items into account (inductive or deductive approach; Burisch, 1984).An example is provided by comparison of the two major sets of scales for the MMPI-A (Butcher et al., 1992). Of the 10 profile scales constructed by empirical keying, 6 obtained estimates of internal consistency below 0.70 in a sample of referred adolescent boys. In a second set of 15 scales constructed with primary concern for manifest content, only one scale obtained an estimate below 0.70 using the same sample. Internal consistency may also vary with the homogeneity of the adjustment dimension being measured, the items assigned to the dimension, and the scale length or range of scores studied, including the influence of multiple scoring formats.
Scale reliability is usually estimated by comparison of repeated administrations. It is important to demonstrate stability of scales if they will be applied in the study of an intervention. Most investigators use a brief interval (e.g., 7–14 days) between measure administrations. The assumption is made that no change will occur in such time. It has been our experience, however, with both the PIY and PIC-2 that small reductions are obtained on several scales at the retest, whereas the Defensiveness scale T score increases by a comparable degree on retest. In some clinical settings, such as an acute inpatient unit, it would be impossible to calculate test-retest reliability estimates in which an underlying change would not be expected. In such situations, interrater comparisons, when feasible, may be more appropriate. In this design it is assumed that each rater has had comparable experience with the youth to be rated and that any differences obtained would therefore represent a source of error across raters. Two clinicians could easily participate in the conduct of the same interview and then independently complete a symptom rating (cf. Lachar et al., 2001). However, interrater comparisons of mothers to fathers, or of pairs of teachers, assume that each rater has had comparable experience with the youth—such an assumption is seldom met.
Estimate of Technical Performance: Validity
Of major importance is the demonstration of scale validity for a specific purpose. A valid scale measures what it was intended to measure (Morgan, Gliner, & Harmon, 2001). Validity may be demonstrated when a scale’s performance is consistent with expectations (construct validity) or predicts external ratings or scores (criterion validity). The foundation for any scale is content validity, that is, the extent to which the scale represents the relevant content universe for each dimension. Test manuals should demonstrate that items belong on the scales on which they have been placed and that scales correlate with each other in an expected fashion. In addition, substantial correlations should be obtained between the scales on a given questionnaire and similar measures of demonstrated validity completed by the same and different raters. Valid scales of adjustment should separate meaningful groups (discriminant validity) and demonstrate an ability to assign cases into meaningful categories.
Examples of such demonstrations of scale validity are provided in the SBS, PIY, and PIC-2 manuals. When normative and clinically and educationally referred samples were compared on the 14 SBS scales, 10 obtained a difference that represented a large effect, whereas 3 obtained a medium effect. When the SBS items were correlated with the 11 primary academic resources and adjustment problems scales in a sample of 1,315 referred students, 99 of 102 items obtained a substantial and primary correlation with the scale on which it was placed. These 11 nonoverlapping scales formed three clearly interpretable factors that represented 71% of the common variance: externalization, internalization, and academic performance. The SBS scales were correlated with six clinical rating dimensions (n = 129), with the scales and subscales of the PIC-2 in referred (n = 521) and normative (n = 1,199) samples, and with the scales and subscales of the PIY in a referred (n = 182) sample. The SBS scales were also correlated with the four scales of the Conners’ Teacher Ratings Scale, Short Form, in 226 learning disabled students and in 66 students nominated by their elementary school teachers as having most challenged their teaching skills over the previous school year. SBS scale discriminant validity was also demonstrated by comparison of samples defined by the Conners’Hyperactivity Index. Similar comparisons were also conducted across student samples that had been classified as intellectually impaired (n = 69), emotionally impaired (n = 170), or learning disabled (n = 281; Lachar, Wingenfeld, et al., 2000).
Estimates of PIYvalidity were obtained through the correlations of PIY scales and subscales with MMPI clinical and content scales (n = 152).The scales of 79 PIYprotocols completed during clinical evaluation were correlated with several other self-report scales and questionnaires: Social Support, Adolescent Hassles, State-Trait Anxiety, Reynolds Adolescent Depression, Sensation-Seeking scales, State-TraitAnger scales, and the scales of the Personal Experience Inventory. PIYscores were also correlated with adjective checklist items in 71 college freshmen and chart-derived symptom dimensions in 86 adolescents hospitalized for psychiatric evaluation and treatment (Lachar & Gruber, 1995).
When 2,306 normative and 1,551 referred PIC-2 protocols were compared, the differences on the nine adjustment scales represented a large effect for six scales and a moderate effect for the remaining scales. For the PIC-2 subscales, these differences represented at least a moderate effect for 19 of these 21 subscales. Comparable analysis for the PIC-2 Behavioral Summary demonstrated that these differences were similarly robust for all of its 12 dimensions. Factor analysis of the PIC-2 subscales resulted in five dimensions that accounted for 71% of the common variance: Externalizing Symptoms, Internalizing Symptoms, Cognitive Status, Social Adjustment, and Family Dysfunction. Comparable analysis of the eight narrow-band scales of the PIC-2 Behavioral Summary extracted two dimensions in both referred and standardization protocols: Externalizing and Internalizing. Criterion validity was demonstrated by correlations between PIC-2 values and six clinician rating dimensions (n = 888), the 14 scales of the teacher-rated SBS (n = 520), and the 24 subscales of the self-report PIY (n = 588). In addition, the PIC2 manual provides evidence of discriminant validity by comparing PIC-2 values across 11 DSM-IV diagnosis-based groups (n = 754; Lachar & Gruber, 2001).
Interpretive Guidelines: The Actuarial Process
The effective application of a profile of standardized adjustment scale scores can be a daunting challenge for a clinician. The standardization of a measure of general cognitive ability or academic achievement provides the foundation for score interpretation. In such cases, a score’s comparison to its standardization sample generates the IQ for the test of general cognitive ability and the grade equivalent for the test of academic achievement. In contrast, the same standardization process that provides T-score values for the raw scores of scales of depression, withdrawal, or noncompliance does not similarly provide interpretive guidelines. Although this standardization process facilitates direct comparison of scores from scales that vary in length and rate of item endorsement, there is not an underlying theoretical distribution of, for example, depression to guide scale interpretation in the way that the normal distribution supports the interpretation of an IQ estimate. Standard scores for adjustment scales represent the likelihood of a raw score within a specific standardization sample. A depression scale T score of 70 can be interpreted with certainty as an infrequent event in the standardization sample. Although a specific score is infrequent, the prediction of significant clinical information, such as likely symptoms and behaviors, degree of associated disability, seriousness of distress, and the selection of a promising intervention cannot be derived from the standardization process that generates a standard score of 70T.
Comprehensive data that demonstrate criterion validity can also be analyzed to develop actuarial, or empirically based, scale interpretations. Such analyses first identify the fine detail of the correlations between a specific scale and nonscale clinical information, and then determine the range of scale standard scores for which this detail is most descriptive. The content so identified can be integrated directly into narrative text or provide support for associated text (cf. Lachar & Gdowski, 1979). Table 11.4 provides an example of this analytic process for each of the 21 PIC-2 subscales. The PIC-2, PIY, and SBS manuals present actuarially based narrative interpretations for these inventory scales and the rules for their application.
Review for Clinical Utility
Aclinician’s careful consideration of the content of an assessment measure is an important exercise.As this author has previously discussed (Lachar, 1993), item content, statement and response format, and scale length facilitate or limit scale application. Content validity as a concept reflects the adequacy of the match between questionnaire elements and the phenomena to be assessed. It is quite reasonable for the potential user of a measure to first gain an appreciation of the specific manifestations of a designated delinquency or psychological discomfort dimension. Test manuals should facilitate this process by listing scale content and relevant item endorsement rates. Questionnaire content should be representative and include frequent and infrequent manifestations that reflect mild, moderate, and severe levels of maladjustment. A careful review of scales constructed solely by factor analysis will identify manifest item content that is inconsistent with expectation; review across scales may identify unexpected scale overlap when items are assigned to more than one dimension. Important dimensions of instrument utility associated with content are instrument readability and the ease of scale administration, completion, scoring, and interpretation.
It is useful to identify the typical raw scores for normative and clinical evaluations and to explore the amount and variety of content represented by scores that are indicative of significant problems. It will then be useful to determine the shift in content when such raw scores representing significant maladjustment are reduced to the equivalents of standard scores within the normal range. Questionnaire application can be problematic when its scales are especially brief, are composed of statements that are rarely endorsed in clinical populations, or apply response formats that distort the true raw-score distribution. Many of these issues can be examined by looking at a typical profile form. For example, CBCLstandard scores of 50T often represent raw scores of only 0 or 1. When clinically elevated baseline CBCL scale values are reduced to values within normal limits upon retest, treatment effectiveness and the absence of problems would appear to have been demonstrated. Actually, the shift from baseline to posttreatment assessment may represent the process in which as few as three items that were first rated as a 2 (very true or often true) at baseline remain endorsed, but are rated as a 1 (somewhat or sometimes true) on retest (cf. Lachar, 1993).
Selected Adjustment Measures for Youth Assessment
An ever-increasing number of assessment instruments may be applied in the assessment of youth adjustment. This research paper concludes by providing a survey of some of these instruments. Because of the importance of considering different informants, all four families of parent-, teacher-, and selfreport measures are described in some detail. In addition, several multidimensional, single-informant measures, both the well established and the recently published, are described. Each entry has been included to demonstrate the variety of measures that are available. Although each of these objective questionnaires is available from a commercial test publisher, no other specific inclusion or exclusion criteria have been applied. This section concludes with an even more selective description of a few of the many published measures that restrict their assessment of adjustment or may be specifically useful to supplement an otherwise broadly based evaluation of the child. Such measures may contribute to the assessment of youth seen in a specialty clinic, or support treatment planning or outcome assessment. Again, the selection of these measures did not systematically apply inclusion or exclusion criteria.
Other Families of Multidimensional, Multisource Measures
Considering their potential contribution to the assessment process, a clinician would benefit from gaining sufficient familiarity with at least one parent-report questionnaire, one teacher rating form, and one self-report inventory. Four integrated families of these measures have been developed over the past decade. Some efficiency is gained from becoming familiar with one of these sets of measures rather than selecting three independent measures. Manuals describe the relations between measures and provide case studies that apply two or all three measures. Competence in each class of measures is also useful because it provides an additional degree of flexibility for the clinician. The conduct of a complete multiinformant assessment may not be feasible at times (e.g., teachers may not be available during summer vacation), or may prove difficult for a particular mental health service (e.g., the youth may be under the custody of an agency, or a hospital may distance the clinician from parent informants). In addition, the use of self-report measures may be systematically restricted by child age or some specific cognitive or motivational characteristics that could compromise the collection of competent questionnaire responses. Because of such difficulties, it is also useful to consider the relationship between the individual components of these questionnaire families. Some measures are complementary and focus on informant-specific content, whereas others make a specific effort to apply duplicate content and therefore represent parallel forms. One of these measure families, consisting of the PIC-2, the PIY, and the SBS, has already been described in some detail. The PIC-2, PIY, and SBS are independent comprehensive measures that both emphasize informant-appropriate and informant-specific observations and provide the opportunity to compare similar dimensions across informants.
Behavior Assessment System for Children
The Behavior Assessment System for Children (BASC) family of multidimensional scales includes the Parent Ratings Scales (PRS), Teacher Rating Scales (TRS), and Self-Report of Personality (SRP), which are conveniently described in one integrated manual (Reynolds & Kamphaus, 1992). BASC ratings are marked directly on self-scoring pamphlets or on one-page forms that allow the recording of responses for subsequent computer entry. Each of these forms is relatively brief (126–186 items) and can be completed in 10 to 30 min. The PRS and TRS items in the form of mainly short, descriptive phrases are rated on a 4-point frequency scale (never, sometimes, often, and almost always), while SRP items in the form of short, declarative statements are rated as either True or False. Final BASC items were assigned through multistage iterative item analyses to only one narrow-band scale measuring clinical dimensions or adaptive behaviors; these scales are combined to form composites. The PRS and TRS forms cover ages 6 to 18 years and emphasize across-informant similarities; the SRP is provided for ages 8 to 18 years and has been designed to complement parent and teacher reports as a measure focused on mild to moderate emotional problems and clinically relevant self-perceptions, rather than overt behaviors and externalizing problems.
The PRS composites and component scales are Internalizing Problems (Anxiety, Depression, Somatization), Externalizing Problems (Hyperactivity, Aggression, and Conduct Problems), and Adaptive Skills (Adaptability, Social Skills, Leadership). Additional profile scales include Atypicality, Withdrawal, and Attention Problems. The TRS Internalizing and Externalizing Problems composites and their component scales parallel the PRS structure. The TRS presents 22 items that are unique to the classroom by including a Study Skills scale in the Adaptive Skills composite and a Learning Problems scale in the School Problems composite. The BASC manual suggests that clinical scale elevations are potentially significant over 59T and that adaptive scores gain importance under 40T. The SRP does not incorporate externalization dimensions and therefore cannot be considered a fully independent measure. The SRP composites and their component scales are School Maladjustment (Attitude to School,Attitude to Teachers, Sensation Seeking), Clinical Maladjustment (Atypicality, Locus of Control, Social Stress,Anxiety, Somatization), and Personal Adjustment (Relations with Parents, Interpersonal Relations, Self-Esteem, Self-Reliance). Two additional scales, Depression and Sense of Inadequacy, are not incorporated into a composite. The SRP includes three validity response scales, although their psychometric characteristics are not presented in the manual.
Conners’ Rating Scales–Revised
The Conners’ parent and teacher scales were first used in the 1960s in the study of pharmacological treatment of disruptive behaviors. The current published Conners’ Rating Scales-Revised (CRS-R; Conners, 1997) require selection of one of four response alternatives to brief phrases (parent, teacher) or short sentences (adolescent): 0 = Not True at All (Never, Seldom), 1 = Just a Little True (Occasionally), 2 = Pretty Much True (Often, Quite a Bit), and 3 = Very Much True (Very Often, Very Frequent). These revised scales continue their original focus on disruptive behaviors (especially ADHD) and strengthen their assessment of related or comorbid disorders. The Conners’ Parent Rating Scale– Revised (CPRS-R) derives from 80 items seven factorderived nonoverlapping scales apparently generated from the ratings of the regular-education students (i.e., the normative sample): Oppositional, Cognitive Problems, Hyperactivity, Anxious-Shy, Perfectionism, Social Problems, and Psychosomatic. A review of the considerable literature generated using the original CPRS did not demonstrate its ability to discriminate among psychiatric populations, although it was able to separate psychiatric patients from normal youth. Gianarris, Golden, and Greene (2001) concluded that the literature had identified three primary uses for the CPRS: as a general screen for psychopathology, as an ancillary diagnostic aid, and as a general treatment outcome measure. Perhaps future reviews of the CPRS-R will demonstrate additional discriminant validity.
The Conners’ Teacher Rating Scale–Revised (CTRS-R) consists of only 59 items and generates shorter versions of all CPRS-R scales (Psychosomatic is excluded). Because Conners emphasizes teacher observation in assessment, the lack of equivalence in scale length and (in some instances) item content for the CPRS-R and CTRS-R make the interpretation of parent-teacher inconsistencies difficult. For parent and teacher ratings the normative sample ranges from 3 to 17 years, whereas the self-report scale is normed for ages 12 to 17. The CRS-R provides standard linear T scores for raw scores that are derived from contiguous 3-year segments of the normative sample. This particular norm conversion format contributes unnecessary complexity to the interpretation of repeated scales because several of these scales demonstrate a large age effect. For example, a 14-year-old boy who obtains a raw score of 6 on CPRS-R Social Problems obtains a standard score of 68T—if this lad turns 15 the following week the same raw score now represents 74T, an increase of more than half of a standard deviation. Conners (1999) also describes a serious administration artifact, in that the parent and teacher scores typically drop on their second administration. Pretreatment baseline therefore should always consist of a second administration to avoid this artifact. T values of at least 60 are suggestive, and values of at least 65T are indicative of a clinically significant problem. General guidance provided as to scale application is quite limited: “Each factor can be interpreted according to the predominant conceptual unity implied by the item content” (Connors, 1999, p. 475).
The Conners-Wells’Adolescent Self-Report Scale consists of 87 items, written at a sixth-grade reading level, that generate six nonoverlapping factor-derived scales, each consisting of 8 or 12 items (Anger Control Problems, Hyperactivity, Family Problems, Emotional Problems, Conduct Problems, Cognitive Problems). Shorter versions and several indices have been derived from these three questionnaires. These additional forms contribute to the focused evaluation of ADHD treatment and would merit separate listing under the later section “Selected Focused (Narrow) or Ancillary Objective Measures.”Although Conners (1999) discussed in some detail the influence that response sets and other inadequate responses may have on these scales, no guidance or psychometric measures are provided to support this effort.
Child Behavior Checklist; Teacher’s Report Form; Youth Self-Report
The popularity of the CBCL and related instruments in researchapplicationsincetheCBCL’sinitialpublicationin1983 has influenced thousands of research projects; the magnitude of this research application has had a significant influence on the study of child and adolescent psychopathology. The 1991 revision, documented in five monographs totaling more than 1,000 pages, emphasizes consistencies in scale dimensions and scale content across child age (4–18 years for the CBCL/ 4–18), gender, and respondent or setting (Achenbach, 1991a, 1991b, 1991c, 1991d, 1993). A series of within-instrument item analyses was conducted using substantial samples of protocols for each form obtained from clinical and specialeducation settings. The major component of parent, teacher, andself-reportformsisacommonsetof89behaviorproblems described in one to eight words (“Overtired,” “Argues a lot,” “Feels others are out to get him/her”). Items are rated as 0 = Not True, 1 = Somewhat or Sometimes True, or 2 = Very True or Often True, although several items require individual elaboration when these items are positively endorsed. These 89 items generate eight narrow-band and three composite scale scores similarly labeled for each informant, although some item content varies. Composite Internalizing Problems consists of Withdrawn, Somatic Complaints, and Anxious/ Depressed and composite Externalizing Problems consists of Delinquent Behavior and Aggressive Behavior; Social Problems, Thought Problems, and Attention Problems contribute to a summary Total scale along with the other five narrowband scales.
The 1991 forms provide standard scores based on national samples. Although the CBCL and the Youth Self-Report (YSR) are routinely self-administered in clinical application, the CBCL normative data and some undefined proportion of the YSR norms were obtained through interview of the informants. This process may have inhibited affirmative response to checklist items. For example, six of eight parent informant scales obtained average normative raw scores of less than 2, with restricted scale score variance. It is important to note that increased problem behavior scale elevation reflects increased problems, although these scales do not consistently extend below 50T. Because of the idiosyncratic manner in which T scores are assigned to scale raw scores, it is difficult to determine the interpretive meaning of checklist T scores, the derivation of which has been of concern (Kamphaus & Frick, 1996; Lachar, 1993, 1998). The gender-specific CBCL norms are provided for two age ranges (4–11 and 12–18). The Teacher’s Report Form (TRF) norms are also gender-specific and provided for two age ranges (5–11 and 12–18). The YSR norms are gender-specific and incorporate the entire age range of 11 to 18 years, and require a fifth-grade reading ability. Narrow-band scores 67 to 70T are designated as borderline; values above 70T represent the clinical range. Composite scores of 60 to 63T are designated as borderline, whereas values above 63T represent the clinical range.
The other main component of these forms measures adaptive competence using a less structured approach. The CBCL competence items are organized by manifest content into three narrow scales (Activities, Social, and School), which are then summed into a total score. Parents are asked to list and then rate (frequency, performance level) child participation in sports, hobbies, organizations, and chores. Parents also describe the child’s friendships, social interactions, performance in academic subjects, need for special assistance in school, and history of retention in grade. As standard scores for these scales increase with demonstrated ability, a borderline range is suggested at 30 to 33T and the clinical range is designated as less than 30T. Youth ethnicity and social and economic opportunities may effect CBCL competence scale values (Drotar, Stein, & Perrin, 1995). Some evidence for validity, however, has been provided in their comparison to the PIC in ability to predict adaptive level as defined by the Vineland Adaptive Behavior Scales (Pearson & Lachar, 1994).
In comparison to the CBCL, the TRF measures of competence are derived from very limited data: an average rating of academic performance based on as many as six academic subjects identified by the teacher, individual 7-point ratings on four topics (how hard working, behaving appropriately, amount learning, and how happy), and a summary score derived from these four items. The TRF designates a borderline interpretive range for the mean academic performance and the summary score of 37 to 40T, with the clinical range less than 37T. The TRF avoids the measurement of a range of meaningful classroom observations to maintain structural equivalence with the CBCL. The YSR provides seven adaptive competency items scored for Activities, Social, and a Total Competence scale. Reference to the YSR manual is necessary to score these multipart items, which tap competence and levels of involvement in sports, activities, organizations, jobs, and chores. Items also provide self-report of academic achievement, interpersonal adjustment, and level of socialization. Scales Activities and Social are classified as borderline at 30 to 33T with the clinical range less than 30T. The YSR Total Competence scale is classified as borderline at 37 to 40T with the clinical range at less than 37T. The strengths and weaknesses of these forms have been presented in some detail elsewhere (Lachar, 1998). The CBCL, TRF, and YSR provide quickly administered and easily scored parallel problem-behavior measures that facilitate direct comparison. The forms do not provide validity scales and the test manuals provide neither evidence of scale validity nor interpretive guidelines.
Selected Single-Source Multidimensional Measures
Minnesota Multiphasic Personality Inventory–Adolescent
The Minnesota Multiphasic Personality Inventory (MMPI) has been found to be useful in the evaluation of adolescents for more than 50 years (cf. Hathaway & Monachesi, 1953), although many questions have been raised as to the adequacy of this inventory’s content, scales, and the application of adult norms (cf. Lachar, Klinge, & Grisell, 1976). In 1992 a fully revised version of the MMPI custom designed for adolescents, the MMPI-A, was published (Butcher et al., 1992). Although the traditional empirically constructed validity and profile scales have been retained, scale item content has been somewhat modified to reflect contemporary and developmentally appropriate content (for example, the F scale was modified to meet statistical inclusion criteria for adolescents). In addition, a series of 15 content scales have been constructed that take advantage of new items that reflect peer interaction, school adjustment, and common adolescent concerns: Anxiety, Obsessiveness, Depression, Health Concerns, Alienation, Bizarre Mentation, Anger, Cynicism, Conduct Problems, Low Self-Esteem, Low Aspirations, Social Discomfort, Family Problems, School Problems, and Negative Treatment Indicators (Williams, Butcher, Ben-Porath, & Graham, 1992).
The MMPI-A normative sample for this 478-statement true-false questionnaire consists of 14 to 18-year-old students collected in eight U.S. states. Inventory items and directions are written at the sixth-grade level. The MMPI-A has also incorporated a variety of test improvements associated with the revision of the MMPI for adults: the development of uniform T scores and validity measures of response inconsistency that are independent of specific dimensions of psychopathology. Substantive scales are interpreted as clinically significant at values above 65T, while scores of 60 to 65T may be suggestive of clinical concerns.Archer (1999) concluded that the MMPI-A continues to represent a challenge for many of the adolescents who are requested to complete it and requires extensive training and expertise to ensure accurate application. These opinions are voiced in a recent survey (Archer & Newsom, 2000).
Adolescent Psychopathology Scale
This 346-item inventory was designed to be a comprehensive assessment of the presence and severity of psychopathology in adolescents aged 12 to 19. The Adolescent Psychopathology Scale (APS; Reynolds, 1998) incorporates 25 scales modeled after Axis I and Axis II DSM-IV criteria. The APS is unique in the use of different response formats depending on the nature of the symptom or problem evaluated (e.g., TrueFalse; Never or almost never, Sometimes, Nearly all the time) and across different time periods depending on the dimension assessed (e.g., past 2 weeks, past month, past 3 months, in general). One computer-generated profile presents 20 Clinical Disorder scales (such as Conduct Disorder, Major Depression), whereas a second profile presents 5 Personality Disorder scales (such as Borderline Personality Disorder), 11 Psychosocial Problem Content scales (such as Interpersonal Problem, Suicide), and four Response Style Indicators.
Linear T scores are derived from a mixed-gender representative standardization sample of seventh- to twelfth-grade students (n = 1,827), although gender-specific and age-specific score conversions can be selected. The 12-page administration booklet requires a third-grade reading level and is completed in 1 hr or less. APS scales obtained substantial estimates of internal consistency and test-retest reliability (median values in the .80s); mean scale score differences between APS administrations separated by a 14-day interval were small (median 1.8T). The detailed organized manuals provide a sensible discussion of scale interpretation and preliminary evidence of scale validity. Additional study will be necessary to determine the relationship between scale T-score elevation and diagnosis and clinical description for this innovative measure. Reynolds (2000) also developed a 20-min, 115-item APS short form that generates 12 clinical scales and 2 validity scales. These shortened and combined versions of full-length scales were selected because they were judged to be the most useful in practice.
Beck Youth Inventories of Emotional and Social Impairment
Recently published and characterized by the ultimate of simplicity, the Beck Youth Inventories of Emotional and Social Impairment (BYI; Beck, Beck, & Jolly, 2001) consist of five separately printed 20-item scales that can be completed individually or in any combination. The child selects one of four frequency responses to statements written at the secondgrade level: Never, Sometimes, Often, Always. Raw scores are converted to gender-specific linear T-scores for ages 7 to 10 and 11 to 14. The manual notes that 7-year-olds and students in second grade may need to have the scale items read to them. For scales Depression (BDI: “I feel sorry for myself”), Anxiety (BAI: “I worry about the future”), Anger (BANI: “People make me mad”), Disruptive Behavior (BDBI: “I break the rules”), and Self-Concept (BSCI: “I feel proud of the things I do”), the manual provides estimates of internal consistency ( = .86–.92, median = .895) and 1-week temporal stability (rtt = .63–.89, median = .80). Three studies of scale validity are also described: Substantial correlations were obtained between each BYI scale and a parallel established scale (BDI and Children’s Depression Inventory, r = .72; BAI and Revised Children’s ManifestAnxiety Scale, r = .70; BSCI and Piers-Harris Children’s Self-Concept Scale, r = .61; BDBI and Conners-Wells’ Self-Report Conduct Problems, r = .69; BANI and Conners-Wells’ SelfReport AD/HD Index, r = .73). Each BYI scale significantly separated matched samples of special-education and normative children, with the special-education sample obtaining higher ratings on Depression, Anxiety, Anger, and Disruptive Behavior and lower ratings on Self-Concept. In a comparable analysis with an outpatient sample, four of five scales obtained a significant difference from matched controls. A secondary analysis demonstrated that outpatients who obtained a diagnosis of a mood disorder rated themselves substantially lower on Self-Concept and substantially higher on Depression in comparison to other outpatients.Additional study will be necessary to establish BYI diagnostic utility and sensitivity to symptomatic change.
Comprehensive Behavior Rating Scale for Children
The Comprehensive Behavior Rating Scale for Children (CBRSC; Neeper, Lahey, & Frick, 1990) is a 70-item teacher rating scale that may be scored for nine scales that focus on learning problems and cognitive processing (Reading Problems, Cognitive Deficits, Sluggish Tempo), attention and hyperactivity (Inattention-Disorganization, Motor Hyperactivity, Daydreaming), conduct problems (OppositionalConduct Disorders), anxiety (Anxiety), and peer relations (Social Competence). Teachers select one of five frequency descriptors for each item in 10 to 15 min. Scales are profiled as linear T values based on a mixed-gender national sample of students between the ages of 6 and 14, although the manual provides age- and gender-specific conversions. Scale values above 65T are designated clinically significant.
Millon Adolescent Clinical Inventory
The Millon Adolescent Clinical Inventory (MACI; Millon, 1993), a 160-item true-false questionnaire, may be scored for 12 Personality Patterns, 8 Expressed Concerns, and 7 Clinical Syndromes dimensions, as well as three validity measures (modifying indices). Gender-specific raw score conversions, or Base Rate scores, are provided for age ranges 13 to 15 and 16 to 19 years. Scales were developed in multiple stages, with item composition reflecting theory, DSM-IV structure, and item-to-scale performance. The 27 substantive scales require 888 scored items and therefore demonstrate considerable item overlap, even within scale categories. For example, the most frequently placed item among the Personality Patterns scales is “I’ve never done anything for which I could have been arrested”—an awkward double-negative as a scored statement.The structures of these scales and the effect of this characteristic are basically unknown because scales, or classes of scales, were not submitted to factor analysis. Additional complexity is contributed by the weighting of items (3, 2, or 1) to reflect assigned theoretical or demonstrated empirical importance.
Given the additional complexity of validity adjustment processes, it is accurate to state that it is possible to handscore the MACI, although any reasonable application requires computer processing. Base rate scores range from 1 to 115, with specific importance given to values 75 to 84 and above 84. These values are tied to “target prevalence rates” derived from clinical consensus and anchor points that are discussed in this manual without the use of clarifying examples. These scores are supposed to relate in some fashion to performance in clinical samples; no representative standardization sample of nonreferred youth was collected for analysis. Base rate scores are designed to identify the pattern of problems, not to demonstrate the presence of adjustment problems. Clearly the MACI should not be used for screening or in settings in which some referred youth may not subsequently demonstrate significant problems.
MACI scores demonstrate adequate internal consistency and temporal stability. Except for some minimal correlational evidence purported to support validity, no evidence of scale performance is provided, although dimensions of psychopathology and scale intent are discussed in detail. Manual readers reasonably expect test authors to demonstrate the wisdom of their psychometric decisions. No evidence is provided to establish the value of item weighting, the utility of correction procedures, or the unique contribution of scale dimensions. For example, a cursory review of the composition of the 12 Personality Patterns scales revealed that the majority of the 22 Forceful items also are also placed on the dimension labeled Unruly. These dimensions correlate .75 and may not represent unique dimensions. Analyses should demonstrate whether a 13-year-old’s self-description is best represented by 27 independent (vs. nested) dimensions. A manual should facilitate the review of scale content by assigned value and demonstrate the prevalence of specific scale elevations and their interpretive meaning.
Selected Focused (Narrow) or Ancillary Objective Measures
Attention Deficit Hyperactivity
BASC Monitor forADHD (Kamphaus & Reynolds, 1998). Parent (46-item) and teacher (47-item) forms were designed to evaluate the effectiveness of treatments used with ADHD. Both forms provide standard scores (ages 4–18) for Attention Problems, Hyperactivity, Internalizing Problems, and Adaptive Skills, and a listing of DSM-IV items.
Brown Attention-Deficit Disorder Scales for Children and Adolescents (BADDS; Brown, 2001). This series of brief parent-, teacher-, and self-report questionnaires evaluates dimensions of ADHD that reflect cognitive impairments and symptoms beyond current DSM-IV criteria. As many as six subscales may be calculated from each form: Activation (“Seems to have exceptional difficulty getting started on tasks or routines [e.g., getting dressed, picking up toys]”); Focus/Attention (“Is easily sidetracked; starts one task and then switches to a less important task”); Effort (“Do your parents or teachers tell you that you could do better by trying harder?”); Emotion/Affect (“Seems easily irritated or impatient in response to apparently minor frustrations”); Memory (“Learns something one day, but doesn’t remember it the next day”); and Action (“When you’re supposed to sit still and be quiet, is it really hard for you to do that?”). Three item formats and varying gender-specific age-normative Bibliography: are provided: 44-item parent and teacher forms normed by gender for ages 3 to 5 and 6 to 7; 50-item parent, teacher, and self-report forms normed by gender for ages 8 to 9 and 10 to 12; and a 40-item self-report form (also used to collect collateral responses) for ages 12 to 18. All forms generate an ADD Inattention Total score and the multiinformant questionnaires also provide an ADD Combined Total score.
The BADDS manual provides an informative discussion of ADHD and a variety of psychometric studies. Subscales and composites obtained from adult informants demonstrated excellent internal consistency and temporal stability, although estimates derived from self-report data were less robust. Children with ADHD obtained substantially higher scores when compared to controls. Robust correlations were obtained for BADDS dimensions both across informants (parent-teacher, parent-child, teacher-child) and between BADDS dimensions and other same-informant measures of ADHD (CBCL, TRF, BASC Parent and Teacher Monitors, CPRS-R Short Form, CTRS-R Short Form).This manual does not provide evidence that BADDS dimensions can separate different clinical groups and quantify treatment effects.
Children’s Depression Inventory (CDI; Kovacs, 1992). This focused self-report measure may be used in the early identification of symptoms and the monitoring of treatment effectiveness, as well as contributing to the diagnostic process. The CDI represents a unique format because children are required to select one statement from each of 27 statement triads to describe their past 2 weeks. The first option is scored a 0 (symptom absence), the second a 1 (mild symptom), and the third a 2 (definite symptom). It may therefore be more accurate to characterize the CDI as a task requiring the child to read 81 short statements presented at a third-grade reading level and make a selection from statement triplets. The Total score is the summary of five factorderived subscales: Negative Mood, Interpersonal Problems, Ineffectiveness, Anhedonia, and Negative Self-esteem. An Inconsistency Index is provided to exclude protocols that may reflect inadequate attention to CDI statements or comprehension of the required task response. Also available is a 10-item short form that correlates .89 to the Total score. Regional norms generate a profile of gender- and age-specific (7–12/13–17 years) T scores, in which values in the 60s (especially those above 65T) in children referred for evaluation are clinically significant (Sitarenios & Kovacs, 1999). Although considerable emphasis has been placed on the accurate description of the CDI as a good indicator of selfreported distress and not a diagnostic instrument, the manual and considerable literature focus on classification based on a Total raw score cutoff (Fristad, Emery, & Beck, 1997).
Revised Children’s Manifest Anxiety Scale (RCMAS; Reynolds & Richmond, 1985). Response of Yes–No to 37 statements generate a focused Total Anxiety score that incorporates three subscales (Physiological Anxiety, Worry/Oversensitivity, Social Concerns/ Concentration); the other nine items provide a validity scale (Lie). Standard scores derived from a normative sample of approximately 5,000 protocols are gender and age specific (6–17+ years). Independent response to scale statements requires a third-grade reading level; each anxiety item obtained an endorsement rate between .30 and .70 and correlated at least .40 with the total score. Anxiety as a disorder is suggested with a total score that exceeds 69T; symptoms of anxiety are suggested by subscale elevations when Total Anxiety remains below 70T (Gerard & Reynolds, 1999).
Marital Satisfaction Inventory–Revised (MSI-R; Snyder, 1997). When the marital relationship becomes a potential focus of treatment, it often becomes useful to define areas of conflict and the differences manifest by comparison of parent descriptions. The MSI-R includes 150 true-false items comprising two validity scales (Inconsistency, Conventionalization), one global scale (Global Distress), and 10 scales that assess specific areas of relationship stress (Affective Communication, Problem-Solving Communication, Aggression, TimeTogether, DisagreementAbout Finances, Sexual Dissatisfaction, Role Orientation, Family History of Distress, Dissatisfaction With Children, Conflict Over Child Rearing). Items are presented on a self-scoring form or by personal computer, and one profile facilitates direct comparison of paired sets of gender-specific normalized T scores that are subsequently applied in evaluation, treatment planning, and outcome assessment. Empirically established T-score ranges suggesting adjustment problems are designated on the profile (usually scores above 59T). The geographically diverse, representative standardization sample included more than 2,000 married adults. Because of substantial scale internal consistency (median = .82) and temporal stability (median 6week rtt = .79), a difference between spouse profiles or a shift on retest of as little as 6 T-points represents a meaningful and stable phenomenon. Evidence of scale discriminant and actuarial validity has been summarized in detail (Snyder & Aikman, 1999).
Parenting Stress Index (PSI), Third Edition (Abidin, 1995). This unique 120-item questionnaire measures excessive stressors and stress within families of children aged 1 to 12 years. Description is obtained by parent selection from five response options to statements often presented in the form of strongly agree, agree, not sure, disagree, strongly agree. A profile of percentiles from maternal response to the total mixed-gender normative sample includes a Child Domain score (subscales Distractibility/Hyperactivity, Adaptability, Reinforces Parent, Demandingness, Mood, Adaptability) and a Parent Domain score (subscales Competence, Isolation, Attachment, Health, Role Restriction, Depression, Spouse), which are combined into a Total Stress composite.Additional measures include a Life Stress scale of 19 Yes–No items and a Defensive Responding scale. Interpretive guidelines are provided for substantive dimensions at 1 standard deviation above and for Defensiveness values at 1 standard deviation below the mean. A 36-item short form provides three subscales: Parental Distress, Parent-Child Dysfunctional Interaction, and Difficult Child. These subscales are summed into a Total Stress score; a Defensiveness Responding scale is also scored.
Current Status and Future Directions
Multidimensional, multiinformant objective assessment makes a unique contribution to the assessment of youth adjustment. This research paper presents the argument that this form of assessment is especially responsive to the evaluation of the evolving child and compatible with the current way in which mental health services are provided to youth. The growing popularity of these instruments in clinical practice (cf. Archer & Newsom, 2000), however, has not stimulated comparable efforts in research that focuses on instrument application. Objective measures of youth adjustment would benefit from the development of a research culture that promotes the study and demonstration of measure validity. Current child clinical literature predominantly applies objective measures in the study of psychopathology and does not focus on the study of test performance as an important endeavor. The journals that routinely publish studies on test validity (e.g., Psychological Assessment, Journal of Personality Assessment, Assessment) seldom present articles that focus on instruments that measure child or adolescent adjustment. An exception to this observation is the MMPI-A, for which research efforts have been influenced by the substantial research culture of the MMPI and MMPI-2 (cf. Archer, 1997).
Considerable effort will be required to establish the construct and actuarial validity of popular child and adolescent adjustment measures. It is not sufficient to demonstrate that a distribution of scale scores separates regular-education students from those referred for mental health services to establish scale validity. Indeed, the absence of such evidence may not exclude a scale from consideration, because it is possible that the measurement of some normally distributed personality characteristic, such as social introversion, may contribute to the development of a more effective treatment plan. Once a child is referred for mental health services, application of a screening measure is seldom of value. The actuarial interpretive guidelines of the PIC-2, PIY, and SBS have established one standard of the significant scale score by identifying the minimum T-score elevation from which useful clinical information may be reliably predicted. Although other paradigms might establish such a minimum scale score standard as it predicts the likelihood of significant disability or caseness scale validity will be truly demonstrated only when a measure contributes to the accuracy of routine decision making that occurs in clinical practice. Such decisions include the successful solution of a representative differential diagnosis (cf. Forbes, 1985), or the selection of an optimal plan of treatment (cf. Voelker et al., 1983).
Similarly, traditional evidence of scale reliability is an inadequate standard of scale performance as applied to clinical situations in which a scale is sequentially administered over time. To be applied in the evaluation of treatment effectiveness, degree of scale score change must be found to accurately track some independent estimate of treatment effectiveness (cf. Sheldrick, Kendall, & Heimberg, 2001). Of relevance here will be the consideration of scale score range and the degree to which a ceiling or floor effect restricts scale performance.
Considering that questionnaire-derived information may be obtained from parents, teachers, and the child, it is not unusual that the study of agreement among informants continues to be of interest. In this regard, it will be more useful to determine the clinical implications of the results obtained from each informant rather than the magnitude of correlations that are so easily derived from samples of convenience (cf. Hulbert, Gdowski, & Lachar, 1986). Rather than attributing obtained differences solely to situation specificity, other explanations should be explored. For example, evidence suggests that considerable differences between informants may be attributed to the effects of response sets, such as respondent defensiveness. Perhaps the study of informant agreement has little value in increasing the contribution of objective assessment to clinical application. Rather, it may be more useful for research to apply paradigms that focus on the incremental validity of applications of objective assessment. Beginning with the information obtained from an intake interview, a parent-derived profile could be collected and its additional clinical value determined. In a similar fashion, one could evaluate the relative individual and combined contribution of parent and teacher description in making a meaningful differential diagnosis, say, between ADHD and ODD. The feasibility of such psychometric research should increase as routine use of objective assessment facilitates the development of clinical databases at clinics and inpatient units.
- Abidin, R. R. (1995). Parenting Stress Index, third edition, professional manual. Odessa, FL: Psychological Assessment Resources.
- Achenbach, T. M. (1991a). Integrative guide for the 1991 CBCL / 4-18, YSR, and TRF profiles. Burlington: University of Vermont, Department of Psychiatry.
- Achenbach,T.M.(1991b).ManualfortheChildBehaviorChecklist/ 4-18 and 1991 Profile. Burlington: University of Vermont, Department of Psychiatry.
- Achenbach, T. M. (1991c). Manual for the Teacher’s Report Form and 1991 Profile. Burlington: University of Vermont, Department of Psychiatry.
- Achenbach, T. M. (1991d). Manual for the Youth Self-Report and 1991 Profile. Burlington: University of Vermont, Department of Psychiatry.
- Achenbach, T. M. (1993) Empirically based taxonomy: How to use syndromes and profile types derived from the CBCL/4-18, TRF, and YSR. Burlington: University of Vermont, Department of Psychiatry.
- Achenbach, T. M., McConaughy, S. H., & Howell, C. T. (1987). Child/adolescent behavioral and emotional problems: Implications of cross-informant correlations for situational specificity. Psychological Bulletin, 101, 213–232.
- Ammerman, R. T., & Hersen, M. (1993). Developmental and longitudinal perspectives on behavior therapy. In R.T. Ammerman & M. Hersen (Eds.), Handbook of behavior therapy with children and adults (pp. 3–9). Boston: Allyn and Bacon.
- Archer, R. P. (1992). MMPI-A: Assessing adolescent psychopathology. Hillsdale, NJ: Erlbaum.
- Archer, R. P. (1997). Future directions for the MMPI-A: Research and clinical issues. Journal of Personality Assessment, 68, 95– 109.
- Archer, R. P. (1999). Overview of the Minnesota Multiphasic Personality Inventory–Adolescent (MMPI-A). In M. E. Maruish (Ed.), The use of psychological testing for treatment planning and outcomes assessment (2nd ed., pp. 341–380). Mahwah, NJ: Erlbaum.
- Archer, R. P., Maruish, M., Imhof, E. A., & Piotrowski, C. (1991). Psychological test usage with adolescent clients: 1990 survey findings. Professional Psychology: Research and Practice, 22, 247–252.
- Archer, R. P., & Newsom, C. R. (2000). Psychological test usage with adolescent clients: Survey update. Assessment, 7, 227–235.
- Barrett, M. L., Berney, T. P., Bhate, S., Famuyiwa, O. O., Fundudis, T., Kolvin, I., & Tyrer, S. (1991). Diagnosing childhood depression. Who should be interviewed—parent or child? The Newcastle child depression project. British Journal of Psychiatry, 159 (Suppl. 11), 22–27.
- Beck, J. S., Beck, A. T., & Jolly, J. B. (2001). Beck Youth Inventories of Emotional and Social Impairment manual. San Antonio, TX: The Psychological Corporation.
- Bidaut-Russell,M.,Reich,W.,Cottler,L.B.,Robins,L.N.,Compton, W.M.,&Mattison,R.E.(1995).TheDiagnosticInterviewScheduleforChildren(PC-DISCv.3.0):Parentsandadolescentssuggest reasons for expecting discrepant answers. Journal of Abnormal Child Psychology, 23, 641–659.
- Biederman, J., Newcorn, J., & Sprich, S. (1991). Comorbidity of attention deficit hyperactivity disorder with conduct, depressive, anxiety, and other disorders. American Journal of Psychiatry, 148, 564–577.
- Brady, E. U., & Kendall, P. C. (1992). Comorbidity of anxiety and depression in children in children and adolescents. Psychological Bulletin, 111, 244–255.
- Brown, T. E. (2001). Brown Attention-Deficit Disorder Scales for Children and Adolescents manual. San Antonio, TX: The Psychological Corporation.
- Burisch, M. (1984). Approaches to personality inventory construction. American Psychologist, 39, 214–227.
- Burns, G. L., Walsh, J. A., Owen, S. M., & Snell, J. (1997). Internal validity of attention deficit hyperactivity disorder, oppositional defiant disorder, and overt conduct disorder symptoms in young children: Implications from teacher ratings for a dimensional approach to symptom validity. Journal of Clinical Child Psychology, 26, 266–275.
- Butcher, J. N., Williams, C. L., Graham, J. R., Archer, R. P., Tellegen, A., Ben-Porath, Y. S., & Kaemmer, B. (1992). Minnesota Multiphasic Personality Inventory–Adolescent: Manual for administration, scoring, and interpretation. Minneapolis: University of Minnesota Press.
- Cantwell, D. P. (1996). Attention deficit disorder: A review of the past 10 years. Journal of the American Academy of Child and Adolescent Psychiatry, 35, 978–987.
- Caron, C., & Rutter, M. (1991). Comorbidity in child psychopathology: Concepts, issues, and research strategies. Journal of Child Psychology and Psychiatry, 32, 1063–1080.
- Conners, C. K. (1997). Conners’ Rating Scales–Revised technical manual. North Tonawanda, NY: Multi-Health Systems.
- Conners, C. K. (1999). Conners’ Rating Scales–Revised. In M. E. Maruish (Ed.), The use of psychological testing for treatment planning and outcome assessment (2nd ed., pp. 467–495). Mahwah, NJ: Erlbaum.
- Cordell, A. (1998). Psychological assessment of children. In W. M. Klykylo, J. Kay, & D. Rube (Eds.), Clinical child psychiatry (pp. 12–41). Philadelphia: W. B. Saunders.
- Crystal, D. S., Ostrander, R., Chen, R. S., & August, G. J. (2001). Multimethod assessment of psychopathology among DSM-IV subtypes of children with attention-deficit/hyperactivity disorder: Self-, parent, and teacher reports. Journal of Abnormal Child Psychology, 29, 189–205.
- Drotar, D., Stein, R. E., & Perrin, E. C. (1995). Methodological issues in using the Child Behavior Checklist and its related instruments in clinical child psychology research. Journal of Clinical Child Psychology, 24, 184–192.
- Duhig, A. M., Renk, K., Epstein, M. K., & Phares, V. (2000). Interparental agreement on internalizing, externalizing, and total behavior problems: A meta-analysis. Clinical Psychology: Science and Practice, 7, 435–453.
- Edelbrock, C., Costello, A. J., Dulcan, M. K., Kalas, D., & Conover, N. (1985). Age differences in the reliability of the psychiatric interview of the child. Child Development, 56, 265–275.
- Exner, J. E., Jr., & Weiner, I. B. (1982). The Rorschach: A comprehensive system: Vol. 3. Assessment of children and adolescents. New York: Wiley.
- Finch, A. J., Lipovsky, J. A., & Casat, C. D. (1989). Anxiety and depression in children and adolescents: Negative affectivity or separate constructs. In P. C. Kendall & D. Watson (Eds.), Anxiety and depression: Distinctive and overlapping features (pp. 171– 202). New York: Academic Press.
- Fitzgerald, H. E., Zucker, R. A., Maguin, E. T., & Reider, E. E. (1994). Time spent with child and parental agreement about preschool children’s behavior. Perceptual and Motor Skills, 79, 336–338.
- Flavell, J. H., Flavell, E. R., & Green, F. L. (2001). Development of children’s understanding of connections between thinking and feeling. Psychological Science, 12, 430–432.
- Forbes, G. B. (1985). The Personality Inventory for Children (PIC) and hyperactivity: Clinical utility and problems of generalizability. Journal of Pediatric Psychology, 10, 141–149.
- Fristad, M. A., Emery, B. L., & Beck, S. J. (1997). Use and abuse of the Children’s Depression Inventory. Journal of Consulting and Clinical Psychology, 65, 699–702.
- Gerard, A. B., & Reynolds, C. R. (1999). Characteristics and applications of the Revisd Children’s Manifest Anxiety Scale (RCMAS). In M. E. Maruish (Ed.), The use of psychological testing for treatment planning and outcomes assessment (2nd ed., pp. 323–340). Mahwah, NJ: Erlbaum.
- Gianarris, W. J., Golden, C. J., & Greene, L. (2001). The Conners’ Parent Rating Scales: A critical review of the literature. Clinical Psychology Review, 21, 1061–1093.
- Gliner, J. A., Morgan, G. A., & Harmon, R. J. (2001). Measurement reliability. Journal of the American Academy of Child and Adolescent Psychiatry, 40, 486–488.
- Goyette, C. H., Conners, C. K., & Ulrich, R. F. (1978). Normative data on Revised Conners’ Parent and Teacher Rating Scales. Journal of Abnormal Child Psychology, 6, 221–236.
- Graham, J. R. (2000). MMPI-2: Assessing personality and psychopathology. New York: Oxford University Press.
- Handwerk, M. L., Larzelere, R. E., Soper, S. H., & Friman, P. C. (1999). Parent and child discrepancies in reporting severity of problem behaviors in three out-of-home settings. Psychological Assessment, 11, 14–23.
- Hathaway, S. R., & Monachesi, E. D. (1953). Analyzing and predicting juvenile delinquency with the MMPI. Minneapolis: University of Minnesota Press.
- Hinshaw, S. P., Lahey, B. B., & Hart, E. L. (1993). Issues of taxonomy and comorbidity in the development of conduct disorder. Development and Psychopathology, 5, 31–49.
- Hulbert, T. A., Gdowski, C. L., & Lachar, D. (1986). Interparent agreement on the Personality Inventory for Children: Are substantial correlations sufficient? Journal of Abnormal Child Psychology, 14, 115–122.
- Jensen, P. S., Martin, D., & Cantwell, D. P. (1997). Comorbidity in ADHD: Implications for research, practice, and DSM-IV. Journal of the American Academy of Child and Adolescent Psychiatry, 36, 1065–1079.
- Kamphaus, R. W., & Frick, P. J. (1996). Clinical assessment of child and adolescent personality and behavior. Boston: Allyn and Bacon.
- Kamphaus, R. W., & Frick, P. J. (2002). Clinical assessment of child and adolescent personality and behavior (2nd ed.). Boston: Allyn and Bacon.
- Kamphaus, R. W., & Reynolds, C. R. (1998). BASC Monitor for ADHD manual. Circle Pines, MN: American Guidance Service.
- King, N. J., Ollendick, T. H., & Gullone, E. (1991). Negative affectivity in children and adolescents: Relations between anxiety and depression. Clinical Psychology Review, 11, 441–459.
- Kovacs, M. (1992). Children’s Depression Inventory (CDI) manual. North Tonawanda, NY: Multi-Health Systems.
- Lachar, D. (1993). Symptom checklists and personality inventories. In T. R. Kratochwill & R. J. Morris (Eds.), Handbook of psychotherapy for children and adolescents (pp. 38–57). New York: Allyn and Bacon.
- Lachar, D. (1998). Observations of parents, teachers, and children: Contributions to the objective multidimensional assessment of youth. In A. S. Bellack, M. Hersen (Series Eds.), & C. R. Reynolds (Vol. Ed.), Comprehensive clinical psychology: Vol. 4. Assessment (pp. 371–401). New York: Pergamon Press.
- Lachar, D., & Gdowski, C. L. (1979). Actuarial assessment of child and adolescent personality: An interpretive guide for the Personality Inventory for Children profile. Los Angeles: Western Psychological Services.
- Lachar, D., & Gruber, C. P. (1993). Development of the Personality Inventory for Youth: A self-report companion to the Personality Inventory for Children. Journal of Personality Assessment, 61, 81–98.
- Lachar, D., & Gruber, C. P. (1995). Personality Inventory for Youth (PIY) manual: Administration and interpretation guide. Technical guide. Los Angeles: Western Psychological Services.
- Lachar, D., & Gruber, C. P. (2001). Personality Inventory for Children, Second Edition (PIC-2). Standard Form and Behavioral Summary manual. Los Angeles: Western Psychological
- Lachar, D., Klinge, V., & Grisell, J. L. (1976). Relative accuracy of automated MMPI narratives generated from adult-norm and adolescent-norm profiles. Journal of Consulting and Clinical Psychology, 46, 1403–1408.
- Lachar,D.,Morgan,S.T.,Espadas,A.,&Schomer,O.(2000,August). Effect of defensiveness on two self-report child adjustment inventories. Paper presented at the 108th annual meeting of the American PsychologicalAssociation,Washington DC.
- Lachar, D., Randle, S. L., Harper, R. A., Scott-Gurnell, K. C., Lewis, K. R., Santos, C. W., Saunders, A. E., Pearson, D. A.,
- Loveland, K. A., & Morgan, S. T. (2001). The Brief Psychiatric Rating Scale for Children (BPRS-C): Validity and reliability of an anchored version. Journal of the American Academy of Child and Adolescent Psychiatry, 40, 333–340.
- Lachar, D., Wingenfeld, S. A., Kline, R. B., & Gruber, C. P. (2000). Student Behavior Survey manual. Los Angeles: Western Psychological Services.
- LaGreca, A. M., Kuttler, A. F., & Stone, W. L. (2001). Assessing children through interviews and behavioral observations. In C. E. Walker & M. C. Roberts (Eds.), Handbook of clinical child psychology (3rd ed., pp. 90–110). New York: Wiley.
- Loeber, R., Green, S. M., & Lahey, B. B. (1990). Mental health professionals’ perception of the utility of children, mothers, and teachers as informants on childhood psychopathology. Journal of Clinical Child Psychology, 19, 136–143.
- Loeber, R., & Keenan, K. (1994). Interaction between conduct disorder and its comorbid conditions: Effects of age and gender. Clinical Psychology Review, 14, 497–523.
- Loeber, R., Lahey, B. B., & Thomas, C. (1991). Diagnostic conundrum of oppositional defiant disorder and conduct disorder. Journal of Abnormal Psychology, 100, 379–390.
- Loeber, R., & Schmaling, K. B. (1985). The utility of differentiating between mixed and pure forms of antisocial child behavior. Journal of Abnormal Child Psychology, 13, 315–336.
- Marmorstein, N. R., & Iacono, W. G. (2001). An investigation of female adolescent twins with both major depression and conduct disorder. Journal of the American Academy of Child and Adolescent Psychiatry, 40, 299–306.
- Maruish, M. E. (1999). The use of psychological testing for treatment planning and outcomes assessment (2nd ed.) Mahwah, NJ: Erlbaum.
- Maruish, M. E. (2002). Psychological testing in the age of managed behavioral health care. Mahwah, NJ: Erlbaum.
- Mash, E. J., & Lee, C. M. (1993). Behavioral assessment with children. In R. T. Ammerman & M. Hersen (Eds.), Handbook of behavior therapy with children and adults (pp. 13–31). Boston: Allyn and Bacon.
- Mash, E. J., & Terdal, L. G. (1997). Assessment of child and family disturbance: A behavioral-systems approach. In E. J. Mash & L. G. Terdal (Eds.), Assessment of childhood disorders (3rd ed., pp. 3–69). New York: Guilford Press.
- McArthur, D. S., & Roberts, G. E. (1982). Roberts Apperception Test for Children manual. Los Angeles: Western Psychological Services.
- McConaughy, S. H., & Achenbach, T. M. (1994). Comorbidity of empirically based syndromes in matched general population and clinical samples. Journal of Child Psychology and Psychiatry, 35, 1141–1157.
- McLaren, J., & Bryson, S. E. (1987). Review of recent epidemiological studies of mental retardation: Prevalence, associated disorders, and etiology. American Journal of Mental Retardation, 92, 243–254.
- McMahon, R. J. (1987). Some current issues in the behavioral assessment of conduct disordered children and their families. Behavioral Assessment, 9, 235–252.
- Merrell, K. W. (1994). Assessment of behavioral, social, and emotional problems. Direct and objective methods for use with children and adolescents. New York: Longman.
- Millon, T. (1993). Millon Adolescent Clinical Inventory (MACI) manual. Minneapolis: National Computer Systems.
- Moretti, M. M., Fine, S., Haley, G., & Marriage, K. (1985). Childhood and adolescent depression: Child-report versus parentreport information. Journal of the American Academy of Child Psychiatry, 24, 298–302.
- Morgan, G. A., Gliner, J. A., & Harmon, R. J. (2001). Measurement validity. Journal of the American Academy of Child and Adolescent Psychiatry, 40, 729–731.
- Naglieri, J. A., LeBuffe, P. A., & Pfeiffer, S. I. (1994). Devereux Scales of Mental Disorders manual. San Antonio, TX: The Psychological Corporation.
- Nanson, J. L., & Gordon, B. (1999). Psychosocial correlates of mental retardation. InV. L. Schwean & D. H. Saklofske (Eds.), Handbook of psychosocial characteristic of exceptional children (pp. 377–400). NewYork: KluwerAcademic/Plenum Publishers.
- Neeper, R., Lahey, B. B., & Frick, P. J. (1990). Comprehensive behavior rating scale for children. San Antonio, TX: The Psychological Corporation.
- Newman, F. L., Ciarlo, J. A., & Carpenter, D. (1999). Guidelines for selecting psychological instruments for treatment planning and outcome assessment. In M. E. Maruish (Ed.), The use of psychological testing for treatment planning and outcomes assessment (2nd ed., pp. 153–170). Mahwah, NJ: Erlbaum.
- Nottelmann, E. D., & Jensen, P. S. (1995). Comorbidity of disorders in children and adolescents: Developmental perspectives. In H. Ollendick & R. J. Prinz (Eds.), Advances in clinical child psychology (Vol. 17, pp. 109–155). New York: Plenum Press.
- Offord, D. R., Boyle, M. H., & Racine, Y.A. (1991). The epidemiology of antisocial behavior in childhood and adolescence. In. D. J. Pepler & K. H. Rubin (Eds.), The development and treatment of childhood aggression (pp. 31–54). Hillsdale, NJ: Erlbaum.
- Pearson, D. A., & Lachar, D. (1994). Using behavioral questionnaires to identify adaptive deficits in elementary school children. Journal of School Psychology, 32, 33–52.
- Pearson, D. A., Lachar, D., Loveland, K. A., Santos, C. W., Faria, L. P., Azzam, P. N., Hentges, B. A., & Cleveland, L. A. (2000). Patterns of behavioral adjustment and maladjustment in mental retardation: Comparison of children with and without ADHD. American Journal on Mental Retardation, 105, 236–251.
- Phares, V. (1997). Accuracy of informants: Do parents think that mother knows best? Journal of Abnormal Child Psychology, 25, 165–171.
- Piotrowski, C., Belter, R. W., & Keller, J. W. (1998). The impact of “managed care” on the practice of psychological testing: Preliminaryfindings.JournalofPersonalityAssessment,70,441– 447.
- Pisecco, S., Lachar, D., Gruber, C. P., Gallen, R. T., Kline, R. B., & Huzinec, C. (1999). Development and validation of disruptive behavior DSM-IV scales for the Student Behavior Survey (SBS). Journal of Psychoeducational Assessment, 17, 314–331.
- Pliszka, S. R. (1998). Comorbidity of attention-deficit/hyperactivity disorder with psychiatric disorder: An overview. Journal of Clinical Psychiatry, 59(Suppl. 7), 50–58.
- Reynolds, C. R., & Kamphaus, R. W. (1992). Behavior Assessment System for Children manual. Circle Pines, MN: American Guidance Service.
- Reynolds, C. R., & Richmond, B. O. (1985). Revised Children’s Manifest Anxiety Scale manual. Los Angeles: Western Psychological Services.
- Reynolds, W. M. (1998). Adolescent Psychopathology Scale (APS): Administration and interpretation manual. Psychometric and technical manual. Odessa, FL: Psychological Assessment Resources.
- Reynolds, W. M. (2000). Adolescent Psychopathology Scale–Short Form (APS-SF) professional manual. Odessa, FL: Psychological Assessment Resources.
- Roberts, M. C., & Hurley, L. (1997). Managing managed care. New York: Plenum Press.
- Sheldrick, R. C., Kendall, P. C., & Heimberg, R. G. (2001). The clinical significance of treatments: A comparison of three treatments for conduct disordered children. Clinical Psychology: Science and Practice, 8, 418–430.
- Sitarenios, G., & Kovacs, M. (1999). Use of the Children’s Depression Inventory. In M. E. Maruish (Ed.), The use of psychological testing for treatment planning and outcomes assessment (2nd ed., pp. 267–298). Mahwah, NJ: Erlbaum.
- Snyder, D. K. (1997). Manual for the Marital Satisfaction Inventory– Revised. Los Angeles: Western Psychological Services.
- Snyder, D. K., & Aikman, G. G. (1999). Marital Satisfaction Inventory–Revised. In M. E. Maruish (Ed.), The use of psychological testing for treatment planning and outcomes assessment (2nd ed., pp. 1173–1210). Mahwah, NJ: Erlbaum.
- Spengler, P. M., Strohmer, D. C., & Prout, H. T. (1990). Testing the robustness of the diagnostic overshadowing bias. American Journal on Mental Retardation, 95, 204–214.
- Voelker, S., Lachar, D., & Gdowski, C. L. (1983). The Personality Inventory for Children and response to methylphenidate: Preliminary evidence for predictive utility. Journal of Pediatric Psychology, 8, 161–169.
- Williams, C. L., Butcher, J. N., Ben-Porath, Y. S., & Graham, J. R. (1992). MMPI-A content scales. Assessing psychopathology in adolescents. Minneapolis: University of Minnesota Press.
- Wingenfeld, S. A., Lachar, D., Gruber, C. P., & Kline, R. B. (1998). Development of the teacher-informant Student Behavior Survey. Journal of Psychoeducational Assessment, 16, 226–249.
- Wrobel, T. A., Lachar, D., Wrobel, N. H., Morgan, S. T., Gruber, C. P., & Neher, J. A. (1999). Performance of the Personality Inventory for Youth validity scales. Assessment, 6, 367–376.
- Youngstrom,E.,Loeber,R.,&Stouthamer-Loeber,M.(2000).Patterns and correlates of agreement between parent, teacher, and male adolescent ratings of externalizing and internalizing Journal of Consulting and Clinical Psychology, 68, 1038–1050.