Meeting of May 16, 2007 - on Employment Testing and Screening
Thank you for the invitation to speak to the Commission about systemic enforcement issues involving employee selection procedures. For twenty years, I have counseled employers across the United States regarding compliance with federal anti-discrimination laws, and represented employers in litigation in federal court. A substantial part of my practice involves advice and litigation with respect to the development, validation, and use of tests and other selection procedures, consistent with Title VII and the Uniform Guidelines on Employee Selection Procedures. Recently, I represented Ford Motor Company in a case brought by the EEOC and private plaintiffs, challenging a written test battery Ford used to select apprentices.
The Commission’s systemic enforcement programs serve an important role in enforcing compliance with federal anti-discrimination laws, by allocating the Commission’s limited resources to areas in which unlawful practices affect large numbers of applicants or employees. Voluntary or court-ordered resolution and remediation of such practices can benefit not only the immediately affected individuals, but also applicants and employees of other companies which may revise their practices in light of a systemic enforcement effort.
In some instances, employers’ use of tests may be an appropriate target of systemic investigation and enforcement efforts. Some tests are used in hiring processes resulting in hundreds or thousands of selections. An employer may be ignorant of the legal requirements, or turn a blind eye to whether a test is supported by evidence of job-relatedness. But in my experience, the vast majority of employers use testing for legitimate and nondiscriminatory business reasons. For instance, they use tests of ability to “screen out” candidates who are the most likely not to succeed in a job because they lack a particular skill or knowledge (such as mechanical or mathematical skills) that is critical or important to successful job performance. Sometimes, they use biodata or personality tests to “screen in” candidates who are the most likely to possess attributes or skills associated with successful job performance. It is in employers’ financial and operational interests to use selection tools that are truly job-related.
Many employers implement standardized tests and other selection procedures in an effort to promote equal employment opportunity. They are mindful that unfettered subjective decision-making can lead to allegations of a pattern or practice of employment discrimination, so they search for objective measures. Many employers spend very large sums of money to employ independent consultants or in-house industrial-organizational psychologists and other selection staff to conduct or supervise job analysis and validation projects, develop or purchase selection instruments, and analyze the results of their selection practices.
When these efforts come under agency or federal court scrutiny, it sometimes seems to these employers that “no good deed goes unpunished” – that is, that their efforts to eradicate subjective practices by adopting uniform and objective selection criteria may lead them, ironically, into a potential liability trap. Sometimes, they perceive that cases are prosecuted simply because adverse impact is alleged, and that their good faith – and expensive -- efforts to validate selection programs as job-related and consistent with business necessity are ignored or challenged based on tenuous arguments of technical non-compliance with complex and outdated standards for test validation.
While litigation enforcement proceedings in some instances will be necessary and appropriate to ensure compliance with the prohibition against discrimination, I respectfully suggest that the greatest good can be achieved for the greatest number of applicants and employees – as well as for conscientious employers and courts considering adverse impact claims – by concentrating the Commission’s efforts in two areas:
First, development of updated guidelines by which employers, their experts, and courts can gauge the effectiveness of test validation. Now approaching thirty years old, the Uniform Guidelines no longer reflect the current state of professional research and experience in the field of test development and validation, and do not adequately assist the courts or parties in litigating selection issues nor inform employers in designing validation strategies.
Second, development of better training of agency investigators and other personnel in proof of adverse impact claims, including evidence sufficient to show the job-relatedness of a selection– making it more likely that the employer practices targeted for enforcement proceedings are the ones which actually are unjustified.
Each of these areas is addressed below:
The Uniform Guidelines on Employee Selection Procedures, (“Uniform Guidelines” or the “Guidelines”),2 and their accompanying Questions and Answers, provide guidance for Title VII compliance and for judicial resolution of testing-based claims. By incorporating a “single set of principles,” the Guidelines were designed “to provide a framework for determining the proper use of tests and other selection procedures.”3 Courts frequently cite to the Uniform Guidelines in deciding testing-based adverse impact claims.4 However, in my experience, the Uniform Guidelines now often confuse litigation of testing issues, because they are outdated, highly technical, and not clearly written. As a result, parties and courts are faced with a Hobson’s choice of either failing to follow the Guidelines or failing to apply the current professional standards, or, even worse, applying them incorrectly.
The Uniform Guidelines have not been submitted for public comment, nor, since their adoption in 1978, been updated by the EEOC and other relevant federal agencies.5 In the same period, in contrast, the American Psychological Association’s Standards for Educational and Psychological Tests (the “Standards”) have been revised substantially several times, as have the Principles for the Validation and Use of Employee Selection Procedures (the “Principles”), published by Division 14 of the APA, the Society for Industrial-Organizational Psychology (“SIOP”). While the Uniform Guidelines cite the APA Standards and other “generally accepted professional standards” as providing relevant guidance in administrative proceedings involving challenges to selection procedures,6 the Guidelines themselves no longer reflect current professional opinion regarding development and validation of employee selection procedures. Discussed below are a few examples of specific areas in which the Uniform Guidelines would benefit from an update.
The Guidelines (and related investigation and enforcement procedures) do not take into account post-1978 judicial guidance regarding the evidence an employer may offer to prove that a selection procedure is job-related. In Watson v. Fort Worth Bank & Trust,7 for instance, decided a decade after the Guidelines were adopted, the Supreme Court held that the adverse impact theory could be applied to subjective decision-making systems. Both the plurality and concurring opinions pointed out, however, that an employer’s burden under Title VII of proving that a selection procedure is job-related does not in every instance require evidence of a formal validation study as set forth in the Uniform Guidelines. The plurality opinion states, “In the context of subjective or discretionary employment decisions, the employer will often find it easier than in the case of standardized tests to produce evidence of a ‘manifest relationship to the employment in question.’”8 In a concurring opinion, three Justices agreed that the Uniform Guidelines “may sometimes not be effective in measuring the job-relatedness of subjective-selection processes,” and that the “proper means of establishing business necessity will vary with the type and size of the business in question, as well as the particular job for which the selection process is employed.”9 I respectfully suggest that the Uniform Guidelines should be revised to reflect the flexibility in validation methods and the limitations on the need for formal validation studies, particularly with respect to subjective selection criteria, that are described in Watson.
The Uniform Guidelines generally state that employers may use one (or more) of three principal validation strategies to examine the job-relatedness of a test or other selection procedure: criterion-related validation, content validation, or construct validation.10 The Uniform Guidelines state that construct validation is a new approach that is not yet fully developed, and therefore not readily accepted by EEOC on its own merits.11
Since the Guidelines were issued in 1978, industrial organizational psychologists have developed substantial research into construct validity and associated approaches, and the limitations and cautions in the Uniform Guidelines are no longer necessary. Yet, employers may be reluctant to use these validation approaches because those limitations still remain. I respectfully suggest that the provisions of the Uniform Guidelines relating to construct validation should be updated to reflect the professional experience and research in this area since 1978.
Whenever a test or other selection procedure results in adverse impact, the Guidelines require evidence of validation,12 and express a general expectation that the required validation study will be conducted locally, that is, with respect to the specific employer’s workforce and/or applicant pool. For certain types of tests, however, particularly tests of cognitive ability, a substantial body of professional research now has concluded that evidence of validity can be generalized across a broad range of jobs. Using meta-analysis13 and other statistical techniques, research conducted by industrial-organizational psychologists demonstrates that, in some circumstances, evidence of the validity of a test as a predictive tool can be generalized from one employment setting to another without need for a local validation study to demonstrate job-relatedness. Two leading researchers in the field of validity generalization for cognitive ability testing are Frank Schmidt and John Hunter.14 Schmidt and Hunter have concluded that evidence of the validity of cognitive ability testing is not “situation-specific,” and can be generalized across employment settings, because cognitive ability is nearly universally relevant to and useful in predicting job performance.
While research regarding generalization of validity results for cognitive ability and other testing is not unanimous in its conclusions, it merits careful review and attention by the EEOC (and other enforcement agencies). The concept of validity generalization is discussed at length in the 2003 Principles, which include detailed consideration of a number of validity generalization approaches, including transportability,15 synthetic validity/job component validity,16 and meta-analysis.17
I respectfully suggest that the EEOC revise the Uniform Guidelines to reflect the current consensus of professional opinion and research with respect to validity generalization. Doing so would provide test users and courts with better and more current guidance about the situations in which the generalization of validity results is sufficient to establish the job-relatedness of selection tools. The Guidelines (and EEOC investigators) should not insist on local validation studies; as the Principles note, due to sample size and other psychometric considerations, generalized validity evidence can be more informative and relevant than the results of a local validity study.18 Particular attention should be given to research regarding the generalizability of the validity of cognitive ability tests and to the growing body of professional experience and research regarding synthetic validity/job component validity.
In connection with criterion-related validation studies, the Uniform Guidelines require test users to conduct tests of fairness where adverse impact exists and where performing a fairness study is “technically feasible.”19 “Unfairness” in a selection procedure is defined as occurring “[w]hen members of one race, sex, or ethnic group characteristically obtain lower scores on a selection procedure than members of another group, and the differences in scores are not reflected in differences in a measure of job performance.”20 The concept of “unfairness” is sometimes referred to as “differential validity” or “single group validity” – that is, whether scores on a test predict performance equally for both majority and minority subgroups.
When the Uniform Guidelines were written in 1978, the promulgating agencies noted that “the concept of fairness or unfairness of selection procedures is a developing concept.”21 Substantial professional research and experience since then has called into question this concept of “differential validity.” Some researchers view “differential validity” as another expression of the rejected situational validity theory. Particularly with regard to cognitive ability testing, these researchers have concluded that differences in validity between sub-groups (such as Caucasians and African-Americans) very rarely exist, and when they do they are very small and usually favor minority groups.22
Particularly for cognitive ability testing, the EEOC should consider the existing research regarding differential validity. As some courts have noted, given the considerable professional research in this area, the absence of a “fairness study” should not be deemed fatal to the validity evidence that an employer presents.23
Although the Supreme Court recently clarified in Smith v. City of Jackson,24 that a claim for adverse impact on the basis of age is available under the Age Discrimination in Employment Act of 1967, the Uniform Guidelines are expressly inapplicable to age discrimination issues.25 The extension of adverse impact liability to age discrimination claims raises the questions of whether and how the adverse impact and validation principles in the Uniform Guidelines may be applied to age discrimination. These questions are complicated by the “reasonableness” burden of proof applicable to age discrimination adverse impact claims under Smith, which differs from the business necessity standard. These questions are further compounded by difficulties in measuring adverse impact based on age, which arise from the fact that age, unlike race, is not immutable. Additional enforcement guidance with respect to the extent, if any, to which the adverse impact and validation standards set forth in the Uniform Guidelines pertain to the age discrimination “reasonable factor other than age” inquiry would be beneficial.
My experience has been that EEOC has not pursued selection issues in cases that might have warranted further investigation, yet has pursued other cases in spite of extensive validity evidence. Often it seems that initial characterization of tests as measuring “cognitive ability” or “math skills” leads to assumptions about adverse impact that drive the result, rather than evaluation of actual adverse impact data or validity evidence. It appears that this could be addressed by additional or focused training.
Even where evidence of adverse impact is found at substantial levels, it alone is an insufficient basis on which to issue a cause determination or to bring or threaten enforcement proceedings.26 Under the applicable law, evidence of adverse impact should trigger an investigation of job-relatedness, not a cause determination.
I respectfully suggest that investigators should be trained to identify the various forms in which validation and other evidence of job-relatedness is often found, and to call for expert assistance from EEOC’s psychologists when testing issues arise. When adverse impact exists but substantial evidence of validation or job-relatedness is adduced, it is a waste of the Commission’s resources to proceed with enforcement proceedings. Just as investigators will consider an employer’s legitimate nondiscriminatory reasons for a decision that are offered in response to a charge alleging disparate treatment, investigators should be trained to request and review job validation evidence, with appropriate professional guidance.
In receiving and reviewing testing-related materials, investigators also should be trained to acknowledge their inherently confidential and proprietary nature. Investigator review of a test instrument itself (i.e., the actual test questions) is rarely necessary or helpful to an investigation of adverse impact or validity. Yet, disclosure of the test form to a charging party or member of the public (e.g., via FOIA) could threaten the test’s validity and usefulness and put at risk the tens or hundreds of thousands of dollars that employers typically invest in the development and validation of selection tools. Investigators therefore should be trained to request not a copy of the test itself, but technical reports of validation studies or other evidence of job-relatedness. Additionally, investigators should be trained to cooperate in the negotiation of appropriate confidentiality agreements and procedures for the review of job analysis and validation materials that often contain highly confidential and proprietary employer information.
The efforts described here are certainly substantial, but they would be well worth the effort. Clarification of validation standards will not only assist employers to prepare selection procedures that are non-discriminatory, but will at the same time assist EEOC to focus its efforts on selection issues -- affecting large numbers of people – that can reliably be shown not to be valid, consistent with current professional standards.
ADMIN_US # 12207482.2
1. Partner, Paul, Hastings, Janofsky & Walker, LLP, 875 15th Street, N.W. Washington, D.C. 20005. Paul Hastings is an international law firm with over 1100 attorneys in offices throughout the United States, Europe, and Asia. Mr. Willner resides in Paul Hastings’ 130-attorney Washington, D.C. office. He completed both undergraduate and law school studies at the University of Virginia, where he was a member of the Order of the Coif and the Editorial Board of the Virginia Law Review. Mr. Willner, who serves as the Firm’s Professional Personnel Partner, represents employers in all aspects of employment law and litigation, including adverse impact and disparate treatment individual cases and class actions in federal and state courts, as well as enforcement proceedings before the Equal Employment Opportunity Commission and Office of Federal Contract Compliance Programs. Mr. Willner frequently counsels employers regarding employee selection procedures and other compliance and regulatory matters. Mr. Willner has spoken and written on a variety of employment law topics. Mr. Willner’s comments in this paper are his alone, and not attributable to any client or to the firm of Paul, Hastings, Janofsky & Walker, LLP.
2. 29 C.F.R. part 1607 (2006).
3. 29 C.F.R. § 1607.1(B).
4. E.g., Griggs v. Duke Power Co., 401 U.S. 424, 433-34 (1971) (Guidelines “entitled to great deference”); Albemarle Paper Co. v. Moody, 422 U.S. 405, 431 (1975) (same).
5. The Uniform Guidelines were promulgated by the EEOC, the Civil Service Commission, the Department of Justice, and the Department of Labor. 29 C.F.R. § 1607.1(A).
6. The Uniform Guidelines state that provisions “relating to validation of selection procedures are intended to be consistent with generally accepted professional standards for evaluating standardized tests and other selection procedures, such as those described in the [Standards] . . . and standard textbooks and journals in the field of personnel selection.” 29 C.F.R. § 1607.5(C). The Guidelines further state that “[n]ew strategies for showing the validity of selection procedures will be evaluated as they become accepted by the psychological profession.” Id. § 1607.5(A).
7. 487 U.S. 977 (1988).
8. 487 U.S. at 999.
9. Id. at 1006-07.
10. 29 C.F.R. § 1607.5(B).
11. “ Construct validation is a relatively new and developing procedure in the employment field, and there is at present a lack of substantial literature extending the concept to employment practices. . . . Until such time as professional literature provides more guidance on the use of construct validity in employment situations, the Federal agencies will accept a claim of construct validity without a criterion-related study . . . only when the selection procedure has been used elsewhere in a situation in which a criterion-related study has been conducted and the use of a criterion-related validity study in this context meets the standards for transportability of criterion-related validity studies as set forth [in 29 C.F.R. § 1607.7].” 29 C.F.R. § 1607.14(D)(1), (4). Question and Answer 81 accompanying the Uniform Guidelines similarly refers to the “developing nature of construct validation for employment selection procedures.” After discussing the circumstances in which evidence of construct validity may be generalized, the Answer to Question 81 notes, “As further research and professional guidance on construct validity in employment situations emerge, additional extensions of construct validity for employee selection may become generally accepted in the profession. The agencies encourage further research and professional guidance with respect to the appropriate use of construct validity.”
12. 29 C.F.R. § 1607.3(A).
13. Meta-analysis “requires the accumulation of findings from a number of validity studies to determine the best estimates of the predictor-criterion relationship for the kinds of work domains and settings included in the studies. . . . Meta-analytic methods for demonstrating generalized validity are still evolving. Researchers should be aware of continuing research and critiques that may provide further refinement of the techniques as well as a broader range of predictor-criterion relationships to which meta-analysis has been applied.” Principles at 29-30.
14. See Frank L. Schmidt John E. Hunter, Development of a General Solution to the Problem of Validity Generalization, 62 Journal of Applied Psychology 643-61 (1977).
15. The concept of “transportability” relates to “the use of a specific selection procedure in a new situation based on results of a validation research study conducted elsewhere.” Key considerations in determining the appropriateness of transportability “are, most prominently, job comparability in terms of content or requirements, as well as, possibly, similarity of job context and candidate group.” Principles at 27.
16. According to the Principles, synthetic validity, or job component validity, entails “the justification of the use of a selection procedure based upon the demonstrated validity of inferences from scores on the selection procedure with respect to one or more domains of work (job components). Thus, establishing synthetic validity/job component validity requires documentation of the relationship between the selection procedure and one or more specific domains of work (job components) within a single job or across different jobs. If the relationship between the selection procedure and the job component(s) is established, then the validity of the selection procedure for that job component may be generalizable to other situations in which the job components are comparable. The validity of a selection procedure may be established with respect to different domains (components) of work, then ‘synthesized’ (combined) for use based on the domains (or components) of work relevant for a given job or job family. . . . [D]etailed analysis of the work is required for use of this strategy of generalizing validity evidence.” Principles at 27-30.
17. Principles at 27-30.
18. Principles at 39, 43.
19. 29 C.F.R. § 1607.14(B)(8). The “technical feasibility” clause generally limits the requirement to conduct fairness studies to those employers with sufficiently large numbers of persons in each of the relevant sub-groups to perform the required statistical analyses. See 29 C.F.R. § 1607.14(B)(1).
22. See, e.g., Frank J. Landy, Psychology of Work Behavior 68-69 (“Tests predict job performance of a minority and the majority in the same way. The small departures from perfect fairness that exist actually favor minority groups). The 2003 Principles refer to the notion of differential validity reflected in the Guidelines under the rubricof “predictive bias.” Principles at 31-34. According to the Principles, “Predictive bias has been examined extensively in the cognitive ability domain. For White-African American and White-Hispanic comparisons, slope differences are rarely found; while intercept differences are not uncommon, they typically take the form of overprediction of minority group performance. . .” Id. at 32-33 (citations omitted).
23. See, e.g., Clady v. County of Los Angeles, 770 F.2d 1421, 1431 (9th Cir. 1985).
24. 544 U.S. 228 (2005).
25. 29 C.F.R. 1607.2(D)
26. Investigators should receive training not only regarding the Commission’s “4/5 rule of thumb,” 29 C.F.R. § 1607.4(D), but also regarding sample size issues and alternative methods of statistical analysis that may be appropriate in various circumstances, especially where populations are large. Investigators also should be conversant with the Commission’s enforcement position that, where adverse impact in overall selections is not found, enforcement proceedings will not be initiated with respect to a subpart of the process. Id. § 1607.4(C).
This page was last modified on May 11, 2007.
Return to Home Page