Education for Adult English Language Learners in the United States: Trends, Research, and Promising Practices

Part VI: Assessment and Accountability

Learner assessment is a priority in adult education. Many adult education programs use a variety of assessment tools to place learners in classes, inform instruction, evaluate learner progress, and report outcomes of instruction. These assessment tools include standardized tests, materials-based and teacher-made tests, portfolios, projects, and demonstrations. Needs assessment and goal-setting activities also play an important role in determining in what areas (e.g., language skills, content areas, functional life skills, literacy) the learner needs the most work.

State of the Field

The Workforce Investment Act of 1998 (WIA; Public Law 105–220) funds adult English as a second language (ESL) instruction through the U.S. Department of Education. WIA requires states to evaluate the performance of every local program according to outcome measures established under the National Reporting System (NRS) (U.S. Department of Education, 2007b). These outcomes include educational level advancement and follow-up goal achievement. States have the flexibility to choose which assessments and procedures they will use to measure these outcomes.

Upon enrollment in an adult ESL program, students place into one of six ESL educational functioning levels based on their pretest scores on an approved standardized assessment. Their progress through these levels is reported each year by state departments of education to the U.S. Department of Education, Office of Vocational and Adult Education (OVAE). Each state negotiates a target percentage of students at each educational functioning level that will advance at least one level (educational level gain) each year. A state can set different standards for different service providers or for different levels of proficiency. For example, the percentage of learners expected to move from the lowest proficiency level could be lower than the percentage expected to move from higher proficiency levels. This recognizes that a learner who enters a program with no literacy skills may require a great deal of instruction before achieving level gain.

Following the NRS state assessment policy guidelines (, states identify standardized assessments and procedures that programs can use to determine learners’ functioning levels, establish timeframes for assessments to be given (either at specific times during the year or after a given number of hours of instruction), and train program staff to administer the assessments. Educational level gain in language and literacy is measured by pretesting students with an approved standardized assessment, then posttesting them with an equivalent form of the same assessment after a predetermined number of instructional hours or at the end of an instructional cycle. The minimum number of instructional hours recommended between pretesting and posttesting for NRS-approved assessments ranges from 40 to 120 hours. For reporting purposes, adult ESL programs must pretest and posttest all students who attend 12 or more hours of class annually.

The six NRS ESL educational functioning levels are used to place adult learners based on their scores on an approved standardized assessment. The NRS ESL educational functioning level descriptors describe what students know and can do in speaking, listening, reading, writing, and functional and workplace skills at each level (see Appendix). These level descriptors focus on what students can do with the language in daily life outside the classroom. They are intended to provide examples that guide assessment and instruction but are not complete descriptions of all of the skills a student may possess at any given level. The descriptors were revised in 2006 to reflect the larger number of adult ESL learners at the lower levels and the need to show the progress of learners at these levels (Table 4.)

Table 4. Original and Revised NRS Levels

Original (Program Year 1999–2000 to 2005–2006)

Revised (Program Year 2006–2007 to present)

Beginning ESL Literacy

Beginning ESL Literacy

Beginning ESL

Beginning ESL Low

Beginning ESL High

Intermediate Low

Intermediate Low

Intermediate High

Intermediate High

Advanced Low


Advanced High

Source. Adapted from U.S. Department of Education, 2007b.

The focus of the NRS is language proficiency, “the ability to use the language effectively and appropriately in real-life situations” (see Buck, Byrnes, & Thompson, 1989, p. 11). Unlike the assessing of achievement, assessing proficiency is not necessarily confined to measuring content knowledge that is taught in the classroom (Kenyon & Van Duzer, 2003).

The U.S. Department of Education (2007b) requires states and local adult education programs to “measure educational gain with standardized assessments that are appropriate within the NRS framework and conform to accepted psychometric standards for validity and reliability (e.g., Mislevy & Knowles, 2002). Assessments that measure educational gain should be designed to measure the development of basic English literacy and language skills through pre- and posttesting” (p. 23). Validity is the degree to which the information gained from an assessment matches the inferences or decisions that programs make about learners or actions that they take as a result of that information. Reliability is the consistency of a measurement when the testing procedure is repeated on a different population of individuals or groups (American Educational Research Association, American Psychological Association, & National Council in measurement in Education, 1999; Messick, 1989; for further discussion, see Kenyon & Van Duzer, 2003).

Assessments that are currently approved for use in one or more states for NRS reporting include BEST (Basic English Skills Test ) Literacy, BEST Plus, CASAS (Comprehensive Adult Student Assessment Systems), CELSA (Combined English Language Skills Assessment), Compass ESL, REEP (Arlington Education and Employment Program) Writing Assessment, and TABE CLAS-E (TABE Complete Language Assessment System—English). New OVAE regulations require that adult ESL assessments be submitted and approved each year prior to being used for accountability reporting in the NRS.

Although educational gain is measured by the percentage of learners who move from level to level during the funding year, there is no research to support how long it takes to advance from one NRS level to the next. Because it takes several years to learn a language well (Thomas & Collier, 1997), the time it takes to show level gain on a proficiency scale depends on both program and learner factors. Because of these factors, it has not been possible to show the exact conditions (which combinations of learner and program factors) under which NRS level gains are achievable (Young, 2007).

The adult ESL field faces a number of challenges in the selection, use, and development of assessments for accountability reporting. Adult ESL staffing concerns, such as inexperienced instructors and volunteers, high teacher turnover rates, part-time and temporary employment, and limited professional development, may affect practitioners’ knowledge of assessment, its purposes, and its alignment with instruction. Program administrators may not know how to use assessment and NRS data to make decisions about instruction, program needs, and professional development. The students themselves may attend class sporadically, making it difficult for teachers to align instruction and assessment and to show educational gain for accountability.

The growing emphasis on alignment of assessments with course content adds another layer of complexity to test selection. The results of standardized assessments will have meaning to learners and teachers only if the test content is related to the goals and content of the instruction (Van Duzer & Berdán, 1999). If the items in a standardized test reflect the actual curriculum, then the test may accurately assess the achievement of the learners. However, if the items do not reflect what is covered in the classroom, the test may not adequately assess what learners know and can do.

There is also concern that standardized tests may not capture the incremental changes in learning that occur over short periods of instructional time. Test administration manuals usually recommend the minimum number of hours of instruction that should occur between pre- and posttesting, yet the learning that takes place within that time frame is dependent on the program and learner factors discussed above. In an effort to ensure that learners are tested and counted before they leave, program staff may be posttesting before adequate instruction has been given. In such cases, learners may not show enough progress to advance a level unless they pretested near the high end of the score ranges for a particular NRS level.


In response to the issues described above, staff at the Center for Applied Linguistics (CAL) conducted an exploratory study to examine the status of adult ESL assessment in the United States, particularly as it is implemented in federally funded adult ESL programs. The goals of the project were to identify the limitations that exist in available testing instruments for use in adult ESL programs and to provide recommendations regarding the need for assessments that measure adult English language learners’ growth in speaking, listening, reading, and writing in English. CAL staff worked with a panel of seven external advisors over a period of 18 months to meet these goals (Kenyon, Van Duzer, & Young, 2006).

Nineteen existing assessments and their accompanying materials were examined to evaluate the test characteristics as related to test construct, psychometric properties, usefulness, and logistics of implementation. This evaluation of the assessments (many of which were not widely used or standardized) pointed to the following needs to be addressed by test publishers so as to improve the available adult ESL assessment offerings in the United States:

  • Better and more explicit connections between test constructs and theories of second language acquisition
  • Test purposes, uses, and language constructs that are clearer and easier to operationalize
  • Demonstrated evidence of psychometric rigor in the test development process
  • Availability of equivalent parallel test forms and research to support the equivalence of existing forms
  • Consideration of logistical factors that may impede or invalidate test implementation or assessment results
  • Consideration of the potential role of technology in administering and scoring assessments

Overall, the review identified the need for more adult ESL assessments that cover a greater range of proficiency levels and language skills and that provide complete and well-researched links to the six NRS ESL educational functioning levels. However, NRS reporting is not the only purpose for adult ESL assessments. Adult English language learners want to know how they are progressing, teachers want feedback on the effectiveness of their instruction, program administrators need proof of success in meeting the goals of the program and the needs of the learners, and funding agencies must determine if their money is being well spent. A single assessment may not meet all of these needs. For example, an assessment that relates scores to broadly defined NRS proficiency levels and is useful for determining level gain may not provide diagnostic information related to mastery of specific knowledge and skills outlined in ESL content standards.

Promising Practices

The findings of the review and study described above were ultimately incorporated into a design plan (Kenyon, Van Duzer, & Young, 2006) with recommendations for the development of adult ESL assessments and for revision of existing ones to bring them in line with the needs of the adult ESL field. Recommendations for promising practices related to the development and use of assessments include the following. Adult ESL assessments must do the following:

  • Meet standard psychometric requirements related to appropriateness, reliability, validity, standardization, bias review, and test development procedures, and meet OVAE requirements for test approval (see, e.g., U.S. Department of Education, 2006, p. 3).
  • Have a clear purpose and a defined construct, or “definitions of abilities that permit us to state specific hypotheses about how these abilities are or are not related to other abilities, and about the relationship between these abilities and observed behavior” (Bachman, 1990, p. 255), for the knowledge or language skill being assessed, within the context of the NRS. Tests used in this context and for this purpose are able to reliably show learner gains over a certain period of time when learners are pretested and post-tested with an appropriate, valid, and reliable standardized assessment (Kenyon & Van Duzer, 2003).
  • Evaluate language proficiency in a performance-oriented, standardized way. Proficiency descriptors, such as the NRS ESL educational functioning levels, should provide information about content, structure, and quality for language use performance tasks to be developed, indicating a learner’s progress through or mastery of the these levels. For each of the NRS functioning levels, tasks need to be developed and validated that would represent completion of each proficiency level. Scoring rubrics and guidelines for evaluating performance need to be in place.
  • Be useful for all stakeholders involved in teaching and learning through timely, clear, and accessible scoring, interpretation, and reporting of assessment results. Adult ESL program administrators and teachers should be able to read, understand, and make sound educational decisions based on assessment scores; provide useful feedback to learners about their progress that will allow them to identify their own strengths and weaknesses; and formulate goals and strategies for improvement.
  • Include documentation supporting the recommended number and intensity of instructional hours necessary to show learner progress, in order to inform state assessment policies, better prepare teachers for effective instruction, and ultimately provide better feedback to learners regarding their progress. If the assessment is used for NRS purposes, evidence must also be provided that the instrument can validly place students into one of the federally designated adult ESL educational functioning levels.
  • Be cost effective and incorporate an understanding of ESL program limitations in terms of funding, personnel, time, materials, logistics, and support.
  • Follow procedures to be carried out within the context of a comprehensive program evaluation plan. State and program staff, learners, and external stakeholders should work together to set goals for the program, develop measures to assess progress toward those goals, and identify how progress will be determined. A comprehensive plan allows learners to know how they are progressing, teachers to assess the effectiveness of instruction, administrators to monitor progress toward program goals and to gain feedback for program improvement, and external stakeholders to see the results of their investment (Holt & Van Duzer, 2000).
  • Consider the roles that technology can play in assessing learners. Such roles may include allowing content to be tailored to the learner’s background; item difficulty to be tailored to the learner’s skill level (e.g., an adaptive test); scoring to be automated (and thus reduce the risk of human error); and low-level literacy or visually impaired learners to be accommodated by alternative response mechanisms, such as touch-screen systems or larger fonts. Multimedia technology can make multiple input formats available to allow for more extensive assessment of all four language skills. Technology has the potential to assess knowledge and skills that cannot be measured by traditional paper and pencil tests. In addition, the use of technology may reduce the risk that construct-irrelevant factors, such as the size of printed words or unfamiliar response mechanisms like bubbling in response sheets, affect student performance on the assessment. Technology also allows for more flexibility in scheduling tests, web-based scoring, and new item assessment formats by influencing how results and relevant data are scored, transported, converted, and kept within an instructional program.
  • Be informed by a variety of perspectives, including new research into language learning processes, psychometrics, educational measurement, and revised or expanded curricular frameworks and instructional content areas.


As the field of adult ESL education continues to implement higher standards, assessment frameworks look not only at what students know about the language, but also at what they can do with it in everyday life. The United States has made progress since 1999 in creating a cohesive system of adult education through legislation, such as WIA, and frameworks, such as the NRS. At the same time, accountability requirements reflect the challenges of building such a system. For there to be a link among classroom instruction, adult learner proficiency in English, and NRS educational gain, standardized assessments that meet both learner and program needs and NRS accountability requirements must be developed and used.

<< Previous | Next >>