Resources

Key Assessment Terms

Accountability	Responsibility for educational outcomes; these outcomes are often measured through standardized testing.
Achievement test	A test that measures how well a student has reached the objectives of a specific course or program
ACTFL proficiency levels	Guidelines developed by the American Council on the Teaching of Foreign Languages (ACTFL) that describe language performance
Alternative assessment	Non-traditional forms of assessment; may include portfolios, observations, work samples, or group projects
Analytic scoring	Method of scoring or rating that assigns separate scores for different aspects of a student’s performance
Aptitude test	Test which measures a student’s talent for learning language; predicts future performance
Assessment	An ongoing process of setting clear goals for student learning and measuring progress towards these goals
Assessment literacy	Knowledge about and a thorough understanding of myriad assessment practices, especially by educators
Authenticity	How well a test reflects real-life situations
Cloze test	Test that measures comprehension by asking students to fill in missing words from a passage
Computer-adaptive test	Computer-based test that adapts to the test-taker’s performance and presents easier or more difficult tasks based on previous answers
Construct	What a test measures
Construct validity	How well a test measures what it is supposed to measure
Content validity	How well the content of a test reflects the construct that the test is measuring
Criterion-referenced	Scores interpreted with respect to standards or a theory of language; everyone can get a high score.
Cutoff score	On a criterion-referenced test, the minimum score a student must receive to demonstrate a determined level
Direct testing	Testing method that closely matches the construct being measured
Discrete test	Test focused on specific language skills
Diagnostic test	Test that identifies a student’s strengths and weaknesses
Evaluation	Making decisions based on the results of assessment
Face validity	Non-technical term that refers to how fair, reasonable, and authentic people perceive a test to be
Formative assessment	An assessment used during the course of instruction to provide feedback to the teacher and learner about the learner’s progress toward desired educational outcomes; the results of formative assessments are often used in planning subsequent instruction.
High-stakes test	Assessment that is used to make critical decisions with consequences for one or more stakeholders in the assessment process; an admissions test that determines the course of a student’s academic future and a test used for accountability and linked to funding are both examples of high-stakes tests.
Holistic scoring	Method of rating an assessment based on general descriptions of performance at specified levels; while a holistic scoring rubric may take into account performance along several dimensions (e.g., fluency, grammatical accuracy, and word choice for oral language), one overall score which best represents the examinee’s performance is assigned.
Impact	The positive or negative effects of testing
Indirect testing	A method of testing that measures abilities related to the construct being tested, rather than the construct itself
Input	The materials (presented aurally and visually) that an examinee receives as part of the test tasks
Integrative test	Test that addresses multiple language skills, sometimes in the same task
Multiple choice test	Test in which examinees demonstrate knowledge, skill, or ability by selecting a response from a list of possible answers
Needs assessment	Inquiry into the current state of knowledge, resources, or practice with the intent of taking action, making a decision, or providing a service with the results
Norm-referenced	Scores interpreted with respect to other examinees; some must score high, some low.
Off-the-shelf	Commercially-available test which can be purchased by an educational institution or individual user and administered at the discretion of the individual user
Parallel forms	Two or more tests with different questions that measure the same underlying skill and whose difficulty levels have been determined to be equivalent; scores from parallel versions of a test can be compared with one another.
Percentile	Range of measures from 1-99 used to compare examinees with one another; an examinee who scored in the 80th percentile placed higher than 80% of test takers.
Performance assessment	Assessment which requires the examinee to demonstrate knowledge or skill through activities that are often direct, active, and hands-on, such as giving a speech, performing a skit, or producing an artistic product
Placement test	Test whose results are used to assign students to classes designed for learners at a particular level
Practicality	Feasibility of test given materials, funding, time, expertise, and staff
Proficiency test	Test of ability in a defined area of language; the area may be narrowly-defined (e.g., English for airline pilots) or more broad (e.g., social and academic language). Proficiency tests are not tied to a specific curriculum or course and are often contrasted with achievement tests.
Program evaluation	Process of collecting data from multiple sources about an instructional program or intervention and making a decision about the success of the program based on this information; the evaluation could target both the process and outcomes of the program.
Raw score	Student’s total number of correct responses on a test
Reliability	Consistency of scores/results
Scale score	Score that allows test results to be compared across students; in standardized testing, raw scores are often converted to scale scores.
Scoring method	Describes how scoring is accomplished (e.g., machine-scored, hand-scored, centrally scored, locally scored)
Scoring process	Describes the procedures used to obtain a test score, e.g., counting the number correct, scoring holistically or analytically according to established guidelines, a scale, or a rubric
Self-assessment	Personal rating of language ability according to specified criteria
Skills test	Test focusing on a specific domain of language use, e.g., listening, reading, writing or speaking (interactive or presentational)
Stakeholders	Persons involved with or invested in the testing process, e.g. test takers, administrators, parents, and teachers/instructors
Standardized test	Test with fixed content, equivalent parallel forms, standard administration and scoring, field-tested, valid, and reliable
Subscore	Score that represents student performance in a particular domain or part of a test
Summative assessment	Outcome-based use of assessments, often for decisions such as grading, program evaluation, tracking, or accountability
Test accommodation	“Any change to a test or testing situation that addresses a unique need of the student but does not alter the construct being measured” (Center for Equity and Excellence in Education, 2006)
Test administration	Delivery of the test items/directions to the test-takers
Test development	Process of creating a test; steps of test development (Hughes, 2003): 1. State the goals of the test. 2. Write test specifications. 3. Write and revise items. 4. Try items with native speakers and accept/reject items. 5. Pilot with non-native speakers with similar backgrounds as the intended test-takers. 6. Analyze the trials and make necessary revisions. 7. Calibrate scales. 8. Validate. 9. Write test administrator handbook, test materials. 10. Train staff as appropriate.
Test format	Mode and organization of test, test structure (e.g., multiple choice, short answer)
Test items	Tasks, questions, or prompts to which test-takers respond
Test materials	Items used for the test administration/taking
Test purpose	What you want to learn from the test results
Testing	Valid and reliable practice of language measurement for context-specific purposes
Validity	Validity is a judgment about whether a test is appropriate for a specific group and purpose and includes considerations such as whether the test really measures what you think it is measuring, whether the results are similar to examinees’ performance on other tests or in class or real-world activities, and whether the use of test results have the intended effects.
Washback	Effects of test on teachers’ and students’ actions; washback can be positive (expected) or negative (unexpected, harmful).

References

Needs Assessment
Test Methods Worksheet
Test Selection Worksheet

Online & Print Resources

Online Resources

CAL Digests: A collection of brief reports on assessment and other relevant topics
Language Testing Resources: An online reference guide to language testing resources, open to all
International Language Testing Association (ILTA) Guidelines for Practice
Virtual Assessment Center: Learning modules about language assessment, from CARLA
Classroom Assessment Literacy Inventory: Adapted questionnaire to measure level of competence in testing and assessment
National Clearinghouse for English Language Acquisition & Language Instruction Educational Programs (NCELA): Information and resources related to support for English learners
Assessment books and training resources, by SERVE Center at UNC – Greensboro
Assessment Literacy: Video by Rick Stiggins
Michigan Assessment Consortium – Free webinar on assessment by Rick Stiggins

Print Resources

Book with in-depth information on measurement, language test uses and methods, reliability, and validity

Bachman, L. & Palmer, A. (2010). Language Assessment in Practice. Oxford: Oxford University Press.

A practical guide to developing your own classroom assessments

Brown, H. D., & Abeywickrama, P. (2010). Language assessment: Principles and classroom practices (Vol. 10). White Plains, NY: Pearson Education.

A book which provides a thorough but accessible overview of foundational concepts in language testing

Hughes, A. (2003). Testing for language teachers (2nd edition). Cambridge: Cambridge University Press.

Handbook which explains the principles of backward design for classroom assessment

McTighe, J. & Wiggins, G. (2005). Understanding by design (2nd ed). Alexandria, VA: Association for Supervision and Curriculum Development.

Resources

Powered by World Data Inc.