Standards
based assessment
The
construction of such standards makes possible a concordance between
standardized test specifications and the goals and objective of educational
programs. And so, in the board domain of language arts, teachers and
educational administrators began the painstaking process of carefully examining
existing curricular goals, concluding needs assessments among students, and
designing appropriate assessments of those standards. A sub field of language
arts that is of increasing importance in the united states, with its millions
of non-native users of English, is English as a second language ()ESL, also
known as English for speakers of other languages (ESOL), English language
learners (ELLs), and English language development (ELD).
ELD
STANDARDS
The process of
designing and conducting appropriate periodic reviews of ELD standards involves
dozens of curriculum and assessment specialists, teachers, and researchers (fields,
200; kuhlman, 2001). In creating such “benchmarks for accountability”(O’Malley
& Valdez Pierce,1996), tremendous responsibility to carry out a
comprehensive study of a number of domains:
·
Literally thousands of categories of
language ranging from phonology at one end of a continuum to discourse,
pragmatic, functional, and sociolinguistic elements at the other end;
·
Specification of what ELD students’
needs are, at thirteen different grade levels, for succeeding in their academic
and social development;
·
A consideration of what is a realistic
number and scope of standards to be included within a given curriculum;
·
A separate set of standards
(qualifications, expertise, training) for teachers to teach ELD students
successfully in their classrooms; and
·
A thorough analysis of the means
available to assess student attainment of those standards.
The
Listening and Speaking standards for English-language learners (ELLs) identify
a student's competency to understand the English language and to produce the
language orally. Students must be prepared to use English effectively in social
and academic settings. Listening and speaking skills provide one of the most
important building blocks for the foundation of second language acquisition.
These skills are essential for developing reading and writing skills in
English; however. to ensure that ELLS acquire proficiency in English listening,
speaking, reading, and writing, it is important that
students receive reading and writing instruction in English while they are
developing fluency in oral English.
To
ensure that El I s develop the skills and concepts needed to demonstrate
proficiency on the English-Language Arts (HA) Listening and Speaking standards,
teachers must concurrently use both the ELD and the standards. ELLS achieving
at the Advanced ELI) proficiency level should demonstrate proficiency on the
ELA standards for their
own and all prior grade levels. This means that all prerequisite skills needed to achieve the
ELA standards must be
learned by the Early Advanced El D proficiency level. El-is must develop both
fluency in English and proficiency on the ELA standards. Teachers must ensure that ETS receive instruction in listening
and speaking that will enable them to demonstrate proficiency on the ELA
Speaking Applications standards.
ELD
ASSESSMENT
The
process of administering a comprehensive, valid, and fair assessment of ELD
students continues to be perfected. Stringent budgets within departments of
education worldwide predispose many in decision-making positions to rely on
traditional standardized tests for ELI) assessment, but rays of hope lie in the
exploration of more student-centered approaches to learner assessment. Stack,
Stack, and Fern (2002), for example, reported on a portfolio assessment system
in the San Francisco Unified School District called the Language and Literacy
Assessment Rubric (LAI-AR), in which multiple forms of evidence of students'
work are collecte Teachers observe students year-round and record their
observations on scannable forms, The use of the LALAR system provides useful
data on students' performance at all grade levels for oral production, and for
reading and writing performance in elementary and middle school grades (1-8).
Further research is ongoing for high school levels (grades 9-12).
CASAS
AND SCANS
At
the higher levels of education (colleges, community colleges, adult schools
language schools, and workplace settings), standards-based assessment systems
have also had an enormous impact-The Comprehensive Adult Student Assessment
System (CASAS), for example, is a program designed to provide broadly based
assessments of ESL curricula across the United States. The system includes more
than 80 standardized assessment instruments used to place learners in programs
diagnose learners' needs, monitor progress, and certify mastery of function:
basic skills. CASAS assessment instruments are used to measure function:
reading, writing, listening, and speaking skills, and higher-order thinking
skills CASAS scaled scores report learners' language ability levels in
employment and adult life skills contexts.
A similar set of standards compiled
by the U. S. Department of Labor, nov known as the Secretary's Commission in
Achieving Necessary Skills (SCANS), OL' lines competencies necessary for
language in the workplace. The competencies cover
language functions in terms of
·
resources
(allocating time, materials, staff, etc.),
·
interpersonal
skills, teamwork, customer service, etc.,
·
information
processing, evaluating data, organizing files, etc.,
·
systems
(e.g., understanding social and organizational systems), and
·
technology
use and application
TEACHER STANDARDS
In
addition to the movement to create standards for learning, an equally strong
movement has emerged to design standards for teaching. Cloud (2001 , p. 3)
noted that a student's "performance [on an assessment] depends on the
quality of the instructional program provided, . . . which depends on the
quality of professional development." Kuhlman (2001) emphasized the
importance of teacher standards in three domains:
1. linguistics and language development
2. culture and the interrelationship
between language and culture
3. planning and managing instruction.
Professional
teaching standards have also been the focus of several committees in the
international association of Teachers of English to Speakers of Other Languages
(TESOL).
How
to assess whether teachers have met standards remains a complex issue. Can
pedagogical expertise be assessed through a traditional standardized test? In
the first of Kuhlman's domains—linguistics and language development—knowledge
can perhaps be so evaluated, but the cultural and interactive characteristics
of effective teaching are less able to be appropriately assessed in such a
test. TESOL's standards committee advocates performance-based assessment of
teachers for the following reasons:
·
Teachers
can demonstrate the standards in their teaching.
·
Teaching
can be assessed through what teachers do with their learners in their
classrooms or virtual classrooms (their performance).
·
This
performance can be detailed in what are called "indicators” :examples
of evidence that the teacher can meet a part of a standard.
·
The
processes used to assess teachers need to draw on complex evidence of
performance. In other words, indicators are more than simple "how to"
statements.
·
Performance-based
assessment of the standards is an integrated system. It is neither a checklist nor a series of discrete assessments.
·
Each
assessment within the system has performance criteria against which the
performance can be measured.
·
Performance
criteria identify to what extent the teacher meets the standard.
·
Student
learning is at the heart of the teacher's performance.
THE
CONSEQUENCES OF STANDARDS-BASED AND STANDARDIZED TESTING
University admissions offices around world have relied on the results of
tests such as the Scholastic Aptitude Test (S; the Graduate Record Exam (GRÜ),
and the TOEFL to screen applicants respectably moderate correlations between these tests and
academic perform.
are used to justify determining the
future of students' lives on the basis of one relatively inexpensive sit-down
multiple-choice test. Thus has emerged the term high-stakes testing. based on
the gate-keeping function that standardized tests perform.
Are the institutions that produce
and utilize high-stakes standardized tests justified in their decisions? An
impressive array of research would seem to say yes. Consider the fact that
correlations between TOEFL scores and academic performance in the first year of
college are impressively high (Henning & Cascallar, 1992). Are tests that
lack a high level of content validity appropriate assessments of ability? A
good deal of research says yes to this question as well.A study of the
correlation of TOEFL results with oral and written production, for example,
showed that years before TOEFL's current use of an essay and oral production
section, significant positive correlations were obtained between all
subsections of the TOEFL and independent direct measures of oral and written
production (Henning & Cascallar, 1992).Test promoters commonly use such
findings to support their claims for the efficacy of their tests.
But several nagging, persistent
issues emerge from the arguments about the con. sequences of standardized
testing. Consider the following interrelated questions:
·
Should
the educational and business world be satisfied with high but not perfect
probabilities of accurately assessing test-takers on standardized instruments?
In other words, what about the small minority who are not fairly assessed?
·
Regardless
of construct validation studies and correlation statistics, should further m'es
of performance be elicited in order to get a more comprehensive picture of the
test-taker?
·
Does
the proliferation of standardized tests throughout a young person's life give
rise to test-driven curricula, diverting the attention of students from
creative or personal interests and in-depth pursuits?
·
Is
the standardized test industry in effect promoting a cultural, social, and
political agenda that maintains existing power structures by assuring
opportunity to an elite (wealthy) class of people?
TEST BIAS
It is no secret that standardized tests involve a number of types of test
bias, That bias comes in many forms: language, culture, race, gender, and
learning styles (Medina & Neill, 1990). The National Center for Fair and
Open Testing, in its bimonthly newsletter Fair Test, every year offers dozens
of instances of claims of test bias from teachers, parents, students, and legal
consultants.
In an era when we seek to recognize the multiple intelligences present
within every student (Gardner, 1983, 1999), is it not likely that standardized
tests promote logical-mathematical and verbal-linguistic intelligences to the
virtual exclusion of the other contextualized, integrative intelligences? Only
very recently have traditionally receptive tests begun to include written and
oral production in their test battery—a positive sign. But is it enough? It is
also clear that many otherwise "smart" people do not perform well on
standardized tests. may excel in cognitive styles that are not amenable to a
standardized format. Perhaps they need to be assessed by such performance-based
evaluation as interviews, portfolios, samples of work, demonstrations, and observation
reports? Perhaps, as Weir (2001, p. 122) suggested, learners and teachers
need to be given the freedom to choose more formative assessment rather than
the summative assessment inherent in standardized tests.
Expanding test batteries to include
such measures would help to solve the problem of test bias (which is extremely
difficult to control for in standardized items) and to account for the small
but significant number of test-takers who are not accurately assessed by standardized
tests. Those who are using the tests for gate-keeping purposes, with few if any
other assessments, would do well to consider multiple measures before
attributing infallible predictive power to standardized tests.
TEST-DRIVEN LEARNING
AND TEACHING
Yet
another consequence of standardized testing is the danger of test-driver
learning and teaching. When students and other test-takers know
that one single measure
of performance will determine their lives, they are less likely
to take a positive
attitude toward learning. The motives in such a context are almost exclusive;
extrinsic, with little likelihood of stirring intrinsic interests.
Test-driven learning is a worldwide
issue. In Japan, Korea, and Taiwan, to name just a few countries,
students approaching their last year of
secondary school focus obsessively on passing the year-end college entrance
examination, a major section of which is English (Kuku 2002). Little attention
is given to any topic or task that does not directly contribute to passing that
one exam. In the United States, high school seniors are forced to give almost
as much attention to SAT scores.
ETHICAL ISSUES: CRITICAL IANGUAGE
TESTING
Shohamy (1997) and others (such as
Spolsky, 1997; Hamp-Lyons, 2001) see the ethics of testing as an extension of
what educators call critical pedagogy. or more precisely in this case, critical
language testing (see TBR Chapter 23, for some comments on critical language
pedagogy in general). Proponents of a critical approach to language testing
claim that large-scale standardized testing is not an unbiased process. but
rather is the "agent of cultural, social. political, educational, and
ideological agendas that shape the lives of individual participants. teachers,
and learners" (Shohamy, 1997, p.
3), issues of critical
language testing are numerous:
·
Psychometric
traditions are challenged by interpretive, individualized procedures for
predicting success and evaluating ability.
·
Test
designers have a responsibility to offer multiple modes of performance to
account for varying styles and abilities among test-takers.
·
Tests
are deeply embedded in culture and ideology.
·
Test-takers
are political subjects in a political context.
These
issues are not new. More than a century ago, British educator E Y. Edgeworth
(1888) challenged the potential inaccuracy of contemporary qualifying
examinations for university entrance. In recent years, the debate has heated
up. In 1997, an entire issue of the journal Language Testing was devoted to
questions about ethics in language testing.
One
of the problems highlighted by the push for critical language testing is the
widespread conviction, already alluded to above, that carefully constructed
standarc€ ized tests designed by reputable test manufacturers are infallible in
their predictive validity. One standardized test is deemed to be sufficient;
follow-up measures are cory sidered to be too costly
A
further problem with our test-oriented culture lies in the agendas of those wbc
design and those who utilize the tests. Tests are used in
some countries to deny city zenship (Shohamy, 1997, p. 10). Tests may by nature
be culture-biased and therefor may disenfranchise members of a nonmainstream
value system.Test givers are always in
a position of power over test-takers and therefore can impose social and
political ideologies on test-takers through
standards of acceptable and unacceptable items. Test promote the notion that
answers to real-world problems have unambiguous right are wrong answers with no
shades of gray. A corollary to the latter is that tests presume to reflect an
appropriate core of common knowledge, such as the competencies reflected in the
standards discussed earlier in this chapter. Logic would therefore dictate that the test-taker must buy in
to such a system of beliefs in order to make the cut.
Language
tests, some may argue, are less susceptible than general-knowledge to such
sociopolitical overtones. The research process that undergirds the TOEFL to
great lengths to screen out Western cultural bias, monocultural belief systems,
other potential agendas. Nevertheless, even the process of
the selection of co alone for the TOEFL involves certain standards that may not
be universal, and the fact that the TOEFL is used as an absolute standard of
English proficiency by most universities does not exonerate this particular
standardized test.
As a language teacher, you might be
able to exercise some influence in ways tests are used and interpreted in your
own milieu. If you are offered variety of choices in standardized tests, you
could choose a test that offers least degree of cultural bias. Better yet, you
might encourage the use of multiple measures of performance (varying item types,
oral and written production other alternatives to traditional assessment) even
though this might cost money. Further, you and your co-teachers might help
establish an institutional system of evaluation that places less emphasis on
standardized tests and emphasis on an ongoing process of formative evaluation.
In so doing, you be offering educational opportunity to a few more people who
would otherwise be eliminated from contention.