Assessing
Listening
OBSERVING
THE PERFORMANCE OF THE FOUR SKILLS
Before
focusing on listening itself. think about the two interacting concepts of performance
and observation. All language users perform the acts of listening speaking,
reading, and writing. They of course rely on their underlying competence. in
order to accomplish these performances. When you propose to assess someone'
ability in one or a combination of the four skills, you assess that person's
competence, but you observe the person's performance. Sometimes the performance
does not indicate true competence: a bad night's rest, illness, an emotional
distraction test anxiety, a memory block, or other student-related reliability
factors could affect performance, thereby providing an unreliable measure of
actual competence.
THE
IMPORTANCE OF LISTENING
Every
teacher of language knows that one's oral production ability—other than
monologues, speeches, reading aloud, and the like—is only as good as one's
listening comprehension ability. But of even further impact is the likelihood
that input in the aural-oral mode accounts for a large proportion of successful
language acquisition. In a typical day, we do measurably more listening than
speaking (with the exception of one or two of your friends who may be nonstop
chatterboxes!).Whether in the workplace, educational, or home contexts, aural
comprehension far outstrips oral production in quantifiable terms of time,
number of words, effort, and attention.
BASIC
TYPES OF LISTENING
From
these stages we can derive four commonly identified types of listening pc
formance, each of which comprises a category within which to consider assessme
tasks and procedures.
1. intensive.
Listening for perception of the components (phonemes, words. intonation,
discourse markers, etc.) of a larger stretch of language
2. Responsive.
Listening to a relatively short stretch of language (a greeting, question,
command, comprehension check, etc.) in order to make an equally short response.
3. Selective.
Processing stretches of discourse such as short monologues for se eral minutes
in order to "scan" for certain information.Ille purpose of such
performance is not necessarily to look for global or general meanings, but tc
be able to comprehend designated information in a context of longer stretches
of spoken language (such as classroom directions from a teacher,T or radio news
items, or stories). Assessment tasks in selective listening coulc ask students,
for example, to listen for names, numbers, a grammatical category, directions
(in a map exercise), or certain facts and events.
4. Extensive.
listening to develop a top-down, global understanding of spoken language.
Extensive performance ranges from listening to lengthy lectures te listening to
a conversation and deriving a comprehensive message or purpose. Listening for
the gist, for the main idea, and making inferences are all part of extensive
listening.
MICRO-
AND MACROSKIIIS OF LISTENING
A
useful way of synthesizing the above two lists is to consider a finite number
of micro- and macroskills implied in the performance of listening
comprehension. Richards' (1983) list of microskills has proven useful in the
domain of specifying objectives for learning and may be even more useful in
forcing test makers to carefully identify specific assessment objectives.
Micro- and macroskills
of listening (adapted from Richards, 1983)
Ø Microskills
1.
Discriminate among the distinctive
sounds of English.
2.
Retain chunks of language of different
lengths in short-term memory.
3.
Recognize English stress patterns, words
in stressed and unstressed positions, rhythmic structure, intonation contours,
and their role in signaling information.
4.
Recognize reduced forms of words.
5.
Distinguish word boundaries, recognize a
core of words, and interpret word order patterns and their significance,
6.
Process speech at different rates of
delivery.
7.
Process speech containing pauses,
errors, corrections, and other performance variables.
8.
Recognize grammatical word classes
(nouns, verbs, etc.), systems (e.g., tense, agreement, pluralization),
patterns, rules, and elliptical forms.
9.
Detect sentence constituents and
distinguish between major and minor constituents.
10.
Recognize that a particular meaning may
be expressed in different grammatical forms
11.
Recognize cohesive devices in spoken
discourse.
Ø Macroskills
12.
Recognize the communicative functions of
utterances, according to situations, participants, goals.
13.
Infer situations, participants, goals
using real-world knowledge.
14.
From events, ideas, and so on,
described, predict outcomes, infer links and connections between events, deduce
causes and effects, and detect such relations as main idea, supporting idea,
new information, given information, generalization, and exemplification.
15.
Distinguish between literal and implied
meanings.
16.
Use facial, kinesic, body language, and
other nonverbal clues to decipher meanings.
17.
Develop and use a battery of listening
strategies, such as detecting key words, guessing the meaning of words from
context, appealing for help, and signaling comprehension or lack thereof.
DESIGNING
ASSESSMENT TASKS: INTENSIVE LISTENING
RECOGNIZING
PHONOLOGICAL AND MORPHOLOGICAL ELEMENTS
A
typical form of intensive listening at this level is the assessment of
recognition of phonological and morphological elements of language.A classic
test task gives a spoken stimulus and asks test-takers to identify the stimulus
from two or more choices
Paraphrase
Recognition
The
next step up on the scale of listening comprehension microskills is words,
phrases and sentences, which are
frequently assessed by providing a stimulus sentence anc asking the test-taker
to choose the correct paraphrase from a number of choices.
DESIGNING
ASSESSMENT TASKS: RESPONSIVE LISTENING
A
question-and-answer format can provide some interactivity in these lower-end
listening tasks. The test-taker's response is the appropriate answer to a
question. The objective of this item is recognition of the wb-question bow much
and its appropriate response. Distractors are chosen to represent common
learner errors: (a) responding to bow much vs. bow much longer; (c) confusing
bow much in reference to time vs. the more frequent reference to money; (d)
confusing a wb-ques• tion with a yes/no
question.
None of the tasks so
far discussed have to be framed in a multiple-choice format. They can be
offered in a more open-ended framework in which test-takers write or speak the
response.
DESIGNING
ASSESSMENT TASKS: SELECTIVE LISTENING
Listening
Cloze
Listening
cloze tasks (sometimes called cloze dictations or partial dictations) require
the test-taker to listen to a story monologue, or conversation and
simultaneously read the written text in which selected words or phrases have
been deleted. Cloze procedure is most commonly associated with reading only.
Information
Transfer
The
objective of this task is to test prepositions and prepositional phrases of
location (at the bottom, on top of, amund, along with larger; smaller), so
other words and phrases such as back yam, yesterday, last few seeds, and scare
away are supplied only as context and need not be tested.
Sentence
Repetition
The
task of simply repeating a sentence or a partial sentence, or sentence don, is
also used as an assessment of listening comprehension. As in a (discussed
below), the test-taker must retain a stretch of language long enough reproduce
it. and then must respond with an oral repetition of that Incorrect listening
comprehension. whether at the phonemic or discourse may be manifested in the
correctness of the repetition.
DESIGNING
ASSESSMENT TASKS: EXTENSIVE LISTENING
Dictation
Dictation
is a widely researched genre of assessing listening comprehension. In a
dictation, test-takers hear a passage, typically of 50 to 100 words, recited
three times: first, at normal speed; then, with long pauses between phrases or
natural word groups, during which time test-takers write down what they have
just heard; and finally, at normal speed once more so they can check their work
and proofread. Here is a sample dictation at the intermediate level of English.
The
difficulty of a dictation task can be easily manipulated by the length of the
word groups (or bursts, as they are technically called), the length of the
pauses, the speed at which the text is read, and the complexity of the
discourse, grammar, and vocabulary used in the passage.
Scoring
is another matter. Depending on your context and purpose in admin: tering a
dictation, you will need to decide on scoring criteria for several possib kinds
of errors;
·
spelling error only, but the word
appears to have been heard correctly
·
spelling and/or obvious
misrepresentation of a word, illegible word
·
grammatical error (For example,
test-taker hears/ can'tdo it, writesl can do it
·
skipped word or phrase
·
permutation of words
·
additional words not in the original
·
replacement of a word with an
appropriate synonym
Communicative
Stimulus-Response Tasks
Assessment
task in which the test-taker is presented with a stimulus monologue or
conversation and then is asked to respond to a set of comprehension questions. are
commonly used in commercially produced proficiency tests, The monologue
lectures, and brief conversations used in such tasks are sometimes a little
contrive and certainly the subsequent multiple-choice questions don't mirror
communicative, real-life situations.
Authentic
Listening Tasks
Ideally,
the language assessment field would have a stockpile of listening test type
that are cognitively demanding, communicative, and authentic, not to mention
interactive by means of an integration with speaking. However, the nature of a
test a sample of performance and a set of tasks with limited time frames
implies an equally limited capacity to mirror all the real-world contexts of
listening performance.
Assessing
speaking
BASIC
TYPES OF SPEAKING
1. Imitative.
At one end of a continuum of types of speaking performance is the ability to
simply parrot back (imitate) a word or phrase or possibly a sentence, While
this is a purely phonetic level of oral production, a number of prosodic,
lexiCal, and grammatical properties of language may be included in the
criterion per. formance.We are interested only in what is traditionally labeled
"pronunciation"; no inferences are made about the test-taker's ability
to understand or convey meaning or to participate in an interactive
conversation. The only role of listening here is in the short-term storage of a
prompt, just long enough to allow the speaker to retain the short stretch of
language that must be imitated.
2. Intensive.
A second type of speaking frequently employed in assessment contexts is the
production of short stretches of oral language designed to demone strate
competence in a narrow band of grammatical, phrasal, lexical, or phonologiCal
relationships (such as prosodic elements—intonation, stress, rhythm, juncture).
The speaker must be aware of semantic properties in order to be able to
respond, but interaction with an interlocutor or test administrator is minimal
at best. Examples of intensive assessment tasks include directed response
tasks, reading aloud, sentence and dialogue completion; limited picture-cued
tasks including simple sequences; and translation up to the simple sentence
level.
3. Responsive.
Responsive assessment tasks include interaction and test come prehension but at
the somewhat limited level of very short conversations, standard greetings and
small talk, simple requests and comments, and the like.
4. Interactive.
The difference between responsive and interactive speaking is in the length and
complexity of the interaction, which sometimes includes multiple exchanges
and/or multiple participants. Interaction can take the two forms of
transactional language, which has the purpose of exchanging specific
information, or interpersonal exchanges, which have the purpose of maintaining
social relationships. (In the three dialogues cited above,A and B were
transactional, and C was interpersonal.) In interpersonal exchanges, oral
production can become pragmatically complex with the need to speak in a casual
register and use colloquial language, ellipsis, slang, humor, and other
sociolinguistic conventions.
5. Extensive
(monologue), Extensive oral production tasks include speeches, oral
presentations, and story-telling, during which the opportunity for oral
interaction from listeners is either highly limited (perhaps to nonverbal
responses) or ruled out altogether. Language style is frequently more
deliberative (planning is involved) and formal for extensive tasks, but we
cannot rule out certain informal monologues such as casually delivered speech
(for example, my vacation in the mountains, a recipe for outstanding pasta
primavera, recounting the plot Of a novel or movie),
MICRO-
AND MACROSKILLS OF SPEAKING
Micro- and macroskills
of oral production
Ø Microskills
1. Produce
differences among English phonemes and allophonic variants.
2. Produce
chunks of language of different lengths.
3. Produce
English stress patterns, words in stressed and unstressed positions, rhythmic
structure, and intonation contours.
4. Produce
reduced forms of words and phrases.
5. Use
an adequate number of lexical units (words) to accomplish pragmatic purposes.
6. Produce
fluent speech at different rates of delivery.
7. Monitor
one's own oral production and use various strategic devices— pauses, fillers,
self-corrections, backtracking—to enhance the clarity of the message.
8. Use
grammatical word classes (nouns, verbs, etc.), systems (e.g., tense, agreement,
pluralization), word order, patterns, rules, and elliptical forms.
9. Produce
speech in natural constituents: in appropriate phrases, pause groups, breath
groups, and sentence constituents.
10. Express
a particular meaning in different grammatical forms.
11. Use
cohesive devices in spoken discourse.
Ø Macroskills
12. Appropriately
accomplish communicative functions according to situations, participants, and
goals.
13. Use
appropriate styles, registers. implicature, redundancies, pragmatic
conventions, conversation rules. floor-keeping and -yielding, interrupting, and
other sociolinguistic features in face-to-face conversations.
14. Convey
links and connections between events and communicate such relations as focal
and peripheral ideas, events and feelings, new information and given
information, generalization and exemplification.
15. Convey
facial features, kinesics, body language, and other nonverbal cues along with
verbal language.
16. Develop
and use a battery of speaking strategies, such as emphasizing key words,
rephrasing, providing a context for interpreting the meaning of words,
appealing for help, and accurately assessing how well your interlocutor is
understanding you.
DESIGNING
ASSESSMENT TASKS: IMITATIVE SPEAKING
An
occasional phonologically focused repetition task is warranted as long repetition tasks are not allowed to occupy a
dominant role in an overall oral prc» duction assessment, and as long as you
artfully avoid a negative washback effect Such tasks range from word level to
sentence level, usually with each item focusing on a specific phonological
criterion. In a simple repetition task, test-takers repeat the stimulus.
whether it is a pair of words. a sentence, or perhaps a question (to test for
intonation production).
DESIGNING
ASSESSMENT TASKS: INTENSIVE SPEAKING
Directed
Response Tasks
In
this type of task, the test administrator elicits a particular grammatical form
or a transformation of a sentence. Such tasks are clearly mechanical and not
communicative, but they do require minimal processing of meaning in order to
produce the correct grammatical output.
Read-Aloud
Tasks
Intensive
reading-aloud tasks include reading beyond the sentence level up to a paragraph
or two. This technique is easily administered by selecting a passage that
incorporates test specs and by recording the test-taker's output; the scoring
is relatively easy because all of the test-taker's oral production is
controlled. Because of the results of research on the Phone Pass test, reading aloud
may actually be a surprisingly strong indicator of overall oral production
ability.
Sentence/Dialogue
Completion Tasks and Oral Questionnaires
Another
technique for targeting intensive aspects of language requires test-takers to
read dialogue in which one speaker's lines have been omitted. Test-takers are
first given time to read through the dialogue to get its gist and to think
about appropriate lines to fill in. Then as the tape, teacher, or test
administrator produces one part orally, the test-taker responds.
Picture-Cued
Tasks
One
of the more popular ways to elicit oral language performance at both intensive
and extensive levels is a picture-cued stimulus that requires a description
from the testtaker Pictures may be very simple, designed to elicit a word or a
phrase; somewhat more elaborate and "busy"; or composed of a series
that tells a story or incident.
Translation
(of Limited Stretches of Discourse)
Translation
is a part of our tradition in language teaching that we tend to discount or
disdain, if only because our current pedagogical stance plays down its
importance- Translation methods Of teaching are certainly passé in an era of
direct approaches to creating communicative classrooms. But we should remember
that in countries where English is not the native or prevailing language,
translation is a meaningful communicative device in contexts where the English
user is called on to be an interpreter. Also, translation is a well-proven
communication strategy for learners of a second language.
DESIGNING
ASSESSMENT TASKS: RESPONSIVE SPEAKING
Assessment
of responsive tasks involves brief interactions with an interlocutor, differing
from intensive tasks in the increased creativity given to the test-taker and
from interactive tasks by the somewhat limited length of utterances.
Question
and Answer
Question-and-answer
tasks can consist of one or two questions from an interviewer, or they can make
up a portion of a whole battery of questions and prompts in an oral interview.
They can vary from simple questions like "What is this called in
English?" to complex questions like "What are the steps governments
should take, if any, to stem the rate of deforestation in tropical
countries?" The first question is intensive in its purpose; it is a
display question intended to elicit a predetermined correct response. We have
already looked at some of these types of questions in the previous section.
Questions at the responsive level tend to be genuine referential questions in
which the test-taker is given more opportunity to produce meaningful language
in response.
Giving
Instructions and Directions
We
are all called on in our daily routines to read instructions on how to operate
an appliance, how to put a bookshelf together, or how to create a delicious
clam chowder. Somewhat less frequent is the mandate to provide such
instructions orally, but this speech act is still relatively common. Using such
a stimulus in an assessment context provides an opportunity for the test-taker
to engage in a relatively extended stretch of discourse, to be very clear and
specific, and to use appropriate discourse markers and connectors. The
technique is simple: the administrator poses the problem, and the test-taker
responds. Scoring is based primarily on comprehensibility and secondarily on
other specified grammatical or discourse categories. Here are some
possibilities.
Paraphrasing
Another
type of assessment task that can be categorized as responsive asks the
testtaker to read or hear a limited number of sentences (perhaps two to five)
and produce a paraphrase Of the sentence.
TEST
OF SPOKEN ENGLISH (TSE)
The
tasks on the TSE are designed to elicit oral production in various discourse
categories rather than in selected phonological, grammatical, or lexical
targets. The following content specifications for the TSE represent the
discourse and pragmatic contexts assessed in each administration:
1. Describe
something physical.
2. Narrate
from presented material
3. Summarize
information of the speaker's own choice.
4. Give
directions based on visual materials.
5. Give
instructions.
6. Give
an opinion.
7. Support
an opinion.
8. Compare/contrast,
9. Hypothesize
10. Function
"interactively."
11. Define
DESIGNING
ASSESSMENT TASKS: INTERACTIVE SPEAKING
The
final two categories of oral production assessment (interactive and extensive
speaking) include tasks that involve relatively long stretches of interactive
discourse (interviews, role plays, discussions, games) and tasks of equally
long duration but that involve less interaction (speeches, telling longer
stories, and extended explanations and translations).
Interview
When
"oral production assessment" is mentioned, the first thing that comes
to mind is an oral interview: a test administrator and a test-taker sit down in
a direct face-toface exchange and proceed through a protocol of questions and
directives. interview, which may be tape-recorded
for re-listening, is then scored on one or more parameters such as accuracy in
pronunciation and/or grammar, vocabulary
usage, fluency, sociolinguistic/pragmatic appropriateness, task
accomplishment, and even comprehension.
Role
Play
Role
playing is a popular pedagogical activity in communicative language-teaching
classes. Within constraints set forth by the guidelines, it frees students to
be some what creative in their linguistic output. In some versions, role play
allows sotn:: rehearsal time so that students can map out what they are going
to say, And it has the effect of lowering anxieties as students can, even for a
few moments, take on persona of someone other than themselves.
Discussions
and Conversations
As formal assessment
devices, discussions and conversations with and among students are difficult
to specify and even more difficult to score. But as informal techniques to
assess learners, they offer a level of authenticity and spontaneity that other assessment techniques may not provide. Discussions may be especially
appropriate tasks through which to elicit and observe such abilities as
·
topic nomination, maintenance, and
termination;
·
attention getting, interrupting, floor
holding, control;
·
clarifying, questioning, paraphrasing;
·
comprehension signals (nodding,
"uh-huh," "hmm," etc.);
·
negotiating meaning;
·
intonation patterns for pragmatic
effect;
·
kinesics, eye contact, proxemics, body
language; and
·
politeness, formality, and other
sociolinguistic factors
Game
Among
informal assessment devices are a variety of games that directly involve language
production, Consider the following types:
1.
"Tinkertoy" game: A Tinkertoy
(or Lego block) structure is built behind a screen. One or two learners are
allowed to view the structure. In successive stages of construction, the
learners tell "runners" (who can't observe the structure) how to
re-create the structure. The runners then tell "builders" behind
another screen how to build the structure. The builders may question or confirm
as they proceed, but only through the two degrees of separation, Object:
re-create the structure as accurately as possible.
2.
Crossword puzzles are created in which
the names of all members of a class are clued by obscure information about
them. Each class member must ask questions oi others to determine who matches
the clues in the puzzle.
3.
Information gap grids are created such
that class members must conduct mini-interviews of other classmates to fill in
boxes, e.g., "born in July," "plays the violin," "has
a two-year-old child," etc.
4.
City maps are distributed to class
members. Predetermined map directions are given to one student who. with a city
map in front of him or her, describes the route to a partner, who must then
trace the route and get to the correct final destination.
ORAL
PROFICIENCY INTERVIEW (OPI)
The
best-known oral interview format is one that has gone through a consider able
metamorphosis over the last half-century, the Oral Proficiency Interviews (OPO-
Originally known as the Foreign Service Institute (FSI) test, the OPI is the
result of a historical progression of revisions under the auspices of several
agencies, including the Educational Testing Service and the American Council
Teaching Foreign Languages (ACTFL). The latter, a professional society research
on foreign language instruction and assessment, has now become principal body for promoting the use of the
OPI.
Oral
Presentations
In
the academic and professional arenas, it would not be uncommon to be called on
to present a report, a paper, a marketing plan, a sales idea, a design of a new
product, or a method. A summary of oral assessment techniques would therefore
be incomplete without some consideration of extensive speaking tasks. Once
again the rules for effective assessment must be invoked: (a) specify the
criterion, (b) set appropriate tasks, (c) elicit optimal output, and (d)
establish practical, reliable scoring procedures.
Picture-Cued
Story-Telling
One
of the most common techniques for eliciting oral production is through pictures, photographs, diagrams, and charts.
We have already looked at this elicitaty device for intensive tasks, but at
this level we consider a picture or a series of picture as a stimulus for a
longer story or description.
Retelling
a Story, News Event
In
this type of task, test-takers hear or read a Story or news event that they are
asked to retell. This differs from the paraphrasing task discussed above (pages
161-162) in that it is a longer stretch of discourse and a different genre. The
objectives in assigning such a task vary from listening comprehension of the
original to production Of a number of oral discourse features (communicating
sequences and relationships of events, stress and emphasis patterns,
"expression" in the case of a dramatic story), fluency, and
interaction with the hearer. Scoring should of course meet the intended
criteria.
Translation
(of Extended Prose)
Translation
of words, phrases, or short sentences was mentioned under the category of
intensive speaking. Here. longer texts are presented for the test-taker to read
in the native language and then translate into English. Those texts could come
in many forms: dialogue, directions for assembly of a product. a synopsis of a
story or play or movie, directions on how to find something on a map. and other
genres, The advantage of translation is in the control of the content
vocabulary, and. to some extent. the grammatical and discourse features.
The
disadvantage is that translation of longer texts is a highly specialized skill
for which some individuals obtain post-baccalaureate degrees! To judge a
nonspecialists oral language ability on such a skill may be completely invalid,
especially if the test-taker has not engaged in translation at this level.
Criteria for scoring should therefore take into account not only the purpose in
stimulating a translation but the possibility Of errors that are unrelated to
oral production ability.
Reference :
Brown,H.Douglas. 2004.
LANGUAGE ASSESSMENT “Principles and classroom practice”. New York: Pearson
Education.
Tidak ada komentar:
Posting Komentar