Evaluation and Assessment Techniques in Language Proficiency Testing

Slide Note
Embed
Share

Evaluation and assessment in language proficiency testing involve systematic data gathering to assess students' communicative abilities. Different types of evaluation, such as formative and summative, are discussed along with techniques like observation, analysis of work, and creating situations to test students' knowledge and creativity.


Uploaded on Sep 12, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Topic 11 EVALUATION Daniel Madrid (Faculty of Education, University of Granada) M Luisa P rez Ca ado (Faculty of Humanities and Education, University of Ja n) INTRODUCTION EVALUATION AND ASSESSMENT TYPES OF EVALUATION THE NATURE OF LANGUAGE PROFICIENCY TESTS TYPES OF TESTS TESTING REQUIREMENTS TESTING THE STUDENT S COMMUNICATIVE ABILITY CHARACTERISTICS OF TEST ITEMS DEMANDS OF THE SPANISH EDUCATIONAL ADMINISTRATION EVALUATION TECHNIQUES EVALUATION THROUGH ACHIEVEMENT TESTS ANALYSING THE SCORES GRADING THE STUDENTS

  2. 1. INTRODUCTION Evaluation implies gathering data on the teaching and learning of the English language, not only in terms of written and oral proficiency, but also in terms of abilities, skills, attitudes, and values. 2. EVALUATION AND ASSESSMENT Evaluation (broader than assessment) implies the systematic gathering of information for purposes of decision-making. Assessment is a form of evaluation. It refers to the measurement of the ability of a person or the quality or success of a teaching course. Evaluation includes: Systematic data-gathering and the assessment of such data in order to implement curricular decisions and improve the curriculum. Assessing curricular effectiveness or efficiency Evaluating the participants attitudes Examining the school context within which teaching-learning activities develop

  3. 2.1. Product-oriented approaches Stages of product-oriented approaches (Hammond 1973:168): - Identification of what is to be evaluated. - Definition of the variables to be evaluated. - Formulation of teaching objectives in operative and observable terms. - Analysis of the results and assessment of the effectiveness of the curricular programme in terms of such outcomes. 2.2. Process-oriented approaches Scriven, for instance, advocates checking not only final learning outcomes, but also process quality through formative evaluation. 2.3. Internal and external evaluation External evaluation is carried out by external assessors. Internal evaluation is carried out by the people involved in it.

  4. 3. TYPES OF EVALUATION 3.1. Formative, summative, initial, and final evaluation a) Formative, process, continuous or ongoing evaluation monitors performance throughout the course of the development of teaching-learning programmes. b) Summative or achievement testing is carried out at the end of a specific period of instruction c) Initialevaluation includes contents covered in previous courses and is applied at the outset of the school year with a diagnostic purpose d) If it is implemented to assess the level of attainment at the end of the school year, it is final evaluation.

  5. Techniques for formative or continuous assessment In order to carry out this type of evaluation, a variety of techniques are recommended: Systematic observation of the students and their work. Analysis of their work, performance and academic activities. Creation of situations which foster the deployment of the students knowledge, originality, and creativity. Comprehension and expression quizzes and tests (listening, speaking, reading and writing) Tests, dialogues, conversations. 3.4. Quantitative and qualitative evaluation If we resort to a quantification of the phenomena we are attempting to assess, we will be making use of a quantitative type of evaluation. If we choose to observe and describe such phenomena, we will be employing a qualitative one

  6. 3.4.1. Profiling Profiling is an example of qualitative evaluation. This type of assessment describes a wide range of pupils qualities, attitudes and behaviour, so that a more complete picture of each student may be given. PORTFOLIO: The individual records of the student s profiles and informal experiences with the language and its culture present different aspects of their language biography and can be registered in his/her portfolio. For example, in the European Language Portfolio the students can record and reflect on their language learning and cultural experiences. It includes three parts: language passport, language biography and dossier (see http://culture2.coe.int/portfolio). STUDENT S PROFILE ON HIS/HER WRITING ABILITY Criteria Profile The student has written a draft with his own version of the story. Simple text, but well organised for his level. 1. Background information about the student s writing. He has included some details. Quite long for his level and age. 1. Student s response to writing. Variety of vocabulary; very basic and simple constructions; some spelling mistakes. 1. Conventions of writing: grammar, vocabulary, spelling, Considerable progress in relation to the last piece of writing. His text is very similar to the model provided; needs to be more creative. 1. Student s progress and development in writing.

  7. 4. THE NATURE OF LANGUAGE PROFICIENCY Language proficiency is a term that refers to our competence or ability to use a language for a specific purpose; the degree of skill with which we understand and use it. 4.1. The unitary approach to language proficiency Oller (1979) has argued that the nature of second language proficiency is unitary, global and holistic. It depends on the learner's pragmatic expectancy grammar. In order to evaluate the learner's capacity to interpret, understand and produce messages, Oller proposes the use of pragmatic or integrative texts. Examples: - Dictations - Combined cloze and dictation - Oral cloze procedure - Oral interviews - Composition or essay writing - Narrations - Translation

  8. Example of cloze test (An option would be to give the misssing words at the top) The old woman who lived in a shoe There once was an old woman who (1) in a shoe. This must have been very cramped and difficult because living (2) . a shoe is not very comfortable, I expect. One day, she went out and there (3) .. some children playing in the street nearby where she lived. They began shouting (4) . her. "You silly old woman, why do you live in a shoe?", they shouted, and other things like that. They were very insulting (5) the old woman. I don't know why the old woman had to live in a shoe, but she (6) . have been very poor, and it was not nice to (7) . fun of the poor woman because she was so hard up that she had nowhere (8) to live. But children can be very cruel sometimes, and this case was (9) .. exception. However, on this occasion the old woman didn't just (10) their insults meekly, but became very angry and, shouting "I will teach you a (11) . , she chased them with a cane. (Based on a nursery rhyme)

  9. Missing words: 1. 2. in 8. else 3. were 9. No 4. at 10. take 5. to 11. lesson 6. must Lived 7. make

  10. 4.2. Multidimensional concept of proficiency The pluridimensional syllabus presented in chapter 4 involves a multidimensional concept of proficiency that includes the learning of concepts and the development of skills and positive attitudes related to: Block 1: Block 1: Comprehension of oral texts (LISTENING) Block 2: Block 2: Production of oral texts: expression and interaction (SPEAKING) Block 3: Block 3: Comprehension of written texts (READING) Block 4:Block 4: Production of written texts (WRITING) The language syllabus The sociolinguistic, pragmatic and discourse syllabus The sociocultural syllabus The cross-curricular syllabus

  11. 5. TESTS These are normally used to measure certain capacities, abilities, or skills; conceptual aspects; or the learners attitudes and values. Valette (1977: 4) assigns three basic roles to tests: they are usually based on syllabus objectives, they may stimulate and encourage students progress, and they provide data (and information) on the degree of objective attainment and on the students level of progress. Some limitations of tests The same test cannot be expected to be suitable or applicable to all teaching- learning situations. We cannot expect more from tests than they can offer us: they simply provide some information about the aspects that we test. Nor should we have a blind faith in rating scales or in their sophisticated arithmetic.

  12. 6. TYPES OF TESTS Tests can be classified according to their aims (cf. Valette 1977:6-7 and Hughes 1991:9-21). 6.1. Aptitude tests The aim of these tests is to measure the subjects aptitude for language learning. The two best-known aptitude tests are Carroll and Sapon s (1967) Elementary Modern Language Aptitude Test (EMLAT) and Pimsleur s (1967) Language Aptitude Battery (LAB). 6.2. Achievement and progress tests They measure what the students have learned after a specific period of instruction. 6.3. Diagnostic and placement tests These tests are applied to establish the students level of competence, the degree of mastery of certain linguistic aspects. This is the type of test typically administered at the outset of the school year and maintained as a reference point for course planning. 6.4. Standardised tests They measure an individual s linguistic and communicative competence according to pre- established criteria and demands set by certain committees, organisations, or institutions, not taking into account the type of syllabus or course which (s)he might have previously followed. Some examples include (cf. Valette 1977: 323-333): - Preliminary English Test (PET) - Cambridge First Certificate (CFC) - Test of English as a Foreign Language (TOEFL) - Test for the Spanish as a Foreign Language Diploma (set by the Ministry of Education, Science, and Technology) (DELE)

  13. 7. TESTING REQUIREMENTS For a test to be useful it must meet a series of requirements (cf. Bachman 1990:160-294, Hughes 1991:22-47, Bachman and Palmer 1996:17-40): 7.1. Reliability A test is said to be reliable if, when administered to the same group of students without further learning or attrition, the results are the same. If it is reliable the same test applied to the same students on two consecutive days should yield, in theory, identical results. 7.2. Validity It is the extent to which a test measures what it is intended to measure. To give an example, a vocabulary or grammar test might not be valid in providing information about the global communicative competence of a group of students. Content validity: When it contains a representative sample of the elements that we want to test. It is obvious that for an oral communication test to be valid it must be interactive and include listening and speaking items. Construct validity: if it is consistent with the theory of language on which it is based. For example, if we aim to develop the student s communicative competence, the test must also be communicative and its items must be selected in accordance with communicative theory

  14. 7.3. Authenticity For a test to be authentic, the items or questions must bear related to real- life tasks and activities. In other words, they should be similar to the problems which normally arise and must be solved in real life. 7.4. Interactive characteristics It depends on how much the student must rely on his/her individual character traits and competence in order to solve the questions it poses. Same of these factors may be: Individual differences (age, sex, social class, aptitude, attitudes, etc.). Linguistic and communicative competence (the latter includes strategic competence). Knowledge of the real world (experiential knowledge). The affective schemata of cognitive structure (values).

  15. 7.5. Social and educational impact It is referred to as washback . The effect which the Rev lida and Selectividad have on 2 de Bachillerato students, their parents, and their schools is an example. 7.6. Practicality The application of a test is conditioned by a set of circumstances which make it possible, viable and practical.

  16. 8. TESTING THE STUDENTS COMMUNICATIVE ABILITY At present, the English level attained by students is established by measuring the development of their linguistic and communicative competence, which includes: Canale (1983: 22-25) COMMUNICATIVE COMPETENCE Bachman (1990) LANGUAGE COMPETENCE Common European Framework (2001) COMMUNICATIVE LANGUAGE COMPETENCES Grammatical competence Phonology Orthography Vocabulary Word formation Sentence formation ORGANIZATIONAL COMPETENCE Grammatical competence Vocabulary Morphology Syntax Phonetics Graphology Textual competence Cohesion Rhetorical organisation PRAGMATIC COMPETENCE Illocutionary competence Ideational functions Manipulative functions Heuristic functions Imaginative functions Sociolinguistic competence Sensitivity to dialects & variety Sensitivity to differences in register Sensitivity to naturalness Cultural references & figures of speech Linguistic competences Lexical Grammatical Semantic Phonological Orthographic Orthoepic Sociolinguistic competence Discourse competence Cohesion Coherence Sociolinguistic competence Markers of social relations Politeness conventions Expressions of folk wisdom Register differences Dialect and accent Strategic competence For grammatical difficulties For sociolinguistic difficulties For discourse difficulties For performance factors Pragmatic competences Discourse competence Functional competence

  17. 9. CHARACTERISTICS OF TEST ITEMS During the psychometric-structuralist period, discrete-point tests were used to measure the students competence only in relation to the linguistic elements learned in class, especially grammar, vocabulary, and phonetics, or to the four language skills, listening, speaking, reading and writing, in isolation. 9.1. Communicative items They focus on: - Learning context and situations. - The purposes and intentions expressed in communication. - Speech acts and communicative functions.

  18. Important implications for communicative test item construction: Interaction: use of the L2 to address someone or to answer the interlocutor. The unpredictable nature of information exchange. Contextualisation: language use appropriate to the specific context and situation, Purposeful FL/L2 use. Communicative performance to express our purposes with efficiency and appropriateness. Authentic language use, avoiding artificial, simplified language not used in real life.

  19. 9.2. The Task-based approach: Implications for test construction: - the interactive and unpredictable character of communication - contextualisation - awareness of communicative purpose and implicit meanings - authenticity - focus on meaning; relevance of context - emphasis on problem-solving - fostering of a variety of procedures - balance between oral and written work - interrelation of the subcompetences which integrate communicative ability What features are applied in the following task? - Find information on the following films - Have you watched any of them? Did you like them?

  20. 10. THE DEMANDS OF THE SPANISH EDUCATIONAL ADMINISTRATION 10.1. Evaluation of CONCEPTS AND PRINCIPLES Grammatical or linguistic competence - Grammar (structures) - Lexis - Phonetics (pronunciation) - Orthography (spelling) Sociolinguistic and pragmatic competence: - Language functions and registers Sociocultural competence - Culture (sociocultural aspects of English) - Cross-curricular aspects and real world knowledge (History, Science, Geography, ...) Basic competences / Cross-curricular aspects

  21. 10.2. Evaluation of PROCEDURES AND SKILLS This basically involves the evaluation of strategic competence and of the four skills (listening, speaking, reading, writing), and is carried out by testing receptive(listening, reading), productive (speaking, writing) and interactiveprocedures(listening-speaking; reading-writing) in an integrative way and by means of interviews or other tasks which require reading and writing, or listening, reading, and writing. This aspect of evaluation provides information on the students procedural competence (knowing how, savoir-faire ). 10.3. Evaluation of ATTITUDES These factors constitute the basis of the students attitudinal or existential competence ( savoir- tre ) and include attitudes, values, motivations, beliefs, cognitive styles and personality factors (Common European Framework 2001:105). 10.4. Ability to learn (LEARNING TO LEARN) This general competence includes the ability to incorporate new items of knowledge into existing knowledge. The ability to learn ( savoir-apprendre ) has several components (Common European Framework 2001:107-108): language and communication awareness, general phonetic awareness and skills, study skills and heuristic skills.

  22. 11. EVALUATION TECHNIQUES 11.1. Some constraints The validity of tests administered in class is only relative. The competences and subcompetences subsumed within communicative competence develop and operate in an interrelated and integrated manner; it is thus extremely difficult (if not impossible) to evaluate them in an isolated way. Oller s (1979) unitary and holistic model is very difficult to apply in the initial stages of language learning.

  23. 11.2. Evaluating the language curriculum We can use a questionnaire (see Madrid and McLaren 1995:258):

  24. 11.5. The students self-evaluation POTENTIAL ADVANTAGES OF SELF-EVALUATION - Students may become more aware of the FL syllabus, they may pay more attention to the learning activities and may carry them out with more enthusiasm, at least at the first stage of self-evaluation. - The learners are given opportunities to reflect on their learning process: attitudes, interest, participation, effort and other aspects of their behaviour in the EFL classroom. - The self-evaluation process may also generate interesting comments among the students and the teacher on the learning tasks: difficulty, appropriateness, etc. and thus provide valuable feedback (see Madrid 1991). - The self-evaluation of learning tasks may help students to understand their achievement better and accept the grades they are given by the teacher. POTENTIAL DRAWBACKS - Some students take the whole process seriously at the beginning, but they soon relax and make a distorted and imprecise appraisal. - In the Primary stage, students have difficulties in applying the criteria given for self- evaluation. They often need guidance in the correct interpretation of some criteria - The self-evaluation process may not satisfy the low-ability students, who often have to admit their lack of interest and dedication.

  25. SELF-EVALUATING ATTITUDES AND MOTIVATION Grade the following statements from 1 to 5. Use: 1 = never 2 = rarely 3 = sometimes 4 = very often 5 = always As a result of the English classes received this year... ATTITUDES AND MOTIVATION: ( ..) 1. I am interested in the English class and I wish to study and practise the language more and more in order to learn it better. Comments: ................. . ( ..) 2. I like English and enjoy studying it, practising it and learning it. Comments: .... ......... ( ..) 3. I feel motivated to study English, consequently I do my best to practise it and learn it. Comments: .............................. .. DEVELOPMENT OF ATTITUDES: ( ..) 4. I feel (more) respect for English-speaking people Comments: ............. .... ( ..) 5. I appreciate their social habits (more). Comments: ... ....... ( ..) 6. I value team work and co-operative learning. Comments? ........... . ( ..) 7. I feel (more) enthusiastic about using English and taking part in communicative situations with native speakers. Comments: . .. ( ..) 8. I feel (more) confident with English and like to use it more and more, orally or in writing. Comments: ( ..) 9. I appreciate the importance of autonomous learning and try to learn by myself as much as Comments: .............................

  26. Self-evaluation activity (enter TouTube)

  27. 11.7. Continuous assessment techniques Observing the student at work daily. Analysis of his/her school work. Suggesting tasks in which the student has to apply concepts and procedures already studied and be creative in solving the new problem. Oral and written quizzes. Assignment of mini-projects and other extracurricular tasks: analysis and evaluation, etc., bearing in mind the results of the oral and written tests administered throughout the school year. Project work, murals, and other activities which involve extracurricular tasks.

  28. 12. EVALUATION THROUGH ACHIEVEMENT TESTING 12.1. EVALUATING ORAL COMMUNICATION 12.1.1. Evaluating the listening comprehension skill Listening comprehension is a receptive skill and depends on our ability in three areas: - Discrimination of sounds and other phonetic elements. - Understanding of specific elements. - Overall comprehension. Draw the route from A to the post office: (The student hears): Go down London street. Pass the square and take the first turning on the left. Go along that street and wait for me at the Post Office on the right Listening comprehension ability and some aspects of their sociocultural competence: Think about the socio-cultural aspects you have studied and choose the right option: In Britain, a very popular take-away lunch In America, when you say thank you , people usually answer ---------------------------------------------------------------- The student hears: a) b) c) d) a) b) c) d) In Britain a very popular take-away lunch is a) peas and sausage b) fish and chips c) orange juice d) bread and butter 2) In America, when you say thank you , people usually answer a) Have a good meal b) Sorry! c) I beg your pardon d) You re welcome

  29. Scanning for specific information: Listen and fill in the table: Flight # Departure time Day Arrival time The student hears: - Let me see flight IB 610, departure 17.40, arrival Malaga 19.55. O.K. See you on Sunday! Bye, now! Listen and fill in the table: - Can you pick me up at the airport? - Of course! What time s your flight? - -

  30. INTEGRATING SKILLS (LISTENING COMPREHENSION AND WRITING (SPELLING) Listen to this text and write: Kids often wear fancy costumes to celebrate Halloween, but some people dress up as ghosts, witches, devils or spirits. They put a piece of candle in their jack-o'-lantern to scare away ghosts. They go from house to house, ring doorbells and shout "trick or treat". People give them candy: sweets, gum, nuts, etc. halloween_220805_04_cp halloween

  31. 12.1.2. Evaluating the speaking skill There are two general qualities that determine the success of the learner. These are: fluency and propositional precision (Common European Framework, 2001: 128). Fluency is the ability to articulate and keep going in communication acts. Propositional precision is the ability to formulate thoughts and propositions to make our meaning clear. Some important criteria used to judge the students oral performance are their Fluency Pronunciation Use of grammar Vocabulary Communicative ability

  32. SPEAKING TEST Primary Education-6th grade Part 1: Questions about the student s personal life: 1.Hi, what s your name? 2.Where do you live? 3.What s your mother s first language? 4.And your father s first language? 5.What language do you speak at home? 1.How many brothers or sisters have you got? 2.What are your hobbies? What do you like? 3.What did you do last weekend? 4.What are your plans for this summer? What will you do? Now let s talk about this park: 10 Are there any people in this park? What people can you see? 11 Are there any animals? What animals can you see? 12 Are there any swings? Where are they? 13 What s the boy in picture 9 doing? 14 Is there a park near your house or in your town? What is it like? / What is there in that park?

  33. Now you tell this story (15, 16, 17, 18, 19, 20):

  34. RATING SCALE FOR THE SPEAKING SKILL Grade Performance Very good speaker: Able to express him/herself with fluency within the scope of the language studied. A Very good Good speaker : Able to express most of the messages. Some occasional mistakes may occur that do not impede communication. B Good Average speaker : Able to express only the essential parts of the message in an intelligible way. Makes some errors in grammar and pronunciation, but is able to transmit the basic aspects. C Acceptable Poor speaker: Produces only isolated words and chunks and is unable to produce connected messages. Pronunciation is poor and impedes communication. D Insufficient

  35. 12.2. EVALUATING WRITTEN COMMUNICATION 12.2.1. Reading items (linguistic competence) DIALOGUES Read and complete. Use: do don t milk bread like Do you like .. Yes, I .. And do you . coffe? No, I . Ok,Have Lovely!

  36. POEMS AND RHYMES Read and complete. Use: day pond play fish tree grass Let s go to the park Let s go to the park. Let s go and Let s sit on the For the whole . The birds in the ., The ducks on the .., The in the sea, And me with my blonde.

  37. 12.2.2. Writing items Write instructions for the boy to get to the park:

  38. Describe some characteristics of the hotel below following the model: Example: The Loire Hotel in Paris is a three star hotel. Its telephone number is 886622. There is central heating and all the bedrooms have hot and cold water. Guests can use an electric cooker in their rooms. There are ironing facilities for guests, too. The garage is suitable for physically handicapped persons. Pets are not accepted.

  39. 13. ANALYSING THE SCORES AND THE TEST ITEMS Range The test s range can be determined by finding the highest and lowest scores. If a test administered to 40 students includes 50 items, the maximum score obtained is 46 and the minimum 14, the range fluctuates between 46 and 14. Mode The most frequently occurring score (or scores) in a sample. Median This is a measure of the central tendency of distribution. It is the value of the middle item or score in a sample arranged in order from lowest to highest. It is the most appropriate measure of central tendency for data. Mean It is the sum of all scores divided by the total number of items. Standard deviation (SD) It is a measure of the spread of the scores on a test. It indicates the degree to which scores vary from the mean. The normal curve or normal distribution

  40. ITEM ANALYSIS It is also important to analyse the items included in the test in relation to the scores obtained. Item difficulty An item difficulty index shows how easy or difficult the item has proved to be in the test. It can be obtained by using a statistical programme and calculating the percentage of correct answers of a particular item. It is interpreted as follows (see Lafourcade 1977): Difficulty Solved by % of students 0-15% 15-50% 50-85% 86-100% Very difficult Difficult Easy Very easy Item discrimination This indicates the extent to which the item separates the more able testees from the less able. It is assumed that good items are completed successfully by good learners and unsuccessfully by the less able. If that is not the case, the item dos not discriminate students (see Heaton 1975: 174-176)

  41. 14. GRADING THE STUDENTS Four common practices are briefly summarised below: We can calculate what each correct answer is worth by dividing the 10 points of the scale by the total number of items and assigning each item the corresponding grade. For example: if the test contains 50 items each correct answer is worth 0.2 points. We can adapt the grading criteria to the group level in the test and consider the maximum score that the students have actually obtained as total number of items. For example, if the test contains 50 items and the highest score was 36, they divide 10 by 36 and give each correct answer 0.28 points. Those teachers who follow the normal distribution procedure may work out the mean and standard deviation. Then they take away one standard deviation value from the mean and the students who do not reach that level fail. For example: if the mean, in the previous test, is 18 points and the standard deviation 2.1, the students who scored less than 16 points fail the test. Some teachers adapt their results to the normal curve and establish five or six intervals to grade the students, as shown below:

Related