Testing is a huge part of learning, but most teachers don't know a lot about it. We ask test guru Dan Ellsworth all about testing. What is a test? How is it different to an assessment? How Can you go about writing a test? How can you assess students for a learner profile? How do tests affect teaching? Yes, we asked him a lot of questions. Listen to hear the answers...
What is Testing and How Does It Shape Our Teaching (with Dan Ellsworth)?
Tracy Yu: Welcome to the TEFL Training Institute Podcast. The bite‑sized TEFL podcast for teachers, trainers and managers.
Tracy: Hi, everyone. Welcome to our podcast today. We've got our special guest...
Dan Elsworth: This is Dan Elsworth.
Ross Thorburn: Hi, everyone. Dan, very awesome to have you here.
Dan: It's good to be here.
Ross: Cool. For any of you that know Dan, he's a, I was going to say, jack of all trades.
Dan: It's as good as any explanation...
Ross: One of his specific trades is testing.
Dan: I work in English language testing or assessment. I was an English teacher before. I've been an IELTS examiner. I've designed courses and things like that. At the moment, I'm running a large English language center. Not as large as some of the really large ones.
Dan: Actually, it's probably quite small in comparison to others. We work in about 80 countries at the moment.
Ross: Wow. 80 not 18?
Dan: 80. We actually did a test in North Korea.
Ross: Wow! That's so cool.
Dan: Yeah. When I started working there, I got a parcel at my desk one day that just said Dan Elsworth and postage stamps said North Korea. That was quite exciting for a minute. Another thing that we did was when Aung San Suu Kyi came to power in Burma or half came to power, all of her ministers took one of our tests on the day they went to democracy.
It's the two things that I can say is very interesting about my work. I've used them all up right at the beginning.
Ross: [inaudible 1:38] interesting podcast. If you could turn off...
Ross: Before we get into it, can I ask you, what's the difference between testing and assessment? Because I use those terms interchangeably.
Dan: To be honest, it depends who you ask in the industry. I think generally, the way it's used colloquially is that testing is a formalized process in that it tends to fit into a certain amount of set time. It tends to be like a specific measurement instrument.
An assessment is a broadened category. It includes testing but it could include formative assessment or it could include some form of observation. The problem with all of these types of definitions is that when you really get down into the detail, the lines are pretty blurry anyway. I would just say use them interchangeably unless you're particularly worried about the kind of semantics to that, I guess.
Ross: With Dan today, we're going to concentrate on testing and as usual, we have three questions. The first question is, what are some different kinds of tests?
Tracy: Number two, how do you write a test?
Dan: And number three, what does that mean for teachers?
What is Testing? What Types of Language Test Are There?
Ross: You mentioned already, Dan, what's the difference between assessment and testing. Do you want to tell us a bit more about what is testing? When does it happen and what sort of tests?
Dan: There are few different types of tests. The ones that teachers normally come across can be divided into a few categories. Again, you get blurred lines between those categories.
What they'll often come across is an achievement test. An achievement test is at the end of a course and one of the things that identifies an achievement test is that it's a set curriculum, it normally covers a very thin slice of the syllabus and it's always specifically around something that you've already done.
Teachers probably do that quite a lot. Problem with achievement tests is that there's not a lot of investment in them. Unfortunately, a lot of course materials that you get out there generically on the market at the moment, you buy a set of course books that are all beautifully designed with nice pictures.
Then you get this kind of flimsy paper booklet, or sometimes, set of PDFs and those are your achievement tests. They're not particularly accurate measurement instruments because they're not really used to make any important decisions.
Ross: Why bother doing them, then? Just to assure the parents that it wasn't a waste of money taking the...?
Dan: That's often a good reason as any.
Dan: It's a business. I mean, with any assessment, when you're asking what the point is, what you're doing is your asking what the decision is that's being made about that candidate, about that student. In some language schools, there is no decision being made. All you're doing is giving them some feedback on how they've done so that they can take that away and think about it.
In other places, that decision is a bit more important and so it means it detects what course they get into in future. When I was at a Beijing school the other day, they were using our tests and it decided whether the kids are streamed into a course that takes three years or a course that takes four years.
Ross: Wow. That's a big decision.
Dan: It's a big decision. Generally, the more important the decision, the more investment is made in the test, the more accurate it is. Achievement test is normally quite low investment.
Ross: Can you give us an example there of a test that's the opposite and it's high investment?
Dan: High investment test would be a proficiency test like IELTS or TOEFL. Those proficiency tests they tend to cover a huge range. If you're looking CFR, they'll go all the way from A1 to C1 or C2.
Ross: Everything, basically?
Dan: Everything. The problem with that is you have to sample from each of those areas. You can ask a few A1 questions, a few A2 questions. You have to be very careful and do lots of analysis on the questions. In testing they are often called items. I occasionally call them an item by mistake.
Dan: That's why. You have to do lots of analysis on those. You have to spend a lot of time researching it because the decisions are made on the back of a proficiency test, particularly ones used for immigration or university entry, are really important.
You have a huge impact on people's lives which is why governments, universities and other people use them will want to know that they've been carefully thought through. We've talked about achievement tests, proficiency tests. There's a couple of others in there as well. You might have a diagnostic test.
Ross: This is when you're about to start a course, is it, to put the right level?
Ross: Then the sales people just put you in a high level...
Dan: I would call that a placement test.
Ross: Ah. OK.
Dan: I get asked about placement Tests quite often. Placement tests are very commercial objects in that they always want them to be about 10 to 15 minutes long. The decision isn't very important because it's useless...
Ross: [laughs] The sales person puts the student wherever they want to.
Dan: All they get, even if the language school is being fairly legitimate, they'll often have a process if a student is obviously in the wrong class, you move them and they've lost a couple of weeks, maybe. That is what you call a placement test.
The diagnostic test is a bit more, like...
Ross: Oh, can I guess this one? Is this when you find out what sections of your skills that you're good in or bad in...?
Ross: ...strong in speaking, bad at listening, strong in writing.
Tracy: It reminds me, like the teacher doing from learner needs analysis, like learner profile for teaching complications. They have interview different number of students and then really find out where they are, then try to plan the lessons and make it very personalized for this person. That's really interesting.
Dan: We were talking earlier about the difference between testing and assessment. Again, I think the colloquial definition is a bit more important than the scientific one.
I would call that a type of assessment. It might include a test. You might have a test result in there. You might use a placement test under diagnostic test. You might use a couple of measurement tools but you're also using observation, you're using some of the student's coursework, you're using some formative assessment, you're seeing how they interact in class.
Building a learner profile is a type of assessment. You're not making a value judgment, you're not saying...Well, maybe you are.
Dan: With all the teachers maybe you are...But essentially, you're trying to put together a picture of that student so that you can make decisions about how you interact with them.
Ross: Or what to teach them, I guess?
Dan: Who they should work with, or what kinds of activities they're comfortable with. That way, I'm not going to learning styles or anything.
How Are Language Tests Written?
Ross: Nice Next thing Dan, do you want to tell us a bit about how tests are written and who writes them?
Dan: It's an interesting area because it's a particular type of person and the job is called an item writer. Remember earlier I said that most assessment people call the question an item for reasons best known for themselves.
Dan: An item writer will write an individual item. They have a very specific set of specifications. They'll put a stimulus.
If it's a reading test, it's a reading passage. They'll normally not write that themselves. They'll normally find that online or it will be a listening audio, video or a picture and then you'll have a few other things in that item. You'll have a stem which is like the beginning of your question or prompt, and then you'll have some ways of responding.
Now, there's two types of response. We're just going through lots of definitions.
Dan: But it is important. You have selected response, which is where you're selecting multiple choice, reordering exercise, or constructed response, which is when you're writing an answer or report speaking. They write them, then they normally go through lots of quality assurance and then they try it out around the world.
Let's do it, say, I'm writing a grammar test. I'll try it out in lots of different countries. I'll give some trial candidates a test. In that test, there'll be some items where I know how good they are. What an item does is that it divides a group of people in two, people who can do it and people who can't, normally.
Sometimes you have more sophisticated ones. You have some that you know how they performed and you don't. You use the ones that you know how they performed to decide roughly what level someone is and you use the other ones and you work out how they...
Ross: ...is it? Does this do what I thought it did?
Dan: Yeah. Sometimes they'll behave really weirdly. Sometimes they'll do the opposite of what you think they're going to do. Let's say, if it's a multiple‑choice question, the right answer is normally called a key and the other ones are normally called a distractor.
Sometimes, the distractor will be too attractive. It would just seem so right for a low‑level candidate because it's something that they've been taught about in school but maybe slightly in the wrong way, and that won't be a very good item because it will draw too many people to that [inaudible 10:06] . It's quite an art to get it the right level of attractiveness.
Sometimes, it will work very well for candidates who are around B1 or B2 but for some reason, low‑level candidates will be getting it more right if you...
Ross: Oh, really?
Tracy: I see.
Ross: ...you start off getting it right and you get it wrong then you get it right.
Dan: You'll find that even the best written questions sometimes have these weird behaviors around high or low‑levels within certain areas. Actually, you end up throwing away a lot of them.
Ross: For me, I remember my first year as a teacher, I taught this group of primary school kids and at the end of the term, I had to make my own speaking test for them. [laughs] I obviously didn't know what I was doing at all. One of the things I taught them was adjectives for what people look like.
One of the question I put in there was, "What does your best friend look like?" [laughs] I remember one the girls who was actually one of the better students in the class looked to me with a puzzled look and paused for a second and went, "My best friend looks like a bird."
Ross: They had obviously never learned that by way asking questions. I wanted to ask you, for teachers, if you're ever in that position as a teacher and you have to write a basic test for one of those reasons, what are some simple rules that teachers might follow?
Dan: Actually, the thing that is most approachable to teachers quite short but regular quizzes and assessments. Something that you're doing fairly regularly, it's fairly low stakes and you're doing in every class, that's going to give you more information. It's also going to mean that you can learn better.
You write a pop quiz that's got 10 questions and you've got, "What does your best friend look like?" and you get some really weird answers, but you start to think about how people are responding and that's a really interesting exercise.
How Does Testing Affect Teaching and Learning?
Ross: Dan, final thing to talk about with you with testing today is how do tests affect what teachers end up doing in the classroom?
Dan: That's a really important question. It's an important question for the assessment community and it's called wash back, sometimes impact or a few other words. It means when you introduce a test or an assessment to any kind of education system, you distort that education system.
As soon as a teacher or a student knows there's going to be a test at some point in their education system, they affect how they behave. If the test is not very good at measuring what you're trying to measure, you'll get a negative effect and that's why it's called negative wash back.
Ross: This is like one of the ones where it's a speaking test but it's tested through multiple choice so none at the class anyone speaks, they just practice multiple choice questions.
Dan: That's right. Or it might be that they know that, for example, that there's 800 questions they might be asked in that test and so they spend all of their time just practicing those 800 hundred questions because they've got all of the past papers but they're not actually practicing the language.
Or it could be that that test focuses very specifically on one skill area, receptive skills are very common, and grammar and vocabulary, just easier to assess its scale.
Ross: Usually this is one of the reasons why speaking often happens so little in a lot of classrooms it's because it's really difficult to test.
Dan: It's very difficult to assess at scale and cost effectively, but actually, it's one of the first things that we do as language teachers. One of the first skills that a language teacher learns is how to assess someone's speaking skills and get to know it but it's just very hard to scale that.
You can have positive wash back where, let's say, you're in an education system where there isn't much speaking being practiced. If you introduce a decent speaking test into that education system, in a cost‑effective way, suddenly everyone's going to spend more time practicing speaking.
That will mean that they have more opportunities later in life. It means they're better prepared to pass a high‑stakes exams or get into a good university.
When people talk about, in practicing backwash or wash back or whatever, they often talk about it in a negative sense, but actually, a lot of the contracts I've seen it working in, if you're replacing something that's old or wasn't there at all or needs some adjustment, you can have a really positive effect on people's lives.
Tracy: Thanks very much Dan for coming to our podcast.
Dan: It was a pleasure.
Ross: Great! Bye, Dan! See you soon.
Tracy: Bye, guys!