Seeing the (lack of) value of standardized assessment first hand with WayFind
I’ve always had a thing for standardized testing. That’s because I thought the tests meant something, and that’s because I was good at them. I can remember learning what the word stanine meant as a 4th grader, and how proud I was to be in the 9th stanine. In high school, I took the SAT more than half a dozen times, with the ultimate goal of earning a perfect 1600 that I never achieved. As a naive high school student, I never really thought about what these tests measured. I just thought they measured “intelligence” and the fact that I scored well on them must be a good thing. I’m also too ashamed to admit what I thought of peers who didn’t score as well. Luckily, my last experience with standardized testing was a rather humiliating encounter with the Physics GRE my senior year, and I left college much more humbled than I entered it.
A few years spent working as a college counselor helped me to see just how warped my perspective on standardized testing was. In that job, I encountered incredible students— intellectual leaders in their classes, making outstanding contributions to the extracurricular life of the school, who found their applications to colleges tarnished to some degree by some difficulty with standardized testing. These students all went on to see success later in college, and in countless conversations with other college counselors and admissions officers, I came to realize that all these tests do is provide a convenient basis for comparing students across the world with a single number. In the age of high pressure college admissions where officers have to read hundreds of applications a day, this can be a very helpful thing. The key is to remember all the things that number does not measure, all the ways in which it might be a distorted comparison, and all the ways it may be abused by forgetting these things.
Still, these lessons were still somewhat abstract, and it wasn’t until very recently that the lessons came for me in the most real way.
My school has made major investments in technology—we’re implementing a 1:1 program throughout the school, we’ve made massive upgrades to the bandwidth and technology infrastructure, we’ve increased support staffing and professional development throughout the school to help teachers develop ways to use this technology in their classes that aligns with 21st century learning. This investment has cost millions of dollars, and naturally like most investments, someone is going to want some data to see what the return is on that investment.
And so it was with technology. We need a way to efficiently measure how teachers use technology in their teaching so that we can target training and support for the future. The first question is what should we measure? Luckily, some very thoughtful educators have already developed the ISTE.NETS-T standards and performance indicators. These are mostly excellent—they provide a clear framework what a technology empowered teacher should be able to do.
The real question is how do you measure these standards? I’ll present how I think you might do this at the end of this post, but my school chose to measure them by having the full faculty take a 45 minute multiple choice test, the WayFind Teacher Assessment for Effective 21st Century Teachers, designed by learning.com. Learning.com is a supplemental business of the 18 billion dollar conglomerate Educomp Solutions Ltd, based in India.
Of course, as soon as a new set of standards are created for just about anything in education, a cottage industry of testing outfits tries to develop some new “diagnostic instrument” aligned to these standards, eager to “monetize this space with a disruptive but authentic assessment that takes advantage of the latest advances in automated scoring technology.”
At first WayFind seems particularly promising, since it is touted as more than just a multiple-choice test. It features task questions, that require the user to show how do a particular task on a simulated computer.
WayFind loses its way
And so I spent one afternoon taking this Way Find assessment—standardized tests were my friend right? Very quickly, I learned something was up with this particular standardized test. Many of the questions were vague or ambiguous.
Many of the questions ask you to choose the “best” method, and while I’m sure that some methods are better than others, I often found myself being able to make a case for multiple methods, depending on circumstances. Here’s an example:
Suppose you were hosting a workshop on digital literacy for parents. What is the best way to invite parents to such a workshop:
- send out an E-vite
- send an email
- write a letter
- write a blog post
Though I’m not sure my answer is correct, I think there’s a very strong case to be made that sending a letter home would be the best way to attract those who aren’t digitally literate to a workshop on digital literacy. Of course, if the entire school has a working mailing list, with email addresses for all parents, that might also be the best way. I could even see that a blog post might very well be the best method for a school with a regularly updated blog that many parents go to see, since such a workshop probably isn’t so critical that we need to clog parent inboxes or mailboxes with invites.
Other questions asked you to simply select the right keyboard shortcut for a common task, like opening a file. However all the provided answers were windows shortcuts, and our faculty use macs. While I think shortcuts are vital, and one of those small stepping stones that really empower users to excel with technology, I’m not sure testing a user on whether he/she remembers a particular shortcut from a different platform does much to tell me whether he or she is an effective 21st century educator.
Then there are the tasks. They presented you with a simulated computer interface that looked like a bad reproduction of the Windows 95 interface, complete with a cramped 640×480 window that I remember from more than a decade ago. The tasks were simple-show students how to save this file, and you had to walk your way thorough this artificial interface, and then when you clicked on the right (or wrong) button, presto, your answer was recorded and you moved on to the next question with no feedback. Of course, even the world’s worst computer interface wouldn’t do this—if I were truly trying to save a file, I would get some confirmation of the file being saved from the user interface, and if I didn’t, I would know to go back and try again.
Mostly, I found myself puzzling over what this test was actually testing, and how these questions, which seemed to be plucked from 2008, were measuring my mastery of the ISTE standards. I wonder what statistical tests the designers of WayFind did to verify that their test measures what they say it is does. This information isn’t available on the WayFind website.
Some reflections on my results
A couple of days later, I got my results, which I have included here in full disclosure.
I don’t want to be immodest, but it’s a bit hard to understand how someone who helped to start the Global Physics Department, and has attended every single meeting in the past year, even setting up a VPN connection between my iPad to my home computer to attend from Puerto Rico only qualifies as “proficient” in Professional Growth. Similarly, I’ve written and read extensively on Digital Citizenship, surly to a point past “basic”. I’ve skyped with teachers from across the country to measure the circumference of the earth, which must be an advanced example of “digital age learning experiences.” My students have completed capstone projects on their own blogs that have gathered feedback from teachers and professors across the nation—surely a sign of “digital age learning” and “creativity.” I was heartened by these results when an art teacher colleague told me that WayFind identified creativity as an area for growth, and a librarian had missed questions on information technology and copyright.
My score is actually below average for my department and just barely above the average of the school as a whole. Yet, I’m tasked with being the Department Integration Specialist responsible for helping my colleagues use technology to increase the effectiveness of their teaching.
If you look more closely at the results, you’ll see that almost each of the individual standards is tested by only three questions, and the difference between proficient and advanced could very likely be a disputed interpretation of an ambiguous question.
That’s when I remembered the lessons I’ve learned from standardized testing—I just never thought it would apply to me. In almost every case, standardized tests are shallow and impoverished attempts to measure something. The SAT doesn’t measure intelligence—it has gone through so many machinations that now the acronym SAT literally has no meaning. And Way Find doesn’t even begin to measure technological expertise or effectiveness as a 21st century teacher.
But that’s not to say that these tests don’t have consequences. I must say I felt inadequate upon seeing these results, and wondered immediately how a 9th grader, forced to take the PSAT for literally no reason, must feel about scoring in the 20th percentile in math, when he has yet to study all of the math on the test. What does this do to that student’s confidence in math? If taking assessments like this helps me and my colleagues to develop a greater sense of empathy for students and the struggles they face with standardized testing, that would be a wonderful benefit.
Just having the data itself might also have consequences. It’s in our nature to want to use all the information in front of us, even that information is deeply flawed or incomplete. I’ve seen colleagues at previous schools turn to PSAT scores to make decisions about whether a student deserves to be an an honors science class—despite the fact that the teacher has never studied how that PSAT scores predict success in a honors science class (likely because no such correlation exists).
I trust that when my administrators say that these scores are diagnostic, they are just that. But I know we are all human, and if we had this score that purported to measure each teacher’s effectiveness as a 21st century teacher, wouldn’t you be tempted to use it? Maybe in assigning classes, you’d be just a small bit inclined to pair up that low scoring teacher with a high scoring colleague so that they can plan classes together. And certainly, when those hard decisions come around and you have to decide whether to renew a contract, you would not let an abysmally low score add just a bit more weight to the “do not renew” decision, would you? I would certainly be tempted to use this information, simply because it is there.
Most importantly, what do I do now that I’m deemed proficient? How do I become advanced? How does this test help me to learn? I don’t get to see the questions I missed, nor, as far as I understand, do my administrators. This assessment doesn’t tell me what things I can do successfully, nor does it provide me with challenges to further improve my skills. This provides me with very little opportunity to grow, and only the vaguest possible notion of my weaknesses. Again, this makes me feel lots of empathy for students across the country who get standardized test results back and only see a single number.
A possible alternative
We could do so much better. Why do we need to turn to impoverished assessments like this when we can design far better assessments on our own? The task assignments on WayFind give us a glimpse of what could be a truly useful assessment. You’re taking this test on a computer—so give the teacher a real task. Here are just a few:
- Show that you can successfully take an image from the web, add callouts to that image and properly cite it for use on a assessment.
- Record a screencast annotation of grading a student paper, and post it on a blog.
- Start a twitter account and find 5 teachers at your school and 5 teachers outside of your school in your discipline to follow.
- Troubleshoot a common error message with Google.
- Create a document you wish to email to a colleague, save it to a pdf, and upload it to a cloud based service like Drobpox so that you can email the document to the colleague without the need for an attachment.
You could generate tasks at every level so that every faculty member could demonstrate mastery of some task, and still have other tasks that would give them ideas and challenges for future growth.
If the goal is to have every teacher develop into effective 21st Teacher who understands technology, why not ask faculty to keep a portfolio, and provide and reflect upon artifacts from their own teaching that they feel demonstrate these standards? This would seem to be far more beneficial to the faculty member, as it would leave them with something tangible upon completing the assessment, and similarly for administrators, who would have real examples of effective teaching they could point to.
I wish I could say that WayFind is a magical standardized assessment that will help you to diagnose the technology needs of your department or school. Certainly, it is in its early stages as an assessment, and so maybe it will grow into something more useful. But based on my experience, and conversation with a number of colleagues, I can’t recommend it now. There are meta lessons to be learned from taking a standardized assessment as an adult, especially one so poorly designed as WayFind, but if this is the lesson you seek to teach, you’ll save money and time by having them read this 35 year-old’s account of taking the SAT as an adult.