Skip to content

Seeing the (lack of) value of standardized assessment first hand with WayFind

April 11, 2012

I’ve always had a thing for standardized testing. That’s because I thought the tests meant something, and that’s because I was good at them. I can remember learning what the word stanine meant as a 4th grader, and how proud I was to be in the 9th stanine. In high school, I took the SAT more than half a dozen times, with the ultimate goal of earning a perfect 1600 that I never achieved. As a naive high school student, I never really thought about what these tests measured. I just thought they measured “intelligence” and the fact that I scored well on them must be a good thing. I’m also too ashamed to admit what I thought of peers who didn’t score as well. Luckily, my last experience with standardized testing was a rather humiliating encounter with the Physics GRE my senior year, and I left college much more humbled than I entered it.

A few years spent working as a college counselor helped me to see just how warped my perspective on standardized testing was. In that job, I encountered incredible students— intellectual leaders in their classes, making outstanding contributions to the extracurricular life of the school, who found their applications to colleges tarnished to some degree by some difficulty with standardized testing. These students all went on to see success later in college, and in countless conversations with other college counselors and admissions officers, I came to realize that all these tests do is provide a convenient basis for comparing students across the world with a single number. In the age of high pressure college admissions where officers have to read hundreds of applications a day, this can be a very helpful thing. The key is to remember all the things that number does not measure, all the ways in which it might be a distorted comparison, and all the ways it may be abused by forgetting these things.

Still, these lessons were still somewhat abstract, and it wasn’t until very recently that the lessons came for me in the most real way.

My school has made major investments in technology—we’re implementing a 1:1 program throughout the school, we’ve made massive upgrades to the bandwidth and technology infrastructure, we’ve increased support staffing and professional development throughout the school to help teachers develop ways to use this technology in their classes that aligns with 21st century learning. This investment has cost millions of dollars, and naturally like most investments, someone is going to want some data to see what the return is on that investment.

And so it was with technology. We need a way to efficiently measure how teachers use technology in their teaching so that we can target training and support for the future. The first question is what should we measure? Luckily, some very thoughtful educators have already developed the ISTE.NETS-T standards and performance indicators. These are mostly excellent—they provide a clear framework what a technology empowered teacher should be able to do.

Enter WayFind

The real question is how do you measure these standards? I’ll present how I think you might do this at the end of this post, but my school chose to measure them by having the full faculty take a 45 minute multiple choice test, the WayFind Teacher Assessment for Effective 21st Century Teachers, designed by learning.com. Learning.com is a supplemental business of the 18 billion dollar conglomerate Educomp Solutions Ltd, based in India.

Of course, as soon as a new set of standards are created for just about anything in education, a cottage industry of testing outfits tries to develop some new “diagnostic instrument” aligned to these standards, eager to “monetize this space with a disruptive but authentic assessment that takes advantage of the latest advances in automated scoring technology.”

At first WayFind seems particularly promising, since it is touted as more than just a multiple-choice test. It features task questions, that require the user to show how do a particular task on a simulated computer.

WayFind loses its way

And so I spent one afternoon taking this Way Find assessment—standardized tests were my friend right? Very quickly, I learned something was up with this particular standardized test. Many of the questions were vague or ambiguous.

Many of the questions ask you to choose the “best” method, and while I’m sure that some methods are better than others, I often found myself being able to make a case for multiple methods, depending on circumstances. Here’s an example:

Suppose you were hosting a workshop on digital literacy for parents. What is the best way to invite parents to such a workshop:

  1. send out an E-vite
  2. send an email
  3. write a letter
  4. write a blog post

Though I’m not sure my answer is correct, I think there’s a very strong case to be made that sending a letter home would be the best way to attract those who aren’t digitally literate to a workshop on digital literacy. Of course, if the entire school has a working mailing list, with email addresses for all parents, that might also be the best way. I could even see that a blog post might very well be the best method for a school with a regularly updated blog that many parents go to see, since such a workshop probably isn’t so critical that we need to clog parent inboxes or mailboxes with invites.

Other questions asked you to simply select the right keyboard shortcut for a common task, like opening a file. However all the provided answers were windows shortcuts, and our faculty use macs. While I think shortcuts are vital, and one of those small stepping stones that really empower users to excel with technology, I’m not sure testing a user on whether he/she remembers a particular shortcut from a different platform does much to tell me whether he or she is an effective 21st century educator.

Then there are the tasks. They presented you with a simulated computer interface that looked like a bad reproduction of the Windows 95 interface, complete with a cramped 640×480 window that I remember from more than a decade ago. The tasks were simple-show students how to save this file, and you had to walk your way thorough this artificial interface, and then when you clicked on the right (or wrong) button, presto, your answer was recorded and you moved on to the next question with no feedback. Of course, even the world’s worst computer interface wouldn’t do this—if I were truly trying to save a file, I would get some confirmation of the file being saved from the user interface, and if I didn’t, I would know to go back and try again.

Mostly, I found myself puzzling over what this test was actually testing, and how these questions, which seemed to be plucked from 2008, were measuring my mastery of the ISTE standards. I wonder what statistical tests the designers of WayFind did to verify that their test measures what they say it is does. This information isn’t available on the WayFind website.

Some reflections on my results

A couple of days later, I got my results, which I have included here in full disclosure.

View this document on Scribd

I don’t want to be immodest, but it’s a bit hard to understand how someone who helped to start the Global Physics Department, and has attended every single meeting in the past year, even setting up a VPN connection between my iPad to my home computer to attend from Puerto Rico only qualifies as “proficient” in Professional Growth. Similarly, I’ve written and read extensively on Digital Citizenship, surly to a point past “basic”. I’ve skyped with teachers from across the country to measure the circumference of the earth, which must be an advanced example of “digital age learning experiences.” My students have completed capstone projects on their own blogs that have gathered feedback from teachers and professors across the nation—surely a sign of “digital age learning” and “creativity.” I was heartened by these results when an art teacher colleague told me that WayFind identified creativity as an area for growth, and a librarian had missed questions on information technology and copyright.

My score is actually below average for my department and just barely above the average of the school as a whole. Yet, I’m tasked with being the Department Integration Specialist responsible for helping my colleagues use technology to increase the effectiveness of their teaching.

If you look more closely at the results, you’ll see that almost each of the individual standards is tested by only three questions, and the difference between proficient and advanced could very likely be a disputed interpretation of an ambiguous question.

That’s when I remembered the lessons I’ve learned from standardized testing—I just never thought it would apply to me. In almost every case, standardized tests are shallow and impoverished attempts to measure something. The SAT doesn’t measure intelligence—it has gone through so many machinations that now the acronym SAT literally has no meaning. And Way Find doesn’t even begin to measure technological expertise or effectiveness as a 21st century teacher.

But that’s not to say that these tests don’t have consequences. I must say I felt inadequate upon seeing these results, and wondered immediately how a 9th grader, forced to take the PSAT for literally no reason, must feel about scoring in the 20th percentile in math, when he has yet to study all of the math on the test. What does this do to that student’s confidence in math? If taking assessments like this helps me and my colleagues to develop a greater sense of empathy for students and the struggles they face with standardized testing, that would be a wonderful benefit.

Just having the data itself might also have consequences. It’s in our nature to want to use all the information in front of us, even that information is deeply flawed or incomplete. I’ve seen colleagues at previous schools turn to PSAT scores to make decisions about whether a student deserves to be an an honors science class—despite the fact that the teacher has never studied how that PSAT scores predict success in a honors science class (likely because no such correlation exists).

I trust that when my administrators say that these scores are diagnostic, they are just that. But I know we are all human, and if we had this score that purported to measure each teacher’s effectiveness as a 21st century teacher, wouldn’t you be tempted to use it? Maybe in assigning classes, you’d be just a small bit inclined to pair up that low scoring teacher with a high scoring colleague so that they can plan classes together. And certainly, when those hard decisions come around and you have to decide whether to renew a contract, you would not let an abysmally low score add just a bit more weight to the “do not renew” decision, would you? I would certainly be tempted to use this information, simply because it is there.

Most importantly, what do I do now that I’m deemed proficient? How do I become advanced? How does this test help me to learn? I don’t get to see the questions I missed, nor, as far as I understand, do my administrators. This assessment doesn’t tell me what things I can do successfully, nor does it provide me with challenges to further improve my skills. This provides me with very little opportunity to grow, and only the vaguest possible notion of my weaknesses. Again, this makes me feel lots of empathy for students across the country who get standardized test results back and only see a single number.

A possible alternative

We could do so much better. Why do we need to turn to impoverished assessments like this when we can design far better assessments on our own? The task assignments on WayFind give us a glimpse of what could be a truly useful assessment. You’re taking this test on a computer—so give the teacher a real task. Here are just a few:

  • Show that you can successfully take an image from the web, add callouts to that image and properly cite it for use on a assessment.
  • Record a screencast annotation of grading a student paper, and post it on a blog.
  • Start a twitter account and find 5 teachers at your school and 5 teachers outside of your school in your discipline to follow.
  • Troubleshoot a common error message with Google.
  • Create a document you wish to email to a colleague, save it to a pdf, and upload it to a cloud based service like Drobpox so that you can email the document to the colleague without the need for an attachment.

You could generate tasks at every level so that every faculty member could demonstrate mastery of some task, and still have other tasks that would give them ideas and challenges for future growth.

If the goal is to have every teacher develop into effective 21st Teacher who understands technology, why not ask faculty to keep a portfolio, and provide and reflect upon artifacts from their own teaching that they feel demonstrate these standards? This would seem to be far more beneficial to the faculty member, as it would leave them with something tangible upon completing the assessment, and similarly for administrators, who would have real examples of effective teaching they could point to.

I wish I could say that WayFind is a magical standardized assessment that will help you to diagnose the technology needs of your department or school. Certainly, it is in its early stages as an assessment, and so maybe it will grow into something more useful. But based on my experience, and conversation with a number of colleagues, I can’t recommend it now. There are meta lessons to be learned from taking a standardized assessment as an adult, especially one so poorly designed as WayFind, but if this is the lesson you seek to teach, you’ll save money and time by having them read this 35 year-old’s account of taking the SAT as an adult.

About these ads
8 Comments leave one →
  1. April 18, 2012 7:52 am

    Test taking skills are REALLY important, John!

  2. April 18, 2012 8:43 am

    How long will it be before there are Kaplan (etc.) courses available for teachers to “improve their performance” on the WayFind assessment? I see an opportunity for someone to monetize this space (sadly, I’m probably only about 6 months ahead of the curve on this).

    • April 18, 2012 8:49 am

      I fear you may be right. In fact, I think educomp solutions could just make another subsidiary devoted to prepping for this assessment. I’d really love to know how much profit the College Board makes selling officially licensed AP and SAT study guides compared to test fees.

  3. Frank Lock permalink
    April 18, 2012 9:13 am

    I have been tutoring 3rd and 5th graders in an after school program, working to improve their performance on the math CRCT. The program was poorly managed (a change was recently made which may improve the situation) and this resulted in some classroom management challenges. I have been using Cognitive Instruction in Mathematical Modeling strategies, and incorporating CRCT questions in the practice sheets the students work on. The CRCT questions are multiple- choice, and with very little thought the students would select an answer and be done very quickly. It took me way too long to realize that to make these questions useful, I needed to eliminate the choices. Once I did the students made some real progress in thinking about how to solve the CRCT type problems.
    Based on that experience, I believe the standardized test/multiple choice format of the CRCT is largely responsible for the poor performance of many students. The format sure makes grading the test easy though – is that the real goal of that test, and all multiple choice standardized tests?

  4. jsb16 permalink
    April 18, 2012 9:54 pm

    Standardized testing is the lazy way of assessing people, whether it’s done to 3rd graders or just-pre-retirement civil servants. It’s being pushed by the people who sell the tests and the test prep materials. (Did you see the article on how awful the FCAT science is?)

    The 35-year-old-man-retakes-SAT essay was funny. I particularly liked “It’s as if the entire test had been conceived of and written by the SS,” because, as I understand it, the SAT was conceived of and written as a way “prove” that WASPs were better suited for college than others (especially Jews) without being as overt as quotas. [That plan fell through when Kaplan (the founder of the company) started teaching his Jewish friends how to do well on the SAT...]

    • April 18, 2012 10:05 pm

      The history of college admissions is fascinating— The Chosen details how admissions essays, and application reads looking at extracurriculars for “well-rounded” people were also tactics taken by Ivy League universities to exclude Jews. It’s a very sad history, and I wish that students learned more of it as they applied for college.

      Here’s the link to the article on the FCAT science assessment-from what I read here, it’s certainly in the running for worst standardized test of the year.

  5. April 18, 2012 11:35 pm

    There have been studies of the correlation between pSAT and science courses. Of course the easiest thing to measure performance with is yet another standardized test (like the AP tests). http://www.collegeboard.com/counselors/app/score.html has correlations between pSAT and various AP tests. The lowest is between math and Physics C (the pSAT math is way too low level to be predictive).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 247 other followers

%d bloggers like this: