Archive for 2011


Can we slow the flow of money to test thieves?


Friday, October 28th, 2011

By: Dennis Maynes, Chief Scientist, Caveon Test Security

This week, Julian Assange, founder of WikiLeaks, announced that his organization is running out of money and may be forced to cease operations by the end of 2011. On October 24, 2011 Reuters reported: “WikiLeaks says ‘blockade’ threatens its existence.” (Source: http://www.reuters.com/article/email/idUSTRE79N46K20111024) The blockade occurred when the major financial processing firms suspended their agreements with WikiLeaks, after WikiLeaks released thousands of secret US diplomatic cables in December, 2010, and threatened the Bank of America with the release of internal documents which resulted in a 3% decrease of Bank of America’s share price.

Assange claims the blockade is illegal and has filed anti-trust lawsuits against Visa and Master Card. On the day before the blockade, WikiLeaks received $135,000. Currently, WikiLeaks receives less than $10,000 per month. The net effect of the blockade to WikiLeaks has been the loss of 95% of its operating cash.

Whether you agree with WikiLeaks’ goals or not, it is clear that WikiLeaks has routinely infringed upon the rights of copyright holders by distributing information and documents without authorization. If it is not obvious why this story has important test security ramifications, let me make it clear: (1) many websites, operated by pirates and thieves, infringe upon the copyrights of secured exam content, (2) it has been very difficult to effectively shutdown this activity, which is costing testing organizations millions of dollars per year in lost test development expenditures, and (3) if payment processors would agree to cease providing services to these thieves and pirates, many of them would fold. The WikiLeaks story demonstrates that copyright infringers will have a difficult time remaining in business without the support of payment processors.

At Caveon, we have been very successful in removing copyrighted exam materials from the Internet. Often our success is based upon respectful and courteous requests to unintentional copyright infringers. However, respect and courtesy do not work against pirates and thieves. At that point, potentially expensive legal action must be commenced.

An alternative to expensive legal proceedings is to work with payment processors to protect their brands. For example, Visa does not want any transaction to bring disrepute upon its brand (source: http://corporate.visa.com/_media/visa-international-operating-regulations.pdf). If we, as an industry, can convince the payment processors that the sale and distribution of pilfered exam content is disreputable, we may be able to slow the flow of money to the test thieves and protect valuable exam content.

What do you think? How can we help payment processors understand that their services facilitate the distribution of stolen exam content? Should ATP (Association of Test Publishers) contact the payment processors, on behalf of its members?

Several months ago, Ben Mannes, Test Security Director at ABIM, expressed this thought: “ATP should be trying to get a meeting with Victoria Espinel [White House intellectual property czar], bring 1-2 industry security experts, and state the case as to why exam content is a vital component to our nation’s infrastructure requiring heightened public sector IP enforcement.”

***************

Please Comment Below, Thank you for Reading



Eight Years of Improving Security


Friday, October 21st, 2011

By: Steve Addicott, Caveon Vice President

October is an important month for Caveon. Eight years ago in October, 2003, several assessment industry veterans formed a small consulting company focused solely on improving the security of our clients’ test programs.    That company is Caveon Test Security!

Fast forward to 2011, and it’s gratifying to consider what this entrepreneurial group of test security zealots has accomplished.  Since that fateful October day, we have

  • conducted over 50 Security Audits of leading test organizations and vendors,
  • flagged and removed tens of thousands of internet-based risks, and
  • conducted statistical analyses of over 30,000,000 test instances for many of the largest, most important test programs in the world.

As I consider the number and breadth of these engagements, perhaps it is worth sharing a few of the core values under which we always operate:

Confidentiality

Throughout our years of operation, one fundamental operating principle has always applied:  client confidentiality.  We never reveal the details of our client engagements without the express approval of our clients. Our clients require and appreciate this sensitivity as we investigate security incidents and provide reports on our forensic analyses. This is not secrecy– this privacy stems from respect for our clients and for the right to privacy of individuals and organizations.

Innovation

We constantly strive to improve means and methods for strengthening exam security. We are always interested in sharing the nature of our work.  Not only do we share our methods and science with clients, client stakeholders, TAC members, educational measurement researchers, and other appropriately interested parties; we are committed to furthering the science around test security. We regularly present at conferences and webinars where we openly share our Caveon approach, theories and methodologies. In fact this last year, we have presented at conferences in Phoenix, Orlando, Chicago, Seattle, Washington DC, Hong Kong, and Prague.

Conservative Recommendations

When we conduct an engagement, our approach is to focus on the situations and incidents that are most egregious, as evidenced in the data and the results that we analyze. We highlight those problems that are most readily identified, documented, and ideally, resolved. Dealing with these problems effectively will have the greatest positive impact to the overall validity and security of test results. This reasonable approach helps our clients, most of which suffer from ever-constrained budgets and resources, effectively concentrate their time, resources, and dollars where the likelihood of inappropriate test taking is highest.

Lastly, our growth and success is directly attributable to a few overarching principles—We always strive to exceed our clients’ expectations, comport ourselves honorably, provide valuable services, and share, as openly and honestly as we can, recommendations for improving the fairness and validity of our clients’ test programs. These principles result in proven, practical protection for our clients, and we intend to follow them for another eight years!

Please Submit Your Comments Below. Thank you!



Item Exposure Is Not the Problem — Poor Security Is


Friday, October 14th, 2011

By: David Foster, CEO, Caveon Test Security

Item exposure during an exam in the testing world is often viewed as a bad thing, because it seems obvious that item exposure leads to item over-use which in turn leads to item compromise. It is common for psychometricians to limit item exposure, defining it as either a too-high absolute number of presentations of the items in a test, or a too-high rate of the items presented on tests. Unfortunately, there is no scientific research or even unscientific guidelines, or even reasonable casual suggestions, about how many exposures are too many, or which rate of exposure is too high.

It does not follow that item exposure is the same as item compromise. In fact, I’ve seen items compromised with an extremely small number of presentations. Some items have even been compromised prior to the first test being administered!

In my opinion, the notion that item compromise results from item exposure—as defined above—leads  to improper conclusions, decisions, and ineffective procedures. I have a few reasons for this opinion, a couple of which I’ll give here. First, item exposure is absolutely necessary. It is obvious that no test can be effective unless its items are exposed during the exam. Test designers even let examinees view an item multiple times encouraging them to return to and review previous items again and again. Second, item compromise has very little to do with the definitions of item exposure given above. Consider this simple example: Suppose that an item was shown to one million test takers and was presented on every exam administered. This would be considered a very high number of exposures along with a 100% exposure rate. But, suppose that none of those examinees were able to share the item with others. In this simple example, the item remains uncompromised and perfectly secure, and can be continued to be used on the exam.

If we wish to reduce item compromise, the example illustrates that limiting the number of presentations or rate of presentations of an item is not as important as the methods used to secure the items, to protect them from theft, and to keep them from being used for cheating. For this reason we need improved item security, which means better ways to keep items from being stolen and used for cheating on subsequent exams. We need methods to detect when an item is truly compromised and then immediately to take it out of service. Instead, we often see stubborn adherence to a century-old model of relatively unsecure test administration, and believing that keeping an item from being presented on a test is a sensible way to secure it.

It is certainly possible to improve the way we secure items. As examples, there are protective item and test designs available, and certainly better test monitoring procedures, that we can use. And perhaps we can learn a little from other industries as well. Consider the problem with the theft of music over the Internet. No one would suggest that music is stolen because it was listened to by too many people. Instead, we see serious efforts to protect the music, to keep it from being stolen, to detect when it is stolen, and to punish those that are responsible. We should be doing the same.

We welcome comments below!



BEAUTIFUL PRAGUE AND A VERY SUCCESSFUL ATP CONFERENCE


Friday, October 7th, 2011

By: John Fremer, President, Caveon Consulting Services

The third European ATP Conference concluded the last week in September in the spectacularly lovely city of Prague in the Czech Republic and it was a rousing success.  Attendance was 225 besting the planners’ target of 200 attendees.  The keynote talks were of exceptionally high quality and there were a large number of productive and well-attended sessions.  The weather was also as good as one can imagine and the city welcomed us in every way.

I pay special attention to how the prevention of cheating and test piracy is addressed at any conference that I participate in and I was struck by the substantial increase in attention from even a year ago to the topic both in formal sessions and in conversations with other attendees. Part of the reason seems to be a high level of awareness of developments in the US, especially in our state assessment programs. There is also a strong country specific set of reasons related to security breaks or cheating episodes in critical programs that have received extensive media coverage. As is the case in the US, once a cheating story breaks in the UK, the Netherlands, or other European countries, it tends to get a great deal of coverage that can last months or more.

There was a good deal of attention to “authentication,” improved ways of using biometrics, proctor training, and closer monitoring of the testing process to make it harder for test takers to invalidate our best efforts to ensure fair testing. The degree to which testing transcends borders within Europe and in the larger world was also emphasized. Steve Addicott of Caveon and Aimee Rhodes of the Chartered Financial Analysts spoke to a packed house on the international aspects of testing and security. Cheaters and pirates can be based in one country in the morning and another in the afternoon if you are resourceful enough to shut down or limit the place where their day started. The situation has been compared to the arcade game “Whack-a-mole,” where as soon as you hit one varmint, another pops up out of a different hole. I like that metaphor as it reflects my view of these unscrupulous enemies of fairness in testing.

Several of the keynotes really impressed me. Two were given by very successful entrepreneurs. Madan Padaki from Bangalore, India, CEO of MeritTrac in an address entitled “The 500 Million Dream: Building a Nation” described the progress made in India raising 400 million people out of poverty. It is an astonishing story as is the development and growth of MeritTrac to be a major provider of testing services in a ten year period. Madan did not say his path had been easy. Rather he indicated that he might not have made the attempt, if he had realized the challenges that he would face.

Another extraordinary session was given by Lucian Tarnowski; the driving force behind “Brave New Talent,” a social media based way of nurturing and locating talent. Lucian is all of 27 years old and talks about digital immigrants, i.e., most of the people now working in assessment. We are “newly arrived” to a world with so many ways to be connected. It was not that way when most of us were in school or starting our careers. Digital natives, by contrast, grew up in this world and it is very familiar to them. Lucian describes his own Dad who has stayed with his typewriter as his way of composing and communicating as a “digital refugee.” I emerged from that session convinced that I am way overdue on my promise to myself to use social networks wherever it will help me keep in touch with the colleagues, clients, and fellow professionals with whom I share interests and goals.

Another session, a plenary in the form of a debate, saw Cor Sluijter of CITO do a very fine job defending our efforts to produce high quality tests that serve valuable purposes against Donald Clark of Learn Direct. Clark pointed out a number of flaws in testing and argued that assessment is not “fit for purpose” in the 21st century world. Clark’s criticisms were thoughtful ones, but Sluijter held his own and all of us who attended welcomed the fact that both did their best with the help of Eugene Burke of SHL who moderated with élan to show us different sides of a meaty and much talked about issue.

I have not captured all of the ATP Europe Conference, but I hope I have conveyed some of the substance and spirit.  Next year it will be in Berlin in mid-September.  The date for next year’s conference has not yet been set.  You will surely see a note with the date on this blog as well as in Caveon’s Newsletter “Cheating in the News.” If I get my own social media act together, you might get a tweet from me about it.  I like to think this particular digital immigrant can still learn new tricks.



Empowering Schools to Use Data Forensics


Friday, September 30th, 2011

By: Dennis Maynes, Chief Scientist, Caveon Test Security

(The following is an excerpt from an invited talk that was presented to the US Department of Education, September 1, 2011.)

It was sometime after we started Caveon, that I realized the primary goal of conducting security analyses was the strengthening of exam security, not catching cheaters. This is a message that resonates very well with the testing program managers with whom I have interacted. They agree that the primary goal of security actions should be to obtain trustworthy test results, which occurs when the exams are administered securely and with integrity. Disciplining cheaters is important and supports this goal, but it is only a means to an end.

Exam security can be strengthened in two ways, and both should be used: (1) Prevention of cheating, and (2) Detection and discipline of cheaters which will result in deterrence.

Prevention of cheating is gained by implementing effective security processes through policies and procedures. An important element of this effort is the periodic review of security processes and how well they have been implemented.

Detection and discipline of cheaters occurs through (1) performing regular forensic analysis, (2) qualifying the anomalies, and (3) imposing sanctions and invalidating scores.

Deterrence results when security actions and consequences for cheating are publicized.

It’s important to realize that security is a process, not a state. As an example, I have an alarm system at home. Installation of an alarm system does not mean that my home is secure. Only by arming and testing the alarm system can I be ensured that it is functioning properly. Speaking of alarm systems, I am delighted when no one breaks into my home. Just because there were no break-ins, does not lessen the value of the alarm system. I have had clients who felt that web patrolling and data forensics monitoring had no value because we did not detect security breaches. The non-existence of security breaches does not lessen the value of the security processes that have been implemented.

Except for some fraud laws, there are very few laws regulating cheating. It is difficult to prove and there is no physical evidence of material loss or harm. I often hear the phrase “Prove that I cheated.” In fact, I recently saw a headline in the papers expressing the same idea. It’s important to realize that state departments of education do not need absolute proof of cheating. They have an obligation to ensure that tests are administered securely and with integrity. In order to meet this obligation, states require a “preponderance of evidence” in order to act, not absolute proof. However, the departments of education must treat students and teachers fairly, and they must communicate policies clearly.

Because security is a process, it is important to have a ready-prepared security breach response plan, before the breach occurs. It’s not a matter of if the plan will be activated; it’s only a matter of when the plan will be activated. The planning process helps the department of education to have a focused and coordinated response for conducting investigations, imposing discipline and, of utmost importance, communicating with the public and the media.

Without such a plan, the department of education must create a response to the security breach in a potentially haphazard manner. The press is very good at uncovering haphazard and hastily prepared communications.

In summary, state departments of education are empowered to use data forensics wisely and effectively when they have implemented security policies, processes, and procedures which enable them to administer tests securely and with integrity. Regular data forensics monitoring allows states to measure and manage security risks that are inherent with all forms of high-stakes testing.



Best Practices in Computer-Based and Online Testing


Friday, September 23rd, 2011

By: David Foster, CEO, Caveon Test Security

In 2010, a very useful book was published by The Council of Chief State School Officers and the Association of Test Publishers. It is titled Operational Best Practices for Statewide Large-Scale Assessment Programs. Caveon’s very own Dr. John Fremer contributed as part of the working committee to the overall effort and provided a chapter or two. As the title suggests, the book provides some “best practices” in a good many areas of interest to all testing professionals, particularly those involved in paper-and-pencil state assessments. A testing program can use the book to evaluate its own practices, and to guide efforts at change if necessary.

Given the intense interest today in delivering tests on the computer, it’s not a surprise that there was immediate interest in a revision of the book, one that would include best practices for programs using or wishing to implement technology-based tests. These are tests that are administered on computers via local servers, or delivered online through secure browsers. Choosing the specific technology used to administer the tests is not an easy chore and should be carefully done. The newest model, online testing—testing administered securely through browsers—is becoming more and more popular with high-stakes testing programs.

But what are we to think about the concept of best practices when a methodology is new and developing, when few organizations are experienced with it? How can a best practice even be identified with so little applied experience and when change accompanies that technology almost daily. It’s my opinion that our concept of what is a best practice has to evolve if we are to find it useful in the face of new and constantly changing technology.

To solve this conundrum I’d like to propose that we adopt a more accepting approach toward innovation and technology. This means that we should seriously consider innovations even though dozens or hundreds of other programs have not yet tried it out. This optimistic attitude is critical if we are to find these innovations immediately helpful, and, more importantly, if we are to set ourselves on a path to accommodate change occurring on a more constant basis. New technologies can be evaluated against reasonable criteria that reveal how the innovation will improve the reliability, validity, security and fairness of the tests. This is especially easy to do if by implementing the technology we are solving a long-standing concern or problem.  My own experience developing and using new technologies over the past 30 years has been very rewarding.

Just a word about standards and technology. Some feel that using new technology violates or threatens standards. That certainly hasn’t been my experience. Throughout my career, as I used new technologies in testing, I have found that in each case it enhanced my ability to meet the standards, rather than threaten them. An example may help here. In 1990 at Novell we implemented a new multiple choice question type that allowed for more than one correct answer. No one had used it before. It immediately helped us to eliminate confusion for our test takers from negatively worded multiple choice questions. There is no standard that states that multiple choice questions must only have a single correct answer, but there are standards that require us to improve the quality of our questions.

Now, a final word about statewide educational testing. The joint committee working on the revision of the Operational  Best Practices for Statewide Large-Scale Testing will provide a set of best practices in the coming months for technology-based tests. Hopefully these suggestions will be met with enthusiasm and optimism. If they are, statewide assessment programs will find it much easier to meet the very ambitious goals set by themselves, the federal government, and other stakeholders.



Spurious Investigations Arise From Flawed Statistics


Friday, September 16th, 2011

By: Dennis Maynes, Chief Scientist, Caveon Test Security

This summer, many cheating investigations of schools based on forensics evidence have been inconclusive or weak. This is troubling because investigations require time and money, and they can be disruptive. In my experience, the statistics are often questioned and with good cause.

A famous statistician once said, “If you torture the data long enough, they will confess.”  (Attributed to Ronald Coase: http://en.wikipedia.org/wiki/Ronald_Coase). Forensics monitoring requires inspection of all the data and listing the extreme observations. Because this is done with the intent of FINDING anomalies, extreme care is required. Knowledge of order statistics teaches how to analyze the data properly. Unfortunately, order statistics are not taught in basic statistics courses, which means that many analysts not having knowledge of order statistics tend to torture the data. It’s like saying my basketball team is taller than your basketball team using only the height of the tallest player on each team. Such a comparison ignores the starting five and the height of all the players on the team.

Let me illustrate the concern with an example. Journalists routinely analyze the score gains of schools within the state using regressions. Because students of basic statistics courses are taught that a regression residual exceeding three standard deviations IS an outlier, the newspaper will report all schools having gains of three standard deviations or more. The named schools will receive “extra attention” in the press and by the public. Such attention is not warranted because the fact that ALL schools were examined has been ignored. The three standard deviation rule is correct only if you were to examine one school and one school only, at random. But, the analysis was not restricted to one school, which means that many schools were inappropriately named. The statistics guarantee it.

Journalists are not the only group to make this error. Most forensics reports that I have read have ignored this issue. Using order statistics, statisticians have developed corrections to avoid overstating anomalies. For example, Bonferroni’s correction divides the target probability level by the sample size (http://en.wikipedia.org/wiki/Bonferroni_correction). Using this correction, 4.2 standard deviations, not 3, is the correct threshold to use if 1,000 schools were inspected.

The next time that you see a forensics report in the media, ask yourself, “Was the Bonferroni correction or similar conservative threshold used?” If the answer is no, be very, very careful because the number of anomalies has probably been overstated. On the other hand, if the answer is yes, the investigations are probably warranted and justified.

<script type=”text/javascript”>
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-26102733-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement(’script’); ga.type = ‘text/javascript’; ga.async = true;
ga.src = (’https:’ == document.location.protocol ? ‘https://ssl’ : ‘http://www’) + ‘.google-analytics.com/ga.js’;
var s = document.getElementsByTagName(’script’)[0]; s.parentNode.insertBefore(ga, s);
})();
</script


Courage Required


Tuesday, September 13th, 2011

By Steve Addicott, Vice President of Sales, Caveon Test Security

Over the past few weeks, I have participated in several planning sessions with Caveon clients who contract with us to analyze test results.  This service, Caveon Data Forensics ™, represents a proactive means to better protect their programs by identifying statistical anomalies that may indicate cheating.

While each of these programs is different (state education, IT certification, medical licensure, construction certification, etc.), it’s interesting to me that they face common challenges in confronting test fraud.  For each of these programs, test results matter…they really matter…in making important decisions in the lives of test takers (and in education, teachers and principals).  Thus, the integrity of the test administration matters greatly, too.

My overarching impression?   Tackling test fraud head-on is not for the faint of heart.  It takes commitment—a genuine, unwavering commitment to fair and valid test results—to say “We know there is a subset of our test-taking population that is taking shortcuts, and we’re going to do something about it.”

These days, everyone is busy.   So, why would any sane test program leader willingly add to his/her workload?  The results of our Data Forensics analyses do just that.  We identify:

  • candidates/students that may require invalidations;
  • test centers/schools that merit investigations; and
  • items/exams that should be retired and/or revised.

The committed leaders we work with understand that this is what is required to be able to stand in front of stakeholders and proclaim that their test administrations are fair and valid.

Another important takeaway I’ve gained is that a successful Data Forensics program requires the cooperation and coordination of many  groups.  Last week, I met with a client, a large state department of education.  Its leadership, in order to ensure the data forensics program possessed real teeth, sought cooperation with several other state departments:

  • Legal, to ensure any sanctions resulting from the data forensics would hold up in court;
  • Communications, to ensure sound, consistent messaging to the media and public alike;
  • Inspector General, for conducting investigations; and
  • Professional Practices, in case sanctions might be brought against a state certified educator.

This sort of cross-organizational coordination is not easy to facilitate, but critically important in the fight for fair and valid testing.

If you’re considering how you can augment the security of your program, you might find one of our company webinars to be a help.   In “Don’t Shoot The Messenger”, three Caveon clients present the good, the bad, and the challenging in instituting program invalidations through data forensic analyses.  You can get a copy of the webinar slides here:  http://caveon.com/df_blog/ .



Adapting Security for Twenty-First Century Testing


Friday, September 2nd, 2011

By David Foster, President, Caveon Test Security

Over 20 years ago, the first high-stakes tests were delivered using computers. The movement of using technology to administer tests was launched. The many advantages of the new ways of testing were realized almost immediately. These included the convenience of being able to take the test at any time and receive an immediate score report. Unfortunately, the new exams were more susceptible to theft and cheating because tests were given during a long testing window. When the same test is given over a long time frame, say, weeks or months or even years, opportunities exist for people early in the period to steal the questions and share them with people later in the window. How the questions are stolen isn’t all that important, but testing programs have never really come up with good security solutions. The problem remains as bad today as it has ever been.

It’s not that the problem can’t be solved. It’s just that we have tried to apply the security methods of the past century to a new way of testing. It would be like harnessing horses to pull a car. Or, using a telegraph to send an email. Proctors—the standard security default approach—are not able to stop most modern ways of stealing exams, and they are just about as ineffective at stopping cheating. And worse yet, sometimes they even do the stealing and cheating.

So, what’s the answer? How can technology-based tests be made more secure? How can a test today be protected while remaining in active service for long periods or even indefinitely. It seems logical to me that technology, having created the problem, should also provide the solution. As Shakespeare stated in King John, “fight fire with fire,” meaning that we should enhance our security efforts with new technologies.

If new security technologies can stem exam theft and cheating, what are some of them?

  1. New data forensics analyses go beyond detecting erasures on answer sheets. They can detect clusters of unusually similar tests. They can make use of item response times to detect tests that are taken too quickly or too slowly.
  2. Exams can be “watermarked” intelligently in such a way as to identify the thief if the items are later found on a website.
  3. New computer-only item formats can protect content better than traditional item designs. Items can be administered one at a time in a design that doesn’t allow for returning and reviewing them.

This list can go on, and the above methods can be refined and improved. Plus, many of these can be used effectively to improve security for paper-and-pencil tests. We’ve given the thieves and cheaters quite a head start. It’s time to fight back with fire of our own.



What Everyone Wants to Know about Cheating in Schools.


Thursday, August 25th, 2011

By Dr. John Fremer

There was a time, not that long ago, when state assessments did not create much interest in the media.  State assessment leaders and testing company professionals tried to think of ways to make testing results more newsworthy, scheduling special briefings for the press, holding “mock” testing sessions and the like.  Not any more.  The media is buzzing with stories about cheating in schools. What are the questions being asked and how can they be answered?

In the past year, I have been interviewed weekly about preventing and/or detecting cheating on high stakes tests.  By “high stakes,” I mean any standardized tests used to make important decisions about students, teachers, schools, and programs.  There are three questions I am almost always asked, regardless of the specific purpose of the reporter’s intended story:

  • Is there more cheating by educators now than in the past?
  • How can you detect whether cheating has occurred?
  • What can states and schools do to eliminate or minimize cheating/

There are other questions, but these three are asked very consistently and I will answer each one.

Is There More Cheating Now?

Yes, there is more cheating by educators now then in the past.  There has been a trend toward greater levels of cheating that goes back decades and I say this based on experience not just “book learning.”  I am in my 50th year as a testing professional and I have seen this trend unfold in many areas of testing not just in education.

How Can You Detect Cheating?

Although cheating by educators is a very worrisome problem, I take comfort from two aspects of the situation.  First of all, I estimate the proportion of educators involved in high stakes state testing who engage in cheating to be between one and two per cent.  You can find higher estimates.  In the important and very widely read book “Freakonomics” a testing misbehavior estimate of five percent is provided, but I think that overstates the actual amount.

Another source of my positive feeling is that I know that there are a good size set of tools that can identify how much cheating is occurring, where it is taking place, and how serious it is.  Caveon uses seven different indicators when we run our detection approach ,“Caveon Data Forensics™.” Here are three very critical ones:

  • Very unusual gains from one year to another. If the results from a class or school seem too good to be true, don’t accept them without a great deal of careful scrutiny.
  • Very high levels of similarity in the specific test results at the individual test question level between a pair or a group of students. How could it be that two classmates taking a 40 item state test, would not only get the same 20 questions correct and 20 wrong, but would choose the identical wrong answer in every instance?  That simply does not happen when students work independently.
  • Very high numbers of erasures, especially wrong to right.  Most students make very few erasures on each section of a state assessment.  So when substantially larger numbers of erasures show up and when a huge proportion of the changes are from a wrong to a right answer, alarms need to go off – “trouble here.”

Caveon advocates using multiple indicators because a single number cannot tell the entire story by itself.

What Can States and Schools Do?

Even though the level of educator cheating is low, any cheating at all by teachers and others is profoundly worrisome to parents, school board members, community leaders and others.  So what can be done to reduce cheating?

  1. Communicate Zero Tolerance – It needs to be crystal clear to everyone involved in testing that the school or district or state will not tolerate cheating. It is not enough to merely to say “that we do not tolerate cheating” at a staff meeting or include a “No Cheating Allowed” commitment in training materials for testing.  It is essential to deliver this message at every stage of the testing process and to get clear evidence of each person’s commitment to follow all testing rules. It is essential also that those individuals who have the primary responsibility in each school know that they have the unqualified support of school, district, and state management to train staff.  Also, that they may monitor all phases of testing to be sure that fairness and validity of results can be counted on.
  2. Analyze Test Results after Every Administration – All involved in testing should be made aware that thorough analyses of test results are going to be carried out and that there will be significant consequences if rules are not followed.
  3. Act on Problem Results – When evidence surfaces of failures to adhere to the rules of testing, it is essential to take action on these findings.  Only in this way will the seriousness of managers be made clear.

You now have my take on the three of the public’s highest concern questions about cheating.  If anyone wants to talk about their own experiences or to learn more about mine, please let me know. I thoroughly enjoy exchanging ideas with others who are committed to fair and valid testing.



HOME :: SERVICES :: RESOURCES :: COMPANY :: PRESS :: LINKS