Archive for 2008
Monday, April 7th, 2008
Hindsight: Perfect understanding of an event after it has happened; – a term usually used with sarcasm in response to criticism of one’s decision, implying that the critic is unfairly judging the wisdom of the decision in light of information that was not available when the decision was made.
After every single airplane crash or incident, the FAA routinely conducts exhaustive investigations to determine the cause of the crash. The purpose of the investigation is “to identify safety deficiencies and unsafe conditions which are then referred to the responsible FAA office for evaluation and corrective action.” The amazing air safety statistics in this country are primarily the result of these extensive analyses. Setting all sarcasm aside, the FAA has learned that hindsight is 20-20. A perfect understanding of the event is often attainable. And from that understanding, air safety has improved.
I believe that all testing programs can learn from this example. If each program conducts a “security breach post mortem” security processes can be improved. A good practice in security is learning from your own mistakes. A better practice is learning from the mistakes of others. A best practice is creating processes so that those mistakes are never repeated.
As an example of what might be possible with a security breach post mortem, consider two recent news stories. Recent news from the UK suggests that many immigrants are being coached to pass the spoken language and listening portions of the citizenship tests, even though they cannot speak English. The BBC went undercover and filmed “an appraisal” which the undercover reporter understood to be the process for passing the language test. The reporter didn’t even need to speak or listen in English. The video is extremely fascinating. In other news, the results of Boston’s promotion exams for firefighters are being discarded and all the candidates will be required to retest, following a security breach in November 2007 when cell phones were used to cheat. The retesting is required because the investigation was inconclusive and the cheaters were not uncovered.
It is likely that both of the above breaches would have been prevented if proper security safeguards were in place. The purpose of the post mortem is to learn the security strengths and weaknesses of the testing program, so that security may be improved and strengthened. In my experience, we generally do not obtain all the information possible from a security breach investigation. For example, in Boston the investigation was conducted to determine who cheated. While some improvements to security should happen as a result of the investigation, I believe that a serious post mortem would reveal even more information in order to prevent similar breaches in the future. The post mortem allows us to learn from our mistakes.
In an earlier essay, I suggested that testing programs should, “Read stories of cheating in the news to learn how the media might portray your cheating incident negatively.” This is one form of learning from the mistakes of others. In addition to studying security breaches in the media, several other methods exist for learning best security practices and processes from others. Some of these are (1) attending presentations where security breaches are discussed, (2) talking directly with program personnel who have been involved in security breaches, and (3) working with experts who study and analyze security breaches and best security practices. At Caveon, we are doing our best to expand our expertise so that we may effectively assist all testing programs in their efforts to strengthen their test security.
If you have never conducted a security breach post mortem you are probably wondering how you might start.
The first step determines the extent and nature of the security breach. When the breach involves cheating during the test or tampering with the test results, a data forensics analysis is invaluable in making this assessment. When the breach involves the distribution and sale of protected test content, an Internet investigation or Caveon Web Patrol can determine the scope and size of the breach. When the breach involves a breakdown of security procedures and processes, a post-mortem security audit will be needed. Some security breaches may require all three information-gathering activities.
The second step performs a cause-and-effect flow analysis or a fault tree analysis. This analysis establishes where the test security vulnerabilities exist and how those vulnerabilities were exploited by the miscreants.
The third step identifies necessary changes in the testing program’s security processes. These changes should be first considered as suggestions or recommendations. They should be prioritized. They should be assessed for effectiveness using security threat models. They should be evaluated against required resource allocations so that their practicality can be measured in terms of the program’s budget and expertise.
Finally, proposed recommendations are presented to the executive management team with an implementation roadmap. The executive report should clearly state that the purpose of the post mortem is to improve and strengthen test security. A post mortem analysis is not conducted with the purpose of apprehending cheaters and imposing discipline upon test frauds. These actions may result from the investigations. But, the post mortem provides the tactical and strategic initiatives to prevent test fraud in the future.
Caveon is willing and able to assist you in these efforts. We wish you the best as you consider how to learn from your own mistakes and the mistakes of others.
Wise men profit more from fools than fools from wise men; for the wise men shun the mistakes of fools, but fools do not imitate the successes of the wise. – Cato the Elder
Hindsight is indeed 20-20 and is not to be scoffed at when we use it in order to improve.
Monday, March 31st, 2008
Georgia bit her lip nervously as she peered out the rear-view mirror of her car. She had already been idling 10 minutes longer than allowed and campus security would be returning shortly. Then, she saw them, exiting the library. Ignacio was detained by a man in uniform. Vincenzo broke into a run, sprinted to the car, and hopped in. “Step on it,” he said. Georgia sped away. “What about Ignacio?” she asked. “Don’t worry. I have it right here,” he replied as he slipped a digital camera from beneath his jacket, extracted a memory card and handed it to Georgia. She grinned. Now, she would be able to pass the test and become an intern at Waldo & Cramer Industries. Once inside W & C and with her computer skills, her current employers would soon be very, very happy.
The above fictionalized account is based upon an incident which Caveon was asked to investigate in 2004. Our client wrote,
“We had an incident over the weekend concerning the XYZ exam …. The examiner contacted our office during the 3rd section of the examination. Two examinees were acting suspiciously throughout the exam. They had questions about how long the breaks were and what would happen if they returned late from the break. During the break, the proctor noticed that one of the test booklets was not on the applicant’s desk.
The proctors noticed that the two examinees went to their car and came back late from the break. When addressed about the booklet, they said they did not have the booklet and then dropped it from their jacket and said, ‘there it is’. They were allowed to continue, although the proctor told them their scores would be invalidated. They were addressed by the proctor and campus police after the exam and questioned. One of the examinees was released as he stated he had nothing to do with the incident. The other fled the scene in a car that was waiting for him, as he was being escorted to check his car to see if there were images on his cell phone of the test booklet. The names of the suspects are Inigo and Vinny.” (Actual names have been changed.)
Results of Investigation
Caveon conducted an investigation into this incident and we discovered that the two individuals, Inigo and Vinny, were enrolled at a nearby university but they were not enrolled in courses of study or college majors that would be consistent with taking the admissions test connected with this incident. Furthermore, we determined that one of these students had lost his passport during the summer and the other had his driver’s license stolen. The information was corroborated and led us to infer that both of these students were victims of identity theft. Some other individuals committed test fraud in their names.
We also discovered that the test thieves were given the opportunity to steal the test because the test site administrator had not collected testing materials during breaks or the lunch period, as per test administration policy and procedures. One of these individuals, “Inigo,” had taken and failed the test approximately six weeks earlier. We presume that this individual determined that an opportunity existed to sneak the test booklet out of the testing site at that time.
In our report, we concluded that the imposters (or identity thieves) took the exam with the intent of exposing the exam content for one or more of the following purposes: for themselves, on behalf of another individual(s), for mass distribution, or for financial gain. We also suggested that, with suitable revision to the test administration policies and procedures, the likelihood of a security breach could be reduced.
Another phase of the analysis was to statistically analyze the test responses. It is difficult to infer “intent to steal” from data analysis, but the data are revealing. One of the statistics that we use in Caveon Data ForensicsTM is known as the bimodality statistic. With this statistic, we assume that most individuals answer the test questions consistently according to the observed performance (or a single level of ability). However, we allow the possibility for some individuals to answer the test questions according to two levels of ability (or in two different modes, hence the name bimodality). Using these statistics we found that Vinny’s test was somewhat aberrant (at the probability level of one in 2,000) and that Inigo’s test was extremely aberrant (at the probability level of one in 200 million). These data, along with comparative “normal” data at the same ability levels, are shown in Figures 1 and 2.
Figure 1: Comparison of Vinny’s test with a normal test
Figure 2: Comparison of Inigo’s test with a normal test
The data confirm that both of these individuals took the exam at two levels of ability. The probability of the high level is shown using the yellow line. The probability of the selected response using the low and high levels is shown using the blue and pink lines, respectively. We infer that Inigo demonstrated more information and knowledge about the test content than Vinny, but both of them appeared to be answering the test questions for some other purpose than obtaining a score and an actual measure of their knowledge of this content area. It appears likely that these individuals were connected with the content area being tested.
This incident is extremely instructive. It illustrates that not all test takers are as they appear and that an unfair advantage may be gained in many ways. I had always wondered whether there would be a motive to steal an identity for the purpose of taking a test and now I know.
Monday, March 24th, 2008
Last time I discussed where RFID (Radio Frequency Identification) chips are finding their way into schools. I promised I would write about RFID applications in testing next.There are at least three areas where RFID technology could help testing program administrators maintain fair and accurate programs: (1) tracking and counting test booklets or answer sheets, (2) verifying that the correct information regarding the test taker and the test form has been recorded on the answer sheet, and (3) maintaining information about test results and test taking status.
Tracking or accounting for test booklets or answer sheets
RFID technology is widely used in materials tracking and handling control systems because individual items may be counted and inventoried quickly and accurately. The obvious need in large scale testing is for the accurate tracking of thousands of test booklets and answer sheets. You can’t put a chip on a test booklet or answer sheet can you? Recent innovations in RFID technology say that you can.
To illustrate the need, consider the following actual occurrences:
1. In 2005, approximately 27,000 TAKS test booklets were lost. These represented .22% of all the test booklets. This would not be serious except that at least some test questions are usually reused in a later year and if these booklets were copied and used as study materials some students would gain an unfair advantage in a later year.
2. In 2003, 232 test booklets (3.3% of 7,000) were lost in New Mexico.
3. In 2005, a significant number of answer sheets were misplaced in Nevada. After a frantic scramble, the answer sheets were found and the affected seniors were awarded their scores that they needed for graduation.
4. There appear to be a large number of situations where exam booklets or answer sheets are misplaced or even lost. For example, Edexcel in the UK lost exam papers, the “Sats” in 2003 in the UK were stolen and offered for sale on the Internet, and in Jamaica this year the sixth grade tests were leaked.
As an illustration of the requirement to properly track the test booklets, consider the situation with Colorado’s Student Assessment Program (CSAP). As it was reported in the news, we read: “In preparation for the testing, administrators have spent hours counting and recounting exams, outlining strict rules for administering the exams to prevent anyone from getting an early peek, and aligning themselves with the proper procedures.”
Do we really want educators counting test booklets instead of teaching? RFID technology has the potential to handle this problem. If every test booklet is identified with an RFID tag and if every answer sheet has an RFID printed label affixed, an RFID reader could process an entire stack of test booklets and answer sheets in just a few minutes and determine if all the materials are present and which ones, if any, are missing.
I haven’t actually seen RFID chips used for test booklets, but I just renewed my US passport and it has an RFID tag embedded in it. I don’t know where the tag is, and I don’t think that I need to know. The key point is that most test booklets are similar to the passport. There is a cover and several pages that are stapled in the middle (at the binding). Affixing RFID labels onto answer sheets is potentially more difficult because the answer sheet needs to be processed through a scanner. However, RFID printing technology exists for affixing labels to documents. A search with Google using “RFID printing” brings up many links with vendors of solutions to create printable labels.
If RFID tags are embedded into the test booklets then the entrances to storage rooms can be fitted with RFID readers and any unauthorized removal of a test booklet may be detected. Similarly, if secure test materials are tagged with RFID devices when they are reviewed by standards setting teams we can be assured that none of the materials will leave the secured area in an unauthorized manner.
These are standard tracking and inventory control processes where RFID has demonstrated its value in other industries.
Verification of information
The other two main areas where RFID technologies might be applied in testing are in verifying that correct information is recorded concerning the test and in maintaining information about test taking status. I see a lot of testing data from public schools and other industries and nearly all of it contains errors. Often, test taker identifiers are recorded incorrectly. These errors require a lot of time to find and correct. But, if they are not corrected, individual students will be affected. RFID has the potential to help in this area, if we give students RFID badges. Even though we currently have systems for processing these data, I see enough of these errors in testing data to know that current technologies are not solving the problem.
Maintaining test results and status
If the students were issued a “smart card” (i.e., an RFID card that can be read and written), we could record on the smart card the student’s transcript and test taking status. Such a card could be beneficial in recording attendance during testing and in recording the test result. Although it’s not obvious that smart cards improve current testing practices in schools, I could see how the smart card might be beneficial in other scenarios, such as military testing. Smart cards are useful when you need to maintain information in a distributed, rather than a centralized, database.
Only time will tell whether RFID applications will bring improvements to testing, but regardless of the potential applications security concerns persist. If the chips are used to record testing results or identify secure materials, they need to be secured against unauthorized tampering. If the chips are used to access secure test materials, they need to be secured against unauthorized duplication. If the chips are used to identify test takers (i.e., by containing biometric data), they need to be secured against unauthorized retrieval. If the chips are used to confirm tests are taken properly (i.e., the test taker’s identifying information is transferred to the test result), they need to be secured against inadvertent data loss.
Monday, March 17th, 2008
The ACLU is opposing a pilot project in Rhode Island to track students as they enter and exit school buses. “Steven Brown, executive director of the Rhode Island chapter of the American Civil Liberties Union, [called] the plan ‘a solution in search of a problem’ and saying the school district already should have procedures in place to track where its students are.” “There’s absolutely no need to be tagging children,” he said. “The program raises enormous privacy and safety concerns, he added.”
If my research is accurate, there have been at least four previous projects for tagging children in schools with RFID (Radio Frequency Identification) chips in the United States:
- Enterprise Charter School, Buffalo, New York (2003) – badges on children, tagging library books, school cafeteria purchases, and visits to the school nurse;
- Spring Independent School District, Houston, Texas (2004) – school bus pass program that is still operational;
- Brittany Elementary School, Sutton, California (2005) – badges on children, due to public outcry this project was cancelled which lead to the Senate of the State of California debated banning RFID chips to identify people in the state; and
- Tucson Unified School District, Tucson, Arizona (2007?) – school bus pass program that may still be under discussion.
In the UK, two clothing manufacturers are sewing RFID tags into school uniforms for the express purpose of tracking students while they are in school. RFID tags sewn into clothing are not new, and neither are RFID badges in the work place. And, now RFID badges are being used in universities. The University of Chicago is revamping all their student id cards primarily to ensure secure building access. Another application of RFID technology is a label affixed to your cell phone at Slippery Rock University, north of Pittsburgh, Pennsylvania, which allows for payment processing.
The idea of chipping children is controversial. On one hand, privacy advocates warn of possible abuses and intrusions. On the other, security proponents promote increased safety. In between are administrators who want improved efficiency and convenience. No one is seriously considering implanting RFID chips into children yet. But, this is happening for patients with Alzheimer’s and dementia, as well as being seriously considered by the British government for prisoners. And, the University of Washington has started a human experiment in the computer science building to assess the possibilities of RFID tracking. Several states have passed legislation prohibiting a person from being forced to accept RFID implants, which are approved by the FDA. (Image source.)
There is no doubt that RFID tags can be abused. With an RFID reader, a bad person can gather information about you surreptitiously. A bad person with a database can profile you, and even create an inventory of your belongings. But this potential exists today, even without ubiquitous RFID tags and readers. Bad people with cameras can gather information about you surreptitiously and create an inventory of your belongings. They do it to children, families and the elderly. The concern about RFID is that this can be done more efficiently. If you use membership cards and discount shopping cards, your purchases may already be tied to you in some database, somewhere, that at sometime in the future may be hacked by someone. To a certain degree, the anti-RFID movement promotes the fear that at some time your information will be stolen.
We should be aware that the school district personnel who are investigating this technology are trying to solve real problems. It’s important to keep track of library books. And, it’s even more important to know that students are entering and exiting the school buses at the proper times and locations. We live in a changing world and what we dismiss as paranoia today may become essential tomorrow. For example, there were no lockers in my elementary school. The first time I saw a locker was in junior high. While in high school, we moved to a small town and the lockers did not lock in that school (unless you brought your own lock from home). Given our changing world, I would be extremely surprised if this situation still exists in my alma mater.
I’m somewhat surprised that the ACLU has not opposed cell phones in schools. Who would have ever imagined that in the name of privacy and safety we would allow everyone to carry a camera to school and take a picture of anything there (e.g., students spitting in a teacher’s water bottle or a teacher filming the girls bathroom)? But that is precisely what has happened with cell phones. We can’t pry cell phones away from students. There is also great potential to abuse cell phones, as demonstrated with the FBI’s ability to remotely activate a cell phone’s microphone and use it to eavesdrop on nearby conversations. If cell phones are ever fitted with RFID tags, this entire debate could be over.
Mr. Brown from the ACLU is right. There are safety and security concerns with RFID devices. However, he doesn’t seem to understand those concerns. RFID chips can be hacked and the information from those chips can be transferred to other chips. As an example if the RFID card allows access to a secured area, a bad person may pilfer the electronic codes and in essence make a copy of the electronic key, as demonstrated by James Van Bokkelen. If the chips do not have proper electronic safeguards the information may be overwritten or used illegitimately.
While I have not directly addressed testing, there are implications for using RFID chips in testing which I will discuss the next time I write. But today, I just couldn’t resist this topic. In my opinion, we need to ignore the fear mongering and we need to use this technology wisely. RFID technology is not a panacea, but it can solve real problems.
Monday, March 10th, 2008
The ATP (Association of Test Publishers) conference this year did everything a good conference should do. We networked. We shared industry information. We discussed best practices. We met with clients and vendors. And we created, renewed, and strengthened friendships. Rather than discuss those things, let me share a few observations relating to test security.
Exam security was a hot topic, with many sessions and many serious conversations around test security. Wayne Camara of the College Board asked me, “Was the emphasis on security due to Caveon?” I replied, “I think it is partly due to our outreach effort, and more programs are dealing with security issues.” I think there are deeper reasons.
There were more stories describing successful security efforts this year than I remember in the past. Just to name a few: the FSBPT discussed their breach and resolution in the Philippines, the GMAC caught a proxy test taker in the very act, EMC presented successful risk management cases, and the Mississippi Department of Education has effectively addressed cheating in schools. We celebrate these successes, because they give us confidence that these problems can be solved.
There is deep concern about test and exam piracy. In the past, this concern was primarily expressed by IT (Information Technology) companies. This year many other organizations had the same concern. I heard several instances of exams being stolen from within computer-based testing centers. I have no reason to doubt these reports.
Theft vulnerabilities had been voiced privately in the past, but the discussions were more open this year. I attribute this to at least three reasons: (1) there were new attendees who wanted to expressly discuss security and stayed for the Test Security Summit, (2) the Boston Globe article “Job Exam Piracy Rising,” dated December 26, 2007, gave the topic national prominence, and (3) some presenters disclosed that their entire item banks, including answer keys and digital representations, had been stolen. In the session, “Cheater, Cheater, Pumpkin Eater,” EMC Corporation reported great success in detecting and shutting down test sites where exams are being stolen. Test pirates refused to resell test content because their test sites were shut down immediately after they stole the tests.
To the best of my recollection, there were more lawyers present at ATP this year than any other year. Representatives from at least four different firms had been invited to attend by conference organizers or conference presenters. I have paraphrased some of their very instructive comments below:
“Gather all your evidence in preparation to litigate, but only litigate as a last resort.”
“You can use statistics to invalidate scores and to take other security actions if you can demonstrate that your actions and decisions are made in good faith. The courts are interpreting these actions using contract law and it’s important that your agreements and contracts support your intended actions.”
“All test items are copyrighted, but you must register the copyrights before the items are stolen. Registered copyrights provide stronger protection than unregistered copyrights. There is a special provision in copyright law to protect secure tests for this purpose.”
GMAC and Pearson VUE described initiatives for preventing and detecting imposters. GMAC verifies a candidate’s current photo with the candidate’s registration photo. They attach the photo to the score report. (I call this “testing event authentication.”) Pearson VUE demonstrated Fujitsu’s PalmSecure biometric authentication technology. The readers are priced at around $700, but within reach for secure testing applications.
Gene Radwin and Liz Burns of EMC Corporation captured our imagination. Gene shared his success in detecting users of braindump content using Trojan items. Liz Burns described her security efforts. She visualizes a triangle. At the base of the triangle are honest people who will not lie and will not cheat. At the top of the triangle are those who will cheat if at all possible. In the middle of the triangle are individuals who may cheat depending upon the circumstances. The “at risk group” is where Liz concentrates her efforts.
The Education Division meeting had an interesting discussion concerning the image of testing in education. I think that a positive image of testing is critical. As an example of how incorrect image of testing can be damaging, consider the report that South Africa has effectively banned unproctored Internet testing, because these tests are thought to be unfair, not being secure (reported by Hennie Kriek, President of SHL, USA).
Finally, if you believe that test publishers are cold and dispassionate, let me disabuse this image. I saw a lot of passion and emotion at this conference. Testing professionals are very concerned that tests are administered securely. As an example, Cindy Simmons, State Assessment Director of Mississippi, showed great forthrightness and passion as she described her state’s initiatives to address cheating on the Subject Area Tests.
It’s true there is much work to do. But members of ATP are committed to fairness and integrity in testing. They comprise “the intelligent voice of testing.“
Thursday, March 6th, 2008
The Association of Test Publishers (ATP) Conference of 2008 ended yesterday. As always, it was a good conference. In 2004 we stated, “You can’t manage what you don’t measure.” Being a sponsor of the conference, we placed a bag of M&M’s (i.e., manage and measure) in each attendee’s conference packet. And, we printed the message on the hotel room key cards.
I have just completed analyses for three testing programs and I am so impressed with what they have done that I want to share their results with you. Good news concerning exam security is refreshing in the midst of so many cheating stories. We recognize dramatic acts of heroism, but often ignore the good that happens with steady, persistent progress. I am so proud of these three programs. They are achieving their common goals: “Reduce cheating, strengthen exam security and emphasize ethical test taking.” The data demonstrate this convincingly. Caveon’s message at ATP this year was, “The answer is in the data.” So let’s look at the data.
Figure 1: Percent of anomalous tests for three programs
Let me describe the data in Figure 1. The percent of anomalous tests for successive analyses are plotted for each program. A trend line has been fit to the data to aid your eye in visualizing the trend pattern. An anomalous test is one that deviates from normal test taking, and will exhibit at least one of the following: aberrance (answering hard questions correctly and missing easy questions), large numbers of erasures, inexplicable score changes from a previous test score, or excessive similarity in the selected answers with at least one other test. An anomalous test does not mean the test taker cheated. For example, when we observe excessively similar tests it is very likely that one person cheated (the copier) and the other person did not (the source). The percent of anomalous tests does not measure the precise number of people who have cheated, but it is highly correlated with that number.
These data are important because they demonstrate that all high-stakes testing programs, irrespective of industry or application, can effectively reduce cheating. They illustrate that reductions in cheating can occur with persistence and dedication. Let me briefly describe each program and some of the positive steps they have taken.
Program 1: This program provides a professional certification with high security. We estimate that there was a 45% reduction in cheating in three years. They have followed up on every case that appeared to be a security violation and every test site that appeared to have lax security. They have emphasized proctor training. They are now reviewing their test taker agreements, proctor training, identification procedures, and physical security with the intent of using the best known security protocols.
Program 2: This program is a public education program. We estimate that there was a 72% reduction in cheating in two years. They have rewritten their test administration manuals and have begun test administration monitoring. They assign a conditional status to extremely anomalous test results and require local review of those test results. They are receiving reports that the students being flagged are admitting to having cheated.
Program 3: This program administers tests in the service industry. We estimate that there was a 78% reduction in cheating in one year. They have stressed ethical test taking. They have revised their test taking agreements and strengthened test administration policies to allow for scores to be invalidated with an appeals process. They have refreshed test forms which appeared to be exposed. They are researching the next phase of security improvements: test site monitoring and appropriate disciplinary measures for test administration personnel who may be helping test takers inappropriately.
These very different programs were the same in one important way: They started where they were, they created a plan, and they were not discouraged. Each was taken back by the first data forensics report (we always find something disconcerting), but they pressed forward and executed their plan. Best practices used by these programs include: test site monitoring, emphasis on ethical test taking, invalidating scores as per policy, refreshing tests which appear to be over exposed, and updating their security procedures.
Let’s give credit where credit is due. The numbers are impressive and the data do not lie. These programs have earned our respect and admiration.
Sunday, March 2nd, 2008
Today is the first day of the annual ATP Conference (Association of Test Publishers). This afternoon I will present a workshop titled, “Strategies and Tactics for Limiting Item Exposure.” We will be exploring innovative ideas for protecting tests and items from theft. It’s easy to understand why test publishers are concerned about test theft. High-quality items are expensive to produce and represent a substantial investment. Item development costs of $1,000 or higher per item are not unusual. In an afternoon, a thief can compromise an investment of $250,000 or more, easily. Most testing professionals will state that item theft is their number one security concern. I discussed this previously in: What is your top security concern?
I can’t share the entire workshop content with you in this short essay. But, I can share with you Gene Radwin’s (of EMC Corporation) intriguing idea of answer-key arbitrage and Trojan items. The idea was briefly mentioned in: Student outwits FCAT with secret pattern. Just as the Trojan horse was the Greeks’ surprise weapon for outwitting the people of Troy, we hope to outsmart users of brain-dump content using Trojan items.
The basic idea of the Trojan item as developed and presented to me by Gene Radwin (email: radwin_gene at emc.com) is to place very easy items on the test which are miskeyed. If a test taker gives the miskeyed answers (and not the correct, easy answers) we have strong evidence that braindump content is being used. The fundamental principle is to create a test-within-a-test to detect test fraud. We booby trap selected items by changing them so that a different answer choice is now correct, and the compromised answer is incorrect. Without knowing which items are booby-trapped, the brain-dump user proceeds in ignorance, until detected. Just to illustrate, consider a math item that I “borrowed” from the SAT practice test.
Table 1: Example of a Trojan item
We do not expect the brain-dump user who has memorized the “Exposed” item to notice the small change in the “Trojan” item. As a result, the cheater will give the originally correct, but now incorrect, answer “C,” and at the same time the honest test taker will give the correct answer “E.” The change in the answer key gives us a leverage or arbitrage point, creating a powerful difference in the statistical expectations.
In order to be effective, several Trojan items will be required on the exam. I haven’t done a rigorous analysis of the statistical power of the procedure, but my current intuition suggests that ten to twelve questions will be needed.
We recently analyzed data where one individual was suspected of having prior access to the test content. Six miskeyed items were present on the exam and we found that the suspect answered all the miskeyed items correctly (i.e., with the wrong answer key). Using item response models, we analyzed the “score” for the miskeyed items. (We do not use standard regression techniques because the data are not normally distributed, being highly constrained and skewed.) These data are shown in Figure 1.
Figure 1: Analysis of 6 miskeyed items
We see two extreme data points in Figure 1, corresponding to the suspected exam and another exam (they had probabilities of one in 5,000 and one in 1,000, respectively). The expected score on the miskeyed items was approximately two. We note that there is no correlation between the raw score on the test and the score on the miskeyed items.
In the above example, analysis of miskeyed items detected a potential testing irregularity. When Trojan items are specifically designed as described above, we expect to see a strong negative relationship between the Trojan items and the total score. In other words, high scoring individuals will provide the correct answer and not the original answer. This negative relationship improves our ability to detect users of brain-dump content.
In addition to my own analyses, one of our clients has told me of great success in using these techniques. For obvious reasons, the client does not want brain-dump users to know which tests are treated with Trojan items and how their cheating is being detected. When cheaters realize they are being punished for using brain-dump content, they will quit using the content. Then we will be satisfied. We just want test takers to do their own work and demonstrate their own ability when they take tests.
Thursday, February 28th, 2008
They say that cheaters only hurt themselves. In all honesty, I think that a cheater said that and we believed him. It is often the case that cheaters hurt the people who gave them the test more than themselves. If you are responsible for giving tests, some fool will eventually cheat on your test. How you handle cheating incidents can make or break you.
When you started out in your career and you began giving tests, you probably didn’t imagine that the most demanding aspect of your job might be what to do about cheating. The first time you encounter this and when the media spotlight is focused on you, you will probably wish you were a rattlesnake handler or a bomb disposal expert. You create and give tests. And, you’re good at your job. You never intended to become a test cop. Let me suggest that you anticipate and prepare for cheating incidents now, before they happen. We call it security incident response planning.
Speaking of cops, there have been a number of stories concerning police departments and cheating on tests recently. In the summer of 2007, information about the police promotion test in Boston was leaked to several officers, as reported by WBZTV. In another story, theState.com reported that twenty-one police officers in Columbia, South Carolina were implicated in cheating when they either cheated, helped others to cheat, or knew of the cheating but failed to report it. And, Houston’s crime lab was in the news twice for open-book cheating, which resulted in the shutdown of the lab, as reported by the Houston Chronicle on October 6, 2007, and January 26, 2008.
The above stories illustrate the importance of responding appropriately to cheating incidents and testing irregularities. You will not be judged harshly because a few miscreants decided to cheat on your test. But, you may be embarrassed completely if you do not address the problem adequately. Your program may suffer a loss of credibility. The public confidence in those you certify may be eroded. And, adding insult to injury, the media may portray you as a fool and a blunderer.
Your security incident response plan should be suited to your organization’s needs and requirements. There are a lot of questions that you should answer. Let me list a few:
- What discipline should the cheater(s) receive?
- Is the discipline appropriate? If it’s too harsh you may be perceived as being unfair. If it’s too lenient you may be judged as playing favorites.
- When will you inform the public about the security breach?
- What will you do if the media learns of the breach before you announce it? Or, before you learn of it?
- Is an investigation needed?
- If so, how will the investigation be conducted? Who will conduct the investigation?
- What information will you share with the media?
- What information will you keep confidential? What justification do you have for not sharing everything?
- Who will be responsible for communications and media relations?
- Is your security incident response plan recorded in policy form to guide you?
As you can see, my list focuses on “doing the right thing” not just on “looking like we are doing the right thing.” Reporters, in particular, are very quick to suspect a cover up or to suspect they are not being told the truth. And, if you are responsible for a testing program which is accountable to the public (e.g., tests in schools or tests involving public safety), it is vital that you maintain the public trust.
One way that you can sharpen your skills in this area is to “simulate” what you would do in specific cheating situations. During the course of a year, just about every type of cheating will be reported in the news. You can stay current with these stories by reading Caveon’s “Cheating in the News.” You can sign up to receive CITN notification by e-mail about twice a month on the lower right-hand corner of the main Caveon web page. Read the stories. Discuss the stories with your staff. Does your security incident response plan tell you how to handle the problem, if it happened to you?
Just as we expect our local emergency response teams (i.e., police, firefighters, and paramedics) to prepare for disasters, we should prepare to handle cheating incidents. A properly executed security incident response plan can keep an incident from becoming a disaster.
Ten years ago the New York Times criticized ETS, claiming that ETS elected to keep quiet rather than publicize exam security breaches. When you, as a testing program manager, have a full-scale security breach on your hands what will you do? I can imagine that it was a very difficult decision within ETS whether to “keep the lid on” the story or to let the story be told. This appears to be a “no-win” situation. If you publicize the security breaches you may seriously undermine your testing program. If you keep quiet and the word leaks out, like it invariably does, your own credibility may be questioned.
Read stories of cheating in the news to learn how the media might portray your cheating incident negatively. Journalists print newspapers and sell advertising. Sensational news is good copy for them. It is especially important, when under spotlight of the press, that your testing program be viewed as being fair, responsible, and ethical. In my experience, reporters will probe for any apparent contradictions, irregularities which could have been avoided, or supposed dismissal of the severity of the situation. If they find any thing that might be construed as an irregularity, it will probably be printed. In my opinion, it’s better to tell your own story first, rather than let reporters interpret the situation in a potentially harmful manner.
I wish you the best as you formulate your security incident response plan. If you could use additional guidance in preparing your security incident response plan, a Caveon test security director will be glad to consult with you.
Monday, February 25th, 2008
The State of Florida recently imposed a cell phone ban on students while taking the FCAT. All the parents of school children in the state received a letter explaining the ban. On the other hand, the Legislature in the State of Utah voted down a bill that would require school districts to establish policies governing cell phone use. The sponsoring legislator said, “[Cell phones] can be used to cheat. We’ve had inappropriate photos transmitted. The problem is pervasive.” An opposing legislator was quoted as saying that “he thinks electronic devices could be better used in education and wouldn’t necessarily like to see policies that simply prohibit them.”
In another story last week reported by Wave3 of Louisville, we read: “Teachers at Oldham County High say they’ve had problems with students using their cell phones to cheat in class. ‘I saw a boy texting under his desk during a test. Then I picked it up. Clear as day it said number five –D- and I took it to the office and we were able to trace the number and it was to another student in the same class,’ said Newkirk.” Now contrast that experience with this column from the Muskegon Chronicle, where the writer claims that gadgets don’t help cheaters. The following points were made:
- Yet research indicates that cheating in high school and college isn’t any more common today than it was 30 years ago.
- “And 99 percent of cheating is still done the old-fashioned way, like copying from a neighbor,” said Scott Gomer, media relations director for ACT,
- But in her four years at Northview High School in Plainfield Township, Kelsey Perras has heard of someone pulling out a cell phone to take a picture of a test “only once,” she said.
- “High-tech cheating isn’t really something you see a whole lot of,” said Hudsonville High senior Travis Martin. “Most people won’t pull out their cell phone during a test. It’s tough to make that discreet.”
- Perras said cheaters at Northview are caught more often than not.
Muskegon must be a very sheltered place with extremely astute teachers. The credibility of each of the above statements is easily challenged. I see a lot of data and from what I see, I feel very confident in stating that cheaters are rarely caught. The only way that I can explain some of the cheating I see is through wireless communications. And, research from the Josephson Institute and Center for Academic Integrity convincingly shows that cheating in school is rising and has been rising for the last two decades.
Confusion concerning cell phones in schools is raging throughout the whole country. The issue is being intensely debated in New York City where it has spilled into the court system. Last spring the New York State Supreme Court upheld a ban on cell phones imposed by New York City in 2006. The Supreme Court decision is now being challenged in appellate court.
Surprisingly, security arguments are given by both sides of this debate. Proponents of cell phones argue that parents and administrators need constant contact with students, because without constant contact student security is jeopardized. Opponents of cell phones in schools cite privacy violations with videos posted on the Internet of students in restrooms and teachers disciplining students. And, of course, they do not overlook the implications of cheating. As reported by WSAZ, the solution at Marshall University has been to allow each instructor to determine in their course syllabus whether cell phones during tests are banned, but to not restrict cell phone use on campus.
Penn State has addressed the issue by creating secure testing environments, where the computers do not have Internet access and where cell phone transmissions are silenced. The technology they are using includes: secure workstations, cameras and monitors on every test taker, and metal-lined testing rooms (known as faraday cages) that passively prevent wireless communications. While this may seem extreme, contrast this with the exam breach of 2004 in South Korea where 314 test results were invalidated after police discovered answer keys being transmitted using text messaging. As another example, consider the January 2, 2008 report by the Boston Globe where firefighters in Boston sent text messages from the restroom to cheat on their exams.
It is clear that cell phones are used to surreptitiously cheat on tests. People, in general, feel strongly that cheating shouldn’t be tolerated on tests. We don’t want doctors, lawyers, nurses, accountants, firefighters, police or any other person who provides a service to us to be an incompetent, bumbling cheater. On the other hand, the public sentiment appears to be confused when it comes to setting aside the cell phone while an exam is being given. The public seems unwilling to restrict the individual privilege of being able to communicate with a child in school while taking a test in order to prevent cheating.
The principle of fairness and integrity dictates that students should have a level playing field. It is very difficult to convince me that the playing field was level when cheaters in China were caught with radio receivers in their shoes:
Police in Jiutai, in the northeastern province of Jilin, became suspicious when a mini-bus remained parked outside a school hosting the exam on Thursday, Xinhua said.
Inside, they found three people, “two of them staring at a computer screen and talking into a walkie-talkie,” Xinhua said.
A student in the examination hall used a wireless microphone to read out the questions and received the answers from the van; Xinhua quoted their confessions as saying.
Police had found some 42 pairs of so-called “cheating shoes” with transmitting and reception ability, selling for about 2,000 Yuan each, in a flat in Shenyang, the provincial capital, state media said on Thursday, adding that they — along with “cheating wallets” and hats — had proved popular this year.
There is no confusion in my mind on this issue. But, I’m just a statistician and who am I to know differently?
Thursday, February 21st, 2008
Taking tests should be easy and simple. We shouldn’t worry about taking tests, but most of us do. It’s called “test anxiety” (see this link also) or “test phobia.” The medical term for this is “Social Anxiety Disorder” (SAD). I like using the three letters SAD to represent this, because it is sad when test anxiety keeps us from doing our best. Rose Oliver calls us a “test-addicted society.” In “Overcoming Test Anxiety,” she argues that
- Test anxiety is evoked and maintained by irrational beliefs and irrational demands.
- The perceived threat of harm stems from the anticipated inability to satisfy these irrational demands, and the catastrophizing of the consequences.
- The catastrophic consequence is primarily to one’s feeling of self-worth, which is irrationally equated with the test outcome.
- Irrational beliefs, irrational demands and catastrophic predictions are over-learned responses (habits) which are rehearsed before and during a test.
- Blocking on a test is an avoidance mechanism which is momentarily anxiety-reducing, but serves to maintain both the anxiety and the irrational belief system.
- Since irrational, self-defeating beliefs are learned habits, they can be unlearned.
- New, self-enhancing beliefs and behaviors can be learned.
In summary, we are afraid when we take tests because if we fail, we appear stupid. We are afraid of being ridiculed or laughed at. We are afraid that parents or friends will think less of us. We are afraid because we may not be prepared. When our fears overcome us, we become anxious and we can’t think clearly. Taking the test becomes an intense emotional experience with negative consequences when our fears turn into reality. We fail. We get a bad grade. Our parents and friends laugh at us. The next time we take a test our fear is even greater.
Taking a test shouldn’t be hard. Nearly always, we take tests for positive purposes. It’s important to find out how much we know. If we are really interested in learning, we should test ourselves all the time. But when we stress out some of us resort to cheating.
Cheating on tests increases our fear, because we might get caught. After looking at a lot of data, I have concluded that students who cheat generally do not do well on exams. Besides the fear of getting caught, there are a few reasons why cheaters struggle. First, cheaters don’t usually prepare well, so they’re not ready to take the test. Second, cheaters often have negative feelings about themselves and that’s why they need to “cheat to succeed.” Third, they often cheat with their friends who are struggling the same way.
We break the cycle of cheating and test phobia by confronting our fears and resolving to be academically honest. When we are prepared we will not fear. As one student told me, “Academic honesty means that I truly want to learn. If I cheat, I cheat myself out of learning.” Test taking should be simple. It should be as simple as learning your ABC’s. I have given you some simple suggestions below to overcome this. I hope that SAD will not overpower you the next time you take a test.
If you enjoyed the above article, I have written other thoughts on ethical test taking that you might be interested in:
A discussion of stealing test questions and its potential consequences: What’s the big deal with sharing a few test questions?
A fable in two parts concerning Santa’s elves and cheating: The Discontent of Santa’s Lazy Elf and Trouble in Section K
Thoughts about what cheating is and what the rules are for taking a test: The rules for taking a test
I have put the above ABC’s into a PDF file that you can download here.