Randomly Parallel Testing:

The Bare Bones

Written by David Foster, Ph.D.

What is randomly parallel testing?

Randomly parallel testing is the act of creating a unique but equivalent test form for each of your test takers on-the-fly while they are taking the test. It does this by randomly drawing or sampling items from a large pool of items you had previously built. Your test administration system has access to the pool and the randomization algorithms used. Once the pool is built, creating the RPT forms for each test taker is child’s play. As far as the range of complexity of test development and test administration procedures, randomly parallel testing is by far the simplest.

You will have noticed that I mentioned a pre-built large pool of items, and you may be wondering where it comes from. I’ll cover it in more detail below, but essentially the pool is built primarily using technology and automated processes; relatively very little manual effort is needed. Even very large pools can be created this way. The size of the pool is designed to, first, have items that cover the major topics as well as the important nooks and crannies of the content domain for your test. Obviously, larger more complex domains need more items. Second, the number of items in the pool must be sufficient to completely do away your test security problems of theft and cheating. Pools like I just described can be created these days with technology and AI more quickly and less expensively than what it would cost you to create a couple of traditional test forms from scratch. More on how this is done below.

Why should I care about randomly parallel testing?

Well, there are a lot of reasons. I’ll list them in the order that I think will interest you the most, but they are all very good reasons.

Test Security: Theft. Stops ALL of the theft of your test content from any source. This means that you will NEVER have to worry about that again. This is because there are so many items in the pool that the theft is irrelevant. Theft of your tests immediately becomes a useless activity benefiting no one.
Test Security: Cheating. Most cheating is prevented also. RPTs nullify the type of cheating that depends on fairly accurate prediction of the exact items on the test forms. Activities like making and using a cheat sheet hidden in a shoe, or even more common, buying stolen test content on websites and cheating using pre-knowledge about the test, no longer work.

So those are some big reasons why you should care because test fraud costs you money, causes you worry, and wastes your time and effort. Did I mention it costs you money? Can you imagine a testing world where theft isn’t possible, where cheating is mostly ineffective and can be easily managed? But let me continue. There are more reasons to care about RPTs.

Because RPTs are built on the foundation of testing an entire domain of skills, say, Residential Plumbing (or any part of it), sampling from items of that domain to make RPT forms, means that the test scores that result are unbiased estimates of the proportion of ALL items in the pool. So, if a prospective plumber answers 82% of the items on their RPT correctly, it is accurate to predict that they can answer 82% of ALL the items in the pool correctly. Since modern testing began early in the last century, this has been the most sought-after, but elusive goal of any testing program. Unfortunately, using a traditional testing approach cannot achieve that goal. Only RPTs can.
RPTs are inherently repeatable by design. You can probably figure out why by now. When test forms are created by random sampling from a large pool, then it is possible to continue going to that item fountain as many times as you wish. Traditional tests get contaminated after a single administration. RPTs keep uncompromised items coming. With repeatable testing it is possible to measure learning growth, check if a test taker is ready to pass the high-stakes test or if they need to prepare more. This is a veritable gold mine of equivalent interchangeable test forms for educators, parents and others.

Those are the major strategic benefits of RPTs, benefits that are not possible when traditional tests are given. In fact, it’s largely true that traditional testing practices are responsible for helping to create the current security problems and limiting the usefulness of testing.

But wait…there are more reasons to care! There are a host of practical benefits, ones that affect your job, your time, your bank account.

Before today’s technologies, RPTs were impossible. But today, they are easy to build and deliver. Using the storage, speed, and processing ability of today’s computers, along with the Internet and AI, creating RPTs and the large pools you need to support them are easier than creating two or three traditional test forms (which get stolen right away anyway). This difference translates to more time to do other things; fewer steps and deadlines to worry about; streamlined schedules; and best of all, more money to spend on more important things. I estimate that the cost to create and maintain RPTs is only about 10% of what it costs to create and maintain traditional tests.

I managed the test development group for a large organization for most of the 1990’s. I was always looking to stop cheating, create better tests, save money and time. What I wouldn’t have given for even the hint of RPTs. The sad irony is that randomly parallel testing had been invented almost 4 decades earlier by the psychometric G.O.A.T, Fred Lord. And I didn’t know about it?! It wasn’t in the textbooks nor the traditional testing playbook. Sadder still, even in the 1990’s the computing technology was sufficient to implement them. And I would have moved heaven and earth to do so. But, alas, I was completely ignorant. And being new to the field, generally shy about introducing tech-based changes, it wasn’t until late in my career that I stumbled upon randomly parallel testing, and recognized it as just what we all need. That was only a few years ago, and I’m actively promoting their benefits.

How can I know that RPTs actually work?

That’s a good question I get a lot. Other than strong psychometric and statistical support by Lord and other testing leaders, the entire foundation of the field of statistics, major supporting test theories, plenty of scientific research going back 70 years and continuing today, data simulations that demonstrate its numbing effect on cheating, and case studies of organizations using it successfully today, I got nuthin’. Sarcasm aside, current traditional ways of testing could only dream of having such an avalanche of support and evidence. Traditional testing practices—and the documented problems they enable—are mostly supported today by the response, “that’s the way we’ve always done it” or “that’s how we were told to do it”.

A colleague of mine at Caveon, Jennifer Palmer, and I recently released a bibliography to help anyone interested in randomly parallel testing, and its joined-at-the-hip companion known as domain-referenced testing. These will help you learn more about these topics. The bibliography can be found on our website using this link: Click Here. We recommend that you begin learning as much as you can about RPTs.

As you learn more, you will become comfortable with at least trying it out, and then moving away from traditional practices altogether. It’s will be an experience akin to trying out a cell phone for the first time and never going back to land lines. There is greater risk in sticking with the status quo and its current problems than there is in moving to something new that solves those problems.

How can I know that RPTs are fair?

This a slightly different kind of question than the one just above. It asks if test forms are interchangeable, that everyone, including test takers would be indifferent to which one a test taker takes. The question begs for the answer that one test taker is not disadvantaged compared to others by getting a randomly more difficult form.

First of all, and this is the most important point, let me start out by saying that there is no process in heaven or on earth to build tests that guarantees that test forms are perfectly identical in how difficult they are. With that truth in mind, the goal is to make the test forms as equivalent or parallel or interchangeable or fair as possible. Randomly parallel tests are the very best way to build test forms that have these qualities. If you want to reduce any disparities in RPT forms, you can simply add more items to them. By increasing their length a reasonable amount, you can reduce the differences to your satisfaction. Traditional fixed forms, even those that have been statistically equated, have similar disparities. Compared to these, RPT forms retain their interchangeability pretty much forever after publication instead of being immediately ravaged by theft and cheating. Traditional fixed test forms lose any claim of equivalence and fairness when reality hits.

What if I’m a little worried about changing my tests, about getting yelled at, or even sued?

I believe these can be avoided. Not the yelling, of course, as someone is always not going to like what you are trying to do. It probably messes with their traditional testing world. But the worry about changing what you are doing and the legal defensibility of RPTs can be easily handled. Your worry isn’t about adopting RPTs or any other new technology that changes testing, it’s about what you don’t yet know. Before taking the plunge, dip your toe in the water, check out the depth of the pool (no pun intended). Read the papers on RPTs, call and chat with me, talk to others who have gone that route. I’m hoping the benefits described above will lure you to at least entertain the notion, perhaps to even trying it out to see if it works. There’s not much personal risk in that. And worry doesn’t even enter in to it.

Now, a note about legal defensibility, the club that so-called testing experts use to beat you into submission whenever you question the orthodoxy. It’s my opinion that the ability to legally defend the practice of randomly parallel testing is many times greater than it is to defend the dominant current practices. Just because everyone is doing something doesn’t make it right. Current practices are not defensible today. Here is one example. Instead of stopping cheating, they enable it. This leads to a huge fairness imbalance. Test takers who cheat get higher scores on tests, and better opportunities for education and work. It’s my opinion that RPTs have a much more solid defense against lawsuits. At least I would be more comfortable as an expert witness supporting that position.

At some point, the decision to change will have to be made for people at all testing programs. The current process is unsustainable, particularly when better technology and solutions are available.

How do I make RPTs?

This section will give you a high-level overview, a road map, of the process for creating RPTs. There are actually several routes you can take and different destinations that end up being worthwhile. And nothing you do now will prevent you from doing it differently later.

As I said earlier in the paper, there are two major elements of randomly parallel testing that need to be put in place for RPTs to work. The first is to create a large pool, which we can refer to as a “universe”, of items that cover the domain for your test. Smaller domains, as well as sub-domains, will naturally require fewer items. The second task is to build or locate a testing system that can manage a large operational pool of items for a test, randomly sample from the pool on-the-fly during the exam, and randomly assign those items to each test taker while the test is going on. The end result is that every test taker always gets a unique test form, each designed to your specifications, made of up of sampled items that statistically represent the domain.

I was going to go into detail about the process for creating a large pool, because there are alternative ways to go about it. I think I’ll describe it generally. If you are interested, you can contact me directly or others at Caveon for more instruction.

Making a Large Item Pool

You can create a large number of items as “finished” items like those used on traditional computerized tests, or you can create a pool of virtual items. The latter pool needs a little more explaining. A virtual pool is created by programming an algorithm that can build any suitable item on-the-fly by combining an item format with content “parts” from different data resources. A helpful example might be a spelling test. Take all the words in a dictionary and place them in a large spreadsheet. When it is time to give a spelling item to give to an examinee, randomly select a row of the spreadsheet, and add it to the stem of the item. The question might say, using audio, “Please spell the word <random word> out loud.” The examinee’s oral spelling attempt is captured and converted to text and then compared with the correct spelling. The item is scored as correct or incorrect. The next item follows the same process, continuing until the test has completed. This is a RPT where the critical content is stored in a spreadsheet, and combined, when needed, with an item format to instantaneously make a complete high-quality item.

You can probably imagine how virtual pools can be built efficiently, stored and managed easily, can be very large in size, and result eventually in actual RPTs.

So, you can either build actual “complete” items, assign an item ID to them, and find a place to store them, or you can create the components of items and programmatically join them together at the last second.

Both of these ways to build large item pools are feasible and should be considered. AI can make the process easier for both. At Caveon we have built some technology to help with either process and can point you to other resources.

Randomly Selecting Items and Assigning them to Examinees

The second technical task is to randomly select items during an exam from the pool individually for each examinee. This assures that each test form is unique and representative of the item pool. To do this, the test administration system needs to make your large item pool available whenever RPTs are given to examinees. That’s the first challenge as most testing systems do not currently accommodate large operational pools in this way. You can either convince your test administration vendor to support your large pools or find a vendor that does.

If the pool is not immediately accessible during an exam, RPTs are impossible. If the pool is available, then the test employs a randomization procedure (or stratified randomization procedure) to randomly draw an actual (or virtual) item to give to a target test taker. This process takes place right after the previous item has been answered, and should take only milliseconds. Testing systems that have in the past only accommodated fixed-form traditional testing need to be modified to select items for RPTs properly. There are some systems out there that support this form of random selection and assignment. Caveon’s Scorpion system is one of them.

Except for these two important components of the testing process, everything else works the same.

Summary

This paper answers the major questions about RPTs at a high level. It should be enough information to encourage further interest. My colleagues at Caveon and I, and others at other organizations, will be able to answer your more detailed questions or provide you with more documentation on the topic.

READY TO TALK TO AN EXAM SECURITY EXPERT?

Reach out and tell us about your organization’s needs today!

Announcing Observer: The Solution to Proctoring

Randomly Parallel Testing:

The Bare Bones

What is randomly parallel testing?

Why should I care about randomly parallel testing?

How can I know that RPTs actually work?

How can I know that RPTs are fair?

What if I’m a little worried about changing my tests, about getting yelled at, or even sued?

How do I make RPTs?

Summary

READY TO TALK TO AN EXAM SECURITY EXPERT?

Security Services

Exam Development

Scorpion Exam Platform

Industries

Resources

Company