## Turning the tables on patient-specific reference data

How much time do you have left?

Yes, in that sense. The existential one.

If the question is difficult to ask, it’s even harder to answer. Ask an actuary. Calculating life expectancy is a complex matter; more complex, at least, then plugging your date of birth and today’s date into a function. An informative life expectancy depends on a host of additional factors, like your sex, current health, and lifestyle habits.

“Multifactorial” calculations like the one above dominate medicine, so it’s no surprise that they should dominate clinical research, too. Take a plasma urea level of 39 mg/dL. Is that above, below, or within the normal range? The question is misconceived, because normal in this case is relative to patient age. A 30-year-old’s “slightly above normal” is a sixty-year-old’s “slightly below normal.”

Age is only one factor. For many ranges, patient gender, ethnicity, and co-morbidities, in addition to age, determine a normal range. Often, researchers can set these factors aside without raising undue safety concerns or undermining the generalizability of their results. But as personalized medicine continues to inform drug discovery and clinical care, researchers will turn to more finely-grained reference data more often. For this reason, data management systems must make it easy for these researchers to apply reference data that’s sensitive to as many factors as they choose.

Of course “easy”, just like “slightly below normal”, is a relative term–for the most part. In no context is a writing a lengthy formula of nested “if, then” clauses easy, e.g.

If the participant is male and Hispanic and between 18 and 25 years old and the test is for ALT, then set the lower limit to 12 U/L and the upper limit to 102 U/L, and if the participant is male and Hispanic and between 26 and 34 years old and the test is for ALT, then set the lower limit to…

Completing the formula above would mean assigning a lower and upper bound to every combination of gender, ethnicity, and age range. The process could easily take hours, just to set the normal limits of ALT. If the study involved a dozen analytes, the data manager would need to devote the better part of a week to programming these constraints. If, at a later date, any one of those constraints changed, he or she would face the unenviable task of modifying (without breaking) the original formula. Too many “modern” EDC systems force the data manager to soldier through this error-prone task. With paper, it’s a non-starter.

How much better, then–for efficiency and quality–to rely on a general constraint; one that leverages a tool that’s easy to build, easy to read, and easy to amend? I’m talking about the humble table.

Yes, the table. For all our advancements in data architecture, the same grid that set us on the path to multiplication in second grade remains an asset today. It’s human readable, it’s intuitive, and it’s powerful.

Powerful? Really? How much can you accomplish with just two axes?

Great question! It’s true that most spreadsheet applications don’t offer more than two axes, at least not through their GUI. But who needs them when you have thousands of rows and hundreds of columns at your disposal?

Suppose I need to assign a unique value to every combination of three hand preferences (left, right, or ambidextrous), four eye colors (blue, green, brown, or hazel), and the eight blood types (O,O-,A+,A-,B+,B-,AB+,AB-). At first blush, it seems a table won’t suffice. I have more dimensions (three) than I do axes (two). But a single axis can accommodate any number of dimensions, because nothing prevents me from treating each combination of values on those dimensions as its own, n-factored value. For example, I can treat each triad of handedness, eye color, and blood type as one of 96 phenotypes.

Laying these combinations along a vertical axis, I can assign a value to each with just two columns.

Maybe I’m partial to a more compact format. If so, I can combine the variables from two dimensions to specify one axis, and let the variables from the third dimension define the other:

Here I make the 96 assignments with 13 rows and 9 columns. (The virtue of this method is fewer total cells.)

In any case, I’m free to work with as many factors as the situation demands, and distribute them between the two axes in any way that makes the most sense to me. Leaning on a familiar format, I’ve made the difficult part of a multifactorial reference much easier. All that remains is to add to the form a simple instruction for “looking up” the values needed. Even if those values change, the form doesn’t need to.

Fair enough. But won’t real use cases require gargantuan tables?

Sure. But what’s gargantuan to you and me is a walk around the block for the right technologyOpenClinica’s EDC relies on fast and flexible XForms to move data through a nimble, microservices architecture, so “clinically-sized tables” pose no threat to smooth performance. Consider these common parameters:

• 81 ages (18 years old to 99 year old)
• 6 ethnic and racial categories
• 2 genders
• 40 analytes
• 2 limits of normal (one upper, one lower)

A mere 972 rows (plus one header) accommodates every combination of age, ethnic and racial category, and gender. 80 columns (plus one on the left for analyte names) accommodates the 40 lower and 40 upper limits. The resulting 973 x 81 grid is small potatoes for database applications that power software like OpenClinica’s. Simple formulas in that context can retrieve the value from any coordinate within milliseconds.

Great. But what’s the big deal? I hardly ever need to apply reference data for this many factors at once.

Yes, a heart rate is a heart rate, and while population differences might exist for this measure, they’re hardly a concern on your vitals form. But don’t confuse the frequency of a need with its importance. Take safety. An insignificant drop in a lab value for one patient may portend real danger for another. Even apart from lab interpretation, though, tables can drive efficiency and accuracy. Dosing can vary between countries participating in the same study, due to differences in labeling and regulation. The same goes for eligibility and arm allocation. Whenever we try to account for these variables within our form, we accept programming delays and chances for error that we don’t need to accept. It is possible, of course, to make an error when assembling our table, but those errors are easier to spot and correct within a grid than they are in some extended, conditional formula. The tables themselves are easier to build in the first place, too, as their source data usually comes to us in the form of a spreadsheet. A little re-labeling of our first row and column, some testing, and viola: trusted references values are now a part of our study.

The lesson is simple, then. First, make sure you’re using the right EDC. Your form builder should allow you to specify reference data with tables, and your forms themselves should retrieve values in that table based on user input all but instantly. Second, use your two axes to their full potential: fill those rows and columns with as many dimensions as are relevant by tapping some basic combinatorics. Third, congratulate yourself.

You’ve just used a bit of the time you have left more wisely.

### Real-world example: applying lab reference data that’s gender- and age-specific for two analytes

Not every analyte carries with it age- or gender-specific normal ranges. But for those that do, their differences are critical. In this example, I’m concerned with two levels from a blood serum panel: Insulin-like growth factor 1 (IGF-1) and Dehydroepiandrosterone-sulfate (DHEA-S). Both play a key role in several endocrinological disorders, and both have normal ranges that vary by age and gender.

Our example form first asks the user to specify the patient’s sex, patient’s date of birth, and date of sample collection. The form then calculates the patient’s age, in years, at the time of collection.

Next, the user is prompted to enter the value for IGF-1.

As soon as it’s entered, the form compares that value to the upper and lower limits of normal corresponding to the patient’s age and sex, as found on the table below. Note that the user’s selection for gender, together with the calculated age, combine to form a unique key (‘female40’).

The lower limit of normal (igf_ll) for a 40-year-old female is 106 ng/mL. The upper limit (igf_ul) is 267 ng/L. Because the entered value of 145 falls within that range, no query is raised.

The form then prompts the user to enter a DHEA-S level. For this analyte, the user enters 278 ug/DL. That value is outside the range for a 40-year-old female. As a result, an auto-query instantly fires.

The full reference table includes 191 rows…

• 95 rows for men aged 18 to 112
• 95 rows for women aged 118 to 112

… and 5 columns…

• 1 column for the gender-age combinations
• 1 column for IGF-1 lower limit
• 1 column for IGF-1 upper limit
• 1 column for DHEA-S upper limit
• 1 column for DHEA-S lower limit

Introducing racial and ethnicity categories, along with more analytes, would multiply the area of our table. Six racial and ethnic categories combined with two genders and 95 whole-year ages would generate a total of 1,141 rows (6 x 2 x 95 combinations plus 1 header row). Specifying the upper and lower limits for three dozen analyzes would occupy 73 columns (2 limits x 36 analytes + 1 label column). The resulting 1,141 x 73 table would contain 197,393 cells, a total that’s 206 times greater than our original table’s cell count. Should you expect a proportional decrease your form’s response time? Not at all! The “lookup” still happens within milliseconds.

## Headed down registry road? Here are the EDC features you’ll need.

Here in Massachusetts, with the March winds whipping and snow always a threat, a week’s vacation down south is common fantasy. Even if it means a 10-hour car ride, most of us relish the thought.

But suppose our usual set of wheels, a Mini Cooper, say, is in the shop. (Potholes the size of craters are a common reality here.) Instead of foregoing our vacation, we decide to rent a vehicle. Chances are another Mini Cooper won’t rank as our first choice. Sure, a car that size could get us from Boston to the Outer Banks. But at what cost to our comfort and cargo?

We can think of study designs as kinds of road trips, and our eClinical tools as vehicles. Randomized controlled trials (RCTs) and registry studies are only two such journeys, but they’re two of the most frequent we in the research community take. In both cases, most of us rely on electronic data capture (EDC) to help us reach our destination.

How do we choose the EDC “vehicle” that will get us there safely, with minimal delays? Marquee brand names matter less than road-tested features. Consider the relative importance of these EDC features in RCTs versus registries.

 Feature RCTs Registries Automatic reporting and notification Important, especially as interim analyses approach Very important, to maintain desired balance among subgroup sizes and to ensure that sites contact participants at the appropriate intervals Interoperability Important, especially for trials that need to consume a high volume of lab and imaging data on a regular basis Very important, as EHR data can easily account for more than half of a registry data Researcher ease-of-use Very important, to drive data entry timelines, reduce queries, and ensure quality Critically important, for the  reasons listed under RCTs, as well as to minimize collection burden and complement the flow of clinical care Participant ease-of-use Often irrelevant, otherwise critically important, depending on whether patient-reported outcomes (PRO) are collected Often critically important, as PRO is a far more common data source for registries

Let’s look briefly at each four of these features in turn.

Registries may be observational, but make no mistake: there’s still plenty to do, especially when it comes to ensuring the internal and external validity of the study design. As with RCTs, registries begin that task before the first participant is ever enrolled. Inclusion and exclusion criteria define the patient population from which the study will draw. Enrollment targets and duration parameters are set to deliver the necessary statistical power. Data elements are selected ahead of time, as are relevant outcomes.

But RCTs wield two defenses against bias that registries do not: highly specific eligibility criteria, and randomization itself. The first defense minimizes the role confounding factors can play, while second helps ensure that the influence of confounders is balanced between comparison groups. Registries, on the other hand, because of their greater need to reflect the diversity of the real-world, cast “a wider net” with their eligibility criteria. In doing so, the room for selection bias–and confounder impact–grows. And because oversampled patient types are not randomized to one or more groups in a registry, they can distort findings more powerfully.

The registry data manager, then, is often engaged in a constant battle against selection bias. She has no more powerful weapon than real-time reporting, which can signal when enrollment efforts need to be retargeted.

Typically, criteria for registry enrollment aren’t as selective as they are for RCTs. That kind of wiggle room leaves the door open for selection bias. Regular, visual reporting of subgroup counts (e.g. patients of a certain race, ethnicity, sex, age, or socioeconomic status) are indispensable to maintaining a registry population that is representative of the general population with the disease, exposure, or treatment under study.

That same real-time reporting, directed now at the site, can automatically prompt CRCs to contact participants in a longitudinal study at the right intervals. Why is this important? Missed visits mean missing data, which poses two risks. The first is a failure to collect enough overall data points to achieve the desired statistical power. The second, more subtle risk pertains to whom the missing data belongs. If a certain patient subgroup is disproportionately more likely to miss visits (and therefore leave blank spaces in the final dataset), results become biased toward the subgroups who were compliant with visit schedules.

Missing data is the scourge of registries. Without consistent outreach to all participants from sites, the data collected can easily be skewed by those participants who are proactive in keeping their appointments. Give your sites helpful, regular reminders of upcoming milestones for their participants.

The takeaway? Look for a data management system that allows you to build clear, actionable reports, and to push them out automatically to sites and other stakeholders on a schedule you set.

## Interoperability

The life sciences are awash with data, and yet how little of it flows smoothly from tank to tank. My blood type, and yours, is very likely recorded in a database somewhere. Yet, if either of us participates in a study where that blood type is a variable, we are almost certainly looking at a new finger prick.

The situation is poor enough for RCTs, but becomes dire with registries. Registries that don’t easily consume extant secondary data place increased burden on site staff, who are rarely reimbursed well or at all for their contribution. RCTs, on the other hand, often pay per assessment. Also unlike RCTs, registries make more frequent use of this data:

While some data in a registry are collected directly for registry purposes (primary data collection), important information also can be transferred into the registry from existing databases. Examples include demographic information from a hospital admission, discharge, and transfer system; medication use from a pharmacy database; and disease and treatment information, such as details of the coronary anatomy and percutaneous coronary intervention from a catheterization laboratory information system, electronic medical record, or medical claims databases. – Gliklich RE, Dreyer NA, Leavy MB, editors. Registries for Evaluating Patient Outcomes: A User’s Guide [Internet]. 3rd edition. Rockville (MD): Agency for Healthcare Research and Quality (US); 2014 Apr. 6, Data Sources for Registries.

Clearly, the ability to exchange data among multiple sources in a programmatic way (i.e. interoperability) is a must have for the EDC that will power your registry. Of course, unlike data storage capacity, you can’t quantify interoperability with just a number and a unit of measure. Interoperability is a technical trait that depends on more fundamental attributes:

• Data standards – Does the system “speak” an open, globally recognized language, such as CDISC?
• API services – Does the system offer clear, well-documented processes for accepting (and mapping) data that is pushed to it from external sources?
• Security – Will data that enter, leave, and reside within the system remain encrypted at all times?

Before selecting an EDC, press your prospective vendors on the questions above. Then inquire exactly how they’ll ensure safe and reliable integration between their system and all your data sources.

## Researcher ease-of-use

Contributing to clinical research is, for many, its own reward. The prospect of expanding our medical knowledge and, perhaps, improving patient lives, is a powerful incentive. But it’s easy for a clinician or researcher to lose sight of these ideals in the middle of a hectic workday. When the research is long and unpaid, which is more likely to be the case for a registry than an RCT, the will to “get the work done” can quickly trump the will to do it right.

Leaders of registry operations, therefore, have an even greater responsibility than their RCT peers to keep hurdles low. That’s a wide-ranging obligation, but ensuring a frustration-free data capture experience stands at or near its center.

First, a clinical research coordinator (CRC) should meet with no obstacle the tasks of signing in to their EDC and navigating to the right participant. These are the “low bars.” Even so, they can easily trip up thick-client systems, and even web-based systems that aren’t built for performance or designed with UX (user experience) principles always front of mind.

But the most important ease-of-use tests happen in the context of the case report form (eCRF). Recall that a large portion of registry data comes from clinical encounters that occur in the delivery of standard care. Think pulse oximetry, or resting heart rate. Consequently, any eCRF that can’t be completed while in the exam room ought to have you raising an eyebrow. Accept nothing less than forms that render clearly in any browser, on any device (no matter how it’s held). But that’s not all. Fields on the form need to be “smart:” appearing only when they are relevant; capable of showing specific, real-time messages when the entered value is invalid; and hanging on to input even if an internet connection is lost. Finally, these fields should “remember” and calculate for the CRC, instantly pulling in patient data from visits ago to reference in the current form, and effortlessly turning a height and weight into a BMI.

Can’t pull medical history from the EHR? Help your CRC out with fast and responsive autocomplete fields.

In short, contributing to your registry should go hand in hand with delivering excellent patient care and keeping accurate, up-to-date records. The further those drift apart, the more your registry suffers.

## Participant ease-of-use

What endpoints are to RCTs, outcomes are to registries. And where there’s a concern with outcomes, there is (often) a concern with patient self-reports. Ergo, chances are high that your next registry may rely on patient-reported outcomes (PRO) as one of its data sources.

If we need to keep the barriers to data submission low for researchers, we need to keep them all but invisible to participants–whileensuring data quality. The simple paper form may appear to offer this balance. Historically, it may have done just that. But twenty years of Internet use have changed our expectations when it comes to offering personal information. Without sacrificing one bit (or byte) of security, we want the same ease in reporting aches to a physician as we find in booking a flight. We want instant “help” when we don’t understand a question, and we don’t want to be asked about matters that don’t apply to us.

Given the expectations above, a study that utilizes even a single PRO instrument can benefit from make the conversion to ePRO. Real-time edit checks, for example, re-orient the participant when their input conflicts with field requirements, without risking the influence of a human interpreter. The time and cost of transcription disappears.

When PRO takes the form of a patient diary, paper’s dirty secrets truly come into the light. Provided the paper form isn’t lost or damaged in the first place, it’s virtually impossible to tell whether a patient made daily diary entries as instructed, or retrospectively wrote responses just prior to a study visit, raising data quality concerns.

As a field, we’ve embraced ePRO for the last decade. But too many ePRO solutions don’t offer the ease or convenience they should. Many depend on provisioned devices, difficult to use and prone to malfunction. Web-based ePRO technologies are a step in the right direction. Here, too, though, industry efforts to deliver a effortless experience often fall short. Special software (such a smartphone apps) require storage space, not to mention the know-how and patience for download, installation, and activation. Along with everything else participants need to remember, is it really fair–or feasible–to add a password, browser recommendations, and “virtual check-in times” to the list?

Won’t be getting you your data anytime soon

The answer lies in allowing patients to use their own devices, be it a laptop or smartphone, and to submit their data on the browser with which they’re most comfortable. Form URLs specially encoded for each participant make passwords unnecessary, while auto-scheduled email and SMS messages provide a friendly, “just-in-time” reminder to make their report. And what better way to convey a message of collaboration with the participant than eConsent? While its role in risky, interventional trials may still be unclear, eConsent is tailor made for registries: it can deliver an interactive education on the purpose of the study, ensure comprehension with in-form quizzes, and signal to registry leaders real-time recruitment trends.

As for ePRO data collection itself, layout, question order, and response mechanism can all make the difference between valid, timely data and no data at all. The participant isn’t an amateur researcher, and won’t tolerate the kinds of screens all of us envision when we think of EMRs. Data collection should proceed from the simple to the complex, leveraging skip logic to trigger only those questions that are relevant, and using autocomplete to help with terminology. A single column layout, a conspicuous progress bar and page advance button, autosave–all of these features are crucial to treating patients like the study VIPs that they are.

## Why you should make better forms your top data management resolution for 2019

Chances are you’ve already set personal goals for the new year. But have you set professional ones? If not, let me suggest the most meaningful data management resolution you can make for 2019.

### “I will build better forms.”

Of all the aspects of eClinical, why rally around forms? For us, the answer is simple. Of all the tools in your toolbelt, optimized forms offer you the greatest leverage in capturing clean data promptly.

Just think about it. You don’t have control over the buzz of clinical and research activity at your sites. You don’t have control over source documents. And you can’t personally visit all your sites, train all your CRCs, or SDV all the items in your study.

So how do you bring order to the (mostly) controlled chaos of a clinical study? You encourage prompt entry of accurate data with forms that are smart, standardized, and, yes, even appealing. Think about what capable forms deliver at the point of entry and downstream:

• Timely data entry from CRCs who are thrilled to use your beautiful eCRFs
• More accurate data, thanks to specific, real-time edit check messages
• Less missing data, thanks to sensible skip logic and clear instructions
• Reduced SDV burden, as more and more of your clean, flexible forms become the source
• Reduced time to database lock
• Easier analysis, thanks to sophisticated “in form” scoring and calculations
• Smoother submission, with CDISC-standardized exports

Don’t get us wrong. Tools that expedite study design and user management, fast and reliable system performance, rock-solid security – these are crucial too. But forms are where you, your CRCs, and your data live, day in and day out. So in terms of overall study success, the “ROI” on perfecting your forms is hard to beat.

That’s why we’ll never take our eyes off this so-called fundamental. In fact, we devoted the last few months of 2018 to assembling the best thinking on forms. Not just our thinking, but yours, and that of experts. You can see what we’ve been up to by reading our blog series on cross-form logic or streaming our two December webinars. And we hope you’ll let us know what (in addition to better forms, of course) will change the clinical research landscape this year. Take the poll below!

## Take the poll

Which of the following will make the biggest impact on clinical data in 2019?

Deep learning/AI
Risk-based monitoring
Enhanced security (e.g. blockchain… or quantum!)
eSource
Wearables

## Master form design with these two on-demand webinars

Equipped with the right system, data managers today have more tools than ever before to capture high-quality right at its source. But what can the “right system” do? And how should data managers deploy those capabilities to prompt accurate, efficient entry from site staff and participants?

We hosted two webinars this month to answer those questions. Now you can watch them on demand. In Kitchen Sink, you’ll spend an economical 30 minutes understanding how OpenClinica’s form capabilities – from cross-form intelligence to modern, multi-media question types – all work together to serve as the user’s partner in capturing better data, faster. In Good Form, we step back to understand the proper role of these capabilities (not all scales are Likerts!) and climb inside the heads of CRCs and participants to better craft our forms for these study VIP’s.

## The Kitchen Sink: See everything OpenClinica forms can do for you

In just thirty minutes, explore all of OpenClinica’s form capabilities, each doing its part to ensure better data, faster! See how skip logic, autocomplete, clickable image maps, real-time edit checks, autosave and a LOT more all work together on beautiful UX to drive cleaner data from the start.

## What can cross-form do for you? (Part 3 of 3)

In the examples given here and here, our cross-form logic depended on data with a known location. In the first case, we knew exactly which event, form, and item to turn to in order to retrieve participate sex and date of birth. In the second case, each of our event dates marked the start of a unique, one-time event, so “finding their address” within the database was a straightforward process.

But what happens when we need to reference data with an indeterminate location, supposing that it even exists? In these cases, we may need to walk around a remote neighborhood, comparing building shapes and sizes, before we find what we’re looking for.

Consider a study that requires drug cessation if a certain adverse event recurs within 90 days. For an Alzheimer’s study, that adverse event may be detection of ARIA (amyloid-related imaging abnormalities) on an MRI scan. Suppose that a second presentation of ARIA within 90 days of the first means that the participant must discontinue study drug. What better occasion could there be for “checking the records” than while reporting a new ARIA? Checking the records here means:

• retrieving the start dates of any previous AE whose report indicated ARIA
• calculating the days’ difference between the most recent of the dates above with the new ARIA presentation date
• showing an alert if that difference is less than 91 days

It’s hardly complex, but a busy CRC working with dozens of participants in multiple trials may forget to follow the process. Cross-form logic, on the other hand, never forgets. The screenshots below depict the result of a third AE report for a single participant. Of the two previous reports, the first indicated detection of ARIA on 1-Nov-2018.  Because the newest ARIA, on 24-Jan-2019, falls within 90 days of the prior one, the form displays instructions to discontinue study drug.

Few questions are too complex for cross-form logic to answer and act upon. If you can state a rule in logical or mathematical terms, you can most likely implement it using a straightforward expression, no matter how many other forms you need to reference. The OpenDataKit library of XPath functions offers a wealth of tools you can combine to create smart, versatile forms that collaborate with researchers.  So don’t let your innovation stop with drug development or study design: carry it through to your forms!

## What can cross-form do for you? (Part 2 of 3)

In the previous post, we presented a cross-form example of clinical data collected in one event factoring into the normal lab range for a subsequent event. But clinical data aren’t the only factors that drive decisions. When an event occurred may determine when it should happen next. Dosing visits provide a common example. Depending on the protocol, dosing might occur at precise intervals (e.g. exactly 21 days between doses) or within windows (e.g. at least 7 days and no more than 10 days from the previous dose). Your EDC system should be able to enforce either type of scheduling, by reading not only the dates entered into forms, but dates found in form and event metadata.

In the example illustrated below, the form makes calculations between the start of a current event (“Dosing Visit 2”) and the start of the previous visit (“Dosing Visit 1”). According to this imaginary protocol, no fewer than 7 and no more than 10 days may elapse between these two visits.

• If dosing visit 2 occurs within this range, the form guides the site-based user on how to prepare the dose.
• If dosing visit 2 has a start date fewer than 7 days after dosing visit 1, the form displays instructions not to proceed, and provides the earliest and latest start dates for the visit.
• Finally, if dosing visit 2 has a start date greater than 10 days after dosing visit 1, the form displays instructions to submit a protocol deviation note.

All of these calculations and feedback take place instantaneously.

## What can cross-form do for you? (Part 1 of 3)

Your study database has just locked. To celebrate, you decide to treat your two in-house monitors to dinner in the city. You’d like to offer your colleagues a choice of three restaurants. Take a moment and imagine which three restaurants you’d choose. Got it? Now suppose you recall that one of your monitors follows a gluten-free diet. Does that change your selection? If all of your initial picks specialized in wheat pasta, it ought to.

What’s good for dinner plans is essential for study conduct, when data quality and safety are on the line. An unremarkable value for height in one participant ought to trigger a query for another; for example, for a teen and a six-year-old enrolled in a pediatric trial. To evaluate the input in one field based on data in another, data managers rely on cross-field edit checks. If the fields are part of distinct forms, that evaluation is known as a cross-form edit check.

At OpenClinica, we extend this capability. We give data managers a tool to make their forms responsive to any element they choose from their database, from the participant’s most recently recorded blood pressure to the start date of the last dosing visit. Using easy-to-understand syntax in the form definition, data managers may reference any item in the database to trigger dynamic edit checks, make calculations, show/hide relevant information, and even change the content and logic of the active form. Your forms can now “know it all” to present a fully contextualized data capture experience.

We developed this feature to help our users:

• capture more consistent, higher quality data,
• drive protocol compliance,
• highlight potential safety issues, and
• mitigate the risk of unnecessary study procedures.

The capability goes far beyond cross-form edit checks. This is total study intelligence. Below is the first of three case studies we’ll present this month that illustrate the difference. If your study has requirements that resemble these, don’t hesitate to contact us for a more in-depth guided tour.

Case #1: Age- and Sex-Dependent Normal Lab Range

One person’s high blood pressure is another person’s normal. Why? The primary reasons are differences in age and sex. When it comes to lab results, “normal” may also be a function of the specific lab conducting the analysis. But not all studies rely on lab-specific ranges. For clarity’s sake, let’s imagine a study that will evaluate lab values against standards that are lab-independent. Known as “textbook ranges”, these ranges define the upper and lower bound of normal based solely on patient-specific factors. Below is a table indicating the lower and upper limits of normal for DHEA-S in adult males and females:

Chances are that a form associated with a screening event has already captured a participant’s sex and date of birth. A subsequent event may include a lab form. It’s inefficient to ask for the participant’s sex and age on this lab form. Sex has already been documented, after all, and age requires a calculation that compares the specimen collection date with the participant’s date of birth. Asking a CRC to make that calculation opens the door to error, especially if collection occurred close to the participant’s birthday. But age and sex information is required to determine whether an entered value is above or below the limits of normal.

Enter cross-form logic, which handily “pulls in” the required data from an external form. In the current case, an expression within the form compares the specimen collection date with the externally-supplied date of birth, to calculate participant age. That age, together with the externally-supplied sex and lower and upper limits indicated within the lab form, are all it takes to instantly evaluate the lab value against the appropriate range for the participant. Results of that evaluation may trigger or hide additional fields, get piped into question or response text for a separate item, or simply provide instructions, as shown in this video.

## Get better dates (in your eCRFs)

Dating is hard, especially if you’re not aware of the relevant conventions and etiquette. The same could be said for collecting date information in your EDC. As technologists enamored with data above all, those of us here at OpenClinica probably aren’t your best source for romantic advice. But if you’re searching for the most efficient way to capture unambiguous and properly formatted date information in your eCRF, prepare to swoon. Here are four tips for working with dates.

#### Tip #1: Allow for full and partial dates.

Requiring a full date when only a certain month or year are known to a CRC or participant is a major hazard for analysis. If that full date field is required, it’s quite possible that the user will select a placeholder for day of the month–the 1st or 15th, say–when that piece of information is unavailable to her. The value in the database, then, implies a level of specificity that wasn’t intended.

To avoid this pitfall, ensure that the individual entering the date indicates which portions are known and which are unknown (“UNK”). Then, offer the corresponding field for input.

During analysis, your statisticians will need to convert entered partial dates into imputed ones, using clear and consistent rules. So it’s important that your EDC support aggregating full and partial dates into a single item.

#### Tip #2: Offer a user-friendly UI that boosts accuracy.

Suppose that a participant knows that she took a dose of medication on the first Monday of the month. You don’t want her to leave the ePRO form in search of a calendar to retrieve that date. Similarly, for a CRC who prefers point-and-click wherever possible, you want a UI that works with her style, to encourage prompt entry. The same is true for a CRC who prefers to type.

Human factors like these impact both data quality and speed. That’s why they are a major design consideration for OpenClinica, and ought to be for you, as well. The datepicker shown here is the clear and flexible standard we rely on, supporting point-and-click, typed entry, and convenient scrolling through months, years, and even decades. Make sure your mechanism offers the same virtues.

#### Tip #3: Let your forms do the math.

Exclusion criteria for some trials require no history of a certain condition for a specified amount of time; for example, no history of melanoma for the last five years. In this case, performing the “date math” mentally may not pose a big challenge. But the situation becomes more complex when, rather than being disqualified, a potential participant is assigned to a particular cohort based on the date of some clinical event (or even multiple events). Error-free calculation then becomes as difficult as it is paramount. A capable form engine is your saving grace in these situations. Your EDC ought support real-time calculations, and respond in a protocol-compliant way based on the results. Use your system’s calculation and logic functions to hide or require certain fields, enforce eligibility rules, makes assignments, and otherwise ensure proper workflows.

For optimal viewing, expand the video to fill size using the arrows icon in the lower right-hand corner.

The example shown here employs skip logic and date math to assign a cohort based on the date that a CT scan confirmed the absence of melanoma. Three years or more of complete response triggers assignment to cohort A, less than three years to cohort B.

#### Tip #4: Follow the standards.

The final step when working with dates is to format them in accordance with a recognized standard. CDISC is arguably the most recognizable and robust. Following ISO 8601 standards, CDISC takes YYYY-MM-DD as its date format.