A Prescription for Data Management Health

“Three days to enter data, five days to answer queries.”

The rule couldn’t be any clearer. You’ve told your sites at the IM and reminded them in each newsletter. You know you won’t get 100% compliance, and that’s fine. You’re reasonable.

But this is getting out of control.

As a data manager, you’ll always live with missing forms, blank fields and open queries. It’s a chronic condition that gives rise to acute episodes around interim and final locks. You’ve learned to manage it, even thrive with it, but you know there’s got to be a more effective treatment regimen.

Good news. While there’s no panacea, I’d like to offer a tool you can begin using today, regardless of your systems or processes, to spur your sites onto improved data entry, query resolution, and even enrollment. But as with any treatment, we need to consider directions, precautions and potential side effects.

First, though, some background. If you use EDC and IxRS to facilitate data collection and enrollment, you’ve probably made it a habit to pull their stable of available reports at some regular interval. (If not — if you’re relying solely on the summary statistics and visualizations available on these systems’ dashboards — consider getting acquainted with the detailed exports. This post will explain why.) These reports are almost always available in some Excel-readable format. Chances are you’ve become practiced at applying some formulas to the data inside. (If not, here’s a tutorial on getting started.) The calculations you make are vital in assessing which sites are leading the pack in subject recruitment and data management tasks, including the timely entry of data or resolution of queries. You and your fellow study leaders depend on this information to refine projections, meet lock deadlines, and offer assistance to those sites behind the curve on key operational metrics. But do you share this information with sites?

Yes! As interim locks approach, I always email out the number of total open queries and missing forms, along with encouragement to tidy these issues up. If that’s your response, you’ve already adopted a best practice. But there’s more you can do.

Provided you do so with the right context and tone, you can and in many cases should communicate to each site exactly how they compare to their peers on several key metrics, from average open query age to subjects screened per month. When you supply this information, you recognize the site’s invaluable contributions, feed their natural and justified curiosity, and tap their desire to maximize their performance.

This practice involves three major challenges. The first challenge is calculating useful, “apples-to-apples,” site metrics from the raw data found in your EDC and IxRS reports. The second is distributing this information to each site in a systematic way.  The third is couching this information in a message that conveys gratitude and support. But each can be met.

Making the calculations

Here, I can offer some great news for users of OpenClinica, and a valuable tool for everyone. OpenClinica now supports a suite of configurable reporting dashboards, providing data managers and those they authorize (including sites) with clear, real-time visualizations of their study data. If you’re currently using OpenClinica, contact us and we’ll gladly share more details.

To help you get started now, regardless of your EDC or IxRS, we’ve created a workbook that performs dozens of calculations for each of your sites based on reports common to nearly every system. It’s free, and guides you step-by-step through converting raw exports into powerful analytics.

Distributing the information

Once you’ve created a table of performance metrics by site, you have the beginnings of a “mail merge.” You simply need to add a column specifying the email address of the individual responsible for data entry for each site.

The steps for executing a mail merge differ from email client to email client. However, some starter documentation is available here:

Setting the context

So far, we’ve touched on the technology of quantitative performance reporting. But what about the art? It’s crucial that sites understand that your intent isn’t to chastise, but to inform and encourage. The metrics you calculate are just one piece of a broader discussion, which would include particularities that simply aren’t reflected in a spreadsheet, such as patient availability and staff experience. A site whose “screened per month” measure ranks in the bottom quartile may have had to overcome incredible hurdles to enroll their six or seven subjects. Meanwhile, they may be adding valuable thought leadership.

To establish the right tone, you might consider adopting a message template like this one:

Hello Site <<site_id>>,

The Data Safety Monitoring Committee will meet two weeks from today, so it’s important we enter all data for visits that occurred on or before March 31st by this Friday, and close all queries by next Wednesday. We can’t thank you enough for your diligence in screening qualified patients and entering data. As you well know, your efforts here support not just our study, but the patients themselves.

It’s been an incredibly busy month, and we recognize it’s not always possible to enter data within five days of events. We realize some queries take weeks to close. And we know your first priority remains and should remain your patients, whether they’re participating in this study or not. Your accomplishments are all the more impressive in light of these facts.

We believe you deserve insight into the contributions you’re making to our study. That’s why we’re initiating a weekly, custom report to share your site’s progress with you. We understand you may be curious about how your “numbers” stack up against those of those of other sites, so we’ve included some comparative measures in this report. Also, to help you navigate data management, we’ve listed out your missing forms and open queries as of the report date shown. (Please note that you may have closed one or more queries or submitted one or more forms in time between report generation and your receipt of this email. The numbers below are not real-time.)

Thank you again for all you do in service to our study and your patients!

Site <<<site_id>>> By the Numbers
Report date: <<<date>>>
Screened : << screened>>
Failed : << failed>>
Randomized : << rand>>
SF Rate (Failed / Failed + Randomized) : << sfrate>>
Months Activated : << mons>>
Screened/Month : << srate>>
Screen Rate Country Rank : << srankc>>
Screen Rate Global Rank : << srankg>>
Randomized/Month : << rrate>>
Randomization Rate Country Rank : << rrankc>>
Randomization Rate Global Rank : << rrankg>>
Days Since Last Screening : << dsls>>
Days Since Last Randomization : << dslr>>
Open Queries : << oq>>
Queries Per Subj Screened : << qrate>>
Queries/Subject Country Rank : << qrankc>>
Queries/Subject Global Rank : << qrankg>>
Average Age of Open Queries : << avgqage>>
Age of Oldest Query (Days) : << oldestq>>
Query List : << qlist>>
Missing Pages : << mpgs>>
Missing Pages Per Subject Screened : << mrate>>
Missing Pgs Per Subject Country Rank : << mrankc>>
Missing Pgs Per Subject Global Rank : << mrankg>>
Average Age of Missing Pages (Days) : << avgmpgage>>
Age of Oldest Missing Page : << oldestmpg>>
Missing Page List : << mpgslist>>

Some final precautions

How often you provide a report like the one above, and what you include in it, are at your discretion. Fast-moving infectious disease trials may warrant a weekly report. Large, endpoint-drive cardiac studies may benefit from just one report per month. Also, carefully consider the cultural differences that exist among sites in various countries. There may be no acceptable way of communicating comparative metrics in some.

There’s power in your metadata. You should consult it frequently on your own, weekly if not daily. You can use the workbook above to do that and nothing more. But we have an obligation to patients worldwide to conduct trials in the most efficient manner compatible with the highest data quality. Bringing some gentle pressure to bear on sites is one method of achieving that. If you adopt some version of the practice described in this post, please let us know your experience with a comment or email.

Around the World in Three Data Integrations

Big data has been a recurring topic in medical research news for years now. It’s a topic that deserves our attention. Big data’s potential to revolutionize fields like genomics and to advance precision medicine generally is stunning. Today, though, a lot of the press is speculation. Robustly effective designer drugs for cancer, based on the patient’s genetic markers, remain an ideal that is likely decades away.

But if we adopt a broader conception of big data–one that includes the massive infrastructure supporting social media, the Internet of Things, and (potentially) interoperable health record platforms–real world applications are not hard to find.

March 2015: Researchers at Stanford University recruit 11,000 subjects into a cardiovascular study in 24 hours using Apple’s open source ResearchKit app.

September 2015: “By outfitting trial participants with wearables, companies are beginning to amass precise information and gather round-the-clock data in hopes of streamlining trials and better understanding whether a drug is working… So far, there are at least 299 such clinical trials using wearables, according to the National Institutes of Health’s records.” – Bloomberg News

July 2016: The National Cancer Institute introduces OpenGeneMed, “a portable, flexible and customizable informatics hub for the coordination of next-generation sequencing studies in support of precision medicine trials.”


One facet of these examples stands out. For all their diversity, projects that rely on big data rely just as much on collaboration.
Moving from genomes to biomarkers to disease risk models and personalized treatment requires more than one big dataset: it requires the integration of data from multiple systems that are secure, geographically separated, and disparately schematized.

Ten years ago, the ability to handle this task might have been seen as a leading-edge, if not commonly leveraged, feature of clinical technology. Today, software that cannot facilitate integration is doomed to obsolescence.

eCitizen of the Data World

What does this requirement mean for EDC? Simply put, those of us building data capture solutions need to look far beyond the “coordinator keying in vitals” use case. (Our solution for that use case had better already be rapid, reliable and easier to execute than ever, considering the burdens placed on trial sites in 2017.) With “insight by integration” at the forefront of research strategies, we technologists had better think of our system as a world traveler: one familiar with the laws in multiple countries, authorized to enter and leave those countries, and fully knowledgeable of their languages and customs. In the world of data management, this means the ability to pass authentication to enter a source database, map the data to a target, and leave the source while maintaining data provenance.

As a long-standing promoter of open, standards-based interoperability, OpenClinica represents this “world traveler.” The native language of OpenClinica’s EDC is the Clinical Data Interchange Standards Consortium Operational Data Model (CDISC ODM). This fact alone makes the OpenClinica data model an ideal cosmopolitan, instantly conversant with research peers around the globe. But holding fast to one standard is not sufficient. We need to be willing to learn new languages. By offering a well-documented web services API, OpenClinica makes it easy for its users to leverage RESTful web services, together with OAuth protocol version 2.0, to systematically:

  • extract data from almost any third-party source (e.g. labs and imaging centers),
  • associate each element of that data to the relevant Case Report Form (CRF) field.

APIs and authentication protocols offer the most direct route to turnkey integration. But it’s not enough to be powerful in the pursuit of data integration. A system has to be flexible, too, when tapping data sources that aren’t available to an API. For OpenClinica, this means providing a host of configurable tools to data managers and data entry personnel.

  • OpenClinica’s Jobs feature allows for custom imports from local files. A Job may be scheduled to run at any frequency, so that users responsible for data entry based on a regularly updated flat file (e.g. a CSV on their hard drive) may provide that data without keying in each element. A Job well-defined and set up just once improves accuracy and saves hours of research time.
  • An Import Data feature makes ad hoc batch uploads easy, as well. Users simply generate a XML file based on OpenClinica-supplied Object Identifiers (OIDs) to map data from the import file to the EDC.
  • OpenClinica supports a variety of Single Sign On (SSO) protocols, reducing repetitive authorization while maintaining security. OpenClinica is also an early and already experienced adopter of SMART on FHIR, a set of open specifications to integrate its core EDC with Electronic Medical Records (ER) and other health IT systems.

A Look at Our Passport

So far, I’ve outlined a set of capabilities required of any EDC in 2017, and claimed that OpenClinica meets them all. But where’s the evidence? In the second half of this post, I’m going put three of our partners in the spotlight. For each, OpenClinica was able to play a pivotal role in bringing together multiple data sources.

The Dutch Translational Research IT (TraIT) project, an initiative from the Center for Translational Molecular Medicine (CTMM),  “enables integration and querying of information across the four major domains of translational research: clinical, imaging, biobanking and experimental (any-omics).” While multiple systems power that integration, OpenClinica is the central hub. TraIT continues to host and support https://www.openclinica.nl, having joined together 10 trials on the platform in October of 2011. By March of 2015, adoption had grown to include 852 users at 157 sites conducting 136 studies, and by October of 2016, that usage had grown to more than 2,800 researchers and 250 research projects.

Among the selection criteria used to evaluate and ultimately select OpenClinica as a partner, TraIT specifically cited:

  • “links to other data storage and analysis tools within the TraIT platform, allowing researchers to integrate and analyse case report data, imaging data, experimental data and bio banking information,” and…
  • the “possibility to integrate with Trusted Third Party which handles proper (de-)identification of participant data within OpenClinica and other tools/services used in TraIT.”

It is worth noting that, in addition to an infrastructure that allows database integration, TraIT relies equally on OpenClinica’s open source model to build custom integrations. “The advantage of the Open Source model compared to a proprietary model, is that multiple independent contributors can review the source code, making enhancements which are then added to the version available to the entire OpenClinica community.”

Usage by the broader community helps ensure the innovation’s longevity and continued evolution. TraIT leverages these tools (such as the OC Data Importer) to help their sites import vast quantities of data in bulk fashion, eliminating transcription errors and delays.

The 100,000 Genomes Project, led and funded by Genomics England, is another example of a large-scale effort to combine clinical and genomic data. The 100,000 Genomes Project is sequencing 100,000 genomes in order to:

  • better diagnose rare disease,
  • understand its causes, and
  • set a direction for research on effective treatment

Whole genome sequencing (WGS) offers the best hope for determining which genetic mutations give rise to particular phenotypes, including disease states. WGS yields the syntactical equivalent of the three billion nucleotide base pairs that make up just one strand of one individual’s DNA, so a research program involving even one such sequencing has already entered the territory of “big data.” While highly specialized systems are responsible for sequencing itself, and yet others for the analysis of the output, an equally essential tool for this research is a system that can manage the clinical data and biospecimen tracking of subjects visiting one of several geographically dispersed clinical centers. Here, too, OpenClinica serves as the hub. Researchers at 13 NHS Genomic Medical Centers are using OpenClinica to register participants, capture clinical information, and ensure that blood samples stay matched with their de-identified contributors.

Project leaders have made public a 10-page guide to researchers on this process, one whose brevity and clarity speaks to how easy OpenClinica makes it. Due the dedication of the researchers, collaboration of participants and the fitness of the technology, the project is on track for completion in 2017.

Click image to enlarge
Click image to enlarge

PECARN, the Pediatric Emergency Care Applied Research Network, is the first federally-funded pediatric emergency medicine research network in the United States. To date, PECARN has conducted 24 studies that have already changed how clinicians are preventing and managing acute illness and injury in children.

As part of their mission to advance clinical practice, PECARN has taken a lead role in the implementation and study of clinical decision support tools. For all the potential benefit offered by these tools, questions remain about their adoption and effectiveness. Do physicians and nurses generally follow evidence-based recommendations for treatment or diagnostic procedures? When they do, are outcomes improved?

To help answer these questions, PECARN study leaders conducted a nonrandomized trial with concurrent controls at thirteen emergency departments between November 2011 and June 2014. These thirteen departments were consolidated into ten research sites. At eight of these sites, clinicians creating an EHR record for any patient <18 years old with minor blunt head trauma were automatically presented with a custom template. This template solicited additional data about the injury before providing recommendations on CT use and risk estimates of clinically important traumatic brain injuries. (CT imaging of the brain is associated with a non-negligible risk of tumor formation in those who undergo the procedure, especially children. At the same time, early detection of ciTBI–i.e. injuries leading to death or requiring neurosurgery, intubation for more than 24 hours, or hospital admission for two or more nights–is critical for effective intervention. The recommendations provided by the EHR template were intended to limit CT use to those patients who met established predictive criteria for significant ciTBI risk.)

The clinicians work in their EHR, together with subsequent cranial imaging and TBI-related outcomes, all generated data that would require aggregation to determine (1) how frequently care providers heeded recommendations surrounding CT use, and (2) whether the predictive rules for ciTBI risk were valid. That aggregation fell to OpenClinica. By accepting reports generated by each site’s EHR to automatically create study subjects, and by integrating with the source of imaging data at each site, OpenClinica enabled a true e-source study that left clinical workflows unaffected. Not one of the 28,669 subjects created in the study database required manual data entry.

Click image to enlarge
Click image to enlarge

 

 

 

 

 

 

Images courtesy of Jeff Yearley, BA, Manager of Clinical Data Management, Data Coordinating Center, University of Utah. Click here to download the slides containing the images above.

The moral? Big data isn’t just found: it’s made, through the coordinated efforts of both people and systems that travel light and fast. You’re contributing to big data during more and more of your waking hours these days. If you want to help shape it through technology, get ready to cooperate… and pack your digital bags.

OC Participate Delivers Better Data, Faster, Again

Some topics in clinical trials bear repeat attention. With patient-centricity claiming more and more of the spotlight in both research and care (rightfully so), we think patient-reported outcomes is one of them.

In our
last post, we described some of the most common obstacles to getting quality data from PRO measures. Patients, especially those are very sick, don’t want to hand-write dry medical diary entries. They don’t want to learn yet another electronic device, download and manage an app, or have to recall yet one more password. And who can blame them? Trial participants are the heroes of the research story, and when it comes to the collaborative process of data gathering, they deserve a hero’s welcome.

That’s why we developed OpenClinica Participate. We’re gratified by the success our clients have found leveraging this innovative ePRO solution, but we’re not surprised. When you prioritize a trial subject’s convenience and obsess over making things simple, you simply get better results. Here’s another example. Let’s call it:

Out with the old, in with the new

OpenClinica teamed with Danish CRO, Signifikans, to implement OpenClinica Participate for a leading Denmark-based bioscience technology company developing an innovative treatment to alleviate colorectal disorders, a common side effect of numerous medicines affecting millions of people at any time. The study’s objective was to investigate exercise induced intestinal permeability, immune markers, and bowel habits in 18-40 year old healthy volunteers. Participants were given two strains of bifidobacterium, an anaerobic bacteria that resides in the intestinal tract. The study involved 48 participants throughout Sweden, and each patient was required to provide 65 daily diaries in addition to 5 in-person visits over the course of the study.

The Old Way

In a similar prior study, the sponsor collected paper diaries from 700 participants. Each participant provided their (hopefully) completed and accurate paper diary to their site coordinator during the in-person study visits. The site coordinators then delivered the completed paper diaries to a data coordinating center. Coordinating center staff then scanned the diaries into a document management system and an overseas data entry vendor used a double data entry workflow to populate a database. Completed diaries were scanned and uploaded in batches for data entry. Phew!

On average, four months elapsed between the point of data capture and the first day of availability of that data to the sponsor. Monitoring participant compliance was also a challenge in this study, as it was
impossible to discern when each patient actually completed their daily diary. The expenditures associated with this process for data entry tallied over $213 per patient diary, or $4.97 per diary page.

Overall, this process was cumbersome, expensive, and logistically complex. The sponsor was planning a similar new study, and this time around was determined to find a way to

  • get faster access to the study data,
  • improve data reliability, and
  • reduce costs

The New Way

The sponsor enlisted local specialty CRO, Signifikans, to help it identify and implement a better approach. Signifikans recognized that with OpenClinica Participate, the sponsor could have immediate access to patient data, and that this data would automatically sit right alongside data captured from other sources during the study. No EDC integration was necessary.

Signifikans also took the lead on configuring the study in OpenClinica. (Our “make it simple” credo guides how we design tools for data managers, as well. That’s why we have invested so much in our forms engine, a topic for another post.) While building the study, Signifikans was able to easily demo prototypes to the sponsor along the way, iterating rapidly through edits and changes. Data capture forms were developed in the Swedish language, and the study was configured to send email reminders to patients to help ensure diaries were completed on time. The reminders contained a secure, uniquely-identified link the participant could click to go right to their diary, eliminating the need for participants to remember usernames and passwords.

Results

As soon as the study went live, the sponsor was able to monitor precisely when data were captured something that was not possible with the old paper-based method. They observed, for example, that all five participants enrolled in the study’s first week each completed their diary card daily, per the protocol. The sponsor’s confidence in patient compliance and data quality surged; so much so, that they implemented an increase in the amount of data being collected this way. Scaling that quickly would have been impossible with paper diaries and slow transcription processes.

“Participate was very low friction: set-up was quick and efficient, and patients really seemed to embrace the technology.”

– Andreas Habicht, CEO, Signifikans Aps

The OpenClinica solution delivered a unified study database out-of-the-box, with patient-reported data sitting alongside clinician-reported data and accessible via the same interface. Having everything in one audit log made it easy to follow the patient’s trajectory through the study. Signifikans was able to use the same tools to configure and manage both ePRO and non-ePRO aspects of the study, resulting in a faster time-to-launch, and facilitating mid-study changes.

In addition to enhanced data quality and faster access to data, the cost of data capture per diary with OpenClinica Participate resulted in cost savings of over 80%.


Comparison: Paper vs. Participate
(click to enlarge)

Keep an eye out for more ePRO success stories on this blog. Our next post will delve into a different topic, but, as with this one, you can be sure it will feature better results through a better eClinical experience.

Is Your Clinical Trial Software Effective, or Just Efficacious? (Part 2 of 2)

When it comes to your assessing your trial technology, your data managers, study coordinators, Investigators and senior leaders are all study subjects.

In the previous post, I described the difference between efficacy and effectiveness, an increasingly important concept in clinical research and healthcare. After stressing the importance of effectiveness research to health policy planning and patient decision-making, I summarized seven criteria for identifying effectiveness studies. Finally, I asked whether these criteria could be re-purposed beyond a medical intervention to inform how we measure the effectiveness of software systems used to conduct clinical trials.

Is it possible to assess clinical trial software through the lens of effectiveness, as opposed to just efficacy?

I believe that it’s not only possible, but crucial. Why? We all want to reduce the time and cost it takes to deliver safe, effective drugs to those that need them. But if we don’t scrutinize our tools for doing so, we risk letting the status quo impede our progress. When lives are on the line, we can’t afford to let any inefficiency stand.

In this post, I adapt the criteria for effectiveness studies in clinical research into a methodology for evaluating the effectiveness of clinical research software. I limit the scope of adaptation to electronic data capture (EDC) systems, but I suspect that a similar methodology could be developed for CTMS, IVR, eTMF and other complementary technologies. If I open a field of inquiry, or even just broaden one that exists, I’ll consider it time well spent.

Continue reading Is Your Clinical Trial Software Effective, or Just Efficacious? (Part 2 of 2)

Is Your Clinical Trial Software Effective, or Just Efficacious? (Part 1 of 2)

Are you measuring all the relevant outcomes of your clinical trial technology?

For pure pathogen-killing power, it’s hard to beat a surgeon’s hand scrub. Ask any clinician, and she’ll tell you how thoroughly chlorhexidine disinfects skin. If she’s a microbiologist, she’ll even explain to you the biocide’s mechanism of action–provided you’re still listening. But how would the practice fare, say, as a method of cold and flu prevention on a college campus? Your skepticism here would seem justified. After all, it’s hard to sterilize a cough in the dining hall.

Efficacy and effectiveness. It’s unfortunate their phonetics are so close, because while the terms do refer to relative locations along a continuum, they’re the furthest thing from synonyms, as the ever accumulating literature on the topic will attest.

In this post and the one that follows, I’d like to offer some clarity on efficacy vs. effectiveness and illustrate the value that each type of analysis offers. If nothing else, what emerges should provide an introduction to the concepts for those new to clinical research. But I have a more speculative aim, too. I’d like propose standards for assessing trial technology through each of these lenses. Why? Because while we’ve been asking whether a particular technology does what it’s explicitly designed to do, as we should and must, we may have forgotten to ask a critical follow-up question: Does it improve the pace and reliability of our research?

Continue reading Is Your Clinical Trial Software Effective, or Just Efficacious? (Part 1 of 2)