The Evolution of Electronic Data Capture

OpenClinica was recently featured in an article in Genetic Engineering and Biotechnology News titled “Commandeering Data with EDC Systems,” written by Dr. James Netterwald. The article briefly recounts the early days of clinical trial Electronic Data Capture (EDC). But how far have we come? Dr. Netterwald’s title (perhaps unintentionally) conjures up images of struggle and strife, which may be perhaps more a more apropos description of the journey of Electronic Data Capture than it may first appear.

As an industry, it’s taken us a good 20 years to get to where we are, and to be plain, it’s been a slow start. (In my own defense, I, and my company Akaza Research, have only been a party to the industry for the last 5 of those 20 years.) Climbing the evolutionary ladder from shipping laptops to sites to keying data into electronic case report forms is certainly progress by any measure. However, while the days of mailing tapes and disks are over, the days of real electronic data capture are yet to come. Today, most experts agree that somewhere between only one-half and two-thirds of all new clinical trials use EDC software, an of this only a very small fraction are “e-source,” defined as collecting data in electronic form at its source as opposed to keying it in from some other source. In some ways it is ironic that cutting-edge biopharmaceutical technologies are developed themselves with technologies that are, relatively speaking, much further down the technology food chain.

Notwithstanding, there are some enterprising few who have pushed the pace towards true EDC. Spaulding Clinical, a large phase 1 unit in West Bend, Wisconsin has developed a system that automatically captures ECG data from their facility’s patients and directly populates the clinical trial database with these data. A patient wears the ECG device and the data are transmitted wirelessly to the EDC system. However, this slick and highly productive solution was not developed by either the ECG vendor or the EDC vendor. It was developed by hand by one of Spaulding’s own software developers.

Why isn’t this type of solution more commonplace in clinical trials? What prevents the industry from making the most of today’s information technology? With the strong incentives currently in place to make research more efficient, our field could certainly benefit from some more forward thinking.

– Ben Baumann

The Open Source Effect: Akaza Research Provides Insight into Rapid Growth of OpenClinica

OpenClinica has seen a surge in usage over the past year, according to recent survey conducted by Akaza Research.

“Our annual survey of the OpenClinica community showed strong expansion in all key measurements of system usage,” said Cal Collins, Chief Executive Officer at Akaza. “In the past year we have seen doubling in the number of OpenClinica users and subjects, and a nearly 10-fold increase in regulatory submissions.”

The company reports that a reported 168,989 subjects have been involved in OpenClinica-powered clinical trials, a 224 percent increase from the prior year. In tandem, the company identified a 246 percent increase in the number of OpenClinica software users. The figure measures users working at the sponsor or CRO level and does not include users at clinical trial sites.

“Since these figures are based on a voluntary survey of the OpenClinica community, they are likely underestimates,” said Collins. “While it can be difficult to precisely measure the usage of freely distributed open source software, they provide a clear indication of the growth in OpenClinica adoption around the world,” he added.

The Professional Open Source Model

OpenClinica stands in stark contrast against the landscape of other EDC products that are provided under a closed source license. Akaza Research’s “professional open source” business model makes OpenClinica available in two editions. The OpenClinica Community Edition is freely available to use and modify, and may be downloaded form The OpenClinica Enterprise Edition is a certified build of the open source technology commercially supported by Akaza Research. In many respects, the company’s business model is similar to that of RedHat (Linux), MySQL (database software), and other open source companies.

The OpenClinica rapidly growing open source community currently comprises over 10,500 users and developers, many of whom help review and adapt the open source software. Roughly 33 percent of OpenClinica users are located in North America, 30 percent in Europe, 14 percent in Asia, 9 percent in Africa, 7 percent in South America, and 7 percent in Australia. OpenClinica community members drive much of the product’s evolution, and in recent years have helped to usher the technology into a wide variety of clinical trial settings.

Worldwide Acceptance in Regulated Trials

The composition of the OpenClinica community is changing over time, with an increasing number of OpenClinica users representing commercial clinical trials. Currently, 55 percent of the OpenClinica community members identifies themselves as working in industry, with the remainder in academic or government settings.

According to Collins, “the robust overall growth is highlighted by an increasing proportion of OpenClinica users representing pharmaceutical, biotech, device, and other companies. We saw a 975 percent increase in OpenClinica-powered trials used in regulatory submissions in the past year, and in the next 12 months OpenClinica adopters expect to increase this number by another 200 percent. This is consistent with our OpenClinica Enterprise Edition customer growth, where a majority of new customers are from industry.”

OpenClinica Community Surpasses 10,000 Members …and oh yeah, what is this open source thing?

Heartfelt thanks to everyone who has supported the OpenClinica project over its relatively brief history. Our community now stands at over 10,000 registered members, representing a 3-fold increase in size over the past two years alone. With members in over 70 countries across six continents, open source is now a central part of the clinical trials software landscape. This is a major accomplishment that we should all be proud of.

While 10,000 may sound like a lot of people, there are still many within the clinical trials industry who do not understand the key concepts of open source. Other software categories have a high prevalence of open source offerings. For instance, when you look at database products (like MySQL, Postgres) and operating systems (like Linux, Android, BSD) there are numerous open source options. Open source is even widely prevalent in the EMR/EHR space, with OpenVista, and over 20 others to choose from.

As OpenClinica ushers the benefits of open source into the clinical trials space, it is instructive to periodically revisit the fundamentals of what exactly open source is.

What is open source?

Open source is a type of free software license–free as in “freedom,” not “beer.”[1] It is not “freeware” and it is not “shareware.” More specifically, open source provides users with[2]:

  • The freedom to run a program, for any purpose
  • The freedom to study how a program works and adapt it to a person’s needs. (Access to the source code is a precondition for this.)
  • The freedom to redistribute copies so that you can help your neighbor.
  • The freedom to improve a program and release improvements to the public, so the whole community benefits. (Access to source code is a precondition for this.)

There are numerous open source software licenses based on the above tenants and roughly 60 open source licenses have been approved by the non-profit Open Source Initiative. The OpenClinica Community Edition is distributed under the LGPL open source license.

Open source as a development model

The software development models around open source projects are typically characterized by transparency and collaboration within the community. Opening the product up to the community, allowing anyone to see the good with the bad, helps to quickly uncover problems and identify areas for improvement. Most open source projects will publicly maintain a project roadmap and defect tracking system. Release cycles of active open source projects tend to be early and often.

The result of such openness and transparency is software that is often more reliable and better performing than proprietary, closed alternatives.

What is professional open source?

A symbiotic relationship exists in a health professional open source model between the Community, Company, and Customer.

Some people may think of open source projects as purely volunteer efforts. That is definitely not the case! While governance models vary from project to project, commercial enterprises have helped make open source consumable by ordinary people and businesses. For example, through its OpenClinica Enterprise Edition, Akaza Research provides support and regulatory assurances that help to minimize business risk and ensure success for organizations wishing to use OpenClinica in mission critical settings. Organizations can turn to Akaza to rapidly develop in-house expertise, obtain hosting and expert professional services, and ensure their OpenClinica systems and users are productive and satisfied.

A pervasive trend in software

Open source is everywhere[3]. From the Firefox web browser to the most popular websites, everyone who uses the World Wide Web uses open source. As web-based technology, OpenClinica and the OpenClinica community are direct beneficiaries of numerous other open source projects. Those within the clinical trials space who recognize the significance of open source will be a step ahead of their colleagues.

– Ben Baumann, Co-Founder, Akaza Research, LLC

Preview of the March 22nd OpenClinica Global Conference

With just a week to go, the OpenClinica Global Conference is shaping up to be an excellent event for learning about OpenClinica and networking with members of the OpenClinica community.

This is the first ever Global Conference and we are thrilled to have as a keynote speaker Mark Adams, Project Manager for the National Cancer Institute’s Cancer Biomedical Informatics Grid (caBIG). As a fellow pioneer working to bring open source to clinical research domain, caBIG has developed a set of interoperable, open source clinical informatics tools which address functions such as adverse event reporting, patient registries, study calendaring, clinical trial management, imaging, and tissue banking. The caBIG project and OpenClinica together illustrate the broad impact open source is having on clinical trials.

The conference program extends along three tracks with case studies, panel discussions, tutorials, and presentations from clinical trial sponsors, CROs, academic groups, and IT services companies. Content is oriented towards both technical and non-technical audiences. Selected topics include:

  • An unveiling of the new OpenClinica CRF Library, a curated repository of standards-based eCRFs for OpenClinica
  • Case studies from sponsors and CROs showing how they have used OpenClinica
  • Presentations of tools and extensions developed around OpenClinica
  • Tutorial for installing OpenClinica
  • Tools, tips, and techniques for using OpenClinica data in SAS
  • Live demonstration of automated data interchange with OpenClinica
  • Automating the data import process
  • Validating OpenClinica for 21 CFR Part 11 compliance
  • Generating ad hoc reports
  • Modularization of the OpenClinica source code and introduction to the OpenClinica Developer Network

A full suite of training classes for data managers, biostatisticans, project managers, system administrators, and developers are also being offered immediately preceding and following the conference.

We look forward to seeing you next week!

OpenClinica 3.0 Completes Functional Testing; Enters Deployment Testing

OpenClinica 3.0 is almost here! The quality team has successfully completed functional testing and has moved onto the phase of software quality assurance called deployment testing.  Deployment tests cover 8 different target “platforms” that range from a clean installation of OpenClinica 3.0 on a Windows server using Postgres, to upgrades of OpenClinica 2.5 to 3.0 on a Linux machine using an Oracle Database. Of course, we also test on all combinations in between.

Before the production version of the application can be released, it must successfully pass through our Quality System. For those of you familiar with such a thing, all of the testing and documentation that OpenClinica 3.0 is going through will end up generating thousands of pages of “paper” that include user requirements, traceability matrix, and a large set of screenshots which prove the expected results of the test cases did in fact happen.

In addition to the team at Akaza that has invested thousands of hours testing the application, this release has also undergone road testing in our first OpenClinica Pilot Program.  I would like to warmly thank the participants of the program for committing their time and effort in making sure OpenClinica 3.0 is our most well vetted release to date.

Please look for an announcement from me in the coming days of when OpenClinica 3.0 is available for download.

– Paul Galvin, Project Manager

OpenClinica 3.0 Features Preview: Part III

Welcome to the 3rd and final installment of the OpenClinica 3.0 features preview!  This post covers the new Web Services interface that is part of 3.0 and the job scheduler that can be used to automate Data Import and Data Export jobs.

OpenClinica 3.0 allows for programmatic interaction with external applications to reduce manual data entry and facilitate real-time data interchange with other systems.  The OpenClinica web services interface uses a SOAP-based API to allow the registering of a subject and scheduling of an event for a study subject.

OpenClinica provides a WSDL (Web Service Definition Language) that defines a structured format which allows OpenClinica to accept “messages” from an external system. For example, an EHR system could register subjects for a study in OpenClinica without direct human intervention. At the same time, the EHR could also be programmatically scheduling study events for these subjects. More information about the OpenClinica API can be found on the OpenClinica developer wiki.

An early reference implementation conducted by clinical lab Geneuity used the API to create a web service which inserts data programmatically into OpenClinica CRFs directly from laboratory devices. See the post by Geneuity’s Colton Smith below.

Another major productivity tool in 3.0 is the introduction of a Job Scheduler for automating bulk data import and export.  With this feature users can define a job that will generate an export at a specified time interval.  The Jobs Scheduler can also be configured to regularly scan a specific location for CDISC ODM files and run data imports when a new file is available. This feature can be particularly helpful in automating routing functions, such as the incorporation of lab data into OpenClinica from an external system.  The lab data does need to be in a valid CDISC ODM format (this can be accomplished via another great open source tool called Mirth), but it does save a person from entering data in two applications separately.

At time of this post, OpenClinica 3.0 is currently released as a beta3, but the production ready application is soon to follow. The application is passing through the highly rigorous strictures of our quality system (think Navy Seals training for software) and the output will be fully validated and ready of use in roughly a month. Needless to say, I, and everyone else here at Akaza is very excited to be so close to releasing 3.0. It is already quite clear that this release will have a momentous, positive impact on the community.

Selling open source without mentioning open source

I am a regular reader of  “The Open Road” blog by Matt Assay on In one of his latest posts, “Getting open-source criticism wrong”, he does a great job of making the case that commercial open source software is about ease of adoption, flexibility, and choice.

It struck a chord because my sales team and I spend a great amount of time and effort explaining to prospective customers that we offer the same level of quality, stability, performance, service, and support as a proprietary vendor. In many cases we must meet a higher threshold than those vendors, because we do not have the lock-in of a commercial software license to compel customers to come back to us for repeat business. Our track record of successful long-term customer relationship is evidence we meet this threshold.

In certain sales situations, for the sake of simplicity and clarity, we have to focus only on these apples-to-apples characteristics, and do not have the opportunity to educate on the economic and technical advantages of OSS as much as we would like. It’s great to know that our open source clinical data management software technology and service offerings can stand successfully on these merits. However, as many readers of this blog already know, open source offers an additional set of critical benefits: “the ability to adopt software rapidly and at low cost, the flexibility to develop and extend their systems as they choose, and the ability to reduce risk by obtaining paid commercial-grade [or better] support”. As more decision makers are coming to understand, it is following this path, rather than the adoption of pricey, monolithic proprietary software, that leads to better outcomes and greater ROI.

Facilitated Data Entry of Lab Results Using OpenClinica’s New Web Services Feature

As mentioned previously, we at Geneuity Clinical Research Services are big fans of OpenClinica and are even more so now with the upcoming release of version 3.0 with its new web services capability.  This article describes how we exploit this new feature to help automate entry of lab results, a particularly important topic given that we do lots of batch testing of specimens and oftentimes test the same specimen for many different analytes.

Prior to 3.0, you had three options when it came to CRF data entry.  The first was to log into OpenClinica’s web interface and manually enter your data.  This was no problem so long as you didn’t have lots and lots of data.  But we did.

Alternatively, you could upload a flat file of your data as long as it was formatted in XML and associated with the appropriate subject id’s and visit descriptions.  Assembling this file wasn’t trivial though and manually looking up each specimen’s subject and event nearly defeated the purpose of the procedure, which was to save time and effort.

Finally, you could do what we did: write custom code to automate the job.  Lab data is amenable to this sort of approach because it is always tagged with something called an accession number that uniquely identifies it.  When designing CRF’s, we always make sure to include a field for the event’s accession number, and when a specimen first arrives through our door the first thing we do is to log into OpenClinica and enter the specimen’s accession number in the appropriate event’s CRF.  Because the number is unique to the study, this entry effectively tags the event and provides a ‘hook’ inside the database so that the event_crf_id of any data item subsequently  annotated with the accession number can be easily looked up using a database query like so: ‘SELECT event_crf_id FROM item_data WHERE value = ‘<accession_number>’.  This, in turn, gives you the requisite information to insert the lab data thusly: ‘INSERT INTO item_data VALUES (‘event_crf_id’, ‘value’ …’ provided you also know the item_id.

To implement this strategy, we wrote custom servlets that operated within the context of our OpenClinica installations.  More recently, we configured MirthConnect channels to do the same.   They worked well and data entry was greatly expedited, but the coding was complex and had to be refactored over and over again for each study and for every CRF change.  While helpful, this strategy wasn’t sustainable in the long run.

Luckily, the latest version of OpenClinica provides a way out.  It incorporates the Spring WS Framework which allows programmers to write something called a ‘web service.’  A web service digests and acts upon XML data sent to it on an on-demand basis over a network.  The source need not be a human being uploading data on a web form, but, more usefully, it can be, say, a clinical testing platform automatically spitting out HL7 messages.  This, of course, is ideal in our case.  So we wrote a web service called ‘EventDataInsert’ that parses XML containing lab data values annotated with accession numbers and item names, looks up the corresponding event_crf_id’s and item_id’s, and inserts the data into item_data accordingly.  The service is generic enough so that it doesn’t have to be refactored for each and every study, but it does make some critical assumptions.  Namely, it assumes that both accession numbers and item names are unique.  So care has to be taken to ensure both these preconditions are met.

The power of EventDataInsert doesn’t just lie in the fact that it handles inserts on an unattended basis, but also in that, like most web services, it requires only simple XML as input.  The latter makes the source of the data irrelevant as long as it can be correctly mapped and transformed into XML.  We often use MirthConnect to do this, using it’s easy-to-use graphical interface to configure channels between incoming raw data and OpenClinica’s web-service interfaces.

The figure below shows a typical deployment of OpenClinica at Geneuity.  MirthConnect is used not only to get data into OpenClinica but also to generate canned PDF reports of the results.  This scenario works for us and gets easier and easier to maintain as OpenClinica evolves new electronic data capture features and makes old ones ever more robust.

Diagram of OpenClinica at Genuity Clinical Research Services
Diagram of OpenClinica at Genuity Clinical Research Services

OpenClinica 3.0 Features Preview – Part II

Welcome to Part II of the OpenClinica 3.0 features! I previously wrote about three of the main features for 3.0, Source Data Verification, new User Interface for navigation in the system, and a new Home Page for each user.

This post is about three additional features: (i) the new Build Study module, (ii) setting the length and significant digits of items, and (iii) the improved performance of the Subject Matrix.

In 3.0,  all the study build tools will accessed from one main page following a task-based approach. There are five tasks available to the user at the outset. Once the user finishes these first five tasks, two more tasks will become available (see image). This allows the complete study from CRFs to event definitions to sites to assignment of users be done all from a single page. There is also a checklist to let the user easily see how many tasks have been completed so they know how much more configuration is needed before the study is ready to start enrolling subjects.

Build Study Page in OpenClinica

OpenClinica 3.0 also allows the creator of CRFs to set the allowable length of  text fields including the number of decimal places a REAL number should be rounded to. This parameter is set in the OpenClinica CRF Template in a new field called Width_Decimal. The user will specify the width and decimal for a particular field which will force the user to enter the most precise data as possible in a CRF. No longer will the system round to the 4th decimal place at all times and allow up to 255 characters in the field if the CRF creator does not want them to. For example, the creator could specify that a field should have no more than 5 digits total with a maximum of 1 decimal place by entering 5(1) in the Width_Decimal column of the OpenClinica template. If the data entry person tried to enter 3.4444 or 678913 they would told the value is invalid.

By providing this functionality, OpenClinica will help the users get their data into SAS and SPSS more easily.

One of the most important and information-rich pages in OpenClinica is Subject Matrix page, and OpenClinica 3.0 provides significant performance enhancements on this page for studies with large numbers of of subjects.  From the Subject Matrix page users can see a snapshot of where the subjects are in the study, schedule a new event, view a subject record, view a subject event, enter data in a CRF and sign a subject’s record without having to navigate to different pages in this system. A number of users were reporting sluggish performance with the Subject Matrix when they had 5000 or more subjects enrolled in a study.

OpenClinica 3.0 utilizes a new table structure that allows users to load the Subject Matrix containing over 10,000 subjects and 15 event definitions in fewer than 5 seconds (this process could take upwards of a minute in previous releases of OpenClinica).

OpenClinica 3.0 Features Preview – Part I

We have been working hard on OpenClinica 3.0 for the last 9 months and are getting closer and closer to a production release ready for use in live clinical studies. In the meantime, I wanted to talk about some of the new features over the next few weeks to let folks know what is coming.

OpenClinica 3.0 is sure to bring a lot of excitement to all users of the rapidly growing open source electronic data capture system. A lot of focus in this release has been put on the way trial sponsors use an EDC system and I’d like to point out some of the new features that should enhance their user experiences.

OpenClinica 3.0 will provide a new home page to study-level users providing key information about the progress of a study. These users will be able to see a summary of the subjects enrolled at each site compared to their expected total enrollment as well as the overall subject enrollment for the complete study. Also, these study-level users will be shown a count of the number of study events that are in a particular status. A summary for the number of subject statuses will be displayed so the study-level user can easily see how many subjects are signed, source data verified etc.

OpenClinica 3.0 will provide monitors a workspace to source data verify subjects and their data. The workspace will allow users to source data verify information collected at each visit one-by-one, or verify the information in a bulk process. These two options allow the monitors to perform remote source data verification daily for subjects in the study. Or, if the monitor has to be on site to review and verify information, he/she can go back to their hotel room and check-off verification for many subjects and events at once so they do not have to go one-by-one through every subject and event CRF.

The top-level navigation in OpenClinica 3.0 has been streamlined so site users of the application understand exactly what they have to do after they login. A new home page for investigators and clinical research coordinator users will show the number of queries assigned to them with a link to see every Query assigned them. The home page will show the 5 most recent queries to give the user an idea of what they need to respond to that day.

The new navigation points to the 3 main actions the site users should take. The “Subject Matrix” link will bring them to the new and improved subject matrix in OpenClinica. This matrix will allow users to easily add subjects, schedule events and even enter data from a single, powerful screen. The “Add Subject” link will bring them to a page where they can add a new subject to the study. “Notes & Discrepancies” will bring them to a page where they can see all the queries for their site and allow them to provide a response.

– Paul Galvin