Thoughts on Code: OpenClinica and Open Standards with CDISC

One of the strengths of open source is the ability to open up the code base and learn by reading and doing, that is, the transparency of the code base allows everyone to get involved. However, the barrier to entry can be the complexity of the code itself; without a qualified guide, you can get ‘lost in the code jungle’ pretty quickly.

Welcome to our code

With that in mind, we are starting today to author blog posts about the OpenClinica code base, including topics like how the code is organized, what the code does, and so on. A lot more detail on this can be found on the OpenClinica Developer Wiki, but these posts, viewed as a whole, can be seen as a gentle introduction, before interested parties start to dive deeper.

When we began to design OpenClinica, we had very few requirements, but the desire  to create a fully-featured database for clinical data, aligned with open standards, making use of the best technology available. Call it the ‘tyranny of the blank page’, if you will. Every start-up faces it. Where do you start? What’s the plan? How do you build it, and what do you build first?

Luckily for us, we could use an open standard to base our schema, and our code, on top of; the CDISC ODM.

What’s a CDISC ODM?

The Operational Data Model, or ODM for short, is a standard published by the Clinical Data Interchange Standards Consortium (CDISC), and is “designed to facilitate the archive and interchange of the metadata and data for clinical research”, as it states in their website. This is a standard which is designed to a) hold metadata about a Study and all Events contained within a given Study, and b) hold Clinical Data which has been collected for a given Study. All of this information is held in XML, which is a very useful format for exchanging between sites, labs and institutions.

Figure 1: Study Metadata and OpenClinica

In the above image, you can see an XML file on one side using CDISC ODM and on the other side, an OpenClinica database. Inside the database are tables that map directly to different objects described in the XML. You’ll notice that the tables associated with study metadata also have a column called ‘oc_oid’, which are the Object Identifiers we use in all aspects of the OpenClinica application.

Figure 2: ODM Clinical Data and OpenClinica

In the second image, you see that the latter half of the XML file (the part  contained in the <ClinicalData> tags) also links to specific tables in the OpenClinica database. Since we link back to the Study metadata through those OIDs, we don’t use OIDs in those tables, but instead the conventional methods of primary keys and foreign keys in the database is good enough.

OK, so they map. But where’s the beef?

Of course, the ODM XML in the images is rather simple, and does not capture the full capability of the metadata that can be passed back and forth between different ODM data sources. For a longer example, you can take a look at the following XML, which defines the Rules governing a single Item:

Sample ItemDef in CDISC ODM XML

As you start to piece together the XML in the above example, you’ll see that not only can you define the Question in multiple languages, but you can specify which measurement it is using and what kinds of values you can accept.  The XML standard is extensible enough to add other pieces of information as well, including coded lists, data types, and so on.  More information can be found at XML4Pharma’s page entitled, ‘Using CDISC-ODM in EDC.’

In future posts, we hope to describe more about the code base, and show how it all comes together as a full-featured application. If there are topics that are of specific interest, we hope you’ll comment below and let us know what you’d like to see here in the coming months.