OpenClinica to present at DIA China Annual Meeting

There are numerous opportunities to learn about OpenClinica at the upcoming 5th DIA China Annual Meeting <http://www.diahome.org/en/Meetings-and-Training/Find-Meetings-and-Training/Meeting-Details.aspx?ProductID=31144> . The conference, held at the Beijing International Convention Center, runs from May 12-15. The event is expected to attract over 1,200 participants and is the largest annual meeting held by DIA in the Asia and Pacific region.

OpenClinica representatives will be hosting an exhibit as well delivering three poster presentations and one lecture:

  • CDISC Standards to Enhance Clinical Trial Efficiency
  • The Importance of Validated EDC System to Ensure Quality of Data
  • Electronic Data Capture (EDC): Providing A Significant Cost-Efficient Platform In Current Clinical Research Management
  • The Adoption of Electronic Data Capture (EDC) for Clinical Trial Data Management in China – The Case of ATHENA Healthcare Consultancy

Be sure to stop by the OpenClinica booth A18 to learn about new developments with the leading open source clinical trial software.

Coming up soon, OpenClinica will be running a public, open enrollment training class in Shanghai (June 4-7). More information <https://www.openclinica.com/open-enrollment-training> .

The OpenClinica Platform – Developer Round Table Discussions

OpenClinica is a clinical trial software platform that aims to provide data capture, data management, and operations management functionality to support human subjects-based research. It can be used for traditional clinical trials as well as a wide variety of other types of human subjects-based research.

Our vision for the product is to provide data capture, data management, and operations management functionality out-of-the-box, in an easily configurable, usable, and highly reliable manner. The underlying platform should be interoperable, modular, extensible, and familiar – so users can solve specific problems, in a generalizable way.

This past spring, the development team here at OpenClinica, LLC held a series of round table discussions about how this vision is reflected in the product. Our goals were to learn critical standards and information models needed for our technology to truly reflect this vision, to develop a consistent, shared vocabulary for the problem domain and the OpenClinica technology, and identify the most urgent opportunities to put these lessons into practice in the product and the community. In particular, we spent a lot of the time in these discussions about how OpenClinica’s use of the CDISC Operational Data Model helps enable this vision.

The discussions were invigorating and thought-provoking. We’ve recorded them to share with the greater community of OpenClinica developers, integrators, and others who want to better understand how the technology works, the design philosophy behind it, and where we’re going in the future. The videos are embedded below.

But before getting to the videos, here’s a bit more background on how we think about OpenClinica as a product and a platform.

First, OpenClinica functionality should be ready out-of-the-box, easily configurable and highly usable. Some of the most important features include:

  • Data definition and instrument/form design with no or minimal programming
  • Sophisticated data structures such as repeating items and item groups
  • Support for a wide variety of response types and data types (single select, multiple choice, free text, image)
  • Data management and review capabilities (including discrepancy management and clinical monitoring) with flexible workflows
  • ALCOA-compliant controls and audit history over all data and metadata, including electronic signature capabilities
  • Patient visit calendar design with management of multiple patient encounters and multi-form support
  • Reporting and data extract to a wide variety of formats (tab, SPSS, CDISC ODM)
  • Ability to combine electronic patient reported outcome (ePRO) data with clinically reported data using common form definitions (write once, run anywhere)
  • Deployment via multiple media, mobile or standard web browser

Many of these things have already been implemented, and more are under development.

The core concept around which OpenClinica is organized is the electronic case report form (CRF). In OpenClinica, a CRF is a resource that is essentially a bunch of metadata modeled in CDISC ODM with OpenClinica extensions. It doesn’t (necessarily) have to correspond to a physical or virtual form; it may represent a lab data set or something similar. An OpenClinica Event CRF is that same bunch of metadata populated with actual data about a particular study participant. Thus, it combines the metadata with the corresponding item (field) data, with references to the study subject, event definition, CRF version, and event ordinal that it pertains to. In this conceptual view of the world, CRFs (as well as CRF items, studies, study events, etc.) are resources with core, intrinsic properties and then some other metadata that has to do with how they are presented in a particular representation. Built around these core resources are all the workflow, reporting, API, security, and other mechanisms that allow OpenClinica to actually save you time and increase accuracy in your research.

Second, OpenClinica should be interoperable. The ultimate measure of interoperability is having shared, machine readable study protocol definitions, and robust, real-time, ALCOA-compliant exchange of clinical data and metadata that aligns with user’s business processes. It should be easy to plug in and pull out or mix-and-match different features, such as forms, rules, study definitions, report/export formats, and modules, to transport them across OpenClinica instances or interact with other applications. Establishing well defined methods and approaches for integration into existing health data environments is a key goal of interoperability.

Third, OpenClinica should be modular and extensible. OpenClinica already provides some of the most common data capture and data management components and aims to have a very broad selection of input types, rules, reports, data extracts, and workflows. However OpenClinica developers should also have the freedom to come up with their own twist on a workflow, module, or data review workflow and have it work easily and relatively seamlessly with the rest of OpenClinica. User identification, authentication, and authorization should be highly configurable and support commonly used general purpose technologies for user credentialing and single-sign-on (such as LDAP & OAuth).

The CRF-centric model allows us a great deal of flexibility and extensibility. We can support multiple modalities, with different representation metadata for rendering the same form, or perhaps the shared representation metadata but applied in a different way (i.e. web browser vs. mobile vs. import job). We can address any part of the CRF in an atomic, computable manner. This approach has been successfully applied in the Rule Designer, which takes the ODM study metadata and allows browsing of the study CRFs and items, with the ability to drag and drop those resources to form rule expressions. Features such as rules and report/export formats are represented as XML documents. These documents define how the features behave in standardized ways so that one rule can, say, be easily replaced with another rule without having to modify all the code that makes use of the rule.

Finally, OpenClinica aims to be familiar in the sense of allowing data managers, developers, statisticians to work in a design/configuration/programming environment that they already know. Programmers don’t all have the same experience, and it would be somewhat limiting to force OpenClinica developers to all use the same language (Java) that OpenClinica was written in. We are constantly looking at ways to make it possible (not to mention reliable and easy!) for users and developers to interact with and extend OpenClinica in a programmatic way. This can mean anything from data loading to more meaningful integrations of applications common to the clinical research environment. As proponents of open, standards-based interoperability, our starting point is always to develop interfaces for these interactions based on the most successful, open, and proven methods in the history of technology – namely the protocols that power the World Wide Web (such as HTTP, SSL, XML, OAuth 2.0). They are relatively simple, extensively documented, widely understood, and well-supported out of the box in a large number of programming and IT environments. On top of this foundation, we rely heavily on the wonderful work of CDISC and the CDISC ODM to model and represent the clinical research protocol and clinical data.

Session 1:  from 30-March-2012 (start at the 5 min 20 sec mark)

Session 2:  from 06-April-2012 (start at the 1 min 25 sec mark)

Session 2a:  from 20-April-2012

Session 3:  from 27-April-2012

Session 4:  from 11-May-2012

OpenClinica at Society for Clinical Trials Annual Meeting (May 20-23, Miami)

OpenClinica will be both exhibiting and presenting at this year’s Society for Clinical Trials Annual Meeting, May 20-23 in Miami. Come by the OpenClinica booth to introduce yourself and learn about the many exciting things happening within the OpenClinica software and community! On Wednesday, May 23, OpenClinica’s, CEO, Cal Collins, will lead a presentation showcasing an innovative new OpenClinica mobile data capture technology. The session is titled “Electronic Patient Reported Data for Risk Screening in Primary Care Clinics using OpenClinica and CDISC ODM.”

eClinical Integration

Increasingly I am seeing real momentum for reducing the costs and barriers to integration of eclinical applications and data in a way that benefits users.

A great example is a recent LinkedIn discussion (you may need to join the group to read it).  Several software vendors and industry experts engaged in a dialogue about the pros and cons of different integration approaches. There is an emerging consensus that integration approaches should adopt open, web standards and harnesses the elegance and flexibility of the CDISC Operational Data Model. This consensus may signal a sea change in attitudes to standards-based integration that makes it the norm rather than the exception.

This is not new to members of the OpenClinica community. Over the years we’ve had many examples of such integration efforts described on this blog and at OpenClinica conferences. To make such efforts more powerful, reusable, and robust, the OpenClinica team has invested a great deal over the past year to create a meaningful, CDISC ODM-based model for interacting with OpenClinica. We have incorporated open web standards (RESTful APIs for transport and OAuth for security) to make the interfaces easily accessible with commonly used software tools.  This is part of a newly published resource for OpenClinica development and integration, the OpenClinica 3.1 Technical Documentation Guide. The first version of the specification can be viewed at https://docs.openclinica.com/3.1/technical-documents/rest-api-specifications. I’ve reproduced the introduction here:

Overview

We are constantly looking at ways to make it possible (not to mention reliable and easy!) for users and developers to interact with and extend OpenClinica in a programmatic way. This can mean anything from data loading to more meaningful integrations of applications common to the clinical research environment.

As proponents of open, standards-based interoperability here at OpenClinica, our starting point is always to develop interfaces for these interactions based on the most successful, open, and proven methods in the history of technology – namely the protocols that power the World Wide Web (such as HTTP, SSL, XML, OAuth 2.0). They are relatively simple, extensively documented, widely understood, and well-supported out of the box in a large number of programming and IT environments. On top of this foundation, we rely heavily on the wonderful work of CDISC and the CDISC ODM to model and represent the clinical research protocol and clinical data.

This chapter describes a CDISC ODM-based way to interact with OpenClinica using RESTful APIs and OAuth. The REST web services API relies on HTTP, SSL, XML, OAuth 2.0. This architecture makes the ODM study protocol representation for an OpenClinica study available and supports other interactions for study design.

Why REST?

The OpenClinica RESTful architecture was developed to (initially) support one particular use case, but with the intention of becoming more broadly applicable over time. This use case is based on a frequent request of end users: for OpenClinica to support a visual method for designing, editing, and testing “rules” which define edit checks, email notifications, skip pattern definitions, and the like to be used in OpenClinica CRFs. Users have had to learn how to write rules in XML, which can be confusing and have a big learning curve for non-technical individuals. The OpenClinica Rule Designer is an application that allows end users to build cross field edit checks and dynamics within a GUI based application. It is centrally hosted Software as a Service (SaaS) based application available for OpenClinica Enterprise customers at https://designer.openclinica.com.

To support interaction of the centrally hosted rule designer with any instance of OpenClinica Enterprise installed anywhere in the world, we needed to implement a secure protocol and set of API methods to allow exchange of study information between the two systems, and do so in a way where the user experience was as integrated as if these applications were part of the same integrated code base. In doing so, and by adopting the aforementioned web and clinical standards to achieve this, we have built an architecture that can be extended and adapted for a much more diverse set of uses.

This chapter specifies how 3rd party applications can interact with an OpenClinica instance via the REST API and OAuth security, and details the currently supported REST API methods. The currently supported API methods are not comprehensive, and you may get better coverage from our SOAP API. However the OpenClinica team is continuing to expand this API and since it is open source anyone may extend it to add new methods to meet their own purposes. If you do use the API in a meaningful way or if you extend the API with new methods, please let others know on the OpenClinica developers list (developers@openclinica.org), and submit your contributions for inclusion back into the codebase – you’ll get better support, increased QA, and compatibility with future OpenClinica releases.

RESTful Representation, based on ODM

“REST”, an acronym for REpresentational State Transfer, describes an architectural style that allows definition and addressing of resources in a stateless manner, primarily through the use of Uniform Resource Identifiers (URIs) and HTTP.

From Wikipedia: A RESTful web service (also called a RESTful web API) is a simple web service implemented using HTTP and the principles of REST. It is a collection of resources, with three defined aspects:

  • the base URI for the web service, such as http://example.com/resources/
  • the Internet media type of the data supported by the web service. This is often JSON, XML or YAML but can be any other valid Internet media type.
  • the set of operations supported by the web service using HTTP methods (e.g., POST, GET, PUT or DELETE).

REST is also a way of looking at the world, as eloquently articulated by Ryan Tomayko.

In the context of REST for clinical research using OpenClinica, we can conceptually think of an electronic case report form (CRF) as a resource that is essentially a bunch of metadata modeled in CDISC ODM with OpenClinica extensions:

  • Some of this metadata (data type, item name, response set, etc) is intrinsic metadata – i.e. tied to the definition of the CRF and its items and mostly unchangeable after it is initially defined.
  • Some of this metadata is representation metadata and used only when the CRF is represented as a web-based HTML form (in the OpenClinica database schema we call this form_metadata, but it also can include other things like CRF version information and rules).

An OpenClinica Event CRF is that same bunch of metadata with the corresponding item data, plus references to the study subject, event definition, CRF version, event ordinal, etc that it pertains to.

  • The notion of a CRF version pertains to the representation of the CRF. It is not intrinsic to the event CRF (this is debatable but it is how OpenClinica models CRFs). Theoretically you should be able to address and view any Event CRF in any available version of the CRF (i.e. http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/edit and http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v2/edit both show you the same data represented in different versions of the CRF). Of course the audit history needs to clearly show which version/representation of the CRF was used for key events such as data capture, signature, etc.
  • Rules are also part of the representation metadata as opposed to intrinsic metadata, even though you don’t need to specify them on a version-by-version basis.
  • Anything attached to the actual event CRF object or its item data – discrepancy notes, audit trails, signatures, SDV performance, etc is part of that event data and should be addressable in the same manner (e.g. http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEM…)

In this conceptual view of the world, CRFs (as well as CRF items, studies, study events, etc.) are RESTful resources with core, intrinsic properties and then some other metadata that has to do with how they are presented in a particular representation. We now have a model that allows us a great deal of flexibility and adaptability. We can support multiple modalities, with different representation metadata for rendering the same form, or perhaps the shared representation metadata but applied in a different way. We can address any part of the CRF in an atomic manner. This approach has been successfully applied in the Rule Designer, which takes the ODM study metadata and allows browse of the study CRFs and items, with the ability to drag and drop those resources into rule expressions. Here are some examples of additional future capabilities that could be easily realized on top of this architecture:

  • Multiple data entry modalities – a user may need to deploy patient based data entry via web, a tablet, a thick client, or even paper/OCR, each with a very different presentation. Each of these may be part of OpenClinica-web or a separate application altogether, but all will rely on the same resource metadata to represent the CRF (according to the UI + logic appropriate for that modality), and use the same REST-based URL and method for submitting/validating the data.
  • Apply a custom view (an XSL or HTML/CSS) to a patient event CRF or full casebook – some uses of this could be to represent as a PDF casebook, show with all audit trails/DNs embedded in line with the CRF data, show a listing of data for that subject, provide (via an XSL mapping) as an XForm or HL7 CCD document for use by another application) – http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/view?renderer=somemap…
  • The same path used in the URLs, eg http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEMOID could be used as the basis for XPath expressions operating on ODM XML representations of CRFs and of event CRF data
  • Internationalization – OpenClinica ought to allow our CRF representation metadata to have an additional sub-layer to render the form in different languages, and then automatically show the appropriate language based on client/server HTTP negotiation (like we do with the rest of the app). Currently internationalization of CRFs requires versioning the CRF.
  • View CRF & Print CRF – use the same representation metadata (form metadata) but apply slightly different rules on how the presentation works (text values instead of form fields, no buttons, turn drop down lists into text values)
  • Discrepancy manager popup – one requested use case would allow a user to update a single event CRF item data value directly from the discrepancy note UI point of view. In this case you could think of just updating that one item as addressing the resource http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEM…. In this model, whatever rules and presentation metadata need to get applied at presentation and save time happen automatically.
  • Import of CDISC ODM XML files – imported data would be processed through the same model, but only use the metadata that’s relevant to the data import modality. Same for data coming in as raw ODM XML via a REST web service. A lot of times the import only populates one part of a CRF and the other parts are expected to be finished via data entry. This model would help us manage that process better that the current implementation of ODM data import.

There are many considerations related to user roles and permissions, workflows, and event CRF/item data status attributes that need to be overlaid on top of this REST model, but the model itself is a conceptually useful way to think about clinical trials and the information represented therein. When implemented using CDISC ODM XML syntax it becomes even more powerful. As widespread support for ODM becomes the norm, the barriers to true interoperability – shared, machine readable study protocol definitions, and robust, real-time, ALCOA-compliant exchange of clinical data and metadata that aligns with user’s business processes – get eviscerated.

* This chapter frequently refers to ODM-based representations of study metadata and clinical data in OpenClinica. We strive as much as possible to implement ODM-based representations of OpenClinica metadata and data according to the generic ODM specifications (currently using ODM version 1.3). However, to ensure our representations support the full richness of information used in OpenClinica we often have to rely on ODM’s vendor extensions capability. We have not always made distinctions in this chapter as to where we are using ‘generic’ ODM versus OpenClinica extensions, but that is documented here. It is our goal as ODM matures and supports richer representations of study information to migrate our extensions back into the generic ODM formats.

** Also note the RESTful URL patterns referred to above are conceptual. Refer to the technical subchapters of this REST API specification for the actual URLs.

The spec (like much of the code that implements it) is open source. I’m looking forward to hearing comments and feedback, and sharing thoughts on how we can encourage broader adoption across different types of eclinical applications.

Thoughts on Code: OpenClinica and Open Standards with CDISC

One of the strengths of open source is the ability to open up the code base and learn by reading and doing, that is, the transparency of the code base allows everyone to get involved. However, the barrier to entry can be the complexity of the code itself; without a qualified guide, you can get ‘lost in the code jungle’ pretty quickly.

Welcome to our code

With that in mind, we are starting today to author blog posts about the OpenClinica code base, including topics like how the code is organized, what the code does, and so on. A lot more detail on this can be found on the OpenClinica Developer Wiki, but these posts, viewed as a whole, can be seen as a gentle introduction, before interested parties start to dive deeper.

When we began to design OpenClinica, we had very few requirements, but the desire  to create a fully-featured database for clinical data, aligned with open standards, making use of the best technology available. Call it the ‘tyranny of the blank page’, if you will. Every start-up faces it. Where do you start? What’s the plan? How do you build it, and what do you build first?

Luckily for us, we could use an open standard to base our schema, and our code, on top of; the CDISC ODM.

What’s a CDISC ODM?

The Operational Data Model, or ODM for short, is a standard published by the Clinical Data Interchange Standards Consortium (CDISC), and is “designed to facilitate the archive and interchange of the metadata and data for clinical research”, as it states in their website. This is a standard which is designed to a) hold metadata about a Study and all Events contained within a given Study, and b) hold Clinical Data which has been collected for a given Study. All of this information is held in XML, which is a very useful format for exchanging between sites, labs and institutions.

Figure 1: Study Metadata and OpenClinica

In the above image, you can see an XML file on one side using CDISC ODM and on the other side, an OpenClinica database. Inside the database are tables that map directly to different objects described in the XML. You’ll notice that the tables associated with study metadata also have a column called ‘oc_oid’, which are the Object Identifiers we use in all aspects of the OpenClinica application.

Figure 2: ODM Clinical Data and OpenClinica

In the second image, you see that the latter half of the XML file (the part  contained in the <ClinicalData> tags) also links to specific tables in the OpenClinica database. Since we link back to the Study metadata through those OIDs, we don’t use OIDs in those tables, but instead the conventional methods of primary keys and foreign keys in the database is good enough.

OK, so they map. But where’s the beef?

Of course, the ODM XML in the images is rather simple, and does not capture the full capability of the metadata that can be passed back and forth between different ODM data sources. For a longer example, you can take a look at the following XML, which defines the Rules governing a single Item:

Sample ItemDef in CDISC ODM XML

As you start to piece together the XML in the above example, you’ll see that not only can you define the Question in multiple languages, but you can specify which measurement it is using and what kinds of values you can accept.  The XML standard is extensible enough to add other pieces of information as well, including coded lists, data types, and so on.  More information can be found at XML4Pharma’s page entitled, ‘Using CDISC-ODM in EDC.’

In future posts, we hope to describe more about the code base, and show how it all comes together as a full-featured application. If there are topics that are of specific interest, we hope you’ll comment below and let us know what you’d like to see here in the coming months.

OpenClinica 3.1 Release Update

We are in the final development stretch of the OpenClinica 3.1 project.  Over the course of alphas and betas some items have been inevitably been added, removed, and changed, so I thought it would be useful to provide an up-to-date list of features and bugs that are being addressed in this release.

Those of you attending the OpenClinica Global Conference May 8-9 here in Boston will get to experience numerous demonstrations of 3.1 functionality, and of course get to interact directly with members of our rockstar development team 🙂

First, here’s a quick review of some of the more popular new features:

  • Dynamic items in a CRF
    • Supports showing and hiding of items based on values provided in a separate field
    • Supports insertion of data across fields and CRFs based on values provided in a separate field.
  • Re-factoring of the Extract Data module to support custom data transformations allowing you to extract data in essentially any conceivable format.
  • Improved capabilities for sharing data entry duties among a given site’s users
  • Discrepancy Note “Days Open” filter to better monitor discrepancy aging
  • Discrepancy Note “Days Since Updated” filter to ensure productive discrepancy resolution
  • Audit log and discrepancy note data included in ODM exports
  • Improved CRF page turn time (in many cases over 10x faster)
  • CRF Header updates to provide a cleaner data entry experience
  • Ability to assign Failed Validation Checks (in addition to Queries)
  • Relative Paths for Item OIDs in Rules to enable portability and reusability of Rules
  • Discrepancy Note flag colors now appear in the CRF
  • Additional study parameter configuration options to opt out of using certain fields such as Interviewer Name, Interview Date, and Location
  • OpenClinica Datamart for ad-hoc reporting (Enterprise only)
  • GUI Based, drag-and-drop Rule Designer (Enterprise only)

More detailed descriptions of the above features can be found in the previous blog posts “OpenClinica 3.1 (project “Amethyst”) Preview” and “New Capabilities Added to OpenClinica Version 3.1.”

For those sufficiently ambitious, yes, you can see a complete list of ALL the 457 separate enhancements and bug fixes in 3.1 as of April 6, on the OpenClinica Amethyst Project Roadmap.

And finally, here’s the current release timeline:

  • Beta 4 released week of April 11th
  • Release Candidate 1 released week of April 18th
  • Production Release of OpenClinica 3.1 early May.

Many thanks to the dedicated team of folks who, for the past 19 months, have worked tirelessly to make this the most significant OpenClinica release ever.

– Paul

Plug-in Architecture for OpenClinica Data Extracts

A major part of the Akaza mission is to make OpenClinica more flexible and customizable. Having a code base that is open source is a great place to start. But not everybody wants to develop Java code to meet their own requirements. We aim wherever possible to add configuration options and easy-to-use design tools within the user interface, but not all problems are a good fit for that approach. The solution is a series of “plug-in” interfaces that allow users to add their own capabilities and configurations, or interact with other applications. Some of these interfaces, such as loading of spreadsheet-based CRF definitions, are a critical part of OpenClinica, without which the system would not be functional. Other interfaces include CDISC ODM data import, job scheduler for import and export, SOAP-based web services, and the HTML5 popup interface that allows 3rd party applications to enter CRF data. Along the way community members have improved these interfaces and taught us a lot about how to design them better.

OpenClinica 3.1 will include a completely rewritten version of Extract Data based around a plug-in architecture that increases flexibility and functionality. We’ve learned that user requirements for organizing, formatting, and presenting data are tremendously diverse (and often conflicting), depending on the user, the intended purpose, the study, and the organization. Our old Extract Data architecture made it difficult to add new output formats or tweak the ones already there. The new functionality provides a highly extensible, easily configurable means to get data formats that meet a user’s precise requirements. It does this by:

  • Using XSL stylesheet transformations to read native CDISC ODM XML and output the data in a transformed format.
  • Specifying available formats, their associated stylesheets, and associated properties (like filename, archival settings, and whether to compress the file) in a properties file (the extract.properties file)
  • Optionally, enabling postprocessing of the transformed data to output to certain non-text file formats and destinations

We started out with a desire to simplify the native output of the OpenClinica Extract Data java application, in a way that increased quality, stability, completeness, and performance. From now on, the OpenClinica core application will only produce CDISC ODM (version 1.3, with OpenClinica Extensions) as the natively supported format. With only one native format, we’re better able to test, document, and guarantee the output. All other output formats generated are transformations from this native ODM 1.3 w/extensions format. We made sure (via the OpenClinica vendor extensions) that we can export all possible data related to a study and its clinical data in this format. In 3.1, this also includes export of audit trail, discrepancy, and electronic signature information.

After we devised a way to improve the quality, stability, and performance of the data coming out of the core, we needed to provide a way to execute the data transformations, into any of a wide variety of outputs. It was important for us to adopt standard, widely used formats and open source technologies as the basis for these transformations. We selected the XSLT (Extensible Stylesheet Language Transformations) language because of its applicability to CDISC ODM XML, extensive features, and reasonably simple learning curve. The implementation of these transformations is powered by a widely used open source engine, the Saxon XSLT and XQuery processor. The behavior of Export Data is determined by the extract.properties configuration file and the XSL stylesheets. The extract.properties file specifies the available data formats available in the system, each with a corresponding XSL stylesheet. OpenClinica 3.1 by default includes a set of XML stylesheet transformations for commonly used formats, such as HTML, Tab-delimited Text, and SPSS. The OpenClinica Enterprise Edition will include additional new formats including SAS, annotated CRFs, printable PDF casebooks with integrated audit trail and discrepancy notes, and a SQL-based data marts with normalized CRF-based table structure for ad-hoc reporting.

At this point, we can now reproduce the extract functionality available in OpenClinica 3.0, at a higher level of quality and stability. The stylesheets replicate the HTML, SPSS, tab-delimited, and multiple CDISC XML formats that were available in 3.0, and the framework will make it much easier to add new formats. However all of these data output formats are some type of text or XML based file. Users have also voiced the need to do things that XSLT cannot do by itself, like produce PDF files or load the data into external relational databases for ad-hoc reporting. The solution was implementation of a postprocessor framework that allows more sophisticated functionality. With postprocessing we can do things like generate binary output formats or send data to a target destination. Two postprocessors are included in 3.1 by default: output to a database using JDBC connectivity and generating PDF files using XSL-FO. The postprocessing step is transparent to end-users; they simply get their files for download or alternatively receive a message that the data has been loaded into the database. And the framework exists to add additional postprocessors via the addition of Java classes with references to those class names in the extract.properties file.

Execution of data export occurs when a user or job initiates a request for data. The request includes the active study or site, the dataset id, and the requested format. The end user will notice only minor differences in how they use the Extract Module. The process of creating datasets has not changed. The Extract can be still initiated from the ‘Download Data’ screen or via a job by selecting the desired output format. At this point however, rather than waiting for the download page to load, the user will be told that their extract is in queue, and receive an email and on-screen notification when the extract is complete. Execution follows a four step process:

Step 1.   Generate native CDISC ODM XML version 1.3 with OpenClinica Extensions

Step 2.   Apply XSL transformation and generate output file according to the settings in extract.properties for the specified format

Step 3.   Optionally, if postprocessing is enabled for the requested format, run the post processing action according to the settings in extract.properties.

Step 4.   Provide user notification with success or failure message.

We’ve also improved the logging and messaging surrounding extracts, which will be crucial for anyone developing, customizing, or debugging XSL stylesheets. As always, full internationalization is supported – if you want a value to be internationalized, it should be prefaced with an & (ampersand) symbol in the extract.properties file, and the corresponding text placed in the notes.properties i18n files.

As is common with software, we didn’t get to do everything we wanted in the first release of these capabilities. Some future features include:

  • Allow extract formats to be restricted to specific users, studies/sites, and/or datasets.
  • Allow loading and validation of formats within the web UI or via web services rather than via the extract.properites config file.
  • Create an exchange for XSL formats similar to the CRF Library.

Other than that we think we’ve thought of everything :-). Have we?

– Cal Collins

New Capabilities Added to OpenClinica Version 3.1

For the past 12 months, the OpenClinica development team has been working diligently on the next major OpenClinica release. In a prior blog post (OpenClinica 3.1 (project “Amethyst”) Preview), I discussed some of the new functionality and key improvements this forthcoming release will include. Since then, the development team has added some more pieces of functionality, most notably:

  • Re-factoring of Extract Data to enable custom data transformations
  • Discrepancy Note “Days Open” filter
  • Discrepancy Note “Days Since Updated” filter
  • CRF Header updates
  • Additional study parameter configuration options
    • Interviewer Name – not used
    • Interview Date – not used
    • Location – not used
  • OpenClinica Data Mart for ad hoc reporting*
  • GUI-based Rules designer*

Custom Data Transformations

From a development perspective, perhaps the most significant piece of functionality in the above list is the complete re-factoring of the Extract Data module. In prior versions of OpenClinica, the Extract Data module would natively export CDISC ODM 1.2, CDISC ODM 1.3, both ODM 1.2 and 1.3 with OpenClinica extensions, tab-delimited text, and an SPSS file format. While these formats will continue to be available in 3.1, OpenClinica will have the ability to process (via XSL style sheets) any CDISC ODM 1.3 data set into any other data format.

This new XSL method of extracting data from OpenClinica opens up a new world of formats in which data may be obtained from OpenClinica. To create a new extract format, all one has to do is configure a new XSL style sheet. Organizations can write their own XSL style sheets then configure a file called extract.properties that allows OpenClinica’s XSL transformation engine to convert the data in the ODM XML file into the organization’s desired format.

GUI-based Rules Designer*

The GUI-based Rules designer is a project currently underway that allows users to build cross-field/cross-form multivariate edit checks without having to interface directly with XML. Its intuitive, drag and drop interface hides much of the current complexity in creating and managing Rules, while retaining the system’s robust capabilities in this area. Users will be able to work directly with the Item Names as defined in a CRFs definition, eliminating the need to keep track of OIDs.

OpenClinica Data Mart*

The OpenClinica Data Mart presents clinical data from OpenClinica studies in a readily-accessible relational database for ad hoc reporting and analysis. The Data Mart enables data managers, project managers, and researchers to create reports and mine data using standard business intelligence software or SQL reporting tools

Beta Timeframe

We currently expect the first Beta of 3.1 to be available towards the end of November. This Beta will be feature complete, and blocking issues that prevent the testing of new features will have been resolved. Once the first Beta is released, we will begin the 3.1 Pilot Program. We currently have 10 organizations participating in the Pilot Program, but if your organization would like to volunteer as well please contact me.

Please be sure to check back the OpenClinica Blog often as I post future updates on the 3.1 timelines.

– Paul Galvin, OpenClinica Project Manager

_________________

*Both the GUI-based Rules designer and Data Mart will be Enterprise only applications and will not be part of the publicly released 3.1 Beta.

OpenClinica 3.0 Features Preview: Part III

Welcome to the 3rd and final installment of the OpenClinica 3.0 features preview!  This post covers the new Web Services interface that is part of 3.0 and the job scheduler that can be used to automate Data Import and Data Export jobs.

OpenClinica 3.0 allows for programmatic interaction with external applications to reduce manual data entry and facilitate real-time data interchange with other systems.  The OpenClinica web services interface uses a SOAP-based API to allow the registering of a subject and scheduling of an event for a study subject.

OpenClinica provides a WSDL (Web Service Definition Language) that defines a structured format which allows OpenClinica to accept “messages” from an external system. For example, an EHR system could register subjects for a study in OpenClinica without direct human intervention. At the same time, the EHR could also be programmatically scheduling study events for these subjects. More information about the OpenClinica API can be found on the OpenClinica developer wiki.

An early reference implementation conducted by clinical lab Geneuity used the API to create a web service which inserts data programmatically into OpenClinica CRFs directly from laboratory devices. See the post by Geneuity’s Colton Smith below.

Another major productivity tool in 3.0 is the introduction of a Job Scheduler for automating bulk data import and export.  With this feature users can define a job that will generate an export at a specified time interval.  The Jobs Scheduler can also be configured to regularly scan a specific location for CDISC ODM files and run data imports when a new file is available. This feature can be particularly helpful in automating routing functions, such as the incorporation of lab data into OpenClinica from an external system.  The lab data does need to be in a valid CDISC ODM format (this can be accomplished via another great open source tool called Mirth), but it does save a person from entering data in two applications separately.

At time of this post, OpenClinica 3.0 is currently released as a beta3, but the production ready application is soon to follow. The application is passing through the highly rigorous strictures of our quality system (think Navy Seals training for software) and the output will be fully validated and ready of use in roughly a month. Needless to say, I, and everyone else here at Akaza is very excited to be so close to releasing 3.0. It is already quite clear that this release will have a momentous, positive impact on the community.

An Opportunity for Transformational Change in Clinical Trials

Life sciences research is recognized as one of the most technologically advanced, groundbreaking endeavors of modern times. Nevertheless until very recently the preferred technology for executing the most critical, costly stage of the R&D process – clinical trials – has been paper forms. Only in 2008 did adoption of electronic alternatives to paper forms take place in more than half of new trials. This recent uptick in adoption rates is encouraging, but further transformational change in the industry is necessary to fully realize the promise of Electronic Data Capture (EDC) and associated “eClinical” technologies. Two developments that could provide the framework for such change are adoption of open data standards and the use of Open Source Software.

Data standards provide uniform ways to represent information or processes within a specific frame of reference and according to a detailed specification. A standard is “open” when it is not encumbered by patent, cost, or usage restrictions. Open Source Software (OSS) is defined loosely as software that allows programmers to openly read, redistribute, and modify the source code of that software. The combination of OSS and open standards is a proven way to deliver improved flexibility, quality, and efficiency.

A community-driven open source offering that harnesses open standards can produce robust, innovative technology solutions for use in regulated clinical trial environments. Most Open Source Software is built using a collaborative development model. The OSS development and licensing model encourages experimentation, reduces ‘reinvention of the wheel’, and allows otherwise unaffiliated parties to build on the work of others. The result is that OSS can become a key driver of increased IT efficiency and a way to wring out unnecessary costs. In many cases, users can have the best of all worlds: the ability to adopt software rapidly and at low cost, the flexibility to develop and extend their systems as they choose, and the ability to reduce risk by obtaining paid commercial-grade support.

As clinical research struggles to become more automated and efficient, we need to rely on interoperable systems to meet challenges of flexibility, quality, and speed. The OSS development model also naturally leads to the adoption of well-documented, open standards. Because OSS product designers and developers tend to reuse successful components and models where available, OSS technologies are often leading implementers of standards. For example, the National Cancer Institute’s Cancer Bioinformatics Grid (caBIG) initiative is “designed to further medicine’s potential through an open source network” based on open data standards and infrastructure that support sharing of heterogeneous data. This remarkable effort aims to connect large networks of researchers in ways that enables efficient re-use of data, eliminates duplicate systems, and enables new types of translational research.

In industry-sponsored clinical trials, standards such as the CDISC Operational Data Model (ODM), Clinical Data Acquisition Standards Harmonization (CDASH), and Study Data Tabulation Model (SDTM) have gained adoption in both proprietary and OSS software platforms. In some cases, standards are mandated for regulatory submission and reporting (SDTM, clinicaltrials.gov) and obviously must be adopted. Other cases, such as use of ODM, CDASH, and general web standards such as web services and XForms tend to be adopted to the degree they have a compelling business case.

The business case for standards centers on increasing accuracy and repeatability, enabling reuse of data, and enhancing efficiency by use of a common toolset. A well-designed standard does not inhibit flexibility, but presupposes idiosyncrasies and allows extension to support ‘corner cases’. Leading industry voices share compelling arguments how to use standards such as ODM, CDASH, XForms, and Web Services to achieve these goals. Though the details are complicated, the approach offers orchestration of disparate applications and organization of metadata across multiple systems. There is change control support and a single ‘source of truth’ for each data point or study configuration parameter, so when study designs change (as they inevitably do) or a previously committed data point is rolled back, it is automatically shared and manual updates to systems are not necessary. Because the ODM, CDASH, and SDTM are used as a common “language”, the systems know the meaning and structure of data and can process transactions accordingly. Here’s a tangible example:

Lets imagine an IVR system wanted to check with an EDC system if a subject was current in a study (current meaning not dropped out, early terminated or a screen failure).  A Web Service could be offered by the EDC system to respond with a ‘True’ or ‘False’ to a call ‘IS_SUBJECT_CURRENT’ ?  Of course hand-shaking would need to occur before it hand [sic] for security and so on, but following this, the IVR system would simply need to make the call, provide a unique Subject identifier, and the EDC system web service would respond with either ‘True’ or ‘False.  With Web Services, this can potentially occur in less than a second.

Electronic Data Capture – Technology Blog, September 28, 2008

While this integration requirement could be satisfied by development of point-to-point, proprietary interfaces, this approach is brittle, costly, and does not scale well to support a third or fourth-party system participating in the transaction. It is critical that standards be open so that parties can adopt and implement them independently, and later interface their systems together when the business case calls for it. A leading industry blogger makes the case for the openness of standards within the ODM’s ‘Vendor Extension’ architecture: ”The ODM is an open standard, the spec is available for free and anyone can implement it. This encourages innovation and lowers the barriers to entry and therefore costs. Vendor Extensions are not open, the vendor is under no obligation to share them with the market and the effect is that meta-tools and inter-operability are held back.”

Having the software that implements these standards released as open source code only strengthens its benefits. Proprietary software can implement open standards, however given the proprietary vendor’s business interest to lock-in license revenue, might the vendor be tempted into tweaking or ‘extending’ the standard in a way that is encumbered to lock users into their platform? This strategy of “embrace, extend, extinguish” was made famous in the Microsoft anti-trust case of the 1990s, where it came to light that the company attempted to apply these principles to key Internet networking protocols, HTML and web browser standards, and the Java programming language. They hoped to marginalize competing platforms that did not support their “extended” versions of the standards. Thankfully, they had limited success in this effort, and the Internet has flourished into the open, constantly innovating, non-proprietary network that we know today. The eClinical technology field is at a similar crossroads. By embracing open standards, and working concertedly to provide business value in re-usable OSS technology, we can achieve a transformation in the productivity of our clinical technology investments.