Engineering OpenClinica’s Future

We recently introduced OpenClinica Participate™.

We believe all research participants—patients, clinicians, researchers, should have technology that meets the ‘anytime, anywhere’ expectations of a mobile, smartphone enabled world. Based on conversations with the OpenClinica community, many of you share this view as well. We are committed to making sure, at minimum, that OpenClinica’s patient engagement technology ‘just works’ in mobile, real world environments. Wherever possible, we will go beyond that and work to make the participant experience engaging, fun, and inspiring.

As transformational as these patient engagement capabilities can be, what we’ve been working on is about more than that. This is about a foundation for the future of the OpenClinica project.

EnketoAs I briefly pointed out in an earlier post, OpenClinica Participate forms are powered by the new enketo-express app that was built around the widely used enketo-core form engine (both available on GitHub).

OpenROSA_LogoOpenClinica will soon natively support the OpenRosa API, which will let you run Enketo, ODK Collect, or any of a number of OpenRosa-compliant data capture clients. Eventually, we envision the Enketo forms engine will replace the current CRF engine in the OpenClinica code base.

odk_medium_squareIf you’re not familiar with Enketo, ODK, or OpenRosa, here’s a primer. Most important is understanding there is a rich global ecosystem of technology, developers and users around the OpenRosa Xform standard. The resultant solutions have been battle tested in diverse health care and field-based data collection settings over many years. In keeping with the open source principles of flexibility and choice, aligning the OpenClinica ecosystem with this community will provide new features and options that you can use.

As my 5 year-old son has taught me when we watch Spider-Man cartoons, with great power comes great responsibility. So it is with open source software. Tapping in to the richness and variety of the OpenRosa community creates new possibilities, but it can add complexity too by expanding the options you have to choose from. OpenClinica is released under an open source license so that many developers can improve, combine, and share their code in a way that enhances quality, usability, and features, and we believe that this richness will drive the next cycle of innovation.

With this goal of better encouraging code contributions, the focus of the repositories and downloads will be easy-to-use open-source libraries: building blocks for developers to create their own OpenClinica-powered apps and modules.

If you are developing on the OpenClinica code base to add features or build custom solutions, you’ll have a greater ability to mix and match just the pieces you need, and to share back your improvements in a modular fashion. It will be much easier for developers to use the libraries and share their experience and contributions back with the community. We will gladly help out if you experience issues. Our own engineers will be able to focus more of their time on improving quality, usability, and functionality, rather than on packaging, testing, and supporting so many different environments. We hope to build a strong collaboration with the Enketo and OpenRosa communities that spawns new ideas and developments.

So try it out! Check out OpenClinica Participate or get started by hooking up OpenClinica with Enketo.

And need I say, you’ll certainly be able to learn more about these OpenClinica innovations at the upcoming OC15 conference in Amsterdam, May 31-June 1.

Let’s get social about code!

The move to Github is a powerful one, and one that we know will foster a strong, active community.

github2When we started building OpenClinica more than eight years ago we wanted to build a community around it and one has really emerged. I’ve personally been able to interact with hundreds of users over the years and I’ve learned a lot from them. People have been pushing the boundaries; starting their own user groups (all over the world!) and building cool tools and add-ons for OpenClinica. The only downside is that I don’t think there has been enough visibility into what other people have been building – but why? Perhaps the right tools for sharing weren’t in place.

Github lets you fork, pull request and merge! Not only that, but you’ll get credit on your Github profile for pull requests that get merged. It’s like the social networking of code contributing. We really want to lower the barriers to solid contributions. In the past two months we’ve gotten five pull requests and two are in the process of being merged and will be in the next release (3.4). Of course, we’re really excited about this and look forward to more contributions. We really want this process to be simple, transparent and fun. Let us know what you think!

You can find us on Github here: https://github.com/OpenClinica/OpenClinica

And for more info on our road map, you can check here:

https://docs.openclinica.com/release-notes

Cheers!

Alicia

Synchronizing OpenClinica Instances: Another Option for Using OpenClinica in Disconnected Settings

While tablet software maker Mi-Co is showcasing an integration of their Mi-Forms tablet-based forms software with OpenClinica that can be used in “offline” settings, elsewhere within the OpenClinica community, Raymond Omollo and Michael Ochieng have developed a separate option for using OpenClinica in settings without internet connectivity. Their method synchronizes multiple locally deployed instances of OpenClinica with a central OpenClinica database. Michael and Raymond recently presented their work at the OC13 conference. You can access their presentation slides here to see how they address key issues such as synchronization, back-ups, encryption, and user training.

Synchronization Flow Chart
Synchronization Flow Chart

While working for Drugs for Neglected Diseases initiative (DNDi), Michael and Raymond devised this approach for a WHO study of Buruli ulcers in West Africa (Ghana and Benin). The study, which is ongoing as of the date of this post, is a randomized controlled trial comparing the efficacy of 8 weeks treatment with clarithromycin and rifampicin versus streptomycin and rifampicin. It involves 430 subjects across 5 sites. The participating sites have limited or unstable internet connectivity, so a solution is needed that provided timely, auditable, and quality data entry given this constraint. A positive byproduct is enhancing the capacity of these disconnected sites to utilize EDC.

As they say, necessity is the mother of invention. And open source makes it easier for people to believe that what is necessary can in fact be accomplished. Kudos to Raymond and Michael for devising a solution that works for them. Perhaps it may work for others as well.  If you’d like to access the source code and documentation for their work, you can download these from the OpenClinica Tools and Tips page (scroll to bottom).  You can reach Raymond and Michael on the OpenClinica Developers mailing list: developers@openclinica.org.

– Ben Bauman

More About DNDi

Headquarted in Geneva, DNDi is a global organization that develops safe, effective, and affordable treatments for neglected diseases. The neglected diseases that DNDi tackles afflict many of the world’s poorest people (Malaria, Leishmaniasis, Chagas disease, Sleeping Sickness, Paediatric HIV, Filaria). DNDi’s goal to develop 11 to 13 new treatments by 2018. More at www.dndi.org.

The OpenClinica Platform – Developer Round Table Discussions

OpenClinica is a clinical trial software platform that aims to provide data capture, data management, and operations management functionality to support human subjects-based research. It can be used for traditional clinical trials as well as a wide variety of other types of human subjects-based research.

Our vision for the product is to provide data capture, data management, and operations management functionality out-of-the-box, in an easily configurable, usable, and highly reliable manner. The underlying platform should be interoperable, modular, extensible, and familiar – so users can solve specific problems, in a generalizable way.

This past spring, the development team here at OpenClinica, LLC held a series of round table discussions about how this vision is reflected in the product. Our goals were to learn critical standards and information models needed for our technology to truly reflect this vision, to develop a consistent, shared vocabulary for the problem domain and the OpenClinica technology, and identify the most urgent opportunities to put these lessons into practice in the product and the community. In particular, we spent a lot of the time in these discussions about how OpenClinica’s use of the CDISC Operational Data Model helps enable this vision.

The discussions were invigorating and thought-provoking. We’ve recorded them to share with the greater community of OpenClinica developers, integrators, and others who want to better understand how the technology works, the design philosophy behind it, and where we’re going in the future. The videos are embedded below.

But before getting to the videos, here’s a bit more background on how we think about OpenClinica as a product and a platform.

First, OpenClinica functionality should be ready out-of-the-box, easily configurable and highly usable. Some of the most important features include:

  • Data definition and instrument/form design with no or minimal programming
  • Sophisticated data structures such as repeating items and item groups
  • Support for a wide variety of response types and data types (single select, multiple choice, free text, image)
  • Data management and review capabilities (including discrepancy management and clinical monitoring) with flexible workflows
  • ALCOA-compliant controls and audit history over all data and metadata, including electronic signature capabilities
  • Patient visit calendar design with management of multiple patient encounters and multi-form support
  • Reporting and data extract to a wide variety of formats (tab, SPSS, CDISC ODM)
  • Ability to combine electronic patient reported outcome (ePRO) data with clinically reported data using common form definitions (write once, run anywhere)
  • Deployment via multiple media, mobile or standard web browser

Many of these things have already been implemented, and more are under development.

The core concept around which OpenClinica is organized is the electronic case report form (CRF). In OpenClinica, a CRF is a resource that is essentially a bunch of metadata modeled in CDISC ODM with OpenClinica extensions. It doesn’t (necessarily) have to correspond to a physical or virtual form; it may represent a lab data set or something similar. An OpenClinica Event CRF is that same bunch of metadata populated with actual data about a particular study participant. Thus, it combines the metadata with the corresponding item (field) data, with references to the study subject, event definition, CRF version, and event ordinal that it pertains to. In this conceptual view of the world, CRFs (as well as CRF items, studies, study events, etc.) are resources with core, intrinsic properties and then some other metadata that has to do with how they are presented in a particular representation. Built around these core resources are all the workflow, reporting, API, security, and other mechanisms that allow OpenClinica to actually save you time and increase accuracy in your research.

Second, OpenClinica should be interoperable. The ultimate measure of interoperability is having shared, machine readable study protocol definitions, and robust, real-time, ALCOA-compliant exchange of clinical data and metadata that aligns with user’s business processes. It should be easy to plug in and pull out or mix-and-match different features, such as forms, rules, study definitions, report/export formats, and modules, to transport them across OpenClinica instances or interact with other applications. Establishing well defined methods and approaches for integration into existing health data environments is a key goal of interoperability.

Third, OpenClinica should be modular and extensible. OpenClinica already provides some of the most common data capture and data management components and aims to have a very broad selection of input types, rules, reports, data extracts, and workflows. However OpenClinica developers should also have the freedom to come up with their own twist on a workflow, module, or data review workflow and have it work easily and relatively seamlessly with the rest of OpenClinica. User identification, authentication, and authorization should be highly configurable and support commonly used general purpose technologies for user credentialing and single-sign-on (such as LDAP & OAuth).

The CRF-centric model allows us a great deal of flexibility and extensibility. We can support multiple modalities, with different representation metadata for rendering the same form, or perhaps the shared representation metadata but applied in a different way (i.e. web browser vs. mobile vs. import job). We can address any part of the CRF in an atomic, computable manner. This approach has been successfully applied in the Rule Designer, which takes the ODM study metadata and allows browsing of the study CRFs and items, with the ability to drag and drop those resources to form rule expressions. Features such as rules and report/export formats are represented as XML documents. These documents define how the features behave in standardized ways so that one rule can, say, be easily replaced with another rule without having to modify all the code that makes use of the rule.

Finally, OpenClinica aims to be familiar in the sense of allowing data managers, developers, statisticians to work in a design/configuration/programming environment that they already know. Programmers don’t all have the same experience, and it would be somewhat limiting to force OpenClinica developers to all use the same language (Java) that OpenClinica was written in. We are constantly looking at ways to make it possible (not to mention reliable and easy!) for users and developers to interact with and extend OpenClinica in a programmatic way. This can mean anything from data loading to more meaningful integrations of applications common to the clinical research environment. As proponents of open, standards-based interoperability, our starting point is always to develop interfaces for these interactions based on the most successful, open, and proven methods in the history of technology – namely the protocols that power the World Wide Web (such as HTTP, SSL, XML, OAuth 2.0). They are relatively simple, extensively documented, widely understood, and well-supported out of the box in a large number of programming and IT environments. On top of this foundation, we rely heavily on the wonderful work of CDISC and the CDISC ODM to model and represent the clinical research protocol and clinical data.

Session 1:  from 30-March-2012 (start at the 5 min 20 sec mark)

Session 2:  from 06-April-2012 (start at the 1 min 25 sec mark)

Session 2a:  from 20-April-2012

Session 3:  from 27-April-2012

Session 4:  from 11-May-2012

eClinical Integration

Increasingly I am seeing real momentum for reducing the costs and barriers to integration of eclinical applications and data in a way that benefits users.

A great example is a recent LinkedIn discussion (you may need to join the group to read it).  Several software vendors and industry experts engaged in a dialogue about the pros and cons of different integration approaches. There is an emerging consensus that integration approaches should adopt open, web standards and harnesses the elegance and flexibility of the CDISC Operational Data Model. This consensus may signal a sea change in attitudes to standards-based integration that makes it the norm rather than the exception.

This is not new to members of the OpenClinica community. Over the years we’ve had many examples of such integration efforts described on this blog and at OpenClinica conferences. To make such efforts more powerful, reusable, and robust, the OpenClinica team has invested a great deal over the past year to create a meaningful, CDISC ODM-based model for interacting with OpenClinica. We have incorporated open web standards (RESTful APIs for transport and OAuth for security) to make the interfaces easily accessible with commonly used software tools.  This is part of a newly published resource for OpenClinica development and integration, the OpenClinica 3.1 Technical Documentation Guide. The first version of the specification can be viewed at https://docs.openclinica.com/3.1/technical-documents/rest-api-specifications. I’ve reproduced the introduction here:

Overview

We are constantly looking at ways to make it possible (not to mention reliable and easy!) for users and developers to interact with and extend OpenClinica in a programmatic way. This can mean anything from data loading to more meaningful integrations of applications common to the clinical research environment.

As proponents of open, standards-based interoperability here at OpenClinica, our starting point is always to develop interfaces for these interactions based on the most successful, open, and proven methods in the history of technology – namely the protocols that power the World Wide Web (such as HTTP, SSL, XML, OAuth 2.0). They are relatively simple, extensively documented, widely understood, and well-supported out of the box in a large number of programming and IT environments. On top of this foundation, we rely heavily on the wonderful work of CDISC and the CDISC ODM to model and represent the clinical research protocol and clinical data.

This chapter describes a CDISC ODM-based way to interact with OpenClinica using RESTful APIs and OAuth. The REST web services API relies on HTTP, SSL, XML, OAuth 2.0. This architecture makes the ODM study protocol representation for an OpenClinica study available and supports other interactions for study design.

Why REST?

The OpenClinica RESTful architecture was developed to (initially) support one particular use case, but with the intention of becoming more broadly applicable over time. This use case is based on a frequent request of end users: for OpenClinica to support a visual method for designing, editing, and testing “rules” which define edit checks, email notifications, skip pattern definitions, and the like to be used in OpenClinica CRFs. Users have had to learn how to write rules in XML, which can be confusing and have a big learning curve for non-technical individuals. The OpenClinica Rule Designer is an application that allows end users to build cross field edit checks and dynamics within a GUI based application. It is centrally hosted Software as a Service (SaaS) based application available for OpenClinica Enterprise customers at https://designer.openclinica.com.

To support interaction of the centrally hosted rule designer with any instance of OpenClinica Enterprise installed anywhere in the world, we needed to implement a secure protocol and set of API methods to allow exchange of study information between the two systems, and do so in a way where the user experience was as integrated as if these applications were part of the same integrated code base. In doing so, and by adopting the aforementioned web and clinical standards to achieve this, we have built an architecture that can be extended and adapted for a much more diverse set of uses.

This chapter specifies how 3rd party applications can interact with an OpenClinica instance via the REST API and OAuth security, and details the currently supported REST API methods. The currently supported API methods are not comprehensive, and you may get better coverage from our SOAP API. However the OpenClinica team is continuing to expand this API and since it is open source anyone may extend it to add new methods to meet their own purposes. If you do use the API in a meaningful way or if you extend the API with new methods, please let others know on the OpenClinica developers list (developers@openclinica.org), and submit your contributions for inclusion back into the codebase – you’ll get better support, increased QA, and compatibility with future OpenClinica releases.

RESTful Representation, based on ODM

“REST”, an acronym for REpresentational State Transfer, describes an architectural style that allows definition and addressing of resources in a stateless manner, primarily through the use of Uniform Resource Identifiers (URIs) and HTTP.

From Wikipedia: A RESTful web service (also called a RESTful web API) is a simple web service implemented using HTTP and the principles of REST. It is a collection of resources, with three defined aspects:

  • the base URI for the web service, such as http://example.com/resources/
  • the Internet media type of the data supported by the web service. This is often JSON, XML or YAML but can be any other valid Internet media type.
  • the set of operations supported by the web service using HTTP methods (e.g., POST, GET, PUT or DELETE).

REST is also a way of looking at the world, as eloquently articulated by Ryan Tomayko.

In the context of REST for clinical research using OpenClinica, we can conceptually think of an electronic case report form (CRF) as a resource that is essentially a bunch of metadata modeled in CDISC ODM with OpenClinica extensions:

  • Some of this metadata (data type, item name, response set, etc) is intrinsic metadata – i.e. tied to the definition of the CRF and its items and mostly unchangeable after it is initially defined.
  • Some of this metadata is representation metadata and used only when the CRF is represented as a web-based HTML form (in the OpenClinica database schema we call this form_metadata, but it also can include other things like CRF version information and rules).

An OpenClinica Event CRF is that same bunch of metadata with the corresponding item data, plus references to the study subject, event definition, CRF version, event ordinal, etc that it pertains to.

  • The notion of a CRF version pertains to the representation of the CRF. It is not intrinsic to the event CRF (this is debatable but it is how OpenClinica models CRFs). Theoretically you should be able to address and view any Event CRF in any available version of the CRF (i.e. http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/edit and http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v2/edit both show you the same data represented in different versions of the CRF). Of course the audit history needs to clearly show which version/representation of the CRF was used for key events such as data capture, signature, etc.
  • Rules are also part of the representation metadata as opposed to intrinsic metadata, even though you don’t need to specify them on a version-by-version basis.
  • Anything attached to the actual event CRF object or its item data – discrepancy notes, audit trails, signatures, SDV performance, etc is part of that event data and should be addressable in the same manner (e.g. http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEM…)

In this conceptual view of the world, CRFs (as well as CRF items, studies, study events, etc.) are RESTful resources with core, intrinsic properties and then some other metadata that has to do with how they are presented in a particular representation. We now have a model that allows us a great deal of flexibility and adaptability. We can support multiple modalities, with different representation metadata for rendering the same form, or perhaps the shared representation metadata but applied in a different way. We can address any part of the CRF in an atomic manner. This approach has been successfully applied in the Rule Designer, which takes the ODM study metadata and allows browse of the study CRFs and items, with the ability to drag and drop those resources into rule expressions. Here are some examples of additional future capabilities that could be easily realized on top of this architecture:

  • Multiple data entry modalities – a user may need to deploy patient based data entry via web, a tablet, a thick client, or even paper/OCR, each with a very different presentation. Each of these may be part of OpenClinica-web or a separate application altogether, but all will rely on the same resource metadata to represent the CRF (according to the UI + logic appropriate for that modality), and use the same REST-based URL and method for submitting/validating the data.
  • Apply a custom view (an XSL or HTML/CSS) to a patient event CRF or full casebook – some uses of this could be to represent as a PDF casebook, show with all audit trails/DNs embedded in line with the CRF data, show a listing of data for that subject, provide (via an XSL mapping) as an XForm or HL7 CCD document for use by another application) – http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/view?renderer=somemap…
  • The same path used in the URLs, eg http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEMOID could be used as the basis for XPath expressions operating on ODM XML representations of CRFs and of event CRF data
  • Internationalization – OpenClinica ought to allow our CRF representation metadata to have an additional sub-layer to render the form in different languages, and then automatically show the appropriate language based on client/server HTTP negotiation (like we do with the rest of the app). Currently internationalization of CRFs requires versioning the CRF.
  • View CRF & Print CRF – use the same representation metadata (form metadata) but apply slightly different rules on how the presentation works (text values instead of form fields, no buttons, turn drop down lists into text values)
  • Discrepancy manager popup – one requested use case would allow a user to update a single event CRF item data value directly from the discrepancy note UI point of view. In this case you could think of just updating that one item as addressing the resource http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEM…. In this model, whatever rules and presentation metadata need to get applied at presentation and save time happen automatically.
  • Import of CDISC ODM XML files – imported data would be processed through the same model, but only use the metadata that’s relevant to the data import modality. Same for data coming in as raw ODM XML via a REST web service. A lot of times the import only populates one part of a CRF and the other parts are expected to be finished via data entry. This model would help us manage that process better that the current implementation of ODM data import.

There are many considerations related to user roles and permissions, workflows, and event CRF/item data status attributes that need to be overlaid on top of this REST model, but the model itself is a conceptually useful way to think about clinical trials and the information represented therein. When implemented using CDISC ODM XML syntax it becomes even more powerful. As widespread support for ODM becomes the norm, the barriers to true interoperability – shared, machine readable study protocol definitions, and robust, real-time, ALCOA-compliant exchange of clinical data and metadata that aligns with user’s business processes – get eviscerated.

* This chapter frequently refers to ODM-based representations of study metadata and clinical data in OpenClinica. We strive as much as possible to implement ODM-based representations of OpenClinica metadata and data according to the generic ODM specifications (currently using ODM version 1.3). However, to ensure our representations support the full richness of information used in OpenClinica we often have to rely on ODM’s vendor extensions capability. We have not always made distinctions in this chapter as to where we are using ‘generic’ ODM versus OpenClinica extensions, but that is documented here. It is our goal as ODM matures and supports richer representations of study information to migrate our extensions back into the generic ODM formats.

** Also note the RESTful URL patterns referred to above are conceptual. Refer to the technical subchapters of this REST API specification for the actual URLs.

The spec (like much of the code that implements it) is open source. I’m looking forward to hearing comments and feedback, and sharing thoughts on how we can encourage broader adoption across different types of eclinical applications.

Plug-in Architecture for OpenClinica Data Extracts

A major part of the Akaza mission is to make OpenClinica more flexible and customizable. Having a code base that is open source is a great place to start. But not everybody wants to develop Java code to meet their own requirements. We aim wherever possible to add configuration options and easy-to-use design tools within the user interface, but not all problems are a good fit for that approach. The solution is a series of “plug-in” interfaces that allow users to add their own capabilities and configurations, or interact with other applications. Some of these interfaces, such as loading of spreadsheet-based CRF definitions, are a critical part of OpenClinica, without which the system would not be functional. Other interfaces include CDISC ODM data import, job scheduler for import and export, SOAP-based web services, and the HTML5 popup interface that allows 3rd party applications to enter CRF data. Along the way community members have improved these interfaces and taught us a lot about how to design them better.

OpenClinica 3.1 will include a completely rewritten version of Extract Data based around a plug-in architecture that increases flexibility and functionality. We’ve learned that user requirements for organizing, formatting, and presenting data are tremendously diverse (and often conflicting), depending on the user, the intended purpose, the study, and the organization. Our old Extract Data architecture made it difficult to add new output formats or tweak the ones already there. The new functionality provides a highly extensible, easily configurable means to get data formats that meet a user’s precise requirements. It does this by:

  • Using XSL stylesheet transformations to read native CDISC ODM XML and output the data in a transformed format.
  • Specifying available formats, their associated stylesheets, and associated properties (like filename, archival settings, and whether to compress the file) in a properties file (the extract.properties file)
  • Optionally, enabling postprocessing of the transformed data to output to certain non-text file formats and destinations

We started out with a desire to simplify the native output of the OpenClinica Extract Data java application, in a way that increased quality, stability, completeness, and performance. From now on, the OpenClinica core application will only produce CDISC ODM (version 1.3, with OpenClinica Extensions) as the natively supported format. With only one native format, we’re better able to test, document, and guarantee the output. All other output formats generated are transformations from this native ODM 1.3 w/extensions format. We made sure (via the OpenClinica vendor extensions) that we can export all possible data related to a study and its clinical data in this format. In 3.1, this also includes export of audit trail, discrepancy, and electronic signature information.

After we devised a way to improve the quality, stability, and performance of the data coming out of the core, we needed to provide a way to execute the data transformations, into any of a wide variety of outputs. It was important for us to adopt standard, widely used formats and open source technologies as the basis for these transformations. We selected the XSLT (Extensible Stylesheet Language Transformations) language because of its applicability to CDISC ODM XML, extensive features, and reasonably simple learning curve. The implementation of these transformations is powered by a widely used open source engine, the Saxon XSLT and XQuery processor. The behavior of Export Data is determined by the extract.properties configuration file and the XSL stylesheets. The extract.properties file specifies the available data formats available in the system, each with a corresponding XSL stylesheet. OpenClinica 3.1 by default includes a set of XML stylesheet transformations for commonly used formats, such as HTML, Tab-delimited Text, and SPSS. The OpenClinica Enterprise Edition will include additional new formats including SAS, annotated CRFs, printable PDF casebooks with integrated audit trail and discrepancy notes, and a SQL-based data marts with normalized CRF-based table structure for ad-hoc reporting.

At this point, we can now reproduce the extract functionality available in OpenClinica 3.0, at a higher level of quality and stability. The stylesheets replicate the HTML, SPSS, tab-delimited, and multiple CDISC XML formats that were available in 3.0, and the framework will make it much easier to add new formats. However all of these data output formats are some type of text or XML based file. Users have also voiced the need to do things that XSLT cannot do by itself, like produce PDF files or load the data into external relational databases for ad-hoc reporting. The solution was implementation of a postprocessor framework that allows more sophisticated functionality. With postprocessing we can do things like generate binary output formats or send data to a target destination. Two postprocessors are included in 3.1 by default: output to a database using JDBC connectivity and generating PDF files using XSL-FO. The postprocessing step is transparent to end-users; they simply get their files for download or alternatively receive a message that the data has been loaded into the database. And the framework exists to add additional postprocessors via the addition of Java classes with references to those class names in the extract.properties file.

Execution of data export occurs when a user or job initiates a request for data. The request includes the active study or site, the dataset id, and the requested format. The end user will notice only minor differences in how they use the Extract Module. The process of creating datasets has not changed. The Extract can be still initiated from the ‘Download Data’ screen or via a job by selecting the desired output format. At this point however, rather than waiting for the download page to load, the user will be told that their extract is in queue, and receive an email and on-screen notification when the extract is complete. Execution follows a four step process:

Step 1.   Generate native CDISC ODM XML version 1.3 with OpenClinica Extensions

Step 2.   Apply XSL transformation and generate output file according to the settings in extract.properties for the specified format

Step 3.   Optionally, if postprocessing is enabled for the requested format, run the post processing action according to the settings in extract.properties.

Step 4.   Provide user notification with success or failure message.

We’ve also improved the logging and messaging surrounding extracts, which will be crucial for anyone developing, customizing, or debugging XSL stylesheets. As always, full internationalization is supported – if you want a value to be internationalized, it should be prefaced with an & (ampersand) symbol in the extract.properties file, and the corresponding text placed in the notes.properties i18n files.

As is common with software, we didn’t get to do everything we wanted in the first release of these capabilities. Some future features include:

  • Allow extract formats to be restricted to specific users, studies/sites, and/or datasets.
  • Allow loading and validation of formats within the web UI or via web services rather than via the extract.properites config file.
  • Create an exchange for XSL formats similar to the CRF Library.

Other than that we think we’ve thought of everything :-). Have we?

– Cal Collins

XForms and OpenClinica

Here at Geneuity, we do a lot of contract research work for sponsors of clinical trials.  That sometimes means developing and validating custom assays that aren’t available off the shelf. And this, in turn, often means developing special interfaces for data capture that can’t be configured inside OpenClinica (as yet).

Does that mean we jettison OpenClinica?

No, absolutely not!

Since OpenClinica is open source and dedicated to interoperability, it can be easily extended.  This article describes the specifics of one such case involving XForms.

A while back, Geneuity was developing an ELISA to measure the abundance of a specific protein in plasma.  We needed a single web form with a spreadsheet-like feel and functionality in which lab technicians could enter data, interpolate unknowns, send the resulting interpolated values for insertion into OpenClinica and then perform source data verification.  Additionally, the form needed to be pre-populated with the accession numbers of specimens requiring testing.  The form had to work on any browser and not be dependent on any browser plugins.  And we had to have it fast.

Impossible? No, not with XForms and the open source XForms renderer betterFORM.

XForms have several advantages.  First of all, you author the forms in relatively simple XML while the complicated AJAX that makes the spreadsheet-like feel and functionality is rendered by the renderer and presented to the client browser for you.  Secondly, XForms allows you to easily call upon webservices to populate data items in the form.  In our case, we developed three such services: one to return a list of accession numbers corresponding to specimens yet to be tested along with corresponding hyperlinks directed inside OpenClinica for rapid source data verification (SDV); another to calculate and graph a standard curve and to interpolate the specimen unknowns; and still another to insert the interpolated values into OpenClinica.  The data flow is diagrammed in Figure 1.  Although a great deal is going on behind the scenes, the user experience is seamless as a single form is being used while data retrieval and refreshment are being done without interrupting the display (thanks to AJAX as automatically implemented by betterFORM). The form itself is pictured in Figure 2.

And there you have it.  The bottom-line: no matter how complicated your data capture requirements are, you can count of the interoperability of OpenClinica.

Figure 1. First, the lab tech initially summons the XForm which calls upon OpenClinica via a webservice for the list of specimens for which there are no test results. The tech enters the data and clicks on the appropriate button to call for data interpolation and calculation. After review of the results, the tech then submits the data for insertion into OpenClinica (for more on this see here). Finally, the tech performs source data verification using the hyperlinks populated in the XForm as the result of step 1.
Figure 2. This is a snapshot of the XForm as it exists after interpolation and calculation. Only two specimens are shown, but usually there are many more in a batch.

Trial Sponsors and Their Contract Labs: Better Collaboration via OpenClinica

At Geneuity Clinical Research Services, we do lab tests for trial sponsors. As readers of this blog know, we use OpenClinica internally as an LIS (laboratory information system), but as more and more drug companies and CRO’s adopt OpenClinica we foresee the day when we will be using their installations as our LIS, not ours.  A common platform will eliminate lots of duplicated effort and will allow for real-time transparency and better collaboration.  But it will also require sponsors to design their CRF’s with their contracting laboratories in mind.  In this article, we describe how this could be done.

First, consider specimen collection and tracking.  Normally, trial sponsors don’t consider doing this within the context of OpenClinica.  But they should.  Let’s say a specimen accidentally thaws in transit between the collection site and the contract lab.  Shouldn’t that fact be summarily recorded in the same context as the resulting lab test whose value may ultimately be reported to the FDA?  I should say so.

So, can OpenClinica be configured to do this? Yes and easily. A separate CRF dedicated to specimen collection could be designed and assigned to each event.  Alternatively, a specimen section could be added to already existing CRF’s.  Either way, fields for such things like accession number and specimen type could then be included.  These would be filled in by site personnel responsible for specimen collection.  Additional fields like ‘shipping deviations’ and ‘laboratory receipt date’ could also be included and would be filled in by lab personnel upon specimen delivery to the testing lab.  When it comes time for data analysis, the sponsor can use OpenClinica’s data export capabilities to exclude or include those lab results with shipping deviations and to investigate the consequences.

Other important aspects of specimen collection include printing labels to barcode samples and generating an attendant paper manifest (know as a requisition) against which labs can check incoming shipments of specimens.  OpenClinica can’t do such things currently.  It would require a whole new software module, but lots of added value could be achieved if one were written.  For instance, one can envision that after accessioning a specimen, site personnel could print a corresponding requisition from the same application window.  Also, imagine the time savings if lab personnel could conveniently print barcode labels after receiving a specimen and recording its receipt date and shipping deviations (if any).  And because the paper requisitions would be generated within the context of OpenClinica, subsequent source data verification by lab personnel could be expedited using QR-encoded URL’s that drill-down into the patient-event matrix. For more on this, see here.

Specimen tracking is just part of the story when it comes to sponsors and their contract labs.  Getting lab data from the laboratory testing platform into OpenClinica is another.  Recently during OpenClinica’s March 22 Global Conference in Bethesda, Akaza Research and Geneuity did a live demonstration of how this can be achieved using a set of MirthConnect channels. A batch of raw lab data keyed only to accession numbers was sent from Geneuity’s corporate headquarters in Maryville, TN to a remote OpenClinica installation hosted at Akaza’s Waltham, MA facility where it was inserted into the database programmatically via an awaiting web service. The insertion was streamlined, secure and seamless.  When setting up a trial, sponsors should think about the lessons this demo provides and consider distributing already configured and validated MirthConnect channels to their contract labs.  In this way, sponsors can control how their data is treated and understand every detail of its electronic provenance. And because MirthConnect can be configured to store its history, the trial’s audit trail can be extended upstream to the data’s very source.
Finally, consider invoicing.  Contract labs have to be paid when they do a test.  Monthly invoicing reports could be generated from OpenClinica by configuring an appropriate ‘data set’ and having it execute at the end of each month using the application’s new built-in quartz scheduler.  In this way, billing would be a snap and everybody would be on the same page.

In summary, how can trial sponsors configure OpenClinica to collaborate better with their contract labs? Do the following, keeping the workflow shown in Figure 1 in mind:

1.    Include a specimen accessioning CRF for each event.  Educate your collection-site people and your lab people as to who is responsible for which fields.  Use OpenClinica’s internal messaging system to remind people of their roles when the study is actually underway.
2.    Exploit OpenClinica’s web services framework to enable batch uploads of laboratory data.
3.    Configure and validate MirthConnect channels to get the lab results from the source data files to your OpenClinica installation.
4.    Distribute these channels to your lab contractors and educate them on their use.
5.    Configure OpenClinica to automatically generate monthly data sets for billing purposes.

The bottom-line: OpenClinica is infinitely configurable and sponsors should start doing so with their lab contractors in mind.  The result will mean both better collaboration and lower costs.

Figure 1: A specimen is collected from a subject on site. The on-site data manager logs into OpenClinica and accessions the sample and prints an accompanying hard-copy requisition. The sample is then shipped to the contracting laboratory where lab personnel log into OpenClinica and indicate they have received the sample. Specimens are then tested in batch and the results are then uploaded en masse into the sponsor's installation of OpenClinica using a thoroughly vetted, validated and auditable MirthConnect channeling system.

Pipes, Hats … and OpenClinica: Digesting HL7 in OpenClinica

If you’re an OpenClinica administrator somewhere, the chances are good somebody has asked you: “Can OpenClinica handle HL7 messaging?”

“No, it doesn’t,” you’ve said.

You probably said that with a sigh of relief because HL7 is a byzantine data exchange standard whose complexity keeps an army of consultants employed and drives neophytes like myself to madness.  The HL7 2.X specification uses eye-fatiguing pipes (“|”) and hats (“^”) as delimiters and has been referred to by experts as the “non-standard standard” (see this). Unfortunately, it is also the lingua franca of health-care messaging currently, and will likely continue to be for a long time to come.

So, it is with a heavy sense of resignation that we here at Geneuity are taking up the challenge to make OpenClinica fluent in HL7. As a contract clinical laboratory, we are particularly interested in having OpenClinica able to digest HL7 ORU messages that convey lab results.  This article details our first pass at the problem.

Our approach is shown in Figure 1.  It makes use of Mirth and a new web service in OpenClinica developed by Geneuity called EventDataInsert.  As shown, an HL7 message containing a lab result is sent by TCP to a Mirth channel which is configured to transform it into a SOAP message palatable to EventDataInsert.  EventDataInsert reads the message and then sees if the specimen has already been accessioned into OpenClinica.  If so, it inserts the data into the underlying database and signals a successful entry.  If not, it does nothing and signals a rejection.  These signals are transmitted back to Mirth which issues a standard HL7 acknowledgment (ACK) message coded with either ‘AA’ for ‘Application Accept’ or ‘AR’ for ‘Application Reject’.  It is the responsibility of whoever (or whatever) sent the HL7 message in the first place to follow up when a lab result is rejected.

To develop this strategy, we used several tools.  To generate HL7 test messages, we utilized the HL7 generator (freely available here) made by the people responsible for the ELINCS initiative.  To send and receive HL7 messages to and from Mirth via TCP, we used Netcat, another freely available utility.

And there you have it!  Of course, the HL7 standard covers much more than the delivery of lab results, but this exercise is most relevant to our concerns and represents an important first step in making OpenClinica talk the talk when it comes to HL7.

HL7 Strategy for OpenClinica

Figure 1:  HL7 Strategy for OpenClinica

First, a HL7 message conveying lab results is sent to a Mirth channel listening for TCP requests.  Mirth parses the message and transforms it into a SOAP message which it then hands off to the EventDataInsert webservice listening within OpenClinica.  EventDataInsert looks to see if the specimen to which the lab result pertains has been accessioned into OpenClinica’s underlying database.  If so, it inserts the results and signals back to Mirth that fact.  If not, it enters nothing and signals back to Mirth that it did nothing.  Mirth digests these signals and sends back to the sender an appropriately configured ACK message via TCP.

Rapid Deployment of New Functionality in OpenClinica Using MirthConnect

In a previous article, we describe how we at Geneuity Clinical Research Services exploit OpenClinica’s new web services feature to automate the entry of lab data keyed to accession numbers.  Here, we describe more fully how and why we use MirthConnect.

Started in 2006, MirthConnect is an open source project sponsored by the Mirth Corporation of Irvine, CA.  It is middleware designed to transform, route and deliver data.  It supports HL7, X12, XML, DICOM, EDI, NCPDP and plain old delimited text.  It can route via MLLP, TCP/IP, HTTP, files, databases, S/FTP, Email, JMS, Web Services, PDF/RTF Documents and custom Java/JavaScript.  MirthConnect has been likened to a Swiss army knife and justifiably so.

Channels are the heart and soul of a MirthConnect installation.  A channel is user defined and has a source and a destination.  A source may be a flat file residing on a remote server or a web service call or a database query or even another channel—whatever you like, it doesn’t matter. A destination may be to write a PDF document, email somebody an attachment or enter data into a database.  Again, whatever!

To illustrate, say you want to poll a database and generate a weekly report.  No sweat! Using MirthConnect’s easy-to-use drag-and-drop template-based editor, define a channel with a database reader as a source, and a document writer as a destination, fill in details like user names, passwords and machine names, define which database fields you want to retrieve and how you want to display the data, and you’re done!  MirthConnect’s daemon handles the rest based on your channel’s configuration.

Once defined, a channel can be exported as XML for later import into another MirthConnect installation.  This is all done with the point and a click of a mouse.

At Geneuity, we use MirthConnect to get data in and out of OpenClinica.  Originally, we used custom JAVA code to do this.  But once we found MirthConnect, we quickly realized we were reinventing the wheel.  Why do that?

Here’s a concrete example.  Consider the very simple CRF from a mock OpenClinica installation shown in Figure 1.  It has three groups of items: accessioning, results and reportage.  When a specimen arrives at Geneuity, the lab tech looks up the patient and event pairing in the subject matrix as specified by the requisition and types into the CRF the accession number, the receipt date and any shipping deviations.  This is done by hand and is indicated as step 1 of Figure 2.

Then, as shown in step 2 of Figure 2, the tech tests the specimen at the testing platform.  In step 3, the platform spits out the data whereupon a collection of MirthConnect channels operating in tandem parses the results, transforms them into SOAP messages and sends them to the EventDataInsertEndpoint web service feature of OpenClinica for upload into the CRF fields designated ‘Assay date’ and ‘Analyte concentration’.

After the tech reviews the data and marks it complete, another collection of channels polls the database for results newly marked complete, generates and delivers PDF reports of the corresponding data (step 4) and then reports back to OpenClinica (step 5) via EventDataInsert the details of the reportage, including status, time and any errors (see the third and last item grouping labeled ‘REPORTAGE’ in Figure 1).

The scenario outlined above requires NO CUSTOM CODE beyond the channel configurations and these are encapsulated and standardized by design.  As such, you don’t need an army of coders on staff to develop and maintain them.

Both OpenClinica and MirthConnect are great as standalone products.  Linked together, however, they really sizzle.

Figure 1: A simple CRF from a mock OpenClinica installation
Figure 1: A simple CRF from a mock OpenClinica installation
Figure 2: This shows how the different item groupings in the CRF depicted in Figure 1 are populated.  Values for items under ACCESSIONING are entered manually by the lab tech.  Values for items under RESULTS are populated by the Mirth channels continuously listening for in-coming data from the clinical testing platform.  Values for items under REPORTAGE are populated by a distinct set of  Mirth channels responsible for polling and reporting newly completed results.
Figure 2: This shows how the different item groupings in the CRF depicted in Figure 1 are populated. Values for items under ACCESSIONING are entered manually by the lab tech. Values for items under RESULTS are populated by the Mirth channels continuously listening for in-coming data from the clinical testing platform. Values for items under REPORTAGE are populated by a distinct set of Mirth channels responsible for polling and reporting newly completed results.