EDC Scandinavia uses OpenClinica for BYOD ePRO

Krister Kristianson, PhD.
EDC Scandinavia AB, Stockholm

RESTful web services with OpenClinicaIn a recent study involving several hundreds patients, we decided to offer patients the ability to collect their diary data using their own smart phones instead of the traditional paper diary. The patients who decided to participate in the study downloaded the app to their smart phone or could use their desktop to access the application.

The apps were developed for iPhones and Androids with a reminder function that notified them when to report their symptoms. The data was then transferred to OpenClinica using the RESTful web services immediately upon entry. The patients ID and pin code were tested before data was added to the database to avoid any illegitimate entries.

About 80% of the patients decided to use the electronic diary – 65% using iPhones and 35% using various Android devices. They could also download the app to iPads or other tablets and if they preferred, they could use the application on desktops.

Outcomes:

  • Paper CRFsOf the patients who used the traditional method of reporting diary data on paper, 2.5 times more patients failed to report at scheduled time points compared to the patients using the app.
  • The app recorded the date and time automatically. When using paper, you can never be sure that the diary has not been completed at the time listed.
  • The addition of simple edit checks mitigated data entry errors, greatly contributing to the increased quality of the data.
  • It further reduced the manpower needed to manually enter data on to the eCRF and enabled us to monitor the patients in real-time and contact them if anything went wrong.

Although the patient population was relatively young, in this part of the world, even elderly patients are likely to use smart phones or desktops and would to be willing to use electronic data capture (EDC) for reporting diary data. The easy configuration of web services in OpenClinica and the ability to query data upon arrival made it an easy task to set up and validate the study.

New OpenClinica Developer Release: Revamped Print Module

A new developer release is available for OpenClinica. While it contains a number of significant improvements, one of the more fundamental changes is a reconstruction of OpenClinica’s print CRF functionality. To date, all printable screens resembled the existing web form interface for the eCRF, whereas now, this engine has been completely redesigned to more closely follow industry standard for printable views of web forms.

Why This Matters To Developers

Ok, I get it. You are a developer, so what’s the big deal about being able to send CRFs to the printer?  It is the technology behind this improvement that will hopefully catch your attention. Important changes have been occurring in the sphere of web apps, and I don’t mean just a sprinkling of AJAX here and there along with a UI that looks like Facebook. The real revolution goes much deeper.

Setting a precedent for OpenClinica: Completely Decoupled Client Technology using Web Services

The notion is simple. Create a browser-based client with the same decoupled technology that a iOS or Android mobile client requires. In plain English, this means that only data is sent back and forth between the client and the server. Just as is expected with a native OS mobile client, the browser-based client never relies on the server to generate or send any part of itself except when the URL is first accessed and loaded into the browser. The client is therefore completely decoupled and can be in complete control of their own state. This results in a more reliable, speedier, feature-packed, and easier to maintain platform for both the client and server. Another way to put it … in the not too distant future, our 400 or so JSP pages will be replaced by one or two main HTML templates and about a dozen small (less than 100 lines each) HTML component files.  But for now, take a look at our printable CRF design as an example of the way forward.

Getting to the Point: REST, JSON, JQuery, HTML, and CSS

What follows is a description of the processing path that starts with a REST URL and ends with a printable CRF.

  1. A user will click on a print icon that is part of many of the CRF view forms. The RESTful URL referenced in the link will be in the form of rest/metadata/html/print/{StudyOID}/{StudyEventOID}/{FormDefOID}. Using this combination, a wildcard asterisk character (*) can be placed as a specifier for all studies, all study events, and all CRFs, respectively.
  2. The first path element in the URL named “rest,” indicates that this is handled by our implementation of org.akaza.openclinica.web.restful.ODMMetadataRestResource, our Jersey JAX-WS controller servlet. The “metadata” in the second path element indicates that is for metadata only and no clinical data will be transmitted. The “html” in the third path element indicates that this will result in a rather simple JSP page at /WEB-INF/jsp/printcrf.jsp. This JSP page is the container for the JQuery code that will make a second REST call to the back-end to retrieve the CDISC ODM in JSON form and is also the container for the Javascript and JQuery code that converts the information contained in the ODM JSON into an HTML DIV element with the rendered printable CRF.
  3. The AJAX call that the Javascript method getPrintableContent() in js/app.js references is a URL in the form of rest/metadata/json/view/{StudyOID}/{StudyEventOID}/{FormDefOID}. It is the 3rd path element of “json” which indicates that the same ODMMetadataRestResource servlet will now fetch the relevant ODM XML metadata, convert it to JSON, and send it back to the callback portion of getPrintableContent(). In the callback portion, a call to app_odmRenderer.renderPrintableStudy() kicks off the process by which JavaScript inspects the JSON ODM object returned by the server and builds a DOM element that represents the portion of the metadata that is meant to be displayed as one or more CRFs.
  4. The DOM HTML is rendered with the help of JQuery Templates. This allows HTML fragments such as template/print_item_def.html, which are initially loaded in memory, to be combined with certain extracted key/value pairs to render an individual or list of components.

What’s on the Horizon

This first release extends to printing blank CRFs. Then, we will work to extend this to handle printing CRFs containing clinical data and very large printable form sets. The process described above will be similar, with the exception that all large documents, typically over 100 pages long, will be rendered using a Java rendering class that builds off of Velocity templates. The resulting server-side HTML page will be converted to a downloadable PDF.

– Nick Sophinos, Senior Developer

Click here for Developer Release

Importing OpenClinica Data Into R

R is a powerful open source statistical software package–many people think of it as an open source alternative to SAS. A number of OpenClinica users have been asking about how to bring data from OpenClinica into R. So, we recently published a write-up in the OpenClinica reference guide, providing instructions for importing data into R in three different ways:

  •     using a Tab-delimited file,
  •     using a CSV file, and
  •     using an Excel file.

Feedback and additional ideas are welcome.

– Ben Baumann

Beyond Single-Selects – Managing Long Lists in OpenClinica

Here at the VU Medical Center Amsterdam, we’re implementing OpenClinica (OC) for CTMM TRACER. TRACER aims at improving diagnosis, prognosis and therapy selection for rheumatoid arthritis. A series of eCRFs are being developed, ranging from questionnaires to joint scoring and DAS-score calculations.

One of our most recent CRFs is concerned with a patient’s medication. In this CRF one of the items of interest is the medication the patient is currently on. The code system employed for medication is the Anatomical Therapeutic Chemical (ATC) Classification System, which contains over 500 entries. Initially, we intended to create a single-select field containing the ATC codes and their description. Unfortunately, the number of characters involved is way more than the maximum of 4000 allowed by OC. The easy alternative would have been to define a free text field, but it is generally best to avoid those, as the data in these fields tends to pollute easily.

The solution we created employs JavaScript and JQuery. Without getting into the technical details (you can get them here), the approach is to create a (read-only) text field in OC and let a non-OC single-select item write data to this field. This ensures that the values written to OC are limited to those specified in the list, without having to store the complete list in OC. The list entries themselves are stored in an external XML file, which is stored on the OC server. If at some point the ATC-codes need to be updated, we can simply update the XML file and the updates will be available for the users.

Whereas our problem was concerned with a long list of ATC-codes, the solution can be applied to any list which is longer than OC’s maximum number of characters. All you have to do is create an XML file, which describes the list, make some minor changes to the example code provided on the wiki page and upload two files to your OC server.

Sander de Ridder (MSc Computer Science, MSc Bioinformatics)

OpenClinica Web Services Tutorial Videos

OpenClinica’s web services layer provides a powerful mechanism for programmatic data interchange between OpenClinica and other systems. Below are the first two videos in a series of tutorials on working with web services. The videos are created by Hiro Honshuku, including the background music! Thanks Hiro!





The OpenClinica Platform – Developer Round Table Discussions

OpenClinica is a clinical trial software platform that aims to provide data capture, data management, and operations management functionality to support human subjects-based research. It can be used for traditional clinical trials as well as a wide variety of other types of human subjects-based research.

Our vision for the product is to provide data capture, data management, and operations management functionality out-of-the-box, in an easily configurable, usable, and highly reliable manner. The underlying platform should be interoperable, modular, extensible, and familiar – so users can solve specific problems, in a generalizable way.

This past spring, the development team here at OpenClinica, LLC held a series of round table discussions about how this vision is reflected in the product. Our goals were to learn critical standards and information models needed for our technology to truly reflect this vision, to develop a consistent, shared vocabulary for the problem domain and the OpenClinica technology, and identify the most urgent opportunities to put these lessons into practice in the product and the community. In particular, we spent a lot of the time in these discussions about how OpenClinica’s use of the CDISC Operational Data Model helps enable this vision.

The discussions were invigorating and thought-provoking. We’ve recorded them to share with the greater community of OpenClinica developers, integrators, and others who want to better understand how the technology works, the design philosophy behind it, and where we’re going in the future. The videos are embedded below.

But before getting to the videos, here’s a bit more background on how we think about OpenClinica as a product and a platform.

First, OpenClinica functionality should be ready out-of-the-box, easily configurable and highly usable. Some of the most important features include:

  • Data definition and instrument/form design with no or minimal programming
  • Sophisticated data structures such as repeating items and item groups
  • Support for a wide variety of response types and data types (single select, multiple choice, free text, image)
  • Data management and review capabilities (including discrepancy management and clinical monitoring) with flexible workflows
  • ALCOA-compliant controls and audit history over all data and metadata, including electronic signature capabilities
  • Patient visit calendar design with management of multiple patient encounters and multi-form support
  • Reporting and data extract to a wide variety of formats (tab, SPSS, CDISC ODM)
  • Ability to combine electronic patient reported outcome (ePRO) data with clinically reported data using common form definitions (write once, run anywhere)
  • Deployment via multiple media, mobile or standard web browser

Many of these things have already been implemented, and more are under development.

The core concept around which OpenClinica is organized is the electronic case report form (CRF). In OpenClinica, a CRF is a resource that is essentially a bunch of metadata modeled in CDISC ODM with OpenClinica extensions. It doesn’t (necessarily) have to correspond to a physical or virtual form; it may represent a lab data set or something similar. An OpenClinica Event CRF is that same bunch of metadata populated with actual data about a particular study participant. Thus, it combines the metadata with the corresponding item (field) data, with references to the study subject, event definition, CRF version, and event ordinal that it pertains to. In this conceptual view of the world, CRFs (as well as CRF items, studies, study events, etc.) are resources with core, intrinsic properties and then some other metadata that has to do with how they are presented in a particular representation. Built around these core resources are all the workflow, reporting, API, security, and other mechanisms that allow OpenClinica to actually save you time and increase accuracy in your research.

Second, OpenClinica should be interoperable. The ultimate measure of interoperability is having shared, machine readable study protocol definitions, and robust, real-time, ALCOA-compliant exchange of clinical data and metadata that aligns with user’s business processes. It should be easy to plug in and pull out or mix-and-match different features, such as forms, rules, study definitions, report/export formats, and modules, to transport them across OpenClinica instances or interact with other applications. Establishing well defined methods and approaches for integration into existing health data environments is a key goal of interoperability.

Third, OpenClinica should be modular and extensible. OpenClinica already provides some of the most common data capture and data management components and aims to have a very broad selection of input types, rules, reports, data extracts, and workflows. However OpenClinica developers should also have the freedom to come up with their own twist on a workflow, module, or data review workflow and have it work easily and relatively seamlessly with the rest of OpenClinica. User identification, authentication, and authorization should be highly configurable and support commonly used general purpose technologies for user credentialing and single-sign-on (such as LDAP & OAuth).

The CRF-centric model allows us a great deal of flexibility and extensibility. We can support multiple modalities, with different representation metadata for rendering the same form, or perhaps the shared representation metadata but applied in a different way (i.e. web browser vs. mobile vs. import job). We can address any part of the CRF in an atomic, computable manner. This approach has been successfully applied in the Rule Designer, which takes the ODM study metadata and allows browsing of the study CRFs and items, with the ability to drag and drop those resources to form rule expressions. Features such as rules and report/export formats are represented as XML documents. These documents define how the features behave in standardized ways so that one rule can, say, be easily replaced with another rule without having to modify all the code that makes use of the rule.

Finally, OpenClinica aims to be familiar in the sense of allowing data managers, developers, statisticians to work in a design/configuration/programming environment that they already know. Programmers don’t all have the same experience, and it would be somewhat limiting to force OpenClinica developers to all use the same language (Java) that OpenClinica was written in. We are constantly looking at ways to make it possible (not to mention reliable and easy!) for users and developers to interact with and extend OpenClinica in a programmatic way. This can mean anything from data loading to more meaningful integrations of applications common to the clinical research environment. As proponents of open, standards-based interoperability, our starting point is always to develop interfaces for these interactions based on the most successful, open, and proven methods in the history of technology – namely the protocols that power the World Wide Web (such as HTTP, SSL, XML, OAuth 2.0). They are relatively simple, extensively documented, widely understood, and well-supported out of the box in a large number of programming and IT environments. On top of this foundation, we rely heavily on the wonderful work of CDISC and the CDISC ODM to model and represent the clinical research protocol and clinical data.

Session 1:  from 30-March-2012 (start at the 5 min 20 sec mark)

Session 2:  from 06-April-2012 (start at the 1 min 25 sec mark)

Session 2a:  from 20-April-2012

Session 3:  from 27-April-2012

Session 4:  from 11-May-2012

Three Things I Learned About Export Job Performance From My Uncle Paulo

1. “Realize you can try to do it all…but you’ll do it slowly, very slowly.”

It’s understandable that there may be a desire to export every single dimension, for every CRF in your study, for all time. However, if that export takes 15 hours, then a job starting during off-hours may impact users the next business day, or worse, users in an alternate time-zone who may just be starting their work day.

It’s better to first get a sense of how long a small subset (e.g. reducing the temporal scope to a month) of data will take, manually, and take the elapsed time for the export to complete and multiply by the scope quotient (i.e. estimated total scope divided by reduced scope) to figure out a rough estimate of how long your export may take.

2. “Those temporal scope fields are your friend.”

If you don’t choose a temporal scope for your dataset, you will get data for a wider timeframe than you’re likely to be alive. This is may be fine for a dataset with a few dimensions, or if you happen to be looking for specific data you aren’t sure you may capture accurately with too narrow of a time scope.

However, if you KNOW what the scope is, you should specify this. This is more important with very large dataset. Breaking the dataset down into smaller scoped datasets will give you the flexibility of manageable chunks of data to schedule for export.

3. “Let your jobs breathe.”

You wouldn’t schedule two appointments, one after the other, if you didn’t know what time the first appointment would end, would you? The same idea applies to scheduled export jobs. While there’s nothing inherently wrong with scheduling one job after the other, you can only get away with this when you have a clear sense of when the first job may end. If you don’t have any idea when the first job may end, while the first job may complete successfully, the second will not if it is overrun by the first. When in doubt, give your jobs enough time to complete, by spacing them out.

Remember, it’s better to measure twice, and cut once. (My uncle Paulo didn’t actually say this last one.)

– Tope Oluwole

Clinical Trials in the Cloud

I got a phone call the other day from a longtime OpenClinica user about the announcement of our new OpenClinica Optimized™ Hosting. He remarked on how leading companies in the industry (including his) are making big investments in cloud computing products and services, because these technologies provide easy-to-access functionality on an infrastructure that is more redundant, scalable, and cost-effective than you could hope to build or buy on your own.

However, in the clinical research field, putting together such an offering is not for the faint of heart. Though our free OpenClinica Community Edition has been installed and run by users on cloud servers for years, our OpenClinica Enterprise Edition offering (which carries regulatory guarantees) would have to meet rigorous reliability, security, and regulatory compliance requirements. How can this be accomplished if you don’t actually know where your data physically resides at any point in time on the cloud?

Prior to the launch of Optimized Hosting, we offered each hosted customer a dedicated server or two server (application + database) setup. This provided a certain peace of mind from knowing that your clinical data lives on a dedicated piece of hardware, but for many the costs were high and suffered from the inherent limitations of being tied to a physical machine. At the end of last year our data center partner achieved SAS 70 Type II certification for their cloud services, and we decided it was time to begin diligence on a cloud-based offering for OpenClinica.

We have spent the past 9 months listening to our customers’ needs and concerns, a designing and testing a solution. The resulting OpenClinica Optimized™ Hosting is an innovative hybrid architecture that provides the best of both worlds:  the scalability, high availability, and flexibility of the cloud combined with the peace of mind that your data lives in purpose-built dedicated hardware.Clinical Research in the Cloud

In short, OpenClinica Optimized Hosting offers greater fault tolerance, with better scalability and performance, at a lower cost than alternatives. Here’s how it works:

Application

Each OpenClinica application instance is a cloud server cloned from an image that has been qualified according to our exacting installation instructions. We configure the instance according to the customer’s supplied configuration parameters and complete operational qualification (OQ). The instance is typically available and ready for production use within a day or two. Thanks to the cloud, computing resources are instantly scalable on-demand.

Database

Dedicated (non-cloud) high performance database machines are configured in a master/slave relationship to provide instant data replication and fault tolerance. By utilizing multiple slave databases located in different geographic regions, the OpenClinica Optimized Hosting database cluster is designed for zero data loss even in event of nuclear strike. The servers use the fastest hard disk technology available today (Fusion-io®), dramatically improving database performance. For example, in our testing, we commonly see data extracts run up to 10x faster than in the prior environment. Database servers are physically isolated via CISCO ASA firewall to eliminate all nonessential access credentials.

Validation and Compliance

OpenClinica Optimized Hosting provides maximum flexibility and transparency in the area of change control and compliance. It has been constructed around a carefully designed set of controls to ensure all updates are fully tested (and documented) in the environment prior to release, and that customers can have upgrades and maintenance releases applied according to their individual schedules and priorities.

One of the great advantages of OpenClinica is the choice it offers – you can use and extend the open source licensed code, you can choose between OpenClinica Community Edition and OpenClinica Enterprise, you can deploy it locally or choose the hosted option. Or, any combination of the above. The new Optimized Hosting environment enhances that choice by providing a fast, reliable, and cost-effective way to get up and running with OpenClinica.

For more on security in OpenClinica Optimized Hosting, see Clinical Trials in the Cloud – Part II.

– Cal Collins

Plug-in Architecture for OpenClinica Data Extracts

A major part of the Akaza mission is to make OpenClinica more flexible and customizable. Having a code base that is open source is a great place to start. But not everybody wants to develop Java code to meet their own requirements. We aim wherever possible to add configuration options and easy-to-use design tools within the user interface, but not all problems are a good fit for that approach. The solution is a series of “plug-in” interfaces that allow users to add their own capabilities and configurations, or interact with other applications. Some of these interfaces, such as loading of spreadsheet-based CRF definitions, are a critical part of OpenClinica, without which the system would not be functional. Other interfaces include CDISC ODM data import, job scheduler for import and export, SOAP-based web services, and the HTML5 popup interface that allows 3rd party applications to enter CRF data. Along the way community members have improved these interfaces and taught us a lot about how to design them better.

OpenClinica 3.1 will include a completely rewritten version of Extract Data based around a plug-in architecture that increases flexibility and functionality. We’ve learned that user requirements for organizing, formatting, and presenting data are tremendously diverse (and often conflicting), depending on the user, the intended purpose, the study, and the organization. Our old Extract Data architecture made it difficult to add new output formats or tweak the ones already there. The new functionality provides a highly extensible, easily configurable means to get data formats that meet a user’s precise requirements. It does this by:

  • Using XSL stylesheet transformations to read native CDISC ODM XML and output the data in a transformed format.
  • Specifying available formats, their associated stylesheets, and associated properties (like filename, archival settings, and whether to compress the file) in a properties file (the extract.properties file)
  • Optionally, enabling postprocessing of the transformed data to output to certain non-text file formats and destinations

We started out with a desire to simplify the native output of the OpenClinica Extract Data java application, in a way that increased quality, stability, completeness, and performance. From now on, the OpenClinica core application will only produce CDISC ODM (version 1.3, with OpenClinica Extensions) as the natively supported format. With only one native format, we’re better able to test, document, and guarantee the output. All other output formats generated are transformations from this native ODM 1.3 w/extensions format. We made sure (via the OpenClinica vendor extensions) that we can export all possible data related to a study and its clinical data in this format. In 3.1, this also includes export of audit trail, discrepancy, and electronic signature information.

After we devised a way to improve the quality, stability, and performance of the data coming out of the core, we needed to provide a way to execute the data transformations, into any of a wide variety of outputs. It was important for us to adopt standard, widely used formats and open source technologies as the basis for these transformations. We selected the XSLT (Extensible Stylesheet Language Transformations) language because of its applicability to CDISC ODM XML, extensive features, and reasonably simple learning curve. The implementation of these transformations is powered by a widely used open source engine, the Saxon XSLT and XQuery processor. The behavior of Export Data is determined by the extract.properties configuration file and the XSL stylesheets. The extract.properties file specifies the available data formats available in the system, each with a corresponding XSL stylesheet. OpenClinica 3.1 by default includes a set of XML stylesheet transformations for commonly used formats, such as HTML, Tab-delimited Text, and SPSS. The OpenClinica Enterprise Edition will include additional new formats including SAS, annotated CRFs, printable PDF casebooks with integrated audit trail and discrepancy notes, and a SQL-based data marts with normalized CRF-based table structure for ad-hoc reporting.

At this point, we can now reproduce the extract functionality available in OpenClinica 3.0, at a higher level of quality and stability. The stylesheets replicate the HTML, SPSS, tab-delimited, and multiple CDISC XML formats that were available in 3.0, and the framework will make it much easier to add new formats. However all of these data output formats are some type of text or XML based file. Users have also voiced the need to do things that XSLT cannot do by itself, like produce PDF files or load the data into external relational databases for ad-hoc reporting. The solution was implementation of a postprocessor framework that allows more sophisticated functionality. With postprocessing we can do things like generate binary output formats or send data to a target destination. Two postprocessors are included in 3.1 by default: output to a database using JDBC connectivity and generating PDF files using XSL-FO. The postprocessing step is transparent to end-users; they simply get their files for download or alternatively receive a message that the data has been loaded into the database. And the framework exists to add additional postprocessors via the addition of Java classes with references to those class names in the extract.properties file.

Execution of data export occurs when a user or job initiates a request for data. The request includes the active study or site, the dataset id, and the requested format. The end user will notice only minor differences in how they use the Extract Module. The process of creating datasets has not changed. The Extract can be still initiated from the ‘Download Data’ screen or via a job by selecting the desired output format. At this point however, rather than waiting for the download page to load, the user will be told that their extract is in queue, and receive an email and on-screen notification when the extract is complete. Execution follows a four step process:

Step 1.   Generate native CDISC ODM XML version 1.3 with OpenClinica Extensions

Step 2.   Apply XSL transformation and generate output file according to the settings in extract.properties for the specified format

Step 3.   Optionally, if postprocessing is enabled for the requested format, run the post processing action according to the settings in extract.properties.

Step 4.   Provide user notification with success or failure message.

We’ve also improved the logging and messaging surrounding extracts, which will be crucial for anyone developing, customizing, or debugging XSL stylesheets. As always, full internationalization is supported – if you want a value to be internationalized, it should be prefaced with an & (ampersand) symbol in the extract.properties file, and the corresponding text placed in the notes.properties i18n files.

As is common with software, we didn’t get to do everything we wanted in the first release of these capabilities. Some future features include:

  • Allow extract formats to be restricted to specific users, studies/sites, and/or datasets.
  • Allow loading and validation of formats within the web UI or via web services rather than via the extract.properites config file.
  • Create an exchange for XSL formats similar to the CRF Library.

Other than that we think we’ve thought of everything :-). Have we?

– Cal Collins