Growing the Community

A recent discussion on the OpenClinica users mailing list centered around making OpenClinica more optimal for small, low-budget academic research studies, and how open source community participation is helping OpenClinica to meet those users’ needs.

Regarding OpenClinica’s target audience: OpenClinica is used in a tremendous variety of studies and organizations. Many of these are paying customers of OpenClinica LLC (f/k/a Akaza Research) who are running large GCP compliant, multi-site clinical trials. Clearly we as a company have an obligation and incentive to support them and provide software that fits their needs.

But I also am committed to making OpenClinica (the technology) successful and widely used. We want OpenClinica to power as many studies as possible, both small and large. If you’re determined to scratch down some data into Excel I won’t stop you. But since you’re reading this you likely know the arguments against doing so, and want a more robust solution for capturing and managing your clinical research data. We want lots of OpenClinica researchers, entrepreneurs and service providers to thrive in a growing ecosystem. If you are a researcher with no budget and you need OpenClinica to do X, me and my staff are ready and willing to prioritize getting feature X into a release if we have meaningful participation from community members to get that feature designed, coded, documented, tested, etc. Every participant in the OpenClinica community has the ability to make those contributions and/or to mobilize community participation to get their feature defined and developed. It’s not benevolence or volunteer work – with open source software the benefit you get out is proportional to the investment you put in.

There are growing examples of this participation. But more is needed to truly realize OpenClinica’s potential. As a community, we need to ask: (1) How do we grow participation? (2) How do we make it easier to participate? and (3) What can you contribute?

Clinical Trials in the Cloud (Part II)

The other day I posted an overview of the new OpenClinica Optimized Hosting offering. Since then we have received requests for more detail on how we secure the data in a customer’s OpenClinica instance against unauthorized access. This is obviously a very important topic!

The particular questions were asked in the context of HIPAA–particularly the HIPAA Security Rule–and the answer below is framed in this context. But even if HIPAA is not relevant to you (because you have no PHI in your OpenClinica instance, you’re not part of a covered entity, or you’re outside the U.S.), the safeguards described below are generally applicable best practices and can be applied in the context of most security compliance/regulatory regimes.

In general the requirements of the HIPAA Security Rule can be summed up as:

  1. Ensure the confidentiality, integrity, and availability of all e-PHI you create, receive, maintain or transmit;
  2. Identify and protect against reasonably anticipated threats to the security or integrity of the information;
  3. Protect against reasonably anticipated, impermissible uses or disclosures; and
  4. Ensure workforce compliance.

Adhering to these requirements is generally demonstrated via a risk analysis that determines reasonable and appropriate security measures for protecting ePHI, and implementing administrative and technical safeguards consistent with the risk analysis (see http://www.hhs.gov/ocr/privacy/hipaa/understanding/srsummary.html for more info). These safeguards may include:

Administrative Safeguards

  • Implement security measures that reduce risks and vulnerabilities to a reasonable and appropriate level.
  • Limit uses and disclosures of PHI to the “minimum necessary.”
  • Appropriate training, authorization, and supervision of workforce members who work with e-PHI
  • Regular review and evaluation

Technical Safeguards

  • Implement technical policies and procedures that allow only authorized persons to access electronic protected health information.
  • Ensure that e-PHI is not improperly altered or destroyed. Electronic measures must be put in place to confirm that e-PHI has not been improperly altered or destroyed.
  • Implement technical security measures that guard against unauthorized access to e-PHI that is being transmitted over an electronic network.

So how do we do this? Many of these safeguards have long been in place as part of the SOPs and other controls we have for our staff and suppliers. The OpenClinica application itself enforces controls such as password policies, audit history, role based access control, and user access log. On top of these safeguards, what’s notable with OpenClinica Optimized Hosting are the specific controls surrounding this new hybrid/cloud-based hosting environment. Below are excerpts of our new Standard Operating Procedure associated with OpenClinica Optimized Hosting. The full SOP and supporting documentation are available as part of a compliance audit.

Excerpt from SOP-SA002 – Managing Hosted OpenClinica

7.1               Security

7.1.1                       Access to any customer instance is limited, via login credentials, to authorized customer users for the web interface only. Customers have no access to the server itself [except through defined application and programmatic interfaces].

7.1.2                       All OpenClinica employees are granted access only to computer and networking areas necessary to perform their duties.

7.1.3                       Each customer’s installation is separate, and cannot be accessed from any other customer installation.

7.1.4                       Connection to a hosted instance is encrypted by means of secure socket layer.

7.1.5                       Application server and database server are secured via firewall, hardened to remove nonessential access credentials, and strong password compliance.

7.1.6                       Hosted systems are constantly monitored for latencies and intrusion.

7.2.1     Installation qualification is performed on initial setup of the OpenClinica Optimized Hosting environment image, and documented in an IQ Report. Qualification items are checked by inspection, review of vendor documentation, or direct testing as appropriate; items are specified in the Installation Qualification Protocol.

7.2.2     Installation qualification for each customer instance is performed when configuring that instance, and is documented in an IQ Report. Qualification items are checked by inspection, or direct testing as appropriate.

We conduct qualification of our own IT practices and our data center provider to assure security, reliability, availability, performance, and data protection within our hosted services. Items reviewed include:

  • Data Center physical security procedures
  • Data center HVAC, power conditioning, and fire suppression systems
  • Disaster prevention and disaster recovery processes
  • Back-up and data retention procedures
  • Network redundancy
  • Firewalls
  • SSL certificate (encryption)
  • System and network monitoring (for latencies, intrusion, and failure prediction)
  • Load balancing

Our data center has a SAS 70 Type II security certification, a well known security certification that originated from financial industry compliance requirements and aligns well with the requirements of the clinical trials industry. We regularly audit their policies and procedures in the context of our quality system, including review of the SAS 70 Type II audit report they provide. Our data center assures secure and reliable operation in part by maintaining appropriate physical resources at the  facility. Fire suppression, conditioned power, and redundant HVAC all protect computing equipment against damage from extreme conditions, while physical access security and surveillance guard against unauthorized intrusion. The full report is available for our customers to review as part of a compliance audit.

The above are some highlights of our multi-tier strategy to ensure the highest level of security of critical clinical data while maintaining accessibility and ease-of-use. Like any good security strategy, we treat it within the company as a dynamic function, subject to regular review and assessment. We recognize our strategy must always be evolving to respond to emerging threats and new requirements. At the end of the day it is the combination of process and technology controls, and subjecting these controls to continual scrutiny, that leads to strong security.

- Cal Collins

Clinical Trials in the Cloud

I got a phone call the other day from a longtime OpenClinica user about the announcement of our new OpenClinica Optimized™ Hosting. He remarked on how leading companies in the industry (including his) are making big investments in cloud computing products and services, because these technologies provide easy-to-access functionality on an infrastructure that is more redundant, scalable, and cost-effective than you could hope to build or buy on your own.

However, in the clinical research field, putting together such an offering is not for the faint of heart. Though our free OpenClinica Community Edition has been installed and run by users on cloud servers for years, our OpenClinica Enterprise Edition offering (which carries regulatory guarantees) would have to meet rigorous reliability, security, and regulatory compliance requirements. How can this be accomplished if you don’t actually know where your data physically resides at any point in time on the cloud?

Prior to the launch of Optimized Hosting, we offered each hosted customer a dedicated server or two server (application + database) setup. This provided a certain peace of mind from knowing that your clinical data lives on a dedicated piece of hardware, but for many the costs were high and suffered from the inherent limitations of being tied to a physical machine. At the end of last year our data center partner achieved SAS 70 Type II certification for their cloud services, and we decided it was time to begin diligence on a cloud-based offering for OpenClinica.

We have spent the past 9 months listening to our customers’ needs and concerns, a designing and testing a solution. The resulting OpenClinica Optimized™ Hosting is an innovative hybrid architecture that provides the best of both worlds:  the scalability, high availability, and flexibility of the cloud combined with the peace of mind that your data lives in purpose-built dedicated hardware.Clinical Research in the Cloud

In short, OpenClinica Optimized Hosting offers greater fault tolerance, with better scalability and performance, at a lower cost than alternatives. Here’s how it works:

Application

Each OpenClinica application instance is a cloud server cloned from an image that has been qualified according to our exacting installation instructions. We configure the instance according to the customer’s supplied configuration parameters and complete operational qualification (OQ). The instance is typically available and ready for production use within a day or two. Thanks to the cloud, computing resources are instantly scalable on-demand.

Database

Dedicated (non-cloud) high performance database machines are configured in a master/slave relationship to provide instant data replication and fault tolerance. By utilizing multiple slave databases located in different geographic regions, the OpenClinica Optimized Hosting database cluster is designed for zero data loss even in event of nuclear strike. The servers use the fastest hard disk technology available today (Fusion-io®), dramatically improving database performance. For example, in our testing, we commonly see data extracts run up to 10x faster than in the prior environment. Database servers are physically isolated via CISCO ASA firewall to eliminate all nonessential access credentials.

Validation and Compliance

OpenClinica Optimized Hosting provides maximum flexibility and transparency in the area of change control and compliance. It has been constructed around a carefully designed set of controls to ensure all updates are fully tested (and documented) in the environment prior to release, and that customers can have upgrades and maintenance releases applied according to their individual schedules and priorities.

One of the great advantages of OpenClinica is the choice it offers – you can use and extend the open source licensed code, you can choose between OpenClinica Community Edition and OpenClinica Enterprise, you can deploy it locally or choose the hosted option. Or, any combination of the above. The new Optimized Hosting environment enhances that choice by providing a fast, reliable, and cost-effective way to get up and running with OpenClinica.

For more on security in OpenClinica Optimized Hosting, see Clinical Trials in the Cloud – Part II.

- Cal Collins

New! OpenClinica 3.1 “Gap” Training

We are excited to announce a new course designed to get you up to speed with the new changes in OpenClinica 3.1.  “OpenClinica 3.1 Gap Training” is a 4-hour course delivered via the web providing an in-depth overview of changes in existing functionality from OpenClinica v3.0 to v3.1 as well as formal instruction on the brand new features in 3.1.  Particular emphasis is placed on the new functionality for Dynamic logic and new Rules capabilities. Topics covered include:

  •     Dynamic Logic with Rules
  •     Simple Show/Hide in CRFs
  •     Additional Rules improvements
  •     Extract Data module changes
  •     Changes and enhancements to Notes & Discrepancies
  •     Changes in site level user to enhance access to “started” CRFs
  •     Rules Designer
  •     Data Mart
  •     Modularization

For course times, pricing, and registration, see https://www.openclinica.com/open-enrollment-training.

Course duration: approximately 4 hours
Delivery medium: online/web meeting

Note: While open to all, this course is intended for people who have received prior formal OpenClinica training and/or have significant experience with OpenClinica 3.0.

eClinical Integration

Increasingly I am seeing real momentum for reducing the costs and barriers to integration of eclinical applications and data in a way that benefits users.

A great example is a recent LinkedIn discussion (you may need to join the group to read it).  Several software vendors and industry experts engaged in a dialogue about the pros and cons of different integration approaches. There is an emerging consensus that integration approaches should adopt open, web standards and harnesses the elegance and flexibility of the CDISC Operational Data Model. This consensus may signal a sea change in attitudes to standards-based integration that makes it the norm rather than the exception.

This is not new to members of the OpenClinica community. Over the years we’ve had many examples of such integration efforts described on this blog and at OpenClinica conferences. To make such efforts more powerful, reusable, and robust, the OpenClinica team has invested a great deal over the past year to create a meaningful, CDISC ODM-based model for interacting with OpenClinica. We have incorporated open web standards (RESTful APIs for transport and OAuth for security) to make the interfaces easily accessible with commonly used software tools.  This is part of a newly published resource for OpenClinica development and integration, the OpenClinica 3.1 Technical Documentation Guide. The first version of the specification can be viewed at https://docs.openclinica.com/3.1/technical-documents/rest-api-specifications. I’ve reproduced the introduction here:

Overview

We are constantly looking at ways to make it possible (not to mention reliable and easy!) for users and developers to interact with and extend OpenClinica in a programmatic way. This can mean anything from data loading to more meaningful integrations of applications common to the clinical research environment.

As proponents of open, standards-based interoperability here at OpenClinica, our starting point is always to develop interfaces for these interactions based on the most successful, open, and proven methods in the history of technology – namely the protocols that power the World Wide Web (such as HTTP, SSL, XML, OAuth 2.0). They are relatively simple, extensively documented, widely understood, and well-supported out of the box in a large number of programming and IT environments. On top of this foundation, we rely heavily on the wonderful work of CDISC and the CDISC ODM to model and represent the clinical research protocol and clinical data.

This chapter describes a CDISC ODM-based way to interact with OpenClinica using RESTful APIs and OAuth. The REST web services API relies on HTTP, SSL, XML, OAuth 2.0. This architecture makes the ODM study protocol representation for an OpenClinica study available and supports other interactions for study design.

Why REST?

The OpenClinica RESTful architecture was developed to (initially) support one particular use case, but with the intention of becoming more broadly applicable over time. This use case is based on a frequent request of end users: for OpenClinica to support a visual method for designing, editing, and testing “rules” which define edit checks, email notifications, skip pattern definitions, and the like to be used in OpenClinica CRFs. Users have had to learn how to write rules in XML, which can be confusing and have a big learning curve for non-technical individuals. The OpenClinica Rule Designer is an application that allows end users to build cross field edit checks and dynamics within a GUI based application. It is centrally hosted Software as a Service (SaaS) based application available for OpenClinica Enterprise customers at https://designer.openclinica.com.

To support interaction of the centrally hosted rule designer with any instance of OpenClinica Enterprise installed anywhere in the world, we needed to implement a secure protocol and set of API methods to allow exchange of study information between the two systems, and do so in a way where the user experience was as integrated as if these applications were part of the same integrated code base. In doing so, and by adopting the aforementioned web and clinical standards to achieve this, we have built an architecture that can be extended and adapted for a much more diverse set of uses.

This chapter specifies how 3rd party applications can interact with an OpenClinica instance via the REST API and OAuth security, and details the currently supported REST API methods. The currently supported API methods are not comprehensive, and you may get better coverage from our SOAP API. However the OpenClinica team is continuing to expand this API and since it is open source anyone may extend it to add new methods to meet their own purposes. If you do use the API in a meaningful way or if you extend the API with new methods, please let others know on the OpenClinica developers list (developers@openclinica.org), and submit your contributions for inclusion back into the codebase – you’ll get better support, increased QA, and compatibility with future OpenClinica releases.

RESTful Representation, based on ODM

“REST”, an acronym for REpresentational State Transfer, describes an architectural style that allows definition and addressing of resources in a stateless manner, primarily through the use of Uniform Resource Identifiers (URIs) and HTTP.

From Wikipedia: A RESTful web service (also called a RESTful web API) is a simple web service implemented using HTTP and the principles of REST. It is a collection of resources, with three defined aspects:

  • the base URI for the web service, such as http://example.com/resources/
  • the Internet media type of the data supported by the web service. This is often JSON, XML or YAML but can be any other valid Internet media type.
  • the set of operations supported by the web service using HTTP methods (e.g., POST, GET, PUT or DELETE).

REST is also a way of looking at the world, as eloquently articulated by Ryan Tomayko.

In the context of REST for clinical research using OpenClinica, we can conceptually think of an electronic case report form (CRF) as a resource that is essentially a bunch of metadata modeled in CDISC ODM with OpenClinica extensions:

  • Some of this metadata (data type, item name, response set, etc) is intrinsic metadata – i.e. tied to the definition of the CRF and its items and mostly unchangeable after it is initially defined.
  • Some of this metadata is representation metadata and used only when the CRF is represented as a web-based HTML form (in the OpenClinica database schema we call this form_metadata, but it also can include other things like CRF version information and rules).

An OpenClinica Event CRF is that same bunch of metadata with the corresponding item data, plus references to the study subject, event definition, CRF version, event ordinal, etc that it pertains to.

  • The notion of a CRF version pertains to the representation of the CRF. It is not intrinsic to the event CRF (this is debatable but it is how OpenClinica models CRFs). Theoretically you should be able to address and view any Event CRF in any available version of the CRF (i.e. http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/edit and http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v2/edit both show you the same data represented in different versions of the CRF). Of course the audit history needs to clearly show which version/representation of the CRF was used for key events such as data capture, signature, etc.
  • Rules are also part of the representation metadata as opposed to intrinsic metadata, even though you don’t need to specify them on a version-by-version basis.
  • Anything attached to the actual event CRF object or its item data – discrepancy notes, audit trails, signatures, SDV performance, etc is part of that event data and should be addressable in the same manner (e.g. http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEM…)

In this conceptual view of the world, CRFs (as well as CRF items, studies, study events, etc.) are RESTful resources with core, intrinsic properties and then some other metadata that has to do with how they are presented in a particular representation. We now have a model that allows us a great deal of flexibility and adaptability. We can support multiple modalities, with different representation metadata for rendering the same form, or perhaps the shared representation metadata but applied in a different way. We can address any part of the CRF in an atomic manner. This approach has been successfully applied in the Rule Designer, which takes the ODM study metadata and allows browse of the study CRFs and items, with the ability to drag and drop those resources into rule expressions. Here are some examples of additional future capabilities that could be easily realized on top of this architecture:

  • Multiple data entry modalities – a user may need to deploy patient based data entry via web, a tablet, a thick client, or even paper/OCR, each with a very different presentation. Each of these may be part of OpenClinica-web or a separate application altogether, but all will rely on the same resource metadata to represent the CRF (according to the UI + logic appropriate for that modality), and use the same REST-based URL and method for submitting/validating the data.
  • Apply a custom view (an XSL or HTML/CSS) to a patient event CRF or full casebook – some uses of this could be to represent as a PDF casebook, show with all audit trails/DNs embedded in line with the CRF data, show a listing of data for that subject, provide (via an XSL mapping) as an XForm or HL7 CCD document for use by another application) – http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/view?renderer=somemap…
  • The same path used in the URLs, eg http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEMOID could be used as the basis for XPath expressions operating on ODM XML representations of CRFs and of event CRF data
  • Internationalization – OpenClinica ought to allow our CRF representation metadata to have an additional sub-layer to render the form in different languages, and then automatically show the appropriate language based on client/server HTTP negotiation (like we do with the rest of the app). Currently internationalization of CRFs requires versioning the CRF.
  • View CRF & Print CRF – use the same representation metadata (form metadata) but apply slightly different rules on how the presentation works (text values instead of form fields, no buttons, turn drop down lists into text values)
  • Discrepancy manager popup – one requested use case would allow a user to update a single event CRF item data value directly from the discrepancy note UI point of view. In this case you could think of just updating that one item as addressing the resource http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEM…. In this model, whatever rules and presentation metadata need to get applied at presentation and save time happen automatically.
  • Import of CDISC ODM XML files – imported data would be processed through the same model, but only use the metadata that’s relevant to the data import modality. Same for data coming in as raw ODM XML via a REST web service. A lot of times the import only populates one part of a CRF and the other parts are expected to be finished via data entry. This model would help us manage that process better that the current implementation of ODM data import.

There are many considerations related to user roles and permissions, workflows, and event CRF/item data status attributes that need to be overlaid on top of this REST model, but the model itself is a conceptually useful way to think about clinical trials and the information represented therein. When implemented using CDISC ODM XML syntax it becomes even more powerful. As widespread support for ODM becomes the norm, the barriers to true interoperability – shared, machine readable study protocol definitions, and robust, real-time, ALCOA-compliant exchange of clinical data and metadata that aligns with user’s business processes – get eviscerated.

* This chapter frequently refers to ODM-based representations of study metadata and clinical data in OpenClinica. We strive as much as possible to implement ODM-based representations of OpenClinica metadata and data according to the generic ODM specifications (currently using ODM version 1.3). However, to ensure our representations support the full richness of information used in OpenClinica we often have to rely on ODM’s vendor extensions capability. We have not always made distinctions in this chapter as to where we are using ‘generic’ ODM versus OpenClinica extensions, but that is documented here. It is our goal as ODM matures and supports richer representations of study information to migrate our extensions back into the generic ODM formats.

** Also note the RESTful URL patterns referred to above are conceptual. Refer to the technical subchapters of this REST API specification for the actual URLs.

The spec (like much of the code that implements it) is open source. I’m looking forward to hearing comments and feedback, and sharing thoughts on how we can encourage broader adoption across different types of eclinical applications.

Sign-up for the OpenClinica 3.1 Webinar

Join us for a free webinar presentation and demonstration of the newly released OpenClinica 3.1. This is a great opportunity to see the technology firsthand and ask questions.

To accomodate different time zones, the same webinar will be run at two different times and dates. You may use the links below to sign-up:

  • Thu, Aug 11, 2011 12:00 PM – 1:30 PM EDT (GMT -5:00)  – Register
  • Wed, Sep 14, 2011 9:00 PM – 10:30 PM EDT (GMT -5:00) – Register

Space is limited, so sign-up today!

Thoughts on Code: OpenClinica and Open Standards with CDISC

One of the strengths of open source is the ability to open up the code base and learn by reading and doing, that is, the transparency of the code base allows everyone to get involved. However, the barrier to entry can be the complexity of the code itself; without a qualified guide, you can get ‘lost in the code jungle’ pretty quickly.

Welcome to our code

With that in mind, we are starting today to author blog posts about the OpenClinica code base, including topics like how the code is organized, what the code does, and so on. A lot more detail on this can be found on the OpenClinica Developer Wiki, but these posts, viewed as a whole, can be seen as a gentle introduction, before interested parties start to dive deeper.

When we began to design OpenClinica, we had very few requirements, but the desire  to create a fully-featured database for clinical data, aligned with open standards, making use of the best technology available. Call it the ‘tyranny of the blank page’, if you will. Every start-up faces it. Where do you start? What’s the plan? How do you build it, and what do you build first?

Luckily for us, we could use an open standard to base our schema, and our code, on top of; the CDISC ODM.

What’s a CDISC ODM?

The Operational Data Model, or ODM for short, is a standard published by the Clinical Data Interchange Standards Consortium (CDISC), and is “designed to facilitate the archive and interchange of the metadata and data for clinical research”, as it states in their website. This is a standard which is designed to a) hold metadata about a Study and all Events contained within a given Study, and b) hold Clinical Data which has been collected for a given Study. All of this information is held in XML, which is a very useful format for exchanging between sites, labs and institutions.

Figure 1: Study Metadata and OpenClinica

In the above image, you can see an XML file on one side using CDISC ODM and on the other side, an OpenClinica database. Inside the database are tables that map directly to different objects described in the XML. You’ll notice that the tables associated with study metadata also have a column called ‘oc_oid’, which are the Object Identifiers we use in all aspects of the OpenClinica application.

Figure 2: ODM Clinical Data and OpenClinica

In the second image, you see that the latter half of the XML file (the part  contained in the <ClinicalData> tags) also links to specific tables in the OpenClinica database. Since we link back to the Study metadata through those OIDs, we don’t use OIDs in those tables, but instead the conventional methods of primary keys and foreign keys in the database is good enough.

OK, so they map. But where’s the beef?

Of course, the ODM XML in the images is rather simple, and does not capture the full capability of the metadata that can be passed back and forth between different ODM data sources. For a longer example, you can take a look at the following XML, which defines the Rules governing a single Item:

Sample ItemDef in CDISC ODM XML

As you start to piece together the XML in the above example, you’ll see that not only can you define the Question in multiple languages, but you can specify which measurement it is using and what kinds of values you can accept.  The XML standard is extensible enough to add other pieces of information as well, including coded lists, data types, and so on.  More information can be found at XML4Pharma’s page entitled, ‘Using CDISC-ODM in EDC.’

In future posts, we hope to describe more about the code base, and show how it all comes together as a full-featured application. If there are topics that are of specific interest, we hope you’ll comment below and let us know what you’d like to see here in the coming months.

Follow

Get every new post delivered to your Inbox.

Join 1,596 other followers