Thanks for making me laugh, Jackie 12182!
Data Management: Queries in clinical trials
Thanks for making me laugh, Jackie 12182!
Data Management: Queries in clinical trials
Sponsors of clinical research must increasingly focus on improving patient engagement in order to meet many of today’s research challenges. Promising disruptions are already under way that could define new models for patient recruitment and retention.
In a time when drug development success is becoming scarcer and more expensive, the industry is looking everywhere it can for new, innovative approaches to improving health. Meeting recruitment goals is one of the biggest challenges for traditional clinical research. Less than one-third of people who come in for a screening end up completing a clinical trial.1 Thinking in a more patient-centric manner can help is in recruiting patients. A fundamental idea behind patient-centered research is to “amplify the patient’s role in the research process.”2 Employing new ways to engage patients and physicians while increasing their level of knowledge and trust can improve the sponsor’s ability to meet recruitment goals.
One often overlooked factor for study participation and retention is convenience. Raising the level of convenience for both the investigator and participant can eliminate a huge obstacle to non-participation or non-completion. There are many ways to incorporate increased levels of patient and physician convenience into trial design and execution, particularly using Internet-based technologies. For instance, social media can be an effective recruiting tool and an important way to build trust with targeted populations. Disease-specific online communities are becoming more and more prominent for chronic diseases. Matchmaking tools act as mediators that draw together researchers and participants. “Traditional” social media offers a less targeted, but no less effective, way to engage patients and investigators.
In general, the four key determinants of a person’s likelihood to participate in a trial are prior participation in research, existing relationships with researchers, involvement of trusted leaders, and trust in the organization. Keys to recruiting success in social media should keep these determinants in mind, and engage communities in a thoughtful, ethical way while respecting the norms of the community you are targeting.
Participant retention post-recruitment can be improved by strengthening the connections between participants and researchers, and enhancing communication structures to support these relationships.3 Capturing Patient Reported Outcomes electronically (ePRO), through the web or mobile devices, offers a way to interact with the participant in a meaningful way while also capturing critical data. For instance, offering the ePRO user risk scores and health recommendations based on their data, or using gamification techniques to increase protocol adherence, can enhance the traditional ePRO experience by offering direct, immediate value to the user. Enabling a “Bring Your Own Device” (BYOD) strategy can increase convenience for populations who already own their own smartphones or tablets. Of course, the study design and applicable regulatory considerations should drive when and how these techniques are used.
Increased focus on the patient experience is not a phenomenon unique to research, but something that is rapidly permeating healthcare systems. These rapid changes can enhance research engagement. There is enormous potential to capture far more robust data and have better follow up than ever before as widespread infrastructure is put in place for coordinated team-based care, home-based continuous monitoring, and wireless data reporting from medical devices. The (still elusive) promise of using the Electronic Health Record system in research to identify participants and capture clean, accurate trial data is more critical than ever before. As medical practices become more electronic and less paper-driven, investigators and staff should be engaged by providing them trial-specific information at the points in their workflow when they can best make use of it. Conversely, requiring them to go outside the workflows and systems they use in routine practice creates complexity and hassle that can deter research participation. A new level of integration between research and health data systems, based on standards (which exist) and open interfaces (which are coming, as part of Meaningful Use), will be necessary to make good on this potential.
As difficult research questions drive increased complexity in trial designs, many feel that the answer is to use technology in simple, scalable ways to engage more participants in research and capture more data. Dr. Russ Altman, a physician and Stanford professor recently told the New York Times, “There’s a growing sense in the field of informatics that we’ll take lots of data in exchange for perfectly controlled data. You can deal with the noise if the signal is strong enough.”4
1. Getz, Ken, The Gift of Participation: A Guide to Making Informed Decisions About Volunteering for a Clinical Trial, 2007, p40.
2. Pignone, Michael, MD, MPH, Challenges to Implementing Patient-Centered Research, Ann Intern Med. 18 September 2012;157(6):450-451
3. Nicholas Anderson, Caleb Bragg, Andrea Hartzler, Kelly Edwards, Participant-centric initiatives: Tools to facilitate engagement in research, Applied & Translational Genomics, Volume 1, 1 December 2012, Pages 25-29, ISSN 2212-0661, 10.1016/j.atg.2012.07.001.
OpenClinica is a clinical trial software platform that aims to provide data capture, data management, and operations management functionality to support human subjects-based research. It can be used for traditional clinical trials as well as a wide variety of other types of human subjects-based research.
Our vision for the product is to provide data capture, data management, and operations management functionality out-of-the-box, in an easily configurable, usable, and highly reliable manner. The underlying platform should be interoperable, modular, extensible, and familiar – so users can solve specific problems, in a generalizable way.
This past spring, the development team here at OpenClinica, LLC held a series of round table discussions about how this vision is reflected in the product. Our goals were to learn critical standards and information models needed for our technology to truly reflect this vision, to develop a consistent, shared vocabulary for the problem domain and the OpenClinica technology, and identify the most urgent opportunities to put these lessons into practice in the product and the community. In particular, we spent a lot of the time in these discussions about how OpenClinica’s use of the CDISC Operational Data Model helps enable this vision.
The discussions were invigorating and thought-provoking. We’ve recorded them to share with the greater community of OpenClinica developers, integrators, and others who want to better understand how the technology works, the design philosophy behind it, and where we’re going in the future. The videos are embedded below.
But before getting to the videos, here’s a bit more background on how we think about OpenClinica as a product and a platform.
First, OpenClinica functionality should be ready out-of-the-box, easily configurable and highly usable. Some of the most important features include:
Many of these things have already been implemented, and more are under development.
The core concept around which OpenClinica is organized is the electronic case report form (CRF). In OpenClinica, a CRF is a resource that is essentially a bunch of metadata modeled in CDISC ODM with OpenClinica extensions. It doesn’t (necessarily) have to correspond to a physical or virtual form; it may represent a lab data set or something similar. An OpenClinica Event CRF is that same bunch of metadata populated with actual data about a particular study participant. Thus, it combines the metadata with the corresponding item (field) data, with references to the study subject, event definition, CRF version, and event ordinal that it pertains to. In this conceptual view of the world, CRFs (as well as CRF items, studies, study events, etc.) are resources with core, intrinsic properties and then some other metadata that has to do with how they are presented in a particular representation. Built around these core resources are all the workflow, reporting, API, security, and other mechanisms that allow OpenClinica to actually save you time and increase accuracy in your research.
Second, OpenClinica should be interoperable. The ultimate measure of interoperability is having shared, machine readable study protocol definitions, and robust, real-time, ALCOA-compliant exchange of clinical data and metadata that aligns with user’s business processes. It should be easy to plug in and pull out or mix-and-match different features, such as forms, rules, study definitions, report/export formats, and modules, to transport them across OpenClinica instances or interact with other applications. Establishing well defined methods and approaches for integration into existing health data environments is a key goal of interoperability.
Third, OpenClinica should be modular and extensible. OpenClinica already provides some of the most common data capture and data management components and aims to have a very broad selection of input types, rules, reports, data extracts, and workflows. However OpenClinica developers should also have the freedom to come up with their own twist on a workflow, module, or data review workflow and have it work easily and relatively seamlessly with the rest of OpenClinica. User identification, authentication, and authorization should be highly configurable and support commonly used general purpose technologies for user credentialing and single-sign-on (such as LDAP & OAuth).
The CRF-centric model allows us a great deal of flexibility and extensibility. We can support multiple modalities, with different representation metadata for rendering the same form, or perhaps the shared representation metadata but applied in a different way (i.e. web browser vs. mobile vs. import job). We can address any part of the CRF in an atomic, computable manner. This approach has been successfully applied in the Rule Designer, which takes the ODM study metadata and allows browsing of the study CRFs and items, with the ability to drag and drop those resources to form rule expressions. Features such as rules and report/export formats are represented as XML documents. These documents define how the features behave in standardized ways so that one rule can, say, be easily replaced with another rule without having to modify all the code that makes use of the rule.
Finally, OpenClinica aims to be familiar in the sense of allowing data managers, developers, statisticians to work in a design/configuration/programming environment that they already know. Programmers don’t all have the same experience, and it would be somewhat limiting to force OpenClinica developers to all use the same language (Java) that OpenClinica was written in. We are constantly looking at ways to make it possible (not to mention reliable and easy!) for users and developers to interact with and extend OpenClinica in a programmatic way. This can mean anything from data loading to more meaningful integrations of applications common to the clinical research environment. As proponents of open, standards-based interoperability, our starting point is always to develop interfaces for these interactions based on the most successful, open, and proven methods in the history of technology – namely the protocols that power the World Wide Web (such as HTTP, SSL, XML, OAuth 2.0). They are relatively simple, extensively documented, widely understood, and well-supported out of the box in a large number of programming and IT environments. On top of this foundation, we rely heavily on the wonderful work of CDISC and the CDISC ODM to model and represent the clinical research protocol and clinical data.
Session 1: from 30-March-2012 (start at the 5 min 20 sec mark)
Session 2: from 06-April-2012 (start at the 1 min 25 sec mark)
A recent discussion on the OpenClinica users mailing list centered around making OpenClinica more optimal for small, low-budget academic research studies, and how open source community participation is helping OpenClinica to meet those users’ needs.
Regarding OpenClinica’s target audience: OpenClinica is used in a tremendous variety of studies and organizations. Many of these are paying customers of OpenClinica LLC (f/k/a Akaza Research) who are running large GCP compliant, multi-site clinical trials. Clearly we as a company have an obligation and incentive to support them and provide software that fits their needs.
But I also am committed to making OpenClinica (the technology) successful and widely used. We want OpenClinica to power as many studies as possible, both small and large. If you’re determined to scratch down some data into Excel I won’t stop you. But since you’re reading this you likely know the arguments against doing so, and want a more robust solution for capturing and managing your clinical research data. We want lots of OpenClinica researchers, entrepreneurs and service providers to thrive in a growing ecosystem. If you are a researcher with no budget and you need OpenClinica to do X, me and my staff are ready and willing to prioritize getting feature X into a release if we have meaningful participation from community members to get that feature designed, coded, documented, tested, etc. Every participant in the OpenClinica community has the ability to make those contributions and/or to mobilize community participation to get their feature defined and developed. It’s not benevolence or volunteer work – with open source software the benefit you get out is proportional to the investment you put in.
There are growing examples of this participation. But more is needed to truly realize OpenClinica’s potential. As a community, we need to ask: (1) How do we grow participation? (2) How do we make it easier to participate? and (3) What can you contribute?
The other day I posted an overview of the new OpenClinica Optimized Hosting offering. Since then we have received requests for more detail on how we secure the data in a customer’s OpenClinica instance against unauthorized access. This is obviously a very important topic!
The particular questions were asked in the context of HIPAA–particularly the HIPAA Security Rule–and the answer below is framed in this context. But even if HIPAA is not relevant to you (because you have no PHI in your OpenClinica instance, you’re not part of a covered entity, or you’re outside the U.S.), the safeguards described below are generally applicable best practices and can be applied in the context of most security compliance/regulatory regimes.
In general the requirements of the HIPAA Security Rule can be summed up as:
Adhering to these requirements is generally demonstrated via a risk analysis that determines reasonable and appropriate security measures for protecting ePHI, and implementing administrative and technical safeguards consistent with the risk analysis (see http://www.hhs.gov/ocr/privacy/hipaa/understanding/srsummary.html for more info). These safeguards may include:
So how do we do this? Many of these safeguards have long been in place as part of the SOPs and other controls we have for our staff and suppliers. The OpenClinica application itself enforces controls such as password policies, audit history, role based access control, and user access log. On top of these safeguards, what’s notable with OpenClinica Optimized Hosting are the specific controls surrounding this new hybrid/cloud-based hosting environment. Below are excerpts of our new Standard Operating Procedure associated with OpenClinica Optimized Hosting. The full SOP and supporting documentation are available as part of a compliance audit.
Excerpt from SOP-SA002 – Managing Hosted OpenClinica
7.1.1 Access to any customer instance is limited, via login credentials, to authorized customer users for the web interface only. Customers have no access to the server itself [except through defined application and programmatic interfaces].
7.1.2 All OpenClinica employees are granted access only to computer and networking areas necessary to perform their duties.
7.1.3 Each customer’s installation is separate, and cannot be accessed from any other customer installation.
7.1.4 Connection to a hosted instance is encrypted by means of secure socket layer.
7.1.5 Application server and database server are secured via firewall, hardened to remove nonessential access credentials, and strong password compliance.
7.1.6 Hosted systems are constantly monitored for latencies and intrusion.
7.2.1 Installation qualification is performed on initial setup of the OpenClinica Optimized Hosting environment image, and documented in an IQ Report. Qualification items are checked by inspection, review of vendor documentation, or direct testing as appropriate; items are specified in the Installation Qualification Protocol.
7.2.2 Installation qualification for each customer instance is performed when configuring that instance, and is documented in an IQ Report. Qualification items are checked by inspection, or direct testing as appropriate.
We conduct qualification of our own IT practices and our data center provider to assure security, reliability, availability, performance, and data protection within our hosted services. Items reviewed include:
Our data center has a SAS 70 Type II security certification, a well known security certification that originated from financial industry compliance requirements and aligns well with the requirements of the clinical trials industry. We regularly audit their policies and procedures in the context of our quality system, including review of the SAS 70 Type II audit report they provide. Our data center assures secure and reliable operation in part by maintaining appropriate physical resources at the facility. Fire suppression, conditioned power, and redundant HVAC all protect computing equipment against damage from extreme conditions, while physical access security and surveillance guard against unauthorized intrusion. The full report is available for our customers to review as part of a compliance audit.
The above are some highlights of our multi-tier strategy to ensure the highest level of security of critical clinical data while maintaining accessibility and ease-of-use. Like any good security strategy, we treat it within the company as a dynamic function, subject to regular review and assessment. We recognize our strategy must always be evolving to respond to emerging threats and new requirements. At the end of the day it is the combination of process and technology controls, and subjecting these controls to continual scrutiny, that leads to strong security.
– Cal Collins
I got a phone call the other day from a longtime OpenClinica user about the announcement of our new OpenClinica Optimized™ Hosting. He remarked on how leading companies in the industry (including his) are making big investments in cloud computing products and services, because these technologies provide easy-to-access functionality on an infrastructure that is more redundant, scalable, and cost-effective than you could hope to build or buy on your own.
However, in the clinical research field, putting together such an offering is not for the faint of heart. Though our free OpenClinica Community Edition has been installed and run by users on cloud servers for years, our OpenClinica Enterprise Edition offering (which carries regulatory guarantees) would have to meet rigorous reliability, security, and regulatory compliance requirements. How can this be accomplished if you don’t actually know where your data physically resides at any point in time on the cloud?
Prior to the launch of Optimized Hosting, we offered each hosted customer a dedicated server or two server (application + database) setup. This provided a certain peace of mind from knowing that your clinical data lives on a dedicated piece of hardware, but for many the costs were high and suffered from the inherent limitations of being tied to a physical machine. At the end of last year our data center partner achieved SAS 70 Type II certification for their cloud services, and we decided it was time to begin diligence on a cloud-based offering for OpenClinica.
We have spent the past 9 months listening to our customers’ needs and concerns, a designing and testing a solution. The resulting OpenClinica Optimized™ Hosting is an innovative hybrid architecture that provides the best of both worlds: the scalability, high availability, and flexibility of the cloud combined with the peace of mind that your data lives in purpose-built dedicated hardware.
In short, OpenClinica Optimized Hosting offers greater fault tolerance, with better scalability and performance, at a lower cost than alternatives. Here’s how it works:
Each OpenClinica application instance is a cloud server cloned from an image that has been qualified according to our exacting installation instructions. We configure the instance according to the customer’s supplied configuration parameters and complete operational qualification (OQ). The instance is typically available and ready for production use within a day or two. Thanks to the cloud, computing resources are instantly scalable on-demand.
Dedicated (non-cloud) high performance database machines are configured in a master/slave relationship to provide instant data replication and fault tolerance. By utilizing multiple slave databases located in different geographic regions, the OpenClinica Optimized Hosting database cluster is designed for zero data loss even in event of nuclear strike. The servers use the fastest hard disk technology available today (Fusion-io®), dramatically improving database performance. For example, in our testing, we commonly see data extracts run up to 10x faster than in the prior environment. Database servers are physically isolated via CISCO ASA firewall to eliminate all nonessential access credentials.
Validation and Compliance
OpenClinica Optimized Hosting provides maximum flexibility and transparency in the area of change control and compliance. It has been constructed around a carefully designed set of controls to ensure all updates are fully tested (and documented) in the environment prior to release, and that customers can have upgrades and maintenance releases applied according to their individual schedules and priorities.
One of the great advantages of OpenClinica is the choice it offers – you can use and extend the open source licensed code, you can choose between OpenClinica Community Edition and OpenClinica Enterprise, you can deploy it locally or choose the hosted option. Or, any combination of the above. The new Optimized Hosting environment enhances that choice by providing a fast, reliable, and cost-effective way to get up and running with OpenClinica.
For more on security in OpenClinica Optimized Hosting, see Clinical Trials in the Cloud – Part II.
– Cal Collins
Increasingly I am seeing real momentum for reducing the costs and barriers to integration of eclinical applications and data in a way that benefits users.
A great example is a recent LinkedIn discussion (you may need to join the group to read it). Several software vendors and industry experts engaged in a dialogue about the pros and cons of different integration approaches. There is an emerging consensus that integration approaches should adopt open, web standards and harnesses the elegance and flexibility of the CDISC Operational Data Model. This consensus may signal a sea change in attitudes to standards-based integration that makes it the norm rather than the exception.
This is not new to members of the OpenClinica community. Over the years we’ve had many examples of such integration efforts described on this blog and at OpenClinica conferences. To make such efforts more powerful, reusable, and robust, the OpenClinica team has invested a great deal over the past year to create a meaningful, CDISC ODM-based model for interacting with OpenClinica. We have incorporated open web standards (RESTful APIs for transport and OAuth for security) to make the interfaces easily accessible with commonly used software tools. This is part of a newly published resource for OpenClinica development and integration, the OpenClinica 3.1 Technical Documentation Guide. The first version of the specification can be viewed at https://docs.openclinica.com/3.1/technical-documents/rest-api-specifications. I’ve reproduced the introduction here:
We are constantly looking at ways to make it possible (not to mention reliable and easy!) for users and developers to interact with and extend OpenClinica in a programmatic way. This can mean anything from data loading to more meaningful integrations of applications common to the clinical research environment.
As proponents of open, standards-based interoperability here at OpenClinica, our starting point is always to develop interfaces for these interactions based on the most successful, open, and proven methods in the history of technology – namely the protocols that power the World Wide Web (such as HTTP, SSL, XML, OAuth 2.0). They are relatively simple, extensively documented, widely understood, and well-supported out of the box in a large number of programming and IT environments. On top of this foundation, we rely heavily on the wonderful work of CDISC and the CDISC ODM to model and represent the clinical research protocol and clinical data.
This chapter describes a CDISC ODM-based way to interact with OpenClinica using RESTful APIs and OAuth. The REST web services API relies on HTTP, SSL, XML, OAuth 2.0. This architecture makes the ODM study protocol representation for an OpenClinica study available and supports other interactions for study design.
The OpenClinica RESTful architecture was developed to (initially) support one particular use case, but with the intention of becoming more broadly applicable over time. This use case is based on a frequent request of end users: for OpenClinica to support a visual method for designing, editing, and testing “rules” which define edit checks, email notifications, skip pattern definitions, and the like to be used in OpenClinica CRFs. Users have had to learn how to write rules in XML, which can be confusing and have a big learning curve for non-technical individuals. The OpenClinica Rule Designer is an application that allows end users to build cross field edit checks and dynamics within a GUI based application. It is centrally hosted Software as a Service (SaaS) based application available for OpenClinica Enterprise customers at https://designer.openclinica.com.
To support interaction of the centrally hosted rule designer with any instance of OpenClinica Enterprise installed anywhere in the world, we needed to implement a secure protocol and set of API methods to allow exchange of study information between the two systems, and do so in a way where the user experience was as integrated as if these applications were part of the same integrated code base. In doing so, and by adopting the aforementioned web and clinical standards to achieve this, we have built an architecture that can be extended and adapted for a much more diverse set of uses.
This chapter specifies how 3rd party applications can interact with an OpenClinica instance via the REST API and OAuth security, and details the currently supported REST API methods. The currently supported API methods are not comprehensive, and you may get better coverage from our SOAP API. However the OpenClinica team is continuing to expand this API and since it is open source anyone may extend it to add new methods to meet their own purposes. If you do use the API in a meaningful way or if you extend the API with new methods, please let others know on the OpenClinica developers list (email@example.com), and submit your contributions for inclusion back into the codebase – you’ll get better support, increased QA, and compatibility with future OpenClinica releases.
RESTful Representation, based on ODM
“REST”, an acronym for REpresentational State Transfer, describes an architectural style that allows definition and addressing of resources in a stateless manner, primarily through the use of Uniform Resource Identifiers (URIs) and HTTP.
From Wikipedia: A RESTful web service (also called a RESTful web API) is a simple web service implemented using HTTP and the principles of REST. It is a collection of resources, with three defined aspects:
- the base URI for the web service, such as http://example.com/resources/
- the Internet media type of the data supported by the web service. This is often JSON, XML or YAML but can be any other valid Internet media type.
- the set of operations supported by the web service using HTTP methods (e.g., POST, GET, PUT or DELETE).
REST is also a way of looking at the world, as eloquently articulated by Ryan Tomayko.
In the context of REST for clinical research using OpenClinica, we can conceptually think of an electronic case report form (CRF) as a resource that is essentially a bunch of metadata modeled in CDISC ODM with OpenClinica extensions:
- Some of this metadata (data type, item name, response set, etc) is intrinsic metadata – i.e. tied to the definition of the CRF and its items and mostly unchangeable after it is initially defined.
- Some of this metadata is representation metadata and used only when the CRF is represented as a web-based HTML form (in the OpenClinica database schema we call this form_metadata, but it also can include other things like CRF version information and rules).
An OpenClinica Event CRF is that same bunch of metadata with the corresponding item data, plus references to the study subject, event definition, CRF version, event ordinal, etc that it pertains to.
- The notion of a CRF version pertains to the representation of the CRF. It is not intrinsic to the event CRF (this is debatable but it is how OpenClinica models CRFs). Theoretically you should be able to address and view any Event CRF in any available version of the CRF (i.e. http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/edit and http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v2/edit both show you the same data represented in different versions of the CRF). Of course the audit history needs to clearly show which version/representation of the CRF was used for key events such as data capture, signature, etc.
- Rules are also part of the representation metadata as opposed to intrinsic metadata, even though you don’t need to specify them on a version-by-version basis.
- Anything attached to the actual event CRF object or its item data – discrepancy notes, audit trails, signatures, SDV performance, etc is part of that event data and should be addressable in the same manner (e.g. http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEM…)
In this conceptual view of the world, CRFs (as well as CRF items, studies, study events, etc.) are RESTful resources with core, intrinsic properties and then some other metadata that has to do with how they are presented in a particular representation. We now have a model that allows us a great deal of flexibility and adaptability. We can support multiple modalities, with different representation metadata for rendering the same form, or perhaps the shared representation metadata but applied in a different way. We can address any part of the CRF in an atomic manner. This approach has been successfully applied in the Rule Designer, which takes the ODM study metadata and allows browse of the study CRFs and items, with the ability to drag and drop those resources into rule expressions. Here are some examples of additional future capabilities that could be easily realized on top of this architecture:
- Multiple data entry modalities – a user may need to deploy patient based data entry via web, a tablet, a thick client, or even paper/OCR, each with a very different presentation. Each of these may be part of OpenClinica-web or a separate application altogether, but all will rely on the same resource metadata to represent the CRF (according to the UI + logic appropriate for that modality), and use the same REST-based URL and method for submitting/validating the data.
- Apply a custom view (an XSL or HTML/CSS) to a patient event CRF or full casebook – some uses of this could be to represent as a PDF casebook, show with all audit trails/DNs embedded in line with the CRF data, show a listing of data for that subject, provide (via an XSL mapping) as an XForm or HL7 CCD document for use by another application) – http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/view?renderer=somemap…
- The same path used in the URLs, eg http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEMOID could be used as the basis for XPath expressions operating on ODM XML representations of CRFs and of event CRF data
- Internationalization – OpenClinica ought to allow our CRF representation metadata to have an additional sub-layer to render the form in different languages, and then automatically show the appropriate language based on client/server HTTP negotiation (like we do with the rest of the app). Currently internationalization of CRFs requires versioning the CRF.
- View CRF & Print CRF – use the same representation metadata (form metadata) but apply slightly different rules on how the presentation works (text values instead of form fields, no buttons, turn drop down lists into text values)
- Discrepancy manager popup – one requested use case would allow a user to update a single event CRF item data value directly from the discrepancy note UI point of view. In this case you could think of just updating that one item as addressing the resource http://oc/RESTpath/StudyA/Subj1234/VisitA/FormB/v1/GROUPOID/ORDINAL/ITEM…. In this model, whatever rules and presentation metadata need to get applied at presentation and save time happen automatically.
- Import of CDISC ODM XML files – imported data would be processed through the same model, but only use the metadata that’s relevant to the data import modality. Same for data coming in as raw ODM XML via a REST web service. A lot of times the import only populates one part of a CRF and the other parts are expected to be finished via data entry. This model would help us manage that process better that the current implementation of ODM data import.
There are many considerations related to user roles and permissions, workflows, and event CRF/item data status attributes that need to be overlaid on top of this REST model, but the model itself is a conceptually useful way to think about clinical trials and the information represented therein. When implemented using CDISC ODM XML syntax it becomes even more powerful. As widespread support for ODM becomes the norm, the barriers to true interoperability – shared, machine readable study protocol definitions, and robust, real-time, ALCOA-compliant exchange of clinical data and metadata that aligns with user’s business processes – get eviscerated.
* This chapter frequently refers to ODM-based representations of study metadata and clinical data in OpenClinica. We strive as much as possible to implement ODM-based representations of OpenClinica metadata and data according to the generic ODM specifications (currently using ODM version 1.3). However, to ensure our representations support the full richness of information used in OpenClinica we often have to rely on ODM’s vendor extensions capability. We have not always made distinctions in this chapter as to where we are using ‘generic’ ODM versus OpenClinica extensions, but that is documented here. It is our goal as ODM matures and supports richer representations of study information to migrate our extensions back into the generic ODM formats.
** Also note the RESTful URL patterns referred to above are conceptual. Refer to the technical subchapters of this REST API specification for the actual URLs.
The spec (like much of the code that implements it) is open source. I’m looking forward to hearing comments and feedback, and sharing thoughts on how we can encourage broader adoption across different types of eclinical applications.
A major part of the Akaza mission is to make OpenClinica more flexible and customizable. Having a code base that is open source is a great place to start. But not everybody wants to develop Java code to meet their own requirements. We aim wherever possible to add configuration options and easy-to-use design tools within the user interface, but not all problems are a good fit for that approach. The solution is a series of “plug-in” interfaces that allow users to add their own capabilities and configurations, or interact with other applications. Some of these interfaces, such as loading of spreadsheet-based CRF definitions, are a critical part of OpenClinica, without which the system would not be functional. Other interfaces include CDISC ODM data import, job scheduler for import and export, SOAP-based web services, and the HTML5 popup interface that allows 3rd party applications to enter CRF data. Along the way community members have improved these interfaces and taught us a lot about how to design them better.
OpenClinica 3.1 will include a completely rewritten version of Extract Data based around a plug-in architecture that increases flexibility and functionality. We’ve learned that user requirements for organizing, formatting, and presenting data are tremendously diverse (and often conflicting), depending on the user, the intended purpose, the study, and the organization. Our old Extract Data architecture made it difficult to add new output formats or tweak the ones already there. The new functionality provides a highly extensible, easily configurable means to get data formats that meet a user’s precise requirements. It does this by:
We started out with a desire to simplify the native output of the OpenClinica Extract Data java application, in a way that increased quality, stability, completeness, and performance. From now on, the OpenClinica core application will only produce CDISC ODM (version 1.3, with OpenClinica Extensions) as the natively supported format. With only one native format, we’re better able to test, document, and guarantee the output. All other output formats generated are transformations from this native ODM 1.3 w/extensions format. We made sure (via the OpenClinica vendor extensions) that we can export all possible data related to a study and its clinical data in this format. In 3.1, this also includes export of audit trail, discrepancy, and electronic signature information.
After we devised a way to improve the quality, stability, and performance of the data coming out of the core, we needed to provide a way to execute the data transformations, into any of a wide variety of outputs. It was important for us to adopt standard, widely used formats and open source technologies as the basis for these transformations. We selected the XSLT (Extensible Stylesheet Language Transformations) language because of its applicability to CDISC ODM XML, extensive features, and reasonably simple learning curve. The implementation of these transformations is powered by a widely used open source engine, the Saxon XSLT and XQuery processor. The behavior of Export Data is determined by the extract.properties configuration file and the XSL stylesheets. The extract.properties file specifies the available data formats available in the system, each with a corresponding XSL stylesheet. OpenClinica 3.1 by default includes a set of XML stylesheet transformations for commonly used formats, such as HTML, Tab-delimited Text, and SPSS. The OpenClinica Enterprise Edition will include additional new formats including SAS, annotated CRFs, printable PDF casebooks with integrated audit trail and discrepancy notes, and a SQL-based data marts with normalized CRF-based table structure for ad-hoc reporting.
At this point, we can now reproduce the extract functionality available in OpenClinica 3.0, at a higher level of quality and stability. The stylesheets replicate the HTML, SPSS, tab-delimited, and multiple CDISC XML formats that were available in 3.0, and the framework will make it much easier to add new formats. However all of these data output formats are some type of text or XML based file. Users have also voiced the need to do things that XSLT cannot do by itself, like produce PDF files or load the data into external relational databases for ad-hoc reporting. The solution was implementation of a postprocessor framework that allows more sophisticated functionality. With postprocessing we can do things like generate binary output formats or send data to a target destination. Two postprocessors are included in 3.1 by default: output to a database using JDBC connectivity and generating PDF files using XSL-FO. The postprocessing step is transparent to end-users; they simply get their files for download or alternatively receive a message that the data has been loaded into the database. And the framework exists to add additional postprocessors via the addition of Java classes with references to those class names in the extract.properties file.
Execution of data export occurs when a user or job initiates a request for data. The request includes the active study or site, the dataset id, and the requested format. The end user will notice only minor differences in how they use the Extract Module. The process of creating datasets has not changed. The Extract can be still initiated from the ‘Download Data’ screen or via a job by selecting the desired output format. At this point however, rather than waiting for the download page to load, the user will be told that their extract is in queue, and receive an email and on-screen notification when the extract is complete. Execution follows a four step process:
Step 1. Generate native CDISC ODM XML version 1.3 with OpenClinica Extensions
Step 2. Apply XSL transformation and generate output file according to the settings in extract.properties for the specified format
Step 3. Optionally, if postprocessing is enabled for the requested format, run the post processing action according to the settings in extract.properties.
Step 4. Provide user notification with success or failure message.
We’ve also improved the logging and messaging surrounding extracts, which will be crucial for anyone developing, customizing, or debugging XSL stylesheets. As always, full internationalization is supported – if you want a value to be internationalized, it should be prefaced with an & (ampersand) symbol in the extract.properties file, and the corresponding text placed in the notes.properties i18n files.
As is common with software, we didn’t get to do everything we wanted in the first release of these capabilities. Some future features include:
Other than that we think we’ve thought of everything :-). Have we?
– Cal Collins