OpenClinica in an Academic Environment

Here at the Women & Children’s Health Research Institute, Edmonton, Alberta, we have been using OpenClinica since version 1. In that time the product has evolved significantly, providing more functionality and tighter focus on support for regulatory clinical trials. Our objective in this institute is to support our researchers regardless of the type of study they are undertaking. However, in the last two years, we have not been involved in any studies that needed to conform to regulatory standards. How then does OpenClinica perform in a purely academic environment?

Because OpenClinica is designed to support rigorous clinical trials it has many features that are of value to academic researchers. In our environment we stress the following features to our researchers, many of whom are used to managing their data in Excel or Access:

  • OpenClinica is hosted within a secure data centre provided by the University’s Faculty of Medicine and the servers are supported by the Faculty’s IT team.
  • Access to data is controlled through individual logins and user roles. The system is web based, using 128 bit encryption between the browser and the server. Studies implemented in OpenClinica are compliant with provincial privacy requirements.
  • Data entry rules and validations allow input to be validated at the point of entry. This greatly reduces the opportunity for user error, reducing the need for double data entry and reducing the cost of the data entry and cleaning effort.
  • CRF versioning enables changes to the data entry tools during the course of a study whilst maintaining the integrity of existing data.
  • Collaborative, multi-site studies are well supported in OpenClinica.
  • Discrepancy management and monitoring workflow allow annotation of the data and facilitate quality management whilst the study is ongoing. This reduces the cost of end of study data cleaning. It also minimizes the interval between end of study and analysis leading to reduced time to publication.

Many of the studies we have implemented in OpenClinica have been unregulated clinical trials, and these are clearly a good fit for the product. Many though have not fitted into this category but OpenClinica has still proved to be good for the job. The following two examples demonstrate our approach to some research projects that were not clinical trials.

Example 1 – Retrospective Chart Review (Double Data Collection)

We were approached by an investigator who asked us to perform double data entry for a project involving a retrospective review of 50 patient charts. However, when detailed requirements were established it became apparent that the charts had been independently reviewed by two separate investigators (double data collection). As a result traditional double data entry was not appropriate as the data entry staff were not qualified to adjudicate between data collected by two medically qualified reviewers.  What was actually required was single data entry and comparison of two separate sets of forms.

For this study we entered the two sets of data into two separate sites in OpenClinica. This allowed the data to be easily subset (into two separate libraries) once we’d imported it into SAS. We then wrote a SAS macro which used PROC COMPARE to compare the two libraries, and generate a difference report.

The final stage of the process was for the Principle Investigator to review the difference report and adjudicate the discrepancies. In over 90% of cases the discrepancies were due to different styles of documentation between the two investigators and were not significant. However approximately 10% of the differences required the subject’s chart to be checked before the discrepancy could be resolved. As the discrepancies were resolved the PI annotated the report and a final round of data entry was performed to apply the corrections in one of the OpenClinica sites. This site was designated the ‘primary’ data and was extracted for analysis.

The combination of OpenClinica and SAS performed well for a chart review where traditional double data entry was not practical. Discrepancies between the two separate reviews were identified by data management staff, which resulted in significant time savings for the two investigators.

Scenario 2 – Research Data Warehouse

Staff at Edmonton’s Pediatric Centre for Weight and Health (PCWH) required a database to collect physical examination, demographic, exercise and diet information from their patients and their patients care givers. The intention was to proactively build a research data warehouse from which future studies could perform retrospective analysis. Budgetary constraints meant that bespoke database development was out of the question, so we suggested they try collecting their data with OpenClinica. After four hours of training and with support from the informatics team the PCWH research coordinator built CRFs reflecting the content of every data collection form used in the clinic.

In order to facilitate our data extraction and analysis needs, a SAS database was created to reflect the contents of OpenClinica, but in a more analysis friendly form. Data is extracted overnight using a Java application that we developed to simplify and automate data extraction into SAS.

As the project evolved, the research teams’ understanding of their data improved and modifications were made to OpenClinica CRFs. As a result, data structures increased in complexity and additional SAS code was written in order to consolidate data structures for the end user.

It also became apparent that the database was required to manage complex relationships between the clinic’s patients and their care givers. For instance:

  • Data was required for all the care givers who attended a clinic visit with the patient
  • Care givers could be parents, other family members, foster parents, etc.
  • Un-related patients could have common care givers. A woman could be the mother of patient A, aunt of patient B and foster mother to patient C.

To handle these relationships, the team created separate numbering conventions for the subjects and their care givers. When a care giver attends the clinic for the first time, they are entered as a new subject. OpenClinica’s ‘secondary identifier’ field is used to enter a comma separated list of ‘relationship codes’ representing patients to whom the individual is a care giver and also their relationship to that patient. The data in this field is used by SAS to create a relationships table that defines all the caregivers for a patient and the relationship. It allows the researcher to subset the data based on these relationships and facilitates research involving complex multi-family structures.

Inventive use of subject numbers and the secondary identifier field in OpenClinica allowed the PCWH to model complex relationships that OpenClinica isn’t primarily designed to handle. The SAS data warehouse facilitates rapid querying of retrospective clinic data by investigators and their trainees, facilitating studies that would otherwise have been performed by chart reviews.


OpenClinica performs well in an academic environment where low cost, flexibility and study implementation speed are critical factors. We are able to extend OpenClinica with additional tools (notably SAS based data manipulations) in order to provide a tailored solution.

It should also be noted that as is the case with any research a deep understanding of the study and its data collection requirements is critical to the success of the project.

Rick Watts B.Sc, FICR, CSci

Team Lead, Clinical Research Informatics Core

Women & Children’s Health Research Institute

University of Alberta