The CDISC Operational Data Model: Ready to Roll?

July 1, 2004

Applied Clinical Trials

Applied Clinical Trials, Applied Clinical Trials-07-01-2004, Volume 11, Issue 7

Think about how you wish to use ODM across an organization, not just within a single trial. All that needs to happen is for the same identifier to be used to denote two different studies and the data for both appear to come from the same study!

In December 1998, NASA launched a spacecraft designed to enter a fixed orbit around the planet Mars. The Climate Orbiter reached Mars in late 1999 and, early in the morning of 23 September, fired its main engine to commence maneuvers into the target orbit. As the craft passed behind Mars, everything seemed normal.

The spacecraft never re-emerged from behind the planet. NASA instigated an investigation and concluded that two teams involved in the project had each used different measurement units in navigation calculations; one team had used imperial units while the other used the metric system. This practice had resulted in data being passed between the teams and used without the necessary conversions. As a result, a multimillion dollar spacecraft was lost due to a simple data transfer issue.

A similar event happened during the testing of computer systems destined for use as part of a clinical trial. Fortunately, although it was essentially the same problem, it did not have the same impact as the NASA incident. A blood glucose reading and other data was going to be collected from a number of subjects, and the data stored on a server located in Europe. That would then be transferred, in an encrypted ASCII format, to another server in the United States where it was to be used for analysis.

An issue arose upon the commencement of system testing. The glucose reading for the subjects was being stored on the server in Europe as mmol/L, while the analysis software was expecting the readings to be measured as mg/dl. The protocol had not specified which unit was to be used. The fix was simple, luckily, and a few days later the problem was resolved.

These two example, albeit at either ends of the scale in terms of their impact, illustrate the potential cost of errors in data transfers. Within the pharmaceutical industry, one organization is working hard to improve the mechanisms available for the transfer of clinical data. The Clinical Data Interchange Standards Consortium (CDISC), a nonprofit organization, has been working over the last four years on the development of standards to support clinical data interchange.

One of the CDISC standards is the Operational Data Model (ODM), and this article is intended as a primer for those who would like an introduction to the model. The article tries not to delve into great technical detail, either about the ODM itself or the technical aspects of the model's implementation. Hopefully, it provides just enough information to understand the concepts.

A little history

The ODM has its roots in 1999, when work first commenced on what resulted in a first draft of the standard, Draft 0.8, issued in March 2000. Since then, several versions have been released for review-in October 2000, Version 1.0 was released. November 2001 saw the review release of Version 1.1. The formal release of Version 1.1 occurred in April 2002. The most recent version, Review Version 1.2, was released for comment in July 2003 and was released for implementation in January 2004.

The ODM has taken some time to evolve into the model we see today. The slow rate of progress in developing the ODM, and the other three CDISC models, is a criticism often levelled at CDISC as an organization. This seems somewhat unfair. The subject domain is a complex one, and the wide-ranging interests of the parties involved make achieving consensus difficult and time-consuming.

The role of the ODM

The CDISC data models are designed to support the end-to-end data flow within clinical trials, from the operational database through analysis to regulatory submission. The role of the ODM is to facilitate the movement of clinical data collected from multiple acquisition sources to an operational database, but it also has application in the subsequent exchange and archiving of such data.

The model has been designed to be flexible enough to allow data from a variety of sources, such as electronic or paper case report forms, electronic Patient Diaries, laboratory data, and others, while also allowing for data to be passed to and from data stores.

ODM mechanics

The ODM specification defines the format of an ODM file as containing the ODM root element with four main subelements (see Figure 1).

Figure 1. ODM file structure.

  • The Study element, used to detail basic study information such as study name, and to define the MetaData for the study.

  • The AdminData element, used to hold user attributes such as locations and signatures for those participating in the trial. These items are used for audit trail functionality, for example.

  • The ReferenceData, used to detail data used in the interpretation of clinical data. An example of such data is laboratory normal ranges.

  • The ClinicalData section, used to hold the data collected as part of the trial.

At the start of this article, two scenarios where data was transferred and interpreted incorrectly were described. In these scenarios, the users had misinterpreted the data. They had the data but did not fully understand how that information should be interpreted; in other words, they had the data but not the metadata. Data are the actual values while metadata is the description of that data.

The metadata held within the study section is central to the concepts behind the ODM, as it defines the structures and interpretation for the data that is to be collected as part the specific clinical protocol. This inherent ability within the ODM to define the structures provides the means by which we can avoid error scenarios such as those described at the start of this article. Since the ODM specification details the format and ordering of elements, a computer can check the validity of not only an ODM but also, by using the metadata, the data within the ODM.

To illustrate the use and power of the ODM, consider the following simplified example of the data collection process employed during a clinical trial. An investigator recruits a patient and proceeds to collect some basic demographic data. This data is written on a paper case report form (CRF), and then entered via a computer for onward transmission into a database located on the servers of a CRO. The CRO is then requested to pass the data to a third party.

The data collected is nothing more complex than the site number at which the subjects were recruited, their sex, weight, and height, and a unique subject identifier, as illustrated by an extract from the paper CRF shown in Figure 2.

Figure 2. Case report form.

This data could be passed in a number of ways. For example, it might become a Comma Separated Variable (CSV) file-similar to that used to import data into Microsoft Excel. Figure 3 depicts the various data flows.

Figure 4 shows an example of a single entry that encoded within such a CSV file. The data is for subject 12 recruited at study center 5. Subject 12 is a male, weight 87.2 Kg and is 156 cm tall. Note that the position of each of the data items is assumed, as are the units for each field, a practice that can lead to errors, as has already been discussed.

Alternatively, the ODM could be employed to transport the data. Figure 5 shows extracts from an equivalent file for the same data as within the CSV file. As can been seen in the figure, the ODM file is composed of the metadata and the ClinicalData elements. The metadata element contains the definition of the paper CRF within the StudyEvent, Form, and Item elements. The subject data is held within corresponding elements within the ClinicalData.

One immediately apparent issue is the verbosity of XML; the relative size of the CSV and ODM files is quite marked by comparison. However, it would normally be the case that each ODM file would not contain the metadata definitions. The ODM specification allows ODM files to refer to others, and this allows for the metadata to be held within a single ODM file referred to by all others. This results in most ODM files only holding data within the ClinicalData section.

Why use the ODM?

By adopting the ODM, pharmaceutical companies can benefit in a number of ways to improve their clinical development processes and reduce their costs. Data quality is a major consideration for all involved within the clinical trials process and, as already discussed, the ability to define the data content and the meaning of that data content can help in the battle to reduce errors when data is transferred.

Whatever the format used in exchanging data, be it CSV, ODM, or some other format, the costs of setting up and operating the exchange of clinical data have to be met. Costs will typically be incurred for activities such as:

  • The definition of the specification that defines the format

  • Programming effort to implement the support for the format

  • Testing to ensure that the interface works

  • Importing , exporting , and checking during the trial

  • Maintenance and support of the format

  • Archiving the data at the end of the trial.

These costs can escalate rapidly if the work is repeated for each trial or for each organization involved in the trial. This is especially true when the costs of programming and testing are taken into account, as these are expensive activities. By employing a standard mechanism, an organization can avoid the additional costs created by additional formats. Further benefits can be achieved once a standard mechanism is in place. By employing this mechanism on each subsequent trial, an organization will increase the savings with each trial undertaken once the costs of the initial investment has been met.

In addition to the cost savings that standards bring, the use of the standard provides for operational flexibility. As adoption increases within the industry, and organizations adopt the standard and associated tools, companies will enjoy the freedom to work with a greater number of suppliers or customers and choose from toolsets that support the standard.

Guidance

The ODM is a powerful mechanism for the transfer of clinical data. However, with any power there is always the ability to abuse it. So, some words of caution.

The first recommendation for any organization wishing to use the ODM is to stop and think: think about how you wish to use it across an organization, not just within a single trial. How many times have we seen various technology pilots that work but do not scale to encompass the organization as a whole?

Simple issues like conventions for trial identifiers, site identifiers, and location identifiers need to be considered. All that needs to happen is for the same identifier to be used to denote two different studies, and the data for both appear to come from the same study! None of this is rocket science; it simply requires a little thought and organization, and yes, maybe a Standard Operating Procedure or two. The same caution needs to be applied to the names of items used to hold the clinical data, as the names may need to take into account the processes undertaken during the analysis phase.

An old adage, but one worth remembering, is Keep It Simple. It is recommended that, rather than placing all metadata and data into a single file, organizations use a number of files. Consideration should be given to file structures, and the best way of working given the toolsets being deployed.

The ODM permits a single file to contain some or all of the elements. It will almost certainly be true that by putting everything into a single file, that file will become large and complex. It is true that the ODM file will be processed by a tool running on a large and powerful computer, but well-organized data can only assist the general logistics of a trial.

It is recommended that you keep metadata separate from clinical data, and always refer to the metadata. It is worth remembering that the metadata represents the protocol, so it is necessary to keep it under configuration control as well as all of the other files.

It should also be remembered that the ODM is a means of representing data and not the means by which the data is sent-it is simply a file. Therefore the file may well need to be encrypted prior to transmission depending on where and how it is being sent, to ensure the data stays confidential.

Validation

The use of the ODM within any computer system will have an impact on the validation undertaken. The cornerstones of the FDA's policy on computerized systems used within clinical trials are data quality and integrity. The mechanisms for data transfer used within a system, whatever the mechanism employed, should be one area that should be closely scrutinized.

The danger with any import or export function is that any flaw in the process affects all subsequent processing based upon the data. There is little point spending significant amounts of money and effort in a data collection process that produces high-quality data only to have the whole exercise invalidated by a function that fails to export the data correctly. Any risk analysis undertaken on a system as part of a validation exercise should take such an eventuality into account.

Data import and export functions within any computer system are, by their nature, hidden from the users. The function may well be automated to run at a certain time of the day, or it may be initiated by the user. Once initiated, though, there is little evidence to a user that the processing is underway. A resulting data file will be generated, but is it correct?

There is an obvious need to ensure that the generation of ODM files, and the subsequent re-importation of those same files, does not result in any data being altered or lost in the process. Given that the function is an internal one, significant checking should be undertaken on the tools during their development to prove that the function has been adequately tested.

With the ODM being based upon XML technology, there is a high probability that the tools used may be off-the-shelf products developed for use in industry at large, and consequently the processes and procedures used in their development are unknown. Care needs to be taken in this circumstance. Such tools are developed in a variety of ways, some more rigorously than others. The selection of such tools should be carefully considered to check their suitability so that, if selected, the effort required in undertaking the validation is minimized.

Because of the ability of the ODM to support multiple files, there is a need to ensure that a trial's processes and procedures can accommodate this. Validation checks should encompass handling of sets of files and making sure that individual files do not get lost.

Security and confidentiality are two other areas that should be closely examined. When the ODM files are transported between sites they should be secure, with the data remaining locked away. Care needs to be taken, too, when ODM files are present on a system to protect from unauthorized access. To assist with this need, Version 1.2 of the ODM introduced XML signatures that can be used to provide tamper-proofing of files.

Conclusion

This article has introduced the ODM, outlined its uses, and provided visibility of its inherent power. The ODM brings the power to easily interconnect systems, and it allows for the use of standard tools in standard roles now that they have the mechanism for a standard way to communicate.

However, to gain acceptance and be adopted, the ODM needs to go through a period of stability. Vendors, CROs, and sponsors should be able to use the ODM knowing that the model will not change, and if it does, that the changes will not be significant. The stability of Version 1.2 of the model suggests we have reached this stage.

CDISC continues to improve the ODM and the other models for Laboratory and Submissions data. It is also striving to harmonize them, so as to provide for seamless data transfers from capture to submission while minimizing the risk of error.

Marie Curie once wrote in a letter, "One never notices what has been done; one can only see what remains to be done." The pharmaceutical industry needs to recognize the progress CDISC has achieved over the last few years as an organization, and begin to employ the fruits of its labors.