Document Management


Applied Clinical Trials

Applied Clinical TrialsApplied Clinical Trials-10-01-2011
Volume 20
Issue 10

Five key steps for e-submission ready documents to avoid pre-submission rework.

Related Article

Common Technical Document Development:

The eCTD Advantage:

Common Ground for eCTD:

It's been said that a new drug application is 100,000 pages (or more) in support of a label. While this might be a simplified view, it does provide some idea of the scope and challenges involved in preparing the enormous body of documentation that is needed to convince a regulatory authority that a product is safe and effective.

In taking a closer look at the required documentation, it's easy to see that a number of challenges await those responsible for assembling the submission:

Administrative and prescribing information. Agencies require a variety of forms, which may be locked down PDF fillable forms, XML forms, or scanned paper. Correspondence and certifications are required that may originate from a variety of different sources. Labeling required in a submission may take several forms ranging from MS Word documents showing tracked changes for draft labeling to XML (e.g., FDA's Structured Product Information).

Summaries and overviews. Because the documents depend on being able to analyze and reference final documents in other parts of the submission, the ability to finalize them depends on the other documents being available and stable.

Quality documents. In this area, the submission publisher must locate the correct content taking into account document versions and specific authority requirements. For example, different stability documents and compendial excipients may apply in different parts of the world, despite the harmonization goals of the International Council on Harmonization. Other challenges include the need to include scanned content such as batch records as well as manufacturing documents that may be subject to late modifications during manufacturing scale-up.

Nonclinical reports. Because the nonclinical program is so lengthy for most products, reports may be needed that were created five years ago or even earlier. The reports were often created by contract research organizations (CROs) or by scientists no longer working on the drug project. These factors often contribute to difficulties in locating reports and in issues related to the fact that they were not created to current standards.

Clinical reports. Clinical reports represent the largest portion of submission content, averaging around two-thirds of an application. In this area, the publisher must work with study reports that may not have been formatted in accordance with agency requirements, or that many exist in many small pieces. In the United States, a large volume of case report forms and SAS datasets must be assembled in accordance with their own specifications.

In the world of paper submissions, the assembly process was complicated, but largely free of technical compliance requirements. In other words, the publisher could take documents in many formats, assemble and print them, with the main concern being the appearance of the final paper product. If a problem was found, often a replacement page could be slipped in at the last moment.

Now, electronic submissions standards place many technical requirements on the individual documents included in a submission, resulting in a new set of quality assurance responsibilities that sometimes fall on a documentation group, but more often end up with the regulatory publisher.

Sprint to the finish

Of course, a primary component of profitability for a drug, biologic, or medical device is its effective patent life. Delivering a submission to a regulatory agency and triggering a review clock directly affects the effective patent life. So why do so many companies wait until the last minute to address document quality issues that can jeopardize their carefully planned timelines?

In general, the root causes trace back to not fully understanding the requirements and failing to put in place a system to ensure they are addressed throughout the entire process:

  • Sponsors may have put off becoming compliant for years, believing that they did not have to worry about electronic submissions yet and not understanding that the documents they were creating would someday be included in electronic submissions in the not-so distant future.

  • Sponsors may not have studied the requirements thoroughly, or may have believed that they would be fully met through the use of commercially available document templates. They may also have relied too much on their CROs, suppliers, or partners to produce compliant documents.

  • Sponsors may have instituted a program for compliant documentation, but failed to monitor it and make corrections.

Whatever the reasons, dealing with documents that are not e-submission-ready at the last minute is costly and time consuming. If problems are found with approved documents in PDF format, fixing them may invalidate electronic or digital signatures, break hyperlinks and bookmarks, and result in the need for re-review and additional publishing activities.

Risks of noncompliance

In the past, regulatory authorities were so eager to encourage electronic submissions that they would often accept significantly flawed applications. However, sponsor adoption of electronic submissions has increased significantly in recent years. For example, the FDA now receives over 50% of new drug applications (originals and supplements) in electronic common technical document (eCTD) format.1 As a result, agencies are beginning to demand greater compliance with the published specifications and standards. For example:

  • The FDA began checking for dozens of errors related to PDFs and fillable forms in 2010.2 Currently, FDA is rejecting around 7% of eCTD applications due to errors.3

  • Reportedly, the French health authority has revealed that it is rejecting half of all NeeS (non-eCTD electronic submissions) dossiers because they have been poorly formatted and do not meet basic validation standards.

  • The Belgian authority has required full compliance with published standards since 2007.4

  • SwissMedic requires the sponsor to perform technical validation of an eCTD before submission, and to submit a form documenting all unresolved errors.

Each of the major authorities accepting eCTD has published validation specifications and established their authority to reject electronic submissions.5-13 The impact of a rejected submission is obvious—delay due to rework. What is less commonly understood is that a technically deficient submission may result in review problems and delays— leading to additional delays in approval, such as the issuance of a complete response letter in the United States. Understanding and complying with requirements, as documents are created, reviewed, and assembled, minimizes the risk of such costly delays due to technical and organizational issues.

Planning for compliance

We've established some of the underlying causes of non-compliance, as well as the risks and consequences. The remainder of this article will discuss key document management practices supporting compliant e-submissions. The practical information and recommendations offered will help to identify gaps in current practices and understand how to address them.

The key document management practices that will be discussed include:

  • Establishing taxonomy and metadata that map to eCTD/NeeS definition of documents and granularity

  • Use of high quality authoring templates

  • Generation of compliant PDF renditions

  • Judicious choice of document titles

  • Limiting file size to acceptable values

Taxonomy and metadata

Taxonomy and metadata are buzzwords for concepts, which most of us are already familiar.

Taxonomy is the practice and science of classification. In the document management world, this means the classification scheme for the types of documents that you manage. For example, in the general sense, a document may be a "quality" or "CMC" document, then more specifically a "control of drug substance" document, and then a "specification." Taken together, the entire inventory of documents included in a submission (and sometimes others as well) are organized in an electronic document management system (EDMS) using a taxonomy.

Metadata is loosely defined as data about data. In the document management world, this means information about your documents which may or may not be included within the documents themselves. Document metadata (also known as properties or attributes) is used to organize and retrieve documents. It may include title, author, approval date, study number, drug substance, or many other pieces of information.

The importance of taxonomy. In the paper world, it didn't make much difference how your documents were structured. You could create a single large document representing all of module 3, paginate it continuously, and break it up (or "volumate" it) for binding into physical volumes. For electronic submissions, however, files (documents) are expected to be provided at very specific levels, slotting into a submission table of contents (the eCTD backbone). Compliance with this requirement has a profound effect on authoring, review, and approval—and starts with the implementation of a compliant taxonomy in an EDMS.

The ICH Electronic Common Technical Document Specification defines the standard sections (overall table of contents) of the CTD/eCTD. The guideline "Organization of the Common Technical Document for the Registration of Pharmaceuticals for Human Use M4" adds the concept of granularity—the level at which distinct, standalone documents can exist. This reference provides a definition for the term document:

A document is defined for a paper submission as a set of pages, numbered sequentially and divided from other documents by a tab. A document can be equated to a file for an electronic submission... In an electronic submission, a new file starts at the same point at which in a paper submission, a tab divides the documents.

The guideline then defines the appropriate granularity to support the submission table of contents (Figure 1).

Figure 1. Yellow shading indicates that documents rolled up to this level are not considered appropriate, and the purple shading indicates that a single document can be submitted at this level.

So how does this affect the EDMS? The answer is that that taxonomy in your EDMS should be optimized to support the organization and taxonomy of the CTD/eCTD (as well as any other submission types you might need to produce). For example:

  • The system should not normally allow you to create a "module 2" document.

  • The system should allow you to create a single document representing all of 2.3 (the quality overall summary). However, it should also allow you to create a document representing the drug substance portion of the QOS (2.3.S) or even general information about the drug substance (2.3.S.1).

  • The taxonomy should be specific enough to map back to the submission sections. For example, if a user creating a document only identifies it as a "report" or "protocol," it could belong in many locations in the quality, nonclinical, or clinical section of a submission.

  • The taxonomy should ensure that there is a logical place for everything to be included in a submission, while minimizing the use of categories like "other."

Sponsors don't always do a good job in organizing documentation in accordance with the CTD/eCTD guidance, with only about half of them doing a good or excellent job14 (Figure 2).

Figure 2. About half of sponsors are doing a good or excellent job organizing documentation in accordance with the CTD/eCTD guidance.

The best practice for implementing a taxonomy in an EDMS is to design one that clearly maps each type of document back to its regulatory requirements, including submission section number, and enforces granularity requirements. Properly implemented, this will assist in organizing information correctly without requiring EDMS users to memorize details about what documents are placed where. One such taxonomy is provided by the EDM reference model—a taxonomy/metadata reference model developed by an industry working group that maps heavily to regulatory requirements.

Walking the metadata tightrope. Metadata is information about a document stored within the EDMS in association with the document. It's commonly used to place a document in a correct folder structure in an EDMS (for example, one including a specific product or study number). It's also used to place a document in the correct eCTD section (e.g., one associated with a specific drug substance) or to provide information used to create a "study tagging file" describing a nonclinical or clinical study. Finally, it's used for search and retrieval within an EDMS.

How is metadata populated? The short answer is that it is entered by a user. Anyone who has been associated with electronic document management has heard various horror stories about systems that require users to enter dozens of pieces of metadata. The user entering them doesn't see the value, and often rebels and ends up working outside the system. On the other hand, without context a document management system becomes an undifferentiated repository for documents of unknown value or relevance. The solution to this problem involves limiting metadata to that providing true value, and automating metadata population wherever practical.

The core metadata required for eCTD includes:

  • Various regional administrative information, such as submission date and type.

  • Drug substance name and potentially manufacturer (handling of manufacturers in eCTD is outside the scope of this article).

  • Drug product name, dosage form, and manufacturer.

  • Indication.

  • For nonclinical studies, study title and ID, and for certain types of studies, species, route of administration, and study duration.

  • For clinical studies, study title and ID, and for certain types of studies, type of control.

In an efficient EDMS, the user should only have to populate fairly minimal metadata. For example, the minimum metadata for many clinical documents can be the title, product, and study ID. Using this information, other metadata such as study title and other study data can be auto-populated from a central source. The key is to limit metadata to what is essential for classification in the EDMS, filing within a submission structure, or retrieving a document in a search.

Benefits of optimized taxonomy and metadata. When an EDMS is built to map to submissions and collect key submission-related properties:

  • Authors categorize documents as they are created.

  • Authors assign the metadata that allow documents to be placed in the submission TOC and categorized via study tagging files.

  • Publishers can locate the content needed for a submission, understand if it is ready, and place it under the correct table of contents heading.

A key element is that this classification occurs as a core part of the regular business process—not as an activity done further down the line.

Authoring templates

Many people are already aware of the need for authoring templates (and associated style guides) to comply with ICH and regional requirements for: margins, fonts, styles, tables of contents, headers, footers, pagination, page size, etc.

Various ICH and the various regional authorities provide guidance on requirements; most specifically, Appendix 7 of the "Electronic Common Technical Document Specification" and the FDA document "Portable Document Format Specifications."

For companies without the internal staff availability or expertise needed to develop their own templates, a number of commercial template sets are available for purchase. These templates can be adopted to include a sponsor's own boilerplate text, authoring instructions, and other content.

Adoption and proper use of good authoring templates is important and should be considered important by sponsors, even those who aren't doing electronic submissions yet or who are using a publishing partner to prepare them. Investing a relatively small amount of time in producing a compliant document now will pay return on investment years in the future when a document is actually submitted, or even sooner if costly re-work by a publishing partner can be avoided.

Providing timely and relevant training to authors is a key factor in the successful use of templates. Some companies simply add their templates to their EDMS and expect that authors will use them correctly—not all authors are expert users of Microsoft Word and many do not fully understand the concept of styles and headings, which are essential to producing compliant documents. Authors may introduce custom symbol fonts, use heading fonts but not actually apply heading styles, decrease font sizes, and decrease or change margins—all in support of their legitimate goals of producing what they believe to be compact and usable documents. A simple training session will help them to understand the unintended consequences of failing to follow authoring guidelines. Some authors may need more extensive Word training in addition.

Another common mistake related to templates and authoring standards is failure to audit documents and monitor compliance with standards. Sponsors who use a monitoring program report major increases in compliance, resulting in 80 to 95% compliance according to industry speakers. Approaches range from using automatic tools to manual checking, but most successful programs involve rejecting non-compliant documents back to their authors and providing additional training or re-training as needed.

Compliant PDFs

Although templates are used to produce high quality Word documents, those documents are almost always rendered to PDF format for submissions. When good templates are used, for the most part good PDF renditions will follow. However, producing compliant PDF renditions is a complex technical process and sponsors should review their PDF generation tools and processes to ensure that they comply with the myriad of requirements. Table 1 shows important PDF requirements that health authorities check for using their eCTD or NeeS validation tools (other requirements, such as those around margins, page orientations, etc. are not checked but are still important).

Table 1. PDF requirements that health authorities check for using their eCTD or NeeS validation tools.

In addition, it should be ensured that the bookmarks and hyperlinks created in your Word document are preserved, and that they are created down to the fourth level to match the document table of contents. Agencies validate bookmarks and hyperlinks for correctness and to ensure they do not use absolute file paths, which will not work once delivered to agency servers.

Most of these settings can be controlled by correct configuration of your rendering solution, even if it you just use Distiller (Adobe PDF).

To ensure that the authorities can search documents and copy and paste text into their assessment reports, they prefer that PDFs are created from electronic sources (such as Word) whenever possible, and that scanning be kept to a minimum. The US FDA will generate a validation error (5057) for each document that is not text searchable. The draft eCTD guidance of the Australian Therapeutic Goods Agency (TGA) specifies documents that must be generated from an electronic source and text searchable (see Reference 9). Other authorities state that PDFs produced from an electronic document are much preferred and in some cases require them for expert reports, summaries, forms, etc. The overall message to sponsors is to minimize scanning.

Document titles

Document titles (not file names) are extremely important to agency reviewers. Many people simply assign the title on the document's title page as the EDMS/eCTD title, but that's not always the best choice.

It's a best practice to create good titles in your EDMS so that publishers don't have to re-work titles when they create submissions. Keep these four guidelines in mind when creating titles:

  • Keep titles succinct. Reviewers generally only see the first 20 to 40 characters of a title unless they scroll or change the size of the panes within their review application.

  • Load important information to the left (beginning). For example, envision two protocols with the same very long title, except that the second includes the word "amendment" at the very end. The reviewer will not be able to distinguish the amendment from the original protocol.

  • Make sure that your title distinguishes a document from other leaves/documents in the same section (sometimes even across sequences). A section containing a number of files with the same titles drives the FDA's reviewers crazy. If you provide four investigator CVs called "investigator CV" for the same study, the reviewer will not know which one is which. Likewise, if you assign a title of "cover letter" to every cover letter, think about what the reviewer will see someday when you have provided 100 eCTD sequences, each with its own cover letter.

  • Define the title such that the reviewer does not have to open the document to understand what's in it. Providing "method 1" and "method 2" is not helpful.

File size

Physical file size must not exceed 100 MB according to ICH specifications. In fact, all major agencies trigger validation errors for files exceeding 100 MB—FDA error 1238 (low), HC error 3 (error), EU Error 15.BP1 (best practice). An exception is SAS SDTM datasets, which can be up to 400 MB.

It's a best practice to monitor PDFs as they are created in your EDMS if possible. If you create, review, and approve a large PDF and then have to split it, you will have to re-establish bookmarks and hyperlinks. If possible reduce the size of the file (there are various techniques for doing so)—if not, at least you will be aware of the need to split the document before external hyperlinks are created to or from the document.


A review of the requirements for e-submission-ready documents, as well as the risk and cost of non-compliance, provides clear justification for implementing a set of tools and procedures that build compliance into documents as they are created, reviewed, and approved in an EDMS—not just at submission time.

Many sponsors are following some, or even most, practices to ensure e-submission-ready documents. The checklist presented in Table 2 will help to identify gaps in processes and tools that can result in costly re-work.

Table 2. Gaps in processes and tools can result in costly re-work.

Kathie Clark is Director, Product Management at NextDocs Corporation, 240 King of Prussia, PA, e-mail:


1. G. M. Gensinger, "FDA International Update," presented at the Drug Information Associate annual meeting in Washington, DC (June 20, 2010).

2. V. Ventura, "FDA eCTD Update: What's Ahead for Validation Codes and Module 1," presented at the Drug Information Associate Electronic Submissions Conference in San Diego, CA (October 28, 2010).

3. Virginia Ventura, "CDER Update: eCTD & Gateway Submissions," presented at the GPhA/FDA ANDA Labeling Workshop, North Bethesda, MD (April 14, 2010).

4. Federal Agency of Medicines and Health Products, "e-Submission,"

5. International Council on Harmonization, "Organization of the Common Technical Document for the Registration of Pharmaceuticals for Human Use," M4, January 13, 2004,

6. International Council on Harmonization, "Electronic Common Technical Document Specification," V3.2.2 (July 16, 2008),

7. Food and Drug Administration, "Portable Document Format Specifications," V2.0 (June 4, 2008),

8. Food and Drug Administration, Draft, "Guidance for Industry: Providing Regulatory Submissions in Electronic Format — Receipt Date," (June 2007),

9. Food and Drug Administration, "Specifications for eCTD Validation Criteria," V1.0 (March 10, 2008),

10. Food and Drug Administration, "Specifications for eCTD Validation Criteria," V2.0, Draft, (December 10, 2010),

11. Health Canada, "Health Canada eCTD Validation Rules," (November 13, 2007),

12. European Medicines Agency, "EU eCTD Validation Criteria," V2.1 (April 2009), NEED LINK.

13. Therapeutic Goods Administration, "Draft Guidance for Industry on Providing Regulatory Submissions for Prescription Medicines in Electronic Format (eCTD) in Australia," (January 2009),

14. S. M. Connelly, "CTD/eCTD Quality: FDA Survey Results," presented at the Drug Information Associate annual meeting in Washington, D.C. (June 20, 2010).

© 2024 MJH Life Sciences

All rights reserved.