Vault automatically generates a PDF rendition (the “viewable rendition”) when a user uploads a file to Vault as a new document or version. Viewable renditions display in the inline viewer and allow users with appropriate access to add annotations to the document. Each version of a document has a unique viewable rendition to reflect any changes made during the editing process. See a list of file formats we support for auto-generated viewable renditions.
Like source files, the viewable rendition is available to download. If users have added annotations to the document, an additional PDF with Annotations download option is available. This option creates a new PDF file by merging the comments with the viewable rendition.
Note that you must have a security profile that grants the Document: Download Rendition permission.
Vault generates all viewable renditions in PDF version 1.7, in accordance with ICH Export Working Group M2 Recommendation – Electronic Standards for the Transfer of Regulatory Information (ESTRI) File Format Recommendation – PDF. (See the ICH website for details.)
Note: This setting only applies to renditions that Vault generates. If the original format of the uploaded file is PDF, Vault does not convert it to PDF 1.7.
Vault also sets the Fast Web View option to “true” and sets the default view for viewable renditions with bookmarks to “page plus bookmarks.”
By default, Vault renders documents as PDFs. Admins can configure Rendition Settings to render documents as PDF/A-1b compliant. If PDF/A-1b renditions are enabled, Vault will no longer render documents as standard PDFs.
About Protected PDF Renditions
By default, Vault-generated PDF renditions are unprotected. Admins can configure their Vault’s Rendition Settings to render protected documents with safeguards applied to restrict the PDF rendition from being altered. When protected renditions are enabled, Vault no longer renders documents as unprotected PDFs.
Some PDF files include Adobe® Acrobat® security settings that prevent viewing of the files in Vault inline viewer. Admins can enable handling of these PDF files in the Admin area. If enabled, Vault can display and allow annotation of PDF files with permissions passwords enabled through Adobe® Acrobat® security settings, and XFA-based PDF files produced with Adobe Experience Manager® or Adobe LiveCycle Designer®.
Vault does not support the viewing of PDF files with document open passwords or certificate security. Additionally, Vault cannot apply overlays or eSignature pages to protected files.
Veeva Support can enable locked fields for Microsoft Word™ documents to prevent the fields from updating when the document renders. Currently, fields in document headers and footers do not lock.
Note that the Microsoft™ DATE field automatically updates each time Vault renders your document. If you do not want the date to update due to rendering, enter the desired date manually, or use the Microsoft™ SAVEDATE field, which stores the date and time that the document was last saved.
To learn more about Microsoft Word™ fields, view the Microsoft Word™ documentation.
If you use password protection in a Microsoft Office™ file to control view access, Vault cannot create a viewable rendition. If you need to use password protection on Microsoft Office™ files, but want Vault to auto-generate viewable renditions, you can change the settings for the file to only protect file editing.
When you upload image files with transparent backgrounds, the Doc Info page displays those images against a white background. In some cases, such as a white logo image with a transparent background, you may not be able to see the image properly in Vault.
Image Quality for Word Files
The Image Quality setting preserves the native image resolution (up to 5000 pixels) for PNG, JPEG, and TIFF raster images in viewable renditions that Vault generates from MS Word™ DOCX source files. With this setting enabled, Vault takes longer to generate viewable renditions and renditions may have a larger file size. You must set the MS Word™ option for Image Size and Quality in your source files to Do not compress image in file prior to inserting images. This is because MS Word’s™ default downsampling is 220 PPI.
Vault can also render native-resolution rasterized images for vector images in EMF and WMF formats. This avoids issues with vector images where Vault does not properly render some characters or lines. Note that this makes text in images non-searchable on viewable renditions.
Note: You must contact Veeva Support to enable this setting. This setting only applies to viewable renditions from MS Word™ source files.
OCR (Optical Character Recognition) extracts and indexes text within scanned image and PDF source files with no editable text. Vault then incorporates the text into the viewable rendition to support both text annotations and search within the document viewer. OCR offers optimal performance for English text, but does extract Latin characters for other languages as well.
Please note that this process can only extract typed text.
Partial Text Extraction & Limitations
Vault can extract up to 50 pages on bulk upload (migration) and up to 100 pages on manual upload or on re-render for high quality scanned documents. Migrated documents may be manually re-rendered to extract text for more pages if needed. When extracting text, Vault assigns a confidence score based on how well the words match and how readable the document is. In some cases, Vault will not extract text because the confidence score is too low. If you suspect that the OCR process may have failed for a specific document, check the document audit trail. OCR status and information on extracted content will be visible in the audit trail.
In some cases, Vault will fail to extract text:
- If word confidence scores are low
- If OCR does not detect any text
- If the process times out because the document is very long (over a hundred pages) or the scanned documents are not high quality
OCR Status Notification
Two shared document fields, OCR Requested and Pages OCR’d, offer information on the status of OCR content extraction. You can use these fields as filters when generating reports.
- OCR Requested shows whether a request for content extraction was made.
- Pages OCR’d displays the percentage of pages successfully OCR’d.
Note that an Admin must assign these shared fields to specific document types in order for you to see them.
OCR will automatically attempt to extract text on files with supported formats:
- PDF or PDF/A-1b with no editable text
- Portable Network Graphics (PNG)
- Tagged Image File Format (TIF, TIFF)
- JPEG (JPEG, JPG)
- Graphics Interchange Format (GIF)
- Bitmap (BMP)
File Size Limitations for OCR
By default, Vault does not extract OCR text for documents that exceed any of the following limits:
- PDF or PDF/A-1b files: 100 pages, 20MB
- TIFF files: 100 pages, 20MB
- Other supported formats: 5MB
If the Admins for your Vault have configured overlays or eSignature manifestation pages, Vault automatically adds these when you download the viewable rendition. Overlays display text in the header, footer, and/or diagonally on the document pages. Signature pages can display details for eSignatures on a separate page that appears before or after your document pages.
On some documents, you may see the Disable Vault Overlays document field. If you need to skip overlays for a specific document, you can set this field to Yes. If the field is set to No, is blank, or is not applied to the document, Vault applies overlays according to your Vault’s configuration.
Vault makes embedded web (http://veeva.com), “mailto” (firstname.lastname@example.org), and internal (table of contents, cross references, etc.) links “clickable” in the viewable rendition if the source file is:
- PDF or PDF/A-1b
- Microsoft Office™ (DOC, DOCX, PPTX, etc.)
When you hover over an embedded link in the document viewer, the URL is visible in lower left corner of the browser or as a pop-up information card (in annotate mode only). In either mode, clicking the link opens the URL in a separate window or tab.
Clicking an embedded link while in View mode opens the URL in a new mini-browser window for easy document review. Clicking another link in the document refreshes the current mini-browser window to display the new target. When link annotations in View mode are enabled, Vault also opens annotation targets in this mini-browser window. In Annotate mode, links open in a separate tab or window as usual.
Vault supports a limited set of embedded link types. Vault presents unsupported links as either plain text (such as links created in Acrobat or Word to invoke software-specific functions), or as hypertext. Unsupported hypertext links display a non-clickable Unsupported Link tooltip.
Supported links meet some or all of the following criteria:
- Target URL is absolute (full path) and uses a whitelisted protocol (
- Target contains no angle-brackets
< >or square brackets
[ ]except when escaped with character encoding.
- Target URL lacks a protocol but begins
www.; in these cases, Vault adds
- Target is a bookmark within the same document.
- Target URL is relative path and points to a document in the same binder, such as:
Unsupported links include the following:
- Relative path URLs that point to documents that are not in the same binder, such as:
- Local file links such as:
- Links beginning with a non-whitelisted protocol (for example, FTP, Sopcast, or Telnet) or an invalid-format protocol (for example:
http: //example.comwith a space after the colon).
- Links that lack a valid protocol and do not begin with
- Software-specific link-types (for example, various Acrobat-specific link types).
When Vault renders a DOC or DOCX file, the viewable rendition can display links in blue text. These links include both web links (for example,
http://www.veeva.com) and internal links in the MS Word™ file (table of contents links, cross references, etc.).
This feature depends on your Vault’s configuration. Admins can enable rendering links in the Admin area.
Note that some links in MS Word™ source files will convert into images during PDF rendition. When this occurs, linked text can’t display in blue, and Vault cannot index the text converted into images for full-text search. To avoid this, clear the formatting on the MS Word™ source file before creating the viewable rendition.
Vault shows bookmarks in PDF and PDF/A-1b documents automatically. When Vault renders a DOC or DOCX file, the viewable rendition displays bookmarks. Admins can configure bookmarks in the Rendition Settings page.
For more information on bookmarks, see Bookmark Support.
When Vault renders a PDF or PDF/A-1b source file with manually created Destinations, the viewable rendition displays them in the Destinations panel. For more information on Destinations, see Navigating to Destinations in the Document Viewer.
When enabled by an Admin, Vault automatically populates basic document metadata on Vault generated PDF and PDF/A-1b viewable renditions based on the source document properties. Vault populates the Title, Author, Subject, and Keywords fields in the Document Properties of the viewable rendition based on the source document’s file properties for these fields.
This feature is only compatible with Microsoft Word, Excel, and PowerPoint source files.
Viewable renditions for MS Word™ documents include all markup and comments that exist on the source file. Admins can enable viewable renditions without document markup for all documents in your Vault. Users with the Vault Owner Actions: Re-render or Manage Viewable Rendition permissions can also enable this setting for individual documents by selecting Word Rendition Settings in the document’s Actions menu.
Vault does not include Speaker Notes in viewable renditions for MS PowerPoint documents. Admins can enable viewable renditions with speaker notes for all documents in your Vault. Users with the Vault Owner Actions: Re-render or Manage Viewable Rendition permissions can also enable this setting for individual documents by selecting PowerPoint Rendition Settings in the document’s Actions menu.
About File Encoding
To help ensure proper rendering of text-based files, make sure that files specify the encoding targeted within (UTF-8, ANSI, etc.). If the proper encoding is not declared, the text in the Vault-generated viewable rendition may display incorrectly.
About HWP Viewable Renditions
Vault automatically creates viewable renditions for Hangul Word Processor (HWP) source files. See About Viewable Renditions for Hangul Word Processor for information about HWP viewable renditions.