Full-text search allows you to find search term matches within the content of a document, whereas the standard search only locates matches within document fields.
Note: For performance reasons, full-text search returns only the first 5,000 matching documents.
You can only use full-text search from the Advanced Search dialog. To include document content in searches:
- Click the binoculars icon in the search bar to open Advanced Search.
- In Search Scope, choose Include Content.
- Optional: Select Include Attachments to search for matches in document and object record attachments.
- Enter your search term in either the Any of these words or All of these words fields, depending on which search operator you want to use. You can use both fields if needed.
- Fill in the remaining fields as needed.
- Click Search.
Vault separates search terms into various segments when searching on alpha-numeric and punctuation fields. This process is called “tokenization.”
By default, searches include document content and document fields or object record fields only. Select Include Content to include document content when using Advanced Search. Select Include Attachments to search for matches in document and object record attachments.
You can also use Advanced Search to search within the document archive. To do this, select Archive from the Search In drop-down or set the Search Archives toggle to Yes. These options only appear if you have a security profile that includes the View Archive permission.
If you enter a search term in the All of the words field with both Include Content and Include Attachments selected, Vault performs a separate search for each search scope. In addition, a third search is performed in the document’s metadata. The search term must appear in at least one of the search scopes to generate results.
Note: If you search an object record using All of the words and Include Attachments, Vault performs one search on the entire record and another on the record’s attachments.
About Search Results
When you search within the content of a document, Vault runs separate searches for document fields and document content, and then merges the final set of results. If the search results include more than 5,000 documents, Vault limits the results to the first 5,000 documents that are most relevant to your search terms and displays a warning. To see a complete set of results, apply additional filters before performing another full-text search.
If Vault finds a match for your search terms within the document content, the search results page displays an excerpt from the document to provide context for the matching term.
Vault automatically indexes the full text for documents with supported source file formats in order to support full-text search. Document content is typically available for search within minutes after upload, but in cases where Vault is uploading many documents simultaneously, there may be a delay. Indexing also occurs for document and object attachments.
Vault can extract and index text within scanned source documents that users upload as images or PDF files. This functionality, called Optical Character Recognition (OCR) allows you to use full-text search on these documents. Vault only extracts typed, English-language text.
Supported Formats for Text Extract
OCR automatically attempts to extract text from non-animated files for the following supported formats:
- PDF (only if the PDF does not already contain text)
- Portable Network Graphics (PNG)
- Tagged Image File Format (TIF, TIFF)
- JPEG (JPEG, JPG)
- Graphics Interchange Format (GIF)
- Bitmap (BMP)
- AV1 Image File Format (AVIF)
- Scalable Vector Graphics (SVG)
- WebP (WEBP)