PDF/A: PDF for Archving
It's 2007. You just created an important document for a client—
a complex regulatory filing for the client's new power plant.
Fast forward to 2027—twenty years from today. Will the documents
In the legal industry, document conversion problems are legion.
Many attorneys started with WordPerfect and may have migrated to
Word. Opening all of those old documents can be troublesome.
Will everything convert? Will it look the same?
The PDF format, designed to capture the printed intent of a
document, is a great solution. With over half a billion copies
of Adobe Reader installed, PDF has been a de facto standard.
Adobe publishes the specification for the PDF, and over 1000
third-party products create, consume or work with PDF in one way
However, government and industry need more assurances—they
require de jure standards. A de jure standard is endorsed by an
independent standards body such as the International
Organization for Standardization (ISO).
Fortunately, we have PDF/A, an ISO standard.
PDF/A was developed by the PDF/A Joint Working Group working
under the auspices of AIIM (Association for Information and
AIIM has a rich history of developing standards for information
management, document management and imaging.
The legal market was well represented in the working group.
Chairing the group was Stephen Levenson, from the Administrative
Office of the U.S. Courts. The courts, of course, need to
safeguard important court decisions stored in electronic
The PDF/A working group first met in mid-2002 and the
specification was formally approved in May, 2005. I’m told that
is extraordinarily fast which indicates a lot of desire in the
document archiving community around this standard.
First things first—I’m not here offer a complete technical
overview of PDF/A.
To keep it simple, here’s what law firms need to know about
PDF/A is based on PDF Reference 1.4— the Acrobat 5 file format
The standard dictates that some features are required and others
There are two "flavors" of PDF/A.
PDF/A-1a—intended for electronic documents such as word
processing, spreadsheets, etc.
PDF/A-1b – for documents scanned from paper or microfiche
Do’s and Don’ts
The PDF/A specification notes that documents should be
self-contained, unfettered, device independent and tagged. What
does that mean?
Long-term predictability requires that documents do not rely on
outside elements to render properly. It makes sense that PDF/A
requires that fonts are embedded in the document.
What fonts are used in your organization? Some fonts have a do
not embed flag which prevent them from being embedded by Adobe
Fonts add considerably to the "weight" of electronic files, so
you can expect that PDF/A files may be larger than the same PDF
without the fonts embedded.
A self-contained document should not be reliant on any outside
media player or scripting system. PDF/A does not allow external
These restrictions rule out certain kinds of documents. For
example, a rich, cross-linked eBrief may not be PDF/A
The PDF/A spec demands that color is expressed in a
If you've ever looked at the output from two different printers
or monitors, you can easily detect subtle differences. For long
term archiving, wouldn’t you want to be able to know what the
color was really supposed to look like?
Put simply, device independence means using a known, standard
color space. Software in the application or operating system can
then translate the known space to the user color space—e.g. your
printer or monitor.
One color space I’d recommend for law firms is the sRGB
(Standard RGB). sRGB is supported by most digital cameras and
Adobe’s product line including Photoshop.
The PDF/A specification insists that documents are unencumbered.
PDF security of any kind is not allowed. Besides, who would
remember a password twenty years from now?
PDF/A documents require a metadata structure in the file.
Metadata—information about documents— may used to record items
such as Title, Subject, Author, Keyword, and so on. PDF/A1-a
does not dicate that user fields are populated, but the metadata
structure must be present in the document.
Metadata has a negative connotation in the legal market, but the
intent with metadata in PDF/A is to allow future readers of
documents to more easily search and classify material.
Tagging is the structure added to documents so that the visually
impaired may more easily consume the document.
Tagging offers anybody reading documents on a computer screen a
number of benefits, however.
PDFEditor OCX: View and Edit PDF files.
PDFPrint: Print PDF files to Windows Printer without depend on
PDF to HTML Converter: Convert PDF files to HTML documents.
PDF to Text Converter: Convert PDF files to plain text files.
PDF to Vector Converter: Convert PDF files to PS, EPS, WMF, EMF,
XPS, PCL, HPGL, SWF, SVG, etc. vector files.
PDF to Image Converter: Convert PDF files to TIF, TIFF, JPG,
GIF, PNG, BMP, EMF, PCX, TGA formats.
DocConverter COM Component (+HTML2PDF.exe): Convert HTML, DOC,
RTF, XLS, PPT, TXT etc. files to PDF files, it is depend on PDFcamp
Image to PDF Converter: Convert 40+ image formats to PDF files.
HTML Converter: Convert HTML files to TIF, TIFF, JPG, JPEG, GIF,
PNG, BMP, PCX, TGA, JP2 (JPEG2000), PNM, etc. formats.
PDF to Word Converter: Convert PDF files to MS Word documents.
More PDF Products
PDF to Image Converter ::
PDF Extract TIFF
HTML Converter ::
PDFcamp Printer ::
DocConverter COM ::
PDF to Word Converter ::
PDF to Text Converter ::
Image to PDF Converter ::
Image to PDF OCR ::
PDF to HTML
AutoCAD DWG and DXF to PDF Converter ::
PCL to PDF Converter ::
Document Printer (docPrint) ::
PDF Password Remover ::
:: PDF Split-Merge
:: PDF Stamper
PDF Tools ::
PDF Editor Toolkit ::
Text to PDF
PowerPoint to Flash ::