|
|
Digital Archiving...
- Required Components
- How To Create Digital
Archives
- How To Create CD-ROM
Collections
- Archiving Old Manuscripts
Many of the most important documents within companies are
already digital files. These documents are assets of the
company representing final packaged results of countless
manhours of labor - maps, forms, brochures, vector plots,
GIS, photographic scans, spreadsheets, graphs, word
processing, annual reports, training manuals, etc. Their
accessibility and reusability is a real issue particularly
in an environment as prone to change as companies face in
contemporary times.
The problem is that the files may be products of any
number of different programs, produced on an array of
different machines.
How can these files be archived when there are so many
different variables involved?
One tactic is to convert the files to PDF. Files that
have lain dormant in one end of an organization take on new
life when they are made compatible with all of the machines
running off a common server or intranet. Converting and
cataloging the files reclaims otherwise lost assets that can
still be of service to the organization. With the fonts
embedded, even the most sophisticated graphics layout files
can be shared and printed on local printers by the newest
assistant or intern.
|
1. Required Components
|
- Minimum system requirements
- PDF conversion software - PDFWriter and/or Acrobat
Distiller®
- Acrobat Catalog® for cataloging and indexing
files
- Acrobat Reader® for opening converted files
- Acrobat Exchange® for conducting cross document
indexed searches and for recomposing pages from different
sources into new documents
- Adobe Illustrator® for opening PDF files for the
purpose of reediting vector graphics and restoring as PDF
files
Component Availability
PDFWriter, Acrobat Distiller®, Catalog, Reader, and
Exchange are modules of the Adobe Acrobat package that can
be purchased directly via mail order for about $200.
Adobe Illustrator® is available as a separate program
for about $400.
|
2. Creating Digital
Archives
|
|
There are two quick ways to create a PDF file:
- Save or Export the file directly in the PDF format
from a growing list of programs - Adobe Illustrator 5.x
and above, PageMaker 6.5, etc.
- Choose PDFWriter (available in the Adobe Acrobat
package) as the print driver. Instead of printing to a
printer, the file will be output to the PDF format -
works well for Microsoft Word, Excel. Does NOTwork well
for programs using Postscript - QuarkXPress, Photoshop
Clipped Paths, imported Illustrator files, etc. The other
risk you run is that the fonts will not be embedded which
means that the receiver may not be able to see the
fonts.
The "clean" way to create a PDF file is to:
- Save your source file, regardless originating
program, as a Postscript
file with the fonts embedded. The file may be many
pages in length.
- Open Distiller
and select the appropriate "Job Options" regarding
font embedding and particularly "Compression." For proof
e-mailing we recommend 72dpi compression resolution. For
proofing on-screen we recommend an intermediate
compression resolution of 300dpi.
Modify the file in Adobe Acrobat Exchange®:
- Open Exchange and edit your files - add other pages,
crop, rotate, create links and bookmarks, append notes,
etc.
- "Save As" an optimized file and add security
passwords if desired. Optimizing reduces the size of the
PDF file and adds byteserving (a.k.a., linearization) -
which means that the end user will be downloading files
one page at a time while the full document downloads in
the background.
Catalog the files using Adobe Acrobat Catalog®.
The end user can open the individual files in Adobe
Reader® (free) or in Adobe Exchange®. Exchange also
allows the user to conduct searches across many documents
through indexes created in Catalog®.
|
3. Creating CD-ROM
Collections
|
|
There are two sets of instructions regarding how to
create a collection of PDF documents for distribution on
CD-ROMs. One is written by Ken Anderson of Adobe Systems
called "Authoring
An Acrobat CD Product, Complete Project Guide." It is
available from the PurePDF website.
Compare his instructions to the following step-by-step
production notes written by Chris
Lane of Computer Multimedia Productions Corp.
(813-531-7279) - downloaded 6/22/98 from the PDFZone User
Forum.
- Create all source PDF files remembering to use 8.3
filenames with no dashes (-) or other non-ISO characters
in them. In general it is also best to use only one case
(upper or lower... I prefer lower) because Acrobat in
Unix will try to match case, but most Unix CD-ROM drivers
change the case of the files to either all upper or all
lower case. This means that if you have a file called
BillyBob.PDF, and you have a PDF link from main.pdf to
BillyBob.PDF, inside Acrobat, this link will be case
sensitive as BillyBob.PDF. On the PC and Mac side, the OS
manages to ignore the case and call BILLYBOB.PDF or
whatever, but in Unix the OS wants to find BillyBob.PDF.
The bummer becomes that most Unix CD-ROM drivers are
written to assume that ISO standard filenames which means
8.3 and all uppercase. To add to the problem, some Unix
CD-ROM drivers will change the case from upper to lower
so that when you look at the filenames on the CD-ROM they
will look more "Unix" or lowercase (hey, I didn't write
the drivers so don't shoot the messager). The result is
that in Unix, Acrobat cross-document links between files
on CD-ROM has a good chance of not working, particularly
if you use changing case filenames. If you use all upper
or all lower you at least have a chance it making it
work. There is also something called the "RockRidge"
extension which allow Unix to do longer than 8.3 and case
sensitive from CD-ROM but creating one of these has its
own set of pains in the ass. Also some newer Unix will do
long file names from CD as well. (NOTE from Tom
Thiersch <thiersch@env-sol.com> to amend the step
one instructions...If by "ISO" you mean "ISO 9960", that
spec requires that all letters be UPPERCASE; so, despite
your personal preference for lowercase, you should keep
that in mind. Failure to strictly follow the ISO 9660
standard can cause your CD-ROM to be un-mountable in
certain situations (Netware seems to be notorious for
this). Don't forget that Acrobat cross-file links
embedded in your PDFs will also be case-sensitive in
those environments which care about it, so the file names
need to be all uppercase on the hard drive where they are
originally created and linked, not just on the CD-ROM
after lower-to-upper translation is done by your CD-ROM
mastering software. This requirement to create all
uppercase filenames can be particularly tricky to do in
certain versions of Windows, so be careful!)
- Place all the source PDF files in a Mac virtual
volume (you can create the virtual volume using Adaptec's
Toast which is what we also use to burn the CD master). A
side benefit of making a virtual volume is that you wind
up with a easy configuration management snapshot of what
you put on the CD-ROM, whereas grabbing the files off of
the hard disk tends to lead to situations where later
attempts to modify the CD image finds that files have
been modified or deleted from the hard disk. The toast
virtual volume becomes a file which you can place on a
Jaz or other rewritable media and then easily open later
to modify the CD image for a new burn.
- Create the Acrobat Catalog index of the files on the
Mac virtual volume. A side note here is that we have
found that Catalog runs faster on the PC, so we often
create a volume on the PC side (using a Jaz or other
rewriteable disk) and copy the PDFs in the Mac virtual
volume to the PC volume. If you use a PC and Mac LAN
(such as PCMacLAN) you can index the virtual volume
across the LAN with some speed loss.
- Place all the Acrobat install files in the Mac
virtual volume in a sub-folder called Acrobat (or
whatever you like). You can copy all the installers from
the Adobe website which has reader installers for Windows
32-bit, Windows 16-bit, Mac, and various Unix flavors.
Note that you will probably want to use Acrobat Reader
with Search and not just the Acrobat Reader since this
will give your users search capabilities.
- You may want to optimize the Mac virtual volume using
Norton's Speed Disk or similar. Older versions of Toast
required this before burning the CD.
- Under Windows, use an installer creator package such
as InstallShield or even a shareware install creator (we
have used Install/Setup
http://ourworld.compuserve.com/homepages/kpherzog/) to
create groups/folders and then launch the Acrobat
installer. Remember to make the pathing to the Acrobat
installer match what will be on the CD (for example
\acrobat\win32\setup.exe). Place the resulting installer
file into the Mac virtual volume along with your other
files (we usually put it at root). Note that your
installation routine should either detect Windows95+ or
Windows3.1 or ask the user which they are running and
then run the appropriate Acrobat reader installer. You
can also make your installation routine optionally run
the Acrobat installer or check if Acrobat is installed
(this is tricky since you also need to check the version
currently installed). Note that under Windows you can
also play with the Acrobat abcpy.ini file which will
customize the Acrobat Preferences to your
specifications.
- Arrange the Mac virtual volume so that it is open and
you only see the files that the user needs. We normally
set this folder up the show only the main PDF and an
alias (Mac shortcut) to the Mac version of the Acrobat
installer.
- In the installation instructions for the Mac tell the
user to run the double click on the Acrobat installer
icon only if they don't have Acrobat Reader with Search
version xxx or higher already installed on the machine.
Note that, on the Mac you do not have to worry about
creating folders/groups or adding to the Start menu, so
you do not need to write special installation routines
here.
- In the installation instructions, direct Unix users
to the Unix install scripts and tell em good luck.... I
think Unix users are used to pain so this does not seem
to phase them.
- In the installation instructions, tell Windows users
to run the installation routine you created above.
- If you want an autorun CD, create an autorun.inf file
and place it in the root of the Mac virtual volume. Make
sure that you create this file on the PC side, since the
Mac uses a different end-of-line than the PC.
- Start up Toast and set it to create an ISO/HFS
hybrid. Tell toast where the data files are (the Mac
virtual volume). Let it rip.
- Test like crazy on all platforms.
|
4. Archiving Old Manuscripts
|
|
If you assume that only crisp and clean text documents
can (or should be) archived using this technology, take a
look at the work being done by
Octavo. This company specializes in archiving old book
and rare manuscripts in their original form, preserving the
look of the original paper text and binding. But these
documents are also text and image PDFs which means that they
can be searched by text. They use sophisticated cameras and
Acrobat for compressing storing and distributing the
finished work. These are necessarily larger than usual
files, but the benefits are obvious. See a PDF sample (5
pages, 1170K) of Shakespeare
poems.
Another exciting example of archiving is the work done by
Direct Imagination.
They have reproduced Owen Jones' The Grammar of
Ornament (originally published in 1856) into a series of
PDF-rich CD-ROMs that contain 112 digitized plates from the
original color edition, 160 vector renderings of many of the
graphic designs and patterns, introductions and research
commentary. The finished work has been accepted into the
permanent collection of the Smithsonian Institution. Other
volumes published by Direct Imagination include: Studies
in Design and Art of Decorative Design by
Christopher Dresser, Le Costume Historique by
Racinet, and under development is Art Nouveau... Art
of M.P. Verneuil. Be sure to visit their website.
|