Digital preservation ensures that the digital information we
have today can still be accessed in the future. Before becoming a professional genealogist, I worked in the IT department of the National Archives in the Netherlands where I picked up
some knowledge about digital preservation that I hope will be useful to you.
Most people think that all you need to do is make back-ups, but there is a lot
more to it than that.
Four issues with digital preservation
There are four main issues with digital preservation:
- Files
get corrupted when hardware fails. Think crashed hard drives or
demagnetized floppy disks. Or your house burning with your backups
inside.
- Files
stored on obsolete media. Who still has a 5 1/4" floppy
drive? Many new computers don't even come with a CD/DVD player
anymore.
- Files
are in obsolete file formats. For example, a WordStar 1.0 or
WordPerfect 4.2 document.
- Files
get lost because of human behavior. For example, files on hard
drives that nobody bothered to copy before recycling the computer or
emails and document that get deleted when you loose access to your account
because you change jobs or providers. People also use files because they
don't understand how their cloud storage solution (Dropbox, Google Drive)
works and don't realize that deleting a file from their hard drive may
also delete it from the cloud.
With paper information, you have to actively do something to
make it go away (throw it out). If you don't do anything, it will get preserved
in your attic or wherever you saw it last.
With digital information, you have to actively make sure
that it gets preserved, otherwise it will be lost.
Back-ups only help so much...
Back-ups can help combat issues number 1, 2 and
4 (corrupt files, obsolete media and human error). A good backup strategy is the 3-2-1
rule: have 3 copies of each file, on 2 different types of media, 1 of which is
stored in a different geographical location. By actively making backups of
things you want to preserve, you ensure that this information is not lost
accidentally.
A good backup strategy will give you access to your files.
But whether they are in a format that you can use is another matter.
Different types of file formats
The way a file is stored on the computer is called a file
format. A file format tells the software how to open the file. Examples of
file formats are .jpg (JPEG image file), .docx (Microsoft Word) and
.pdf (Portable Document File).
Digital preservation problems start when you have a file
that is stored in a file format that you cannot open anymore. You may have a
WordPerfect 4.2 file, but you may not have WordPerfect anymore. Perhaps you can
convert the file, but information may be lost along the way. Or you may use an
old computer (or emulate one) and run your old software.
The type of file format determines how hard it will be to
open in the future. Some file formats are proprietary and
require that vendor's software to open them. .FTM files are a good example, the
file format that Family Tree Maker uses. You can't open an FTM file in another
program.
Other file formats use open standards. An open
standard means that the specification of the standard is published and
everybody is allowed to use it to create their own programs. Gedcom is an
example of an open standard: dozens of tools can read a Gedcom file, which is
why it is still being used to exchange information between different genealogy
programs.
Most software offer a way to save files in a different file
format, which often includes open standards, like the Gedcom export function in
genealogical software. You have that option as long as you can use the
software. But you may not be able to run that software after your next
operating systems upgrade, if your license expires or if the vendor goes out of
business.
The problem with programs like Evernote
Cloud-based solutions (like Evernote) give people a false
sense of security because they think everything is "backed up" in the
cloud. That is not really true though: Evernote synchronizes, which
is different than a backup. If you delete a note in Evernote, it will be gone
from all your devices, because Evernote synchronizes your information. That is
different from information that is backed up, because then you will be able to
restore the backup.
Another problem with programs like Evernote is that you are
totally dependent on the vendor. I have no idea if I will be able to access my
Evernote files if Evernote goes out of business ten years from now and the software stops
working. I use the web version as well as the Windows version. If Evernote goes
out of business, I won't be able to use the web version but will probably still
have access to the Windows version, at least for a while. But the information
are all stored in Evernote-specific .EXB files on my hard drive. As far as I
know, there are no other programs than Evernote to get my stuff out except
copy-paste it.
[BTW, I'm using Evernote as an example because it's very popular among genealogists but most software, especially cloud-based software has the same issues]
[BTW, I'm using Evernote as an example because it's very popular among genealogists but most software, especially cloud-based software has the same issues]
My advice: combine solid backups with migration to open file formats
For information that is important to you, that you want
generations from now to be able to open, my advice is to make sure to save them
in a file format that has an open standard. For example, I save scans of my
family photos as TIFF, not PSD (the Photoshop file format). A hundred years from now, any programmer can
take the TIFF scans and create a viewer for the amazing devices that they will
have then. But Photoshop might not be around anymore.
This does not mean I don't use other files as well. I use
Family Tree Maker and RootsMagic for my research. That is no problem since I am
aware of the issues and will periodically create a GEDCOM and put that in my
backup locations using the 3-2-1 rule. I also keep the FTM files, because conversion often loses some of the information. But if for some reason I don't have FTM anymore, or the old program doesn't work on my shiny new operating system, I will still be able to access at least a large part of the information.
Every time I read a blog extolling the benefits of cloud storage, throwing out ALL your paper, I cringe. These are adults. Surely they have lost digital information one way or another. I will never throw out my paper books or my paper documents.
ReplyDeleteInteresting post. I wanted to tell you that I've included it in my NoteWorthy Reads for this week: http://jahcmft.blogspot.com/2015/03/noteworthy-reads-5.html
ReplyDeleteSuperb advice, Yvette. And as for Evernote... I use it for convenience in some situations, but I certainly do not think of it as a long-term backup.
ReplyDeleteI have a love/hate relationship with cloud storage.
ReplyDeleteIs it backed up someplace else?? Always asking that question.
@Carol I'm sure they store it redundantly, so it's safe if one of their hard drives crashes, but that's not the same as a back-up. As I explained in the article, typical cloud solutions synchronize your files so will delete all copies if you delete one with no option to restore (or only a limited time to undo).
ReplyDeleteExcellent post - I love this Worldwide Genealogy groups of writers - always learning something (or getting a gentle reminder). This is helpful plus a good reminder about how I save and use my digital work. Thanks so much.
ReplyDeleteTotally agree. I am a cheerleader for Evernote but as with any tool, you need to be aware of its strengths and weaknesses. I don't use Evernote as a storage tool, I use it as an access tool. It allows me to access and work on my genealogy and other research on the go (via my iPad and the web version). But I keep paper copies of all critical documentation in addition to copies of all files on my hard drive (which is backed up to a couple of other cloud storage services and DVD).
ReplyDelete