Thursday, 5 March 2015

Digital preservation, or why I worry about Evernote

Digital preservation ensures that the digital information we have today can still be accessed in the future. Before becoming a professional genealogist, I worked in the IT department of the National Archives in the Netherlands where I picked up some knowledge about digital preservation that I hope will be useful to you. Most people think that all you need to do is make back-ups, but there is a lot more to it than that.

Four issues with digital preservation

There are four main issues with digital preservation:
  1. Files get corrupted when hardware fails. Think crashed hard drives or demagnetized floppy disks. Or your house burning with your backups inside. 
  2. Files stored on obsolete media. Who still has a 5 1/4" floppy drive? Many new computers don't even come with a CD/DVD player anymore. 
  3. Files are in obsolete file formats. For example, a WordStar 1.0 or WordPerfect 4.2 document. 
  4. Files get lost because of human behavior. For example, files on hard drives that nobody bothered to copy before recycling the computer or emails and document that get deleted when you loose access to your account because you change jobs or providers. People also use files because they don't understand how their cloud storage solution (Dropbox, Google Drive) works and don't realize that deleting a file from their hard drive may also delete it from the cloud. 
With paper information, you have to actively do something to make it go away (throw it out). If you don't do anything, it will get preserved in your attic or wherever you saw it last. 
With digital information, you have to actively make sure that it gets preserved, otherwise it will be lost. 



Back-ups only help so much...

Back-ups can help combat issues number 1, 2 and 4 (corrupt files, obsolete media and human error). A good backup strategy is the 3-2-1 rule: have 3 copies of each file, on 2 different types of media, 1 of which is stored in a different geographical location. By actively making backups of things you want to preserve, you ensure that this information is not lost accidentally. 

A good backup strategy will give you access to your files. But whether they are in a format that you can use is another matter.

Different types of file formats

The way a file is stored on the computer is called a file format. A file format tells the software how to open the file. Examples of file formats are .jpg (JPEG image file), .docx (Microsoft Word) and .pdf (Portable Document File). 

Digital preservation problems start when you have a file that is stored in a file format that you cannot open anymore. You may have a WordPerfect 4.2 file, but you may not have WordPerfect anymore. Perhaps you can convert the file, but information may be lost along the way. Or you may use an old computer (or emulate one) and run your old software. 

The type of file format determines how hard it will be to open in the future. Some file formats are proprietary and require that vendor's software to open them. .FTM files are a good example, the file format that Family Tree Maker uses. You can't open an FTM file in another program.

Other file formats use open standards. An open standard means that the specification of the standard is published and everybody is allowed to use it to create their own programs. Gedcom is an example of an open standard: dozens of tools can read a Gedcom file, which is why it is still being used to exchange information between different genealogy programs.

Most software offer a way to save files in a different file format, which often includes open standards, like the Gedcom export function in genealogical software. You have that option as long as you can use the software. But you may not be able to run that software after your next operating systems upgrade, if your license expires or if the vendor goes out of business.

The problem with programs like Evernote

Cloud-based solutions (like Evernote) give people a false sense of security because they think everything is "backed up" in the cloud. That is not really true though: Evernote synchronizes, which is different than a backup. If you delete a note in Evernote, it will be gone from all your devices, because Evernote synchronizes your information. That is different from information that is backed up, because then you will be able to restore the backup. 

Another problem with programs like Evernote is that you are totally dependent on the vendor. I have no idea if I will be able to access my Evernote files if Evernote goes out of business ten years from now and the software stops working. I use the web version as well as the Windows version. If Evernote goes out of business, I won't be able to use the web version but will probably still have access to the Windows version, at least for a while. But the information are all stored in Evernote-specific .EXB files on my hard drive. As far as I know, there are no other programs than Evernote to get my stuff out except copy-paste it.

[BTW, I'm using Evernote as an example because it's very popular among genealogists but most software, especially cloud-based software has the same issues]

My advice: combine solid backups with migration to open file formats

For information that is important to you, that you want generations from now to be able to open, my advice is to make sure to save them in a file format that has an open standard. For example, I save scans of my family photos as TIFF, not PSD (the Photoshop file format). A hundred years from now, any programmer can take the TIFF scans and create a viewer for the amazing devices that they will have then. But Photoshop might not be around anymore. 

This does not mean I don't use other files as well. I use Family Tree Maker and RootsMagic for my research. That is no problem since I am aware of the issues and will periodically create a GEDCOM and put that in my backup locations using the 3-2-1 rule. I also keep the FTM files, because conversion often loses some of the information. But if for some reason I don't have FTM anymore, or the old program doesn't work on my shiny new operating system, I will still be able to access at least a large part of the information.

More information

If you want to learn more about how archives professionals deal with digital preservation, check out the website of the Open Preservation Foundation. For a more lighthearted introduction, view the rest of the videos by Team Digital Preservation on Youtube.

7 comments:

  1. Every time I read a blog extolling the benefits of cloud storage, throwing out ALL your paper, I cringe. These are adults. Surely they have lost digital information one way or another. I will never throw out my paper books or my paper documents.

    ReplyDelete
  2. Interesting post. I wanted to tell you that I've included it in my NoteWorthy Reads for this week: http://jahcmft.blogspot.com/2015/03/noteworthy-reads-5.html

    ReplyDelete
  3. Superb advice, Yvette. And as for Evernote... I use it for convenience in some situations, but I certainly do not think of it as a long-term backup.

    ReplyDelete
  4. I have a love/hate relationship with cloud storage.

    Is it backed up someplace else?? Always asking that question.

    ReplyDelete
  5. @Carol I'm sure they store it redundantly, so it's safe if one of their hard drives crashes, but that's not the same as a back-up. As I explained in the article, typical cloud solutions synchronize your files so will delete all copies if you delete one with no option to restore (or only a limited time to undo).

    ReplyDelete
  6. Excellent post - I love this Worldwide Genealogy groups of writers - always learning something (or getting a gentle reminder). This is helpful plus a good reminder about how I save and use my digital work. Thanks so much.

    ReplyDelete
  7. Totally agree. I am a cheerleader for Evernote but as with any tool, you need to be aware of its strengths and weaknesses. I don't use Evernote as a storage tool, I use it as an access tool. It allows me to access and work on my genealogy and other research on the go (via my iPad and the web version). But I keep paper copies of all critical documentation in addition to copies of all files on my hard drive (which is backed up to a couple of other cloud storage services and DVD).

    ReplyDelete

Hello, thanks for leaving a comment on the World Wide Genealogy Blog. All comments are moderated because of pesky spammers!

Best wishes
World Wide Genealogy Team