Tuesday, 29 April 2014

To do lists in Evernote

I have enjoyed reading how others use Evernote to help with their family history research.

Screenshot from my desktop version
My main collecting point for my family history is my tree on ancestry.com synched with a tree on my desk top computer using Family Tree Maker and also on my iPad using the ancestry.com app.

For research projects though I use Evernote. In particular if there is a book, a manuscript, a database I want to look up and explore I add it to Evernote with the questions I want answered and tag it with the relevant surname  and the repository plus to do.

The same to do list on my phone - I can click on a note to bring up the content
For example if there is a manuscript in the State Library of Victoria, I will tag it with SLV to do. When I get down to the SLV then all my to dos can be grabbed together by retrieving notes that are tagged. Once I have completed the to do I might collect my findings against that note and updating the tags by removing the to do tag but replacing it with the repository, ie the tag SLV to do is replaced with SLV.

I use this for interstate repositories too, for example the State Library of South Australia which I get to visit less than once a year. It is really useful to have collected in advance all the things I want to achieve there as I think of them.

Evernote is synchronised across my desktop computer, my iPad and my android phone. Thus I have my to do list with me all the time.

See also  Diane Hewson's post in February on collaborating with Evernote: http://worldwidegenealogy.blogspot.com.au/2014/02/evernote-collaboration-sharing-of-tips.html

Saturday, 26 April 2014

Passing Things Down

This week, I received two big boxes from my mother.  They contained belongings of my father's maternal grandmother, Sarah Adeline "Addie" O'Hara Hall.   Our family has very few family heirlooms and now I am keeper of these.  I have a few other things from other branches that may appear in a future post.  One of my plans for this year is to make a catalog of the family treasures we do have so that my children will know what they are and to whom they belonged.
Since my grandmother saved these bowls to pass down to my dad, they were important to her and probably special to her mother as well.  Having them in my possession has made me wonder about Addie this week and so this month's post is about her family.

Addie's floral china bowl

Addie's glass bowl on a stand

Addie was born in Kansas City, KS to James O'Hara and Emaline Jane Reynolds in 1879.  She and I share a birthday -- 86 years apart.  She died in 1970 when I was four years old, but I never met her.  She and her husband, Arthur Hall, had six children.  I never met any of them except for my grandmother, Treasa.  I feel a loss not having known them.  I knew/know, or at least met, most of the siblings of my other three grandparents.  
Arthur and Addie Hall
A few generations of my grandmother's family can be easily traced through census records that are available online.  However, my mother did most of the work before online census searches were available.
On the 1910 census, Arthur Hall is listed as a mechanic.  Besides himself and Addie, his household consisted of his first three children, son Postelle, daughter Vera, and baby son Lawell who had not yet been named.  Also living with them were Arthur's father, Milton Hall and Addie's father, James O'Hara -- the Mark Twain-looking fellow in the outdoor family picture below.  Sometimes Addie's brothers, Tom and Harvey lived with the family as well.  In other censuses, Arthur is listed as a farmer.  Arthur and Addie's children were born in Oklahoma, but the family also lived in Florida and Georgia.
The Hall family c1915
Someone wrote 1916 on this photo, but I'm pretty sure it was taken in 1917 because my grandma is the baby in her uncle's arms and she was born in the middle of 1916.

The family picture has Howie, Florida written on it.  My grandmother was born in Oklahoma and the family is on the census, owning a farm in Leesburg, Florida in 1920.  So they moved shortly after she was born.  At some point, they bought a Florida citrus farm, but suffered a freeze the first year and lost the farm.  Howey-in-the-Hills, Florida was across Lake Harris from Leesburg.  It was incorporated in the mid-20's when it's founder, William John Howey, was looking for buyers for his citrus land.  I wonder if Arthur and Addie purchased their ill-fated property from him.  By 1930, they had moved to Georgia.  Arthur never recovered from the heartbreaking loss of his farm.  He became ill and died in 1934 at 61 years of age.  My grandmother was the only child left at home by this time.

Addie, pregnant with my grandmother
Arthur a few years before his death
My grandmother, Treasa

Addie's father, James O'Hara, was a house painter.  Some census records say he was born in New York.  Other sources say he was born in Ireland and came to the USA as a young child.
Addie's mother, Emaline Jane Reynolds, was born in Iowa and spent most of her life in Kansas.  Beers and Proper are other surnames in Emaline's line.  We know James O'Hara's father was Patrick O'Hara, but we don't know any other surnames from this line.

Friday, 25 April 2014

"Pocahontas" Alias Metoaka and Her Descendants

Since I'm from the oldest surviving English colony in the United States, I thought I'd focus this month on one of the most famous Native Americans from Virginia: Pocahontas. She had one son but her descendants now number in the tens of thousands. There are several ancestor associations, which people may join if they can prove their descent from Pocahontas.

Rev. James Mitchell is my 4X great grand uncle

My relationship to Pocahontas is very tortuous and certainly wouldn't gain me admission to any Pocahontas ancestor association were I to want to join. I find it interesting nonetheless as every elementary school history book I was required to use included a chapter on her. I am from Virginia after all!

Matoaka "Pocahontas" also known as Rebecca Rolfe, engraving by 
Simon van de Passe; courtesy of Wikipedia

When I was researching her descendants, I discovered a wonderful old book on Google Play, Pocahantas and Her Descendants, written by Wyndam Robertson and published in 1887.  

The book included seven generations of Pocahontas descendants, including Wyndham Robertson (1803-1888) himself. Robertson included a delightful declaration of love by John Rolfe:

"Pocahontas…to whom my hartie and best thoughts are, and have long bin so entangled and inthralled in so intricate a laborinth, that I was even awearied to unwinde myselfe thereout." 

Wyndham Robertson, painting by Louis Mathieu Didier Guillaume; 
courtesy of Wikipedia

Robertson was the acting governor of Virginia from 1836 to 1837. As senior member of the Council of State, he was also Lt Governor when Governor Littleton Waller Tazewell resigned the office.  At the time the legislature elected the governor and it was controlled by the Whigs so Robertson was not returned to office in 1837. After his term was over he was elected to the Virginia House of Delegates several times and was in that office during Virginia's struggles over secession from the Union. Robertson was a staunch Unionist and tried to prevent secession. 

When Abraham Lincoln made his call for troops on April 15, 1861, Wyndham Robertson became "zealously active in all measures in defense of his state." After the Civil War he served on the Committee of Nine, which sought Virginia's readmission into the Union. After long and faithful service to Virginia, he retired and wrote his genealogy book. He died on February 11, 1888, and is buried at Cobbs, Virginia.

He later said about his service to his state during the Civil War:

"And now, after twenty years of experience of yet unripened results, I have no regrets, nor repent a single act of my State, or myself, in these unhappy affairs -- welcoming the end of slavery, but still believing it would have been reached without the horrors of war."

And this is yet another reason I love old books so much -- not only are the subjects of the books fascinating so are their authors.

Genealogy books are great references but should be considered just that a reference. They are not sources. Even experienced researchers make mistakes. And one of those mistakes put me on a trail that ended Pocahontas. For several months, I believed my sister-in-law descended from the Bermuda Tucker family based on a Tucker genealogy book that claimed her 6 times great grandfather's father was a Henry Tucker of Bermuda. But DNA has proved that the author confused several Henry Tuckers, who lived in Southampton County, Virginia, in the late 1600s. The Bermuda Tuckers married into the Randolph family,  one of the first families of Virginia, that do descend from Pocahontas, but my sister-in-law's Tucker line still deadens with Benjamin Tucker (1714-1799). 

Tuesday, 22 April 2014

Criteria for Assessing the Quality of Genealogy Websites and Online Data

Academic researchers, commercial vendors and volunteer interest groups have produced a vast array of online resources useful to family historians and genealogists.  The quality of the websites and data contained in them varies hugely.  For this discussion, I will focus on websites that offer access to digital copies of original records.

Quality has nothing to do with the total number of records, or the number of collections or data sets.  Quality is unrelated to the cost of a website subscription or motivation of the provider.

Without documentation of all the processing of records and information in them a researcher cannot asses the reliability of records. We can’t change the imperfect state that the original records come to us in. Archivists work hard to preserve both the records themselves and the context of their creation and use, but online presentation is often performed by other parties. Digitisation, indexing, and search are just a few of the processes that happen before an online version of the record is presented. Presentation can have profound influence on how records are perceived and the conclusions drawn from them. Consequently, transparency is an ethical obligation.

What are the most important website features? How can they be assessed?

The following, in order of importance, are essential:
  1. Catalogue
  2. Transcript quality
  3. Search facilities
  4. Browsing facilities
  5. Record quality
The quality of other features are also important, and a bonus if included.  Examples include analytical tools, user data (e.g. family trees, imported sources, research notes etc.), collaborative tools and social networks. But for now, I will discuss the basic five points above.

Collections with different histories or characteristics should be assessed separately. Only the catalogue can be assessed across a whole website.

Catalogue First

Yes, I really do mean that the catalogue is more important than anything else.

Genealogists use archival material, whether in the form of original records or some kind of derivative.  Genealogy websites are really a digital archive of such materials, so a genealogy website’s catalogue should share many of the features of an archival catalogue.

In her blog post The Value of Archival Description, Considered, archivist Maureen Callaghan recognizes researcher’s needs:
“getting to understand who created records, why they were created, and what they provide evidence of – really gets to the nature of research. These are the questions that historians and journalists and lawyers and all of the communities that use our collections ask – they don’t just see artifacts, they see evidence that can help them make a principled argument about what happened in the past. They want to know about reliability, authenticity, chain of custody, gaps, absences and silences.”

So, a catalogue is more than just a list of collections. Such a list might be the starting point for creating a catalogue, but falls well short of the sophisticated database that comprises an archival catalogue. It contains information about the collections, so serves a quite different purpose to search and browsing facilities.

A good catalogue answers questions about the website’s collections with no fuss:
  1. Is the catalogue complete, including collections not yet digitised and indexed with a timescale of expected online availability? This information allows the researcher to make informed decisions using the database or seeking the records elsewhere.
  2. What record collections does it contain? You want to know that relevant records are included before paying a subscription or spending precious time searching for records, don’t you?
  3. Where did the collections come from? Typically records come from originals in an archive, or a publication. The barest minimum information for archival material is the archive and the archive reference, and for published information, the bibliographic reference. That allows the researcher to check the archive’s or bibliographic catalogues.
  4. How do the collections relate to one another? Logical groupings of record sets by record type reflect original function of the records, whilst groupings by creator reflect the history or provenance of the records. Both are important for understanding how the records can be used. Were several types of record created by a particular process e.g. collection of taxes involved assessment of liability, record of payments and penalties for late or non-payment.
  5. What is the structure of the data set?  How is the record set arranged? Is it by date, person or something else?
  6. Is each collection or record set complete?
  7. What is the extent of each collection and record set?  How many sub-sets, how many records in each?
  8. Does the catalogue entry describe the records? Is a brief history of the original records creation and provenance included? Were the digital records an image of the original, or derived from a microfilm or transcript?
  9. What information do the records typically contain?
  10. Is a scholarly work on the record type referenced, or a critique on the strengths and weaknesses of the records included?

Transcript Quality

Transcription transforms manuscript and typescript documents into computer readable text, essential for creating searchable records. In evaluating the quality of computerized records consider if the website documents the following:
  1. The completeness of the transcript.  A complete transcript captures the most information so is far more useful than an abstract or an index. 
  2. Accuracy of transcription is influenced by how it was produced.  Optical character recognition (OCR) is commonly used for typescript.  Human data entry of manuscript or handwritten material depends on palaeographic and keyboard skills.  Typically, OCR and unskilled data entry yield less accurate transcripts.
  3. Checking procedures should detect obvious gobbledy-gook, and common OCR and data entry errors.  Double data entry produces a con-census interpretation, but may not avoid common reading errors.
  4. Have error rates been assessed?

Search Facilities

Good search rests on an accurate transcript, not algorithms or user added ‘corrections’.  Repeatable search, essential for confidence in the validity of results, requires a complete data set and stable search methods.  Search is not a simple operation, so inexperienced users need coaching and encouragement, not dumbed-down, limited functionality.  Consider the website’s documentation and functionality of the following:
  1. Targeted search on individual data sets and collections as default. Choosing which collection or collections to search first is much more efficient than filtering out irrelevant collections.
  2. Search on all data items in the record.  This requires a complete transcript.
  3. Full text search, the ability to search everywhere in the record, also requires a complete transcript.
  4. Complex search, the ability to specify ‘AND’, ‘OR’ and other operators.
  5. Name matching algorithm choice.  Examples include soundex and metaphone, which perform phonetic matching for English-language names, and Daitch-Mokotoff, which is adapted for Slavic and German spellings of Jewish names.
  6. Date ranges. Can start and end dates be specified, or a central dates with accuracy?
  7. Are place searches restricted to place names? Is a proximity search based on distance included?
  8. Wildcards, replacement characters in the search term that stand in for unknown possibilities e.g. Sm*th returns Smith, Smyth.
  9. Separation of transcribed values and interpreted values with options to search on either or a combination. For example, the abbreviation ‘Wm’ or Latinised ‘Gulielmus’ can be interpreted as William. Standardised interpretations are known to librarians and archivists as authority control http://en.wikipedia.org/wiki/Authority_control . User added ‘corrections’ are another kind of interpreted value.
  10. Filtering of search results.
  11. Optional ‘sticky’ settings and well-chosen default settings.
  12. Result presentation.  Is it simple, clear and contain the information important to you?  Does it include the search terms used?
  13. Result sorting on data fields chosen by the user. What does ‘relevance’ mean?
  14. Result export in a variety of formats, ready for use by with software tools of your choice.
  15. Logged searches that document research activity.

Browsing Facilities

Browsing is a tool for examining records in the context of the record set.  It should replicate the experience of turning the pages of the original.  The order of digital images should exactly follow the order of the original pages.  The structure of the record set and relationships between individual records contain subtle information about the creation and use of the original.

Browsing does not replace search. When used as a last resort when search fails, it is an indicator of poor search or transcription.

Record Quality

Original records are typically presented as digital images. Genealogists use the most original source available so they can be confident that the information is as reliable as possible. A digital image is not the same as the original, but can come acceptably close, provided that:
  1. Good image quality that is legible.  Sharp focus, resolution, colour accuracy, and contrast all contribute to legibility. Digital image file types vary in the degree of data compression, which influences image quality.
  2. Information that identifies the record portrayed included in the image file. That means all the information that you want in your citation, such as the archive and archive reference of the original, page number, record identifier, person of interest etc. A meaningful file name is helpful, but enough detail makes an unreasonably long name, as is human readable text added to the image. Potentially most useful is embedded citation information in the image file metadata, which is computer readable.
  3. Technical camera or scanner metadata provides provenance of the image, including whether it has been modified.

Your Challenge – Review one data set

There is a lot to consider in assessing the quality of genealogy websites and the data they contain. Of course, we want all the features mentioned above in a user-friendly package, but I think there is quite enough to start with above. Have I omitted anything vital? Do you agree with these criteria?

Before we can hold genealogy data suppliers accountable, we need to fairly assess whether what they offer is of sufficiently good quality for our purposes.  What constitutes ‘fit for purpose’ is open to debate.  I think genealogy data consumers would benefit from setting expectations and demanding quality, and that suppliers would benefit greatly from carefully considered feedback.

In the interest of collaboration between suppliers and consumers, I challenge you to review one data set using these criteria.