Protecting Privacy When Using AI

Introduction

Artificial intelligence (AI) tools are transforming how we do genealogy. From helping us transcribe old documents to understanding our DNA results, AI tools—especially large language models (LLMs) like ChatGPT—are becoming valuable aids in family history research. However, these powerful tools raise important privacy concerns that every genealogist needs to understand and incorporate into the way they use them.

Objective

To prevent sensitive family information from being inadvertently exposed or misused while still benefiting from AI’s in genealogical research and analysis.

Preparation

Why Should We Care About Privacy with AI?

To understand why privacy matters when using AI, it helps to know a bit about how AI tools work. Large language models are trained on vast amounts of information from the internet and other sources. These tools also have the ability to learn from what we share with them.  The information you share might be stored, combined with other data, or used to train future versions of the software.

For genealogists, this creates some specific risks. When we research family history, we often work with sensitive information about both living and deceased relatives. For example:

  • Birth dates and addresses of living people could be used for identity theft
  • Medical histories might reveal sensitive family conditions that living relatives want kept private
  • Adoption records could impact family relationships
  • DNA test results could expose unexpected family connections

Implementation

The Water Cooler Rule: A Simple Way to Think About Privacy

Before entering anything into a large language model chatbot like ChatGPT, ask yourself: “Would I be comfortable seeing this information posted for everyone to see above the water cooler at work?”

Practical Steps to Protect Private Information

1. Anonymize Your Data

When working with AI tools, get in the habit of removing or changing identifying details. For example:

  • Use initials or pseudonyms instead of full names
  • Replace specific dates with approximate ones
  • Replace specific addresses with general locations
  • Create ID numbers to track individuals (e.g., “Person A” or “Family 123”) rather than using their names

2. Know Your Tools

Before using any AI platform:

  • Read the privacy policy and terms of service
  • Check whether conversations are saved
  • Look for options to delete your history, or to keep them from being saved in the first place
  • Verify whether the service claims any rights to your input data

3. Set Clear Boundaries

Develop personal guidelines for what you will and won’t share.  For example, you might decide to never:

  • Share family stories without anonymizing them first
  • Input DNA test results without anonymizing them first
  • Input medical information, unless the people mentioned have been dead for at least 50 years
  • Share documents about people, unless they’ve been dead for at least 50 years

4. Understand Local Privacy Laws

Different regions have different laws governing data privacy that may affect what information you can share with AI tools.

  • Familiarize yourself with the privacy regulations that apply where you live
  • Stricter rules for privacy usually apply to living people than to deceased people

Review and Verify

Before you put sensitive or private information into an artificial intelligence tool:

  • Use the Water Cooler Rule
  • Anonymize sensitive data before sharing it
  • Read the privacy policies for your AI tools
  • Share these guidelines with your fellow genealogists

Ethical Considerations

Working Together for the Entire Genealogical Community

Privacy protection isn’t just an individual responsibility—it’s crucial for maintaining trust within the genealogical community. When one person mishandles private information, it can damage the reputation of the entire genealogical community. This could make others hesitant to share their own family history information with anyone in the future.

Benefits and Limitations

While AI can help us analyze and summarize large amounts of family information quickly, we must carefully balance this convenience against the risk of exposing sensitive or private information. Understanding these limitations helps us make informed decisions about when and how to use AI tools in our research.

Takeaway Tips

When it comes to privacy and AI, it’s better to err on the side of caution. The family stories and information we work with aren’t just data points—they’re the personal histories of real people who deserve to have their privacy respected and protected. Together, we can harness the power of AI while protecting the private information that makes each family’s story unique.