1. Select and capture

This workflow describes the process of selecting the content and obtaining it from the creator or depositor, as well as planning.

1.1 Planning

Before starting any email preservation you will need to undertake some planning to understand how emails are used and managed by the organisation or individual. This can include:

  • Identifying policies and procedures relating to email e.g. email retention policies
  • Understanding what email systems are being used
  • Considering any legal issues that may impact the preservation e.g. data protection, intellectual property rights (PDF, fifth definition on second page of PDF, 165 KB)
  • Identifying and engaging with stakeholders who understand the above and who will need to be involved in the preservation process

If you wish to preserve email within your organisation you should consider creating an email preservation policy, email guidance for users and email retention plans or ensure current policies incorporate this.

For external donors/depositors, you may wish to create guidance about email management and preservation.

Further guidance

1.2 Selection

The planning process also involves identifying what you plan to select for preservation.

There are a range of different approaches that can be taken including: keeping everything, focusing on sent items, self-curation by the account holder, or focusing on key accounts (the Capstone approach (sixth definition on first page of PDF, 165 KB)).

You will need to think about whether any appraisal (second definition on first page of PDF, 165 KB) takes place at this stage and who will undertake this. For example, will any ‘pre-appraisal’ by the account holder take place? The Good Practices for Acquiring Email Archives: a community guide recommends “the bulk of the appraisal work should occur before the email collection is transferred.” See ‘Step 2.2: Appraisal and sensitivity review‘ for more information on appraisal.

At this point you may wish to capture some simple information from the donor or depositor such as format, date range, and size to help you understand the collection.

You will need to consider when you capture the email. You can use a rolling approach with email captured at a regular interval (e.g. yearly) and/or when an email account is no longer in use.

Further guidance

1.3 Capture

The capture methods will be informed by the planning you have undertaken in the ‘Planning’ and ‘Selection’ steps. They can include exporting content from mail servers, email clients (seventh definition on first page of PDF, 165 KB) or webmail into formats such as PST or MBOX (sixth definition on fourth page and third definition on third page of the same PDF, 165 KB).

Some organisations are migrating emails to Portable Document Format (PDF) for preservation, although the Novice to Know How training advises “this is generally only recommended for certain use cases, such as where the end users of the email archives have security concerns about loading another user’s emails into their email client.”

Disk imaging is an alternative approach, although Novice to Know How training describes this as “one of the least desirable options as it removes any options for format selection or pre-transfer appraisal….It may, however, be the only option available due to time and resource constraints.”

At this point you should also undertake virus checks on the content and you might want to repeat these at different points in your workflow – see Section 1.3 – ‘Virus check’ – of the Digital preservation workflows guidance.

Following the capture, you may also want to create a checksum for the content – see Section 1.5 – ‘Create checksums’ – of the Digital preservation workflows guidance.

Further guidance and software

Section 2: Pre-ingest