This workflow describes the process of selecting the content and obtaining it from the creator or depositor, as well as planning.
1.1 Planning
Before starting any email preservation you will need to undertake some planning to understand how emails are used and managed by the organisation or individual. This can include:
- Identifying policies and procedures relating to email e.g. email retention policies
- Understanding what email systems are being used
- Considering any legal issues that may impact the preservation e.g. data protection, intellectual property rights (PDF, fifth definition on second page of PDF, 165 KB)
- Identifying and engaging with stakeholders who understand the above and who will need to be involved in the preservation process
If you wish to preserve email within your organisation you should consider creating an email preservation policy, email guidance for users and email retention plans or ensure current policies incorporate this.
For external donors/depositors, you may wish to create guidance about email management and preservation.
Further guidance
- Lesson 3: Developing an Email Preservation Program of the Novice to Know How: Email Preservation online training provides an overview of the key considerations.
- Good Practices for Acquiring Email Archives: a community guide is strong on planning.
- Why Do We Need to Preserve Email? (PDF, 263 KB) by The National Archives and the Digital Preservation Coalition is a useful guide to preservation for email users.
- Managing emails by The National Archives, aimed at email users in public sector bodies.
- Management of and Access to Email Archives User Guide (DOC, 9.9 MB) and Guide for Donors of Email Archives (PDF, 769 KB) by Jessica Smith & Paul Carlyle of the University of Manchester. Produced as part of the Palladium Project in 2022 and aimed at external donors/depositors.
1.2 Selection
The planning process also involves identifying what you plan to select for preservation.
There are a range of different approaches that can be taken including: keeping everything, focusing on sent items, self-curation by the account holder, or focusing on key accounts (the Capstone approach (sixth definition on first page of PDF, 165 KB)).
You will need to think about whether any appraisal (second definition on first page of PDF, 165 KB) takes place at this stage and who will undertake this. For example, will any ‘pre-appraisal’ by the account holder take place? The Good Practices for Acquiring Email Archives: a community guide recommends “the bulk of the appraisal work should occur before the email collection is transferred.” See ‘Step 2.2: Appraisal and sensitivity review‘ for more information on appraisal.
At this point you may wish to capture some simple information from the donor or depositor such as format, date range, and size to help you understand the collection.
You will need to consider when you capture the email. You can use a rolling approach with email captured at a regular interval (e.g. yearly) and/or when an email account is no longer in use.
Further guidance
- Both Module 4.1: Email Selection Methodologies of the Novice to Know How: Email Preservation online training and Section 2.3.2 of The Future of Email Archives report by the Council on Library and Information Resources provide a good overview of the different selection methods.
- Good Practices for Acquiring Email Archives: a community guide has a good section on selection/appraisal and a list of ‘high-level metadata’ which can be captured before acquisition.
- The ePADD tool (free) can allow account holders to review or appraise their email before deposit or transfer (see ‘Step 2.2: Appraisal and sensitivity review‘).
- IRMS022 – Managing the email of the Netherlands government – in this Information and Records Management Society podcast, Vincent Hoolt describes the approach of the Netherlands government towards email preservation – drawing inspiration from the Capstone method.
- Appraisal Rubric was developed for the Carcanet Press email collection by John Rylands Library and identifies what types of records will be kept.
1.3 Capture
The capture methods will be informed by the planning you have undertaken in the ‘Planning’ and ‘Selection’ steps. They can include exporting content from mail servers, email clients (seventh definition on first page of PDF, 165 KB) or webmail into formats such as PST or MBOX (sixth definition on fourth page and third definition on third page of the same PDF, 165 KB).
Some organisations are migrating emails to Portable Document Format (PDF) for preservation, although the Novice to Know How training advises “this is generally only recommended for certain use cases, such as where the end users of the email archives have security concerns about loading another user’s emails into their email client.”
Disk imaging is an alternative approach, although Novice to Know How training describes this as “one of the least desirable options as it removes any options for format selection or pre-transfer appraisal….It may, however, be the only option available due to time and resource constraints.”
At this point you should also undertake virus checks on the content and you might want to repeat these at different points in your workflow – see Section 1.3 – ‘Virus check’ – of the Digital preservation workflows guidance.
Following the capture, you may also want to create a checksum for the content – see Section 1.5 – ‘Create checksums’ – of the Digital preservation workflows guidance.
Further guidance and software
- Lesson 4: Selection and Capture of the Novice to Know How: Email Preservation online training provides an overview of this step including step-by-step guides to exporting email from classic Outlook and Gmail. It also provides an overview of the ePADD tool (free) which can capture email directly from some webmail.
- Email archive preservation which uses Preservica and ePADD (video) – Jan Whalen of the University of Manchester talks about exporting emails from Outlook into PST with the Carcanet archive.
- EA-PDF Working Group (2021), A specification for using PDF to package and represent email (PDF, 663 KB). This is a specification rather than a capture method, but the Sheffield City Archives case study (Module 8.3) in the Novice to Know How: Email Preservation online training is a good practical example of using PDF to preserve email.
- Disk imaging software such as BitCurator (free) can facilitate the capture of emails by creating disk images and extracting deleted materials.