Legionnaires' disease outbreak investigation toolbox

Download Page

Data Management

Depending on the outbreak, comparatively large volumes of data might be collected. Even if the outbreak is small, data sharing between organisations or countries may be required. Good data management policy enables improved auditing, identification of lessons learnt, increased transparency and ensures the security of personal identifiable information. Consideration should be given to the creation of a database to store contextual data in a coherent and robust manner. To facilitate this, versions of a trawling questionnaire in Epiinfo and Word are provided here.

Data management plan

Data should be entered into a suitable database depending on which software is to be used for analysis.

Errors may be introduced into the data at any stage of data collection, data entry or data analysis, and checking should take place at each stage. There are three main ways of reducing data entry errors and maintaining data quality at the data entry stage: interactive checking, double data entry and batch checking. However, none of these approaches can guarantee the identification of all data entry errors.

  • Double data entry (where the data are entered twice, ideally by two different people, with the two data sets then compared using verification software) is the gold standard, but may be impractical in an outbreak setting. Interactive checking identifies errors or anomalies in the data as they are entered, and can detect range errors (e.g. an age of 176) or consistency errors (e.g. a pregnant male).
  • Interactive checking is best used when data collection proceeds in parallel with data entry, and anomalies in the data can quickly be queried from the data source. However, interactive checking interrupts data entry, and so batch checking, where checks are made on the data after all the data are entered, or periodically during data entry, may be preferred.
  • Interim analyses of the data, such as basic tabulations and plots, can identify further errors in the data. Where errors are corrected, it is important to maintain an audit trail of changes made to the data. One way of doing this is to leave the original data untouched and to correct errors programmatically at the time of analysis. If it is not possible to correct these errors, then it may be necessary to set their values to missing.

Data Sharing

At times during an outbreak, there may be requests or need to share information. It may be that, in spite of the multidisciplinary nature of the outbreak control team, additional analytical support might be required to interpret epidemiological and environmental data or that outbreak control teams in other member states exist to respond to their local cases. Updates to the local or national media may also be required. Click here for information regarding potential communication messages. Data sharing should be viewed as a benefit to the outbreak investigation and can be facilitated by good data management. The precise nature of the data to be shared will vary from outbreak to outbreak, but there will be a number of key issues that are likely to arise:

  • Ensure patient confidentiality
  • Ensure clear audit trail and time stamp when data are sent out, so that if new cases or information are incorporated into outbreak, clear and logical updates can be sent out as necessary
  • Ensure data are sent out in suitable electronic format for collaborators to open and read
  • Ensure data are structured within files in suitable and understood units (for example case locations are given as centroid of the spatial unit they inhabited at that time (suitable to protect confidentiality) in longitude/latitude coordinates) and common language (for example, so that laboratory results can be interpreted easily by other members of outbreak control team)
  • Ensure legal obligations are met
  • Ensure outbreak investigations are not prejudiced