Data Management
Depending on the outbreak, comparatively large volumes of data might be collected. Even if the
outbreak is small, data sharing between organisations or countries may
be required. Good data management policy enables improved auditing, identification of lessons
learnt, increased transparency and ensures the security of personal identifiable information.
Consideration should be given to the creation of a database to store contextual data in a
coherent and robust manner. To facilitate this, versions of a trawling questionnaire in Epiinfo
and Word are provided here.
Data management plan
Data should be entered into a suitable database depending on which software is to be used for
analysis.
Errors may be introduced into the data at any stage of data collection, data entry or data
analysis, and checking should take place at each stage. There are three main ways of reducing
data entry errors and maintaining data quality at the data entry stage: interactive checking,
double data entry and batch checking. However, none of these approaches can guarantee the
identification of all data entry errors.
-
Double data entry (where the data are entered twice, ideally by two
different people, with the two data sets then compared using verification software) is the
gold standard, but may be impractical in an outbreak setting. Interactive checking
identifies errors or anomalies in the data as they are entered, and can detect range errors
(e.g. an age of 176) or consistency errors (e.g. a pregnant male).
-
Interactive checking is best used when data collection proceeds in parallel with
data entry, and anomalies in the data can quickly be queried from the data source. However,
interactive checking interrupts data entry, and so batch checking, where checks are made on
the data after all the data are entered, or periodically during data entry, may be
preferred.
-
Interim analyses of the data, such as basic tabulations and plots, can identify
further errors in the data. Where errors are corrected, it is important to maintain an
audit trail of changes made to the data. One way of doing this is to leave the original
data untouched and to correct errors programmatically at the time of analysis. If it is not
possible to correct these errors, then it may be necessary to set their values to missing.
Data Sharing
At times during an outbreak, there may be requests or need to share information. It may
be that, in spite of the multidisciplinary nature of the outbreak control team, additional
analytical support might be required to interpret epidemiological and environmental data or
that outbreak control teams in other member states exist to respond to their local cases.
Updates to the local or national media may also be required. Click here
for information regarding potential communication messages. Data sharing should be viewed as a
benefit to the outbreak investigation and can be facilitated by good data management. The
precise nature of the data to be shared will vary from outbreak to outbreak, but there will be
a number of key issues that are likely to arise:
- Ensure patient confidentiality
- Ensure clear audit trail and time stamp when data are sent out, so that if new cases or
information are incorporated into outbreak, clear and logical updates can be sent out as
necessary
- Ensure data are sent out in suitable electronic format for collaborators to open and read
- Ensure data are structured within files in suitable and understood units (for example case
locations are given as centroid of the spatial unit they inhabited at that time (suitable to
protect confidentiality) in longitude/latitude coordinates) and common language (for example,
so that laboratory results can be interpreted easily by other members of outbreak control team)
- Ensure legal obligations are met
- Ensure outbreak investigations are not prejudiced