Legionnaires' disease outbreak investigation toolbox

Download Page

2.5 Spatial data structure and interoperability

2.5.1 Spatial data model

Section's 2.1, 2.2, 2.3 and 2.4 provided an overview of the data required for a GIS-based investigation into the source of a Legionnaires' disease outbreak, providing some example schemas for the collection and storage of case data. For outbreaks in both individual countries and across borders there are significant benefits that can be realised if a common data model can be agreed that defines what data are collected and how it should be structured within a GIS. A commonly defined and understood schema makes the interpretation of the data easier, and also allows for more straightforward data integration in scenarios where data can be shared across borders. Even in scenarios where data cannot be shared there are still benefits that can be realised by sharing analytical tools and code that analyse data based on a common schema.

2.5.2 Spatial data format

Even with a commonly adopted data model it is still likely that individual nations will store data in different data formats. This will largely be influenced by the database platform(s) and GIS software used by those authorities responsible for identifying and responding to Legionnaires' disease outbreaks. Spatial data can be stored as flat files, often in proprietary formats such as ESRI shapefile or MapInfo TAB files, and also in 'open' formats, as defined by the OGC, such as GML and KML. Data can also be stored in databases using proprietary spatial data types such as Oracle SDO geometry, Microsoft SQL Server geography/geometry and PostGIS geometry, and using open data types such as the OGC WKB geometry format. Most modern GIS software has the capability to work directly with multiple file formats or be able to convert data into a useable format.

Open standards should be considered as a potential mechanism for easing data sharing and interoperability. The use of open data transport mechanisms such as Web Mapping Services (WMS) and Web Feature Services (WFS) can be considered efficient methods for exposing spatial data from databases or file servers for transport across the internet/intranet and subsequent application consumption. WMS typically transports backdrop image mapping whilst WFS can transport intelligent feature data with coordinates and attributes that can be recreated and interrogated within applications. The OGC has responsibility for defining open geographic standards to enhance spatial data interoperability and sharing. Any cross-European surveillance initiatives that make use of spatial data should give careful consideration to the data storage, structure and delivery it uses and consider the use of open standards. Further details may be found at http://www.opengeospatial.org.

2.5.3 Coordinate Systems

Locations on the earth's spherical surface are measured in geographic coordinates, however while latitude and longitude can locate exact positions on the earth's surface, they are not uniform units of measure (only along the equator does the distance represented by one degree of longitude approximate the distance represented by on degree of latitude). To overcome measurement difficulties, data are commonly 'transformed' from three-dimensional geographic coordinates to two-dimensional 'projected' coordinates. In Europe spatial data can be stored using Global (e.g. Web Mercator Auxiliary Sphere), European (e.g. European Terrestrial Reference System, ETRS) and country specific (e.g. British National Grid) mapping coordinate systems. Global and European coordinate systems are designed for larger cross-country scale spatial data representation although these tend not to offer the required precision that can be obtained from a coordinate system based on a projection optimised for a particular region on the Earth's surface (e.g. a single country).

It is likely in the context of a cross-border Legionnaires' disease outbreak that datasets sourced from individual countries would be based on differing coordinate systems (e.g. Sweden uses RT90 and neighbouring Norway uses one of two systems - NGO1948), however to enable spatial analysis, data needs to be referenced to a common coordinate system. Modern GIS software, both commercial and open source, commonly have facilities to handle the re-projection of data.