1.3 Comparing potential outbreak sources

GIS can be used to explore the spatial relationships that exist between those cases involved in a Legionnaires' disease outbreak and the potential sources of contamination that might be responsible for that outbreak. Descriptive epidemiology is often enough to enable the identification of a single common location as being responsible (such as a swimming baths visited by all those cases included in the outbreak). However, in those situations where a common location is not obvious it is possible that a cooling tower or other aerosol emitting facility could be the responsible source, and that those included within the outbreak have at some point come into contact with contaminated aerosols within relatively close spatial proximity to that source.

By identifying potential outbreak source locations, such as cooling towers, GIS can be used to interrogate each case's movements in relation to each potential source and therefore assess the relative likelihood of each source being responsible.

1.3.1 Buffer and overlay analysis

Data requirements: Case data, Potential outbreak source locations

Description: Buffering can be used to establish 'zones' around potential source locations that reflect the perceived area in which those locations, if responsible for the outbreak, could have affected the population. Those buffers can then be overlaid with case data such as home/work locations and travel routes to identify which cases have been within each zone. The assumption is that those zones that have the larger number of cases travelling through them are more likely to house the location of common infection. Data should be analysed to establish the following for each buffer zone:

1. The number of home locations within the buffer

2. The number of unique cases that have been within the buffer

3. The number of locations visited within the buffer

4. The number of travel routes through the buffer

Figure 1.1 and Table 1.1 provide an illustration of the analysis outputs.

GIS tools required: A vector-based buffer tool is required for buffering potential outbreak sources. A spatial-join tool is required to overlay case data with the buffers to provide totals for 1-4 listed above.

Examples from the literature: Garcia-Fulgueiras et. al (2003) http established 30 specific zones in the city of Murcia in which potential sources of contaminated aerosols were located and used their detailed record of patient's movements to record who had entered these zones, when, and how frequently. Similarly in the 2003 Hereford outbreak Kirrage et. al used a composite score methodology with points being allocated for each case that was within 500m of the source or within 1 km downwind of the source. In both Murcia and Hereford this approach helped to identify the respective sources of the outbreak.

Considerations for a cross-border outbreak: As is discussed in section 2.1 Case data, it is unlikely that detailed case data such as home/work locations and travel routes could be shared between countries in a cross-border outbreak scenario. It should, however, be possible to share data on potential outbreak source locations such as cooling tower locations and any buffers that are generated around them. It is therefore possible for each nation to analyse only the cases associated with their own country, as described above, and then to aggregate the numbers from each country for each 'buffer' zone. Assuming each country conforms to an agreed data schema, data integration should be straightforward. Figure's 1.2, 1.3, and Table 1.2 provide an illustration of this scenario.

Figure 1.1 Example data used in buffer and overlay analysis

Table 1.1 Example ouput of overlay analysis

Cooling Tower	Home locations within the buffer zone	Locations visited within the buffer zone	Travel routes through the buffer zone	Cases that have travelled the within buffer zone
A	3	15	25	13
B	3	9	15	9
C	0	3	4	4
D	0	3	3	3
E	3	6	13	6
F	5	10	36	12

Overlaying each cases home location, the locations they have visited and the routes they have travelled reveals that those zones surrounding cooling tower's A and F have had a high number of cases that have passed through them. Of a total of 14 cases included in the outbreak 13 cases had been within the buffer zone of cooling tower A and 12 cases within the buffer zone surrounding cooling tower F.

Figure 1.2 Only those cases resident in Country A are included within the analysis. Travel into Country B can still be included.

Figure 1.3 Only those cases resident in Country B are included within the analysis. Travel into Country A can still be included.

Table 1.2 Analysis outputs from Country A and Country B are combined.

Cooling Tower	Home locations within buffer zone			Locations visited within buffer zone			Travel Routes through buffer zone			Cases that have travelled within buffer zone
	Country A	Country B	Total	Country A	Country B	Total	Country A	Country B	Total	Country A	Country B	Total
A	0	3	3	7	8	15	10	15	25	7	5	13
B	0	3	3	4	5	9	5	10	15	7	2	9
C	0	0	0	2	1	3	2	2	4	2	2	4
D	0	0	0	1	2	3	1	2	3	1	2	3
E	3	0	3	4	2	6	11	2	13	1	5	6
F	4	1	5	7	3	10	26	10	36	6	6	12

The buffer and overlay analysis described in this section essentially attaches figures to a potential outbreak source (and its buffer). In a cross-border outbreak where individual-based data cannot be shared, each country can still perform the analysis before sharing their results with a neighbouring country. Assuming that a neighbouring country can conduct the same analysis, the figures attached to each potential source location can be combined and the output figures should be the same as if the analysis was conducted with access to all the individual-level case data.

1.3.2 Attack rate analysis

Data requirements: Case data, Potential outbreak source locations, Demographic data

Description: The assumption behind this technique is that for the facility responsible for an outbreak you will observe a trend whereby risk decreases with distance from a facility. The simplest way to utilise this technique is to buffer each facility at varying radii and overlay those radii with population data and case data. Using this approach allows you to establish the total population within each radius and the number of cases within each radius. From these figures you can calculate both attack rates and risk ratios within each buffer. The advantage of looking at attack rates in addition to the actual number of cases is that it provides a relative measure of disease prevalence. For example, a specific number of cases reported in a relatively sparsely populated area could prove more insightful than an identical number of cases reported in a densely populated area. Similarly looking at age-specific attack rates can be important. Legionnaires' disease tends to impact the elderly more than the young so using a similar scenario as before, a given number of cases in an elderly population may not be as significant as an identical number of cases in a younger population.

A limitation of this technique is that in order to establish attack rates you need to assign a location for each case. This is commonly a case's home address; however there is every possibility that Legionnaires' disease was contracted at another location (such as the work place) or travelling between locations. Figure 1.4 and Table 1.3 provide an illustration of how attack rates may be presented.

GIS tools required: A vector-based buffer tool is required for buffering potential outbreak sources at multiple distances. Due to the differing nature of demographic data population data may come in either a vector or raster form. A tool is needed to calculate the total population within each buffer from that vector or raster data.

Examples from the literature: Nygard et. al, 2008 http used this technique in their investigation into the 2005 Fredrikstad and Sarpsborg outbreak in Norway. The authors calculated attack rates and risk ratios for each potential facility within 1 km, 1.5 km, 3 km, 5 km and 10 km buffers. They found that those living within 1 km of a specific facility were the most at risk and only for this particular facility did that risk decrease with distance.

Considerations for a cross-border outbreak: There are two challenges to conducting attack rate analysis across a border. Firstly, as individual-based case data cannot be shared across borders, each nation would need to provide a total number of 'cases' within each buffer zone on their side of the border. In this way each figure could be combined with that of the neighbouring country to provide the total number of cases within each buffer zone. The second challenge relates to the demographic data used to assign a population to each buffer. As is discussed in section 2.3 Demographic data, different countries collect demographic data in different ways, possibly covering different time periods and are not necessarily directly comparable. To sensibly assign a population to each buffer would likely require a gridded population model with continuous coverage across national boundaries.

Figure 1.4 Example Legionnaires' disease outbreak scenario

Table 1.3 Attack rates per 100,000 persons. Attack rates for CT1 decrease with distance suggesting that it could be the cooling tower responsible for the outbreak.

Cooling Tower Buffer	Cases	Population	Attack Rate (per 100,00 persons)
CT1
1 km	9	182	49.45
2.5 km	20	832	24.03
5 km	27	2996	9.01
10 km	33	12270	2.69
CT2
1 km	0	178	0
2.5 km	1	737	1.36
5 km	18	3845	4.68
10 km	32	12292	2.60
CT3
1 km	1	150	6.67
2.5 km	2	712	2.81
5 km	18	3857	4.67
10 km	32	12226	2.62
CT4
1 km	3	219	13.70
2.5 km	15	942	15.92
5 km	37	3234	11.44
10 km	33	11931	2.77

Legionnaires' disease outbreak investigation toolbox

1.3 Comparing potential outbreak sources