Under the leadership of Dr Harry Gibson, the ROAD-MAP team gathers data from cross-sectional household surveys.

These are occurrences when an agency (governmental, charitable, or research project) goes to a location that has been selected via a statistically viable approach and:

  • Records the latitude and longitude (or GPS coordinates) of that location
  • Asks a series of questions of all the inhabitants to determine wealth indicators, educational indicators, access to interventions, and other useful information
  • Takes blood tests from all the inhabitants to determine the parasite rate.

Each survey included in the MAP parasite rate database has been disaggregated to individual sites (geographical points), individual dates (if the same site was sampled repeatedly) and individual parasite species. Conversely individual-level survey data has been aggregated up to the community level at a particular site and time. So, for example, 14 P. falciparum positives confirmed by microscopy out of 120 community members aged 1-10 years old in Goma village examined from 1 to 30 June 2008 would constitute a single record.

There are two main sources of these cross-sectional surveys:

  • Published papers, located via searches in PubMed (and similar). These data are usually just parasite rates without additional questionnaire information. Sometimes, we can extract data directly from these papers (and record it as information that is publicly available). Sometimes, we have to contact the authors for data (and we record the permission they grant on the data use accordingly). These sources currently account for about 75% of the survey points MAP holds.
  • Surveys published by non-Governmental and charitable organisations, principally Measure DHS. These are nearly always a combination of the covariate questions and the parasite rates and form the most useful data. These sources account for about 25% of the survey points MAP holds but this proportion is increasing. It is the exclusive source of point data we have on ITNs.

While these cross-sectional surveys are the most useful data to statisticians, they are mostly concentrated in sub-Saharan Africa. This is why much of the recent work carried out by MAP has concentrated on this region.

All data stored by MAP is done so in accordance with our data policy.

The majority of these data are publicly available but there are some surveys for which the data owners have requested we restrict what we make available. As a result, data falls into the four categories listed below.

Please note: Which of the categories below any given row falls into is recorded in the permissions_info field in the download (see also the data dictionary, further down this page).

  1. The full data is available for the survey (permissions_info field is null)
  2. The data is available for download but the coordinates are withheld. All data from the Measure DHS surveys falls into this category (permission_info field = “Site data available from www.measuredhs.com”): for these rows, the DHS survey ID is included. To obtain the coordinate data, register with Measure DHS as a user and download the coordinates file for the survey. For non-DHS surveys (permission_info field = “Confidential location”), please contact the data-owner cited in the bibliographic source for the coordinates.
  3. The coordinates are available but the data are withheld (permission_info field = “No permission to release data”). Please contact the data-owner cited in the bibliographic source for the data.
  4. All data is withheld at the request of the data owner. These records do not appear at all in the publicly available downloads.

MAP has provided an R package that can download data points from our Explorer tool.

Download PfPR Surveys
Download PvPR Surveys

The data fields available for each survey record are listed in the following table:

Field Description
FID A unique identifier for the data point generated by GeoServer.
id This is a unique identifier provided by MAP and must be used in any correspondence with MAP about specific data points.
dhs_id This field is only completed when the survey data was provided by MEASURE DHS. In these instances, additional site data can be obtained from the MEASURE DHS website as detailed in the ‘missing data’ field. The ‘dhs id’ links the survey data downloaded from the MAP Data Explorer to the GPS dataset provided by MEASURE DHS.
site_id This is a unique identifier provided by MAP and must be used in any correspondence with MAP about specific sites.
site_name The name of the site in which the survey was conducted.
latitude Site location coordinates referenced to the WGS84 coordinate system.
rural_urban Whether the site location is rural (‘R’) or urban (‘U’).
country The country in which the survey was conducted.
country_id The ISO country code.
continent_id The continent in which the survey was conducted.
month_start Dates between which the survey was conducted.
lower_age Age range of the individuals surveyed. Surveys that included individuals of all ages are upper age recorded as lower age = 0 and upper age = 99.
examined The number of individuals examined.
pv_pos The number of individuals with Plasmodium vivax; parasites in the blood.
pv_pr The proportion of Pv positive individuals.
method Diagnostic method used, e.g. microscopy or RDT.
rdt_type If an RDT was used for diagnosis, the type of test used.
pcr_type If a PCR was used for diagnosis, the type of test used.
permissions_info This field explains any blanks in the other fields. There are three categories of missing data. If we do not have permission to release survey data then this is noted here. If the site data itself is sensitive and cannot be release then this is noted here. If the site data is available from MEASURE DHS direct then the URL is provided here.
citation1 The citation(s) describes the data source and should be used when publishing further work on these data. Up to three citations may be given per survey. At least one citation is always provided regardless of whether the other data fields are available for release.
malaria_metrics_available Boolean value describing whether the data point has publicly available data.
location_available Boolean value describing whether the data point has a publicly available location.