# MAP Data Summary and Download Page
This page supports the Wellcome Trust’s “Data Re-Use Prize – Malaria” competition and provides a summary of MAP’s data estate and links to where these data can be downloaded from. Data for the example questions are also grouped together on this page for convenience.
MAP provides the outputs of our research, as well as broader technical advice and support, to National Malaria Control Programmes (NMCPs), non-governmental organisations (NGOs), Ministries of Health, and other third parties as part of our commitment to open access data.
To this end, MAP obtains, curates, and shares a wide variety of malariometric data. These fall into two categories:
- Input data for models. These include data that are expanded upon in the pages linked below:
- Public-domain malaria metrics reported through routine surveillance systems
- Nationally representative cross-sectional surveys of parasite rate
- Satellite imagery capturing global environmental conditions that influence malaria transmission
Further input data are available from the MAP Data Explorer page and Country Profiles, including:
- Mosquito vector occurrence surveys
- Duffy negativity surveys
- G6PD deficiency surveys
- HbS (sickle haemoglobin) surveys
- Modelled outputs. These include data that is expanded upon in the pages linked to below:
- Malaria incidence, parasite rate, and interventions in sub-Saharan Africa
- Accessibility to cities
- Malaria attributable fever and treatment rates
Further modelled outputs are available from the MAP Data Explorer page and Country Profiles, including:
- Mosquito vector occurrence and relative abundance
- The spatial limits of Plasmodium falciparum and P. vivax malaria
- Temperature suitability for malaria transmission
- P. vivax relapse incidence
- Duffy-negativity phenotype frequency
- G6PD deficiency allele frequency
- HbS (sickle haemoglobin) allele frequency
# Data For the Example Questions
The Wellcome Data Re-Use Prize for Malaria is in the form of an open question. Participants are challenged to explore MAP’s data and come up with innovative uses or insights. Submissions might combine our data or modelled outputs with their own open datasets to address questions either directly associated with malaria or for which malaria might be a potential covariate.
Three example questions are included on the Wellcome Trust’s competition page. The suggested datasets to use for these example questions are gathered below for convenience. Participants should not feel constrained to just use these data.
# Example Question 1: Explaining unattributed transmission
The resources for this question come from MAP’s paper on the effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015 and include unpublished data from intermediary steps in the modelling process.
- Rasters of prevalence means, as published in the paper. There is a raster for each year, the value in each pixel indicates the estimated parasite rate in children between the ages and two and ten.
- Rasters of PR upper and lower bounds. These rasters provide the upper and lower credible intervals for each of rasters of prevalence means.
- Rasters of credible PR – 100MB
- Raster of residuals. These rasters show the remaining (residual) transmission that has not been accounted for by the covariates already in the model i.e. the residuals have accounted for the effects of insecticide treated nets, access to artimisinin combined therapies, and indoor residual spraying with insecticides (see next item).
- Rasters of residuals – 54MB
- Rasters of covariates already used in the models. The way these covariates were used in the paper is explained in the MAP paper Re-examining environmental correlates of Plasmodium falciparum malaria endemicity: a data-intensive variable selection approach. The CSV file below shows how each of the subsequent covariates relate to Table 5 in the paper.
- Covariate Summary Sheet
- EVI – 2,948MB (Note the size of this file will prevent Google docs displaying a preview)
- LGBP_Landcover – 37MB
- LST – 3,345MB (Note the size of this file will prevent Google docs displaying a preview)
- PET – 4MB
- SRTM_Slope – 8MB
- TCB – 70MB
- TCW – 2,000MB (Note the size of this file will prevent Google docs displaying a preview)
- TSI – 783MB
- WorldClim_Precipitation – 4MB
# Example Question 2: Downscaling areal incidence data
The resources for this question come from data published in MAP’s paper on travel times to cities to assess inequalities in accessibility and unpublished data from MAP’s paper on the effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015.
It also includes a subset (Senegal, Ethiopia, and Zambia) of our forthcoming database of annual parasite index (API), calculated from Ministry of Health routine surveillance systems.
- API Data for Senegal, Ethiopia, and Zambia. The data comprises:
- Data dictionaries
- A CSV of the calculated API and the source of the raw case figures used in the calculations
- A geometry file for use in mapping software, containing the same data as in the CSV
API Data for Senegal, Zambia, and Ethiopia – 0.3MB
- Accessibility to Cities 2015 (Zip file, 372 MB)
- Friction Surface 2015 (Zip file, 520 MB)
- Covariate Summary Sheet
- EVI – 2,948MB (Note the size of this file will prevent Google docs displaying a preview)
- LGBP_Landcover – 37MB
- LST – 3,345MB (Note the size of this file will prevent Google docs displaying a preview)
- PET – 4MB
- SRTM_Slope – 8MB
- TCB – 70MB
- TCW – 2,000MB (Note the size of this file will prevent Google docs displaying a preview)
- TSI – 783MB
- WorldClim_Precipitation – 4MB
# Example Question 3: Visualisation of uncertainty
The resources for this question come from MAP’s paper on the effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015 and include unpublished data from intermediary steps in the modelling process.
- Rasters of prevalence means, as published in the paper. There is a raster for each year,the value in each pixel indicates the estimated parasite rate in children between the ages and two and ten.
- Rasters of PR upper and lower bounds. These rasters provide the upper and lower credible intervals for each of rasters of prevalence means.
- Rasters of credible PR – 100MB
- Tables of national P. falciparum PR with credible intervals
- The rasters for each of the 100 runs (or realisations) of the models, by year. These realisations are the data from which the mean PR and confidence interval rasters are produced. The total number of files is 1,600 (100 for each of the years 2000-2015 in the study). Hence, they have been divided up into years to make downloading the data more manageable.
- Realisations for 2000 – 848MB
- Realisations for 2001 – 848MB
- Realisations for 2002 – 848MB
- Realisations for 2003 – 848MB
- Realisations for 2004 – 848MB
- Realisations for 2005 – 848MB
- Realisations for 2006 – 848MB
- Realisations for 2007 – 848MB
- Realisations for 2008 – 848MB
- Realisations for 2009 – 848MB
- Realisations for 2010 – 848MB
- Realisations for 2011 – 848MB
- Realisations for 2012 – 848MB
- Realisations for 2013 – 848MB
- Realisations for 2014 – 848MB
- Realisations for 2015 – 848MB