Datasets for download: World Bank, IADB, and EuropeAid
Here you can find the full datasets collected on development projects, public tenders, and contracts for 3 major donor agencies: the World Bank, the Inter-American Development Bank, and EuropeAid. The project was supported by the British Academy/ UK Department for International Development Anti-Corruption Evidence Programme. The final datasets result from a concerted effort by the University of Sussex, Government Transparency Institute, and Datlab. In addition to republishing structured data gathered from official source websites, the datasets also contain corruption risk red flags developed by the research team.
About the project
Development aid donors are under increasing pressure to ensure accountability and transparency in the allocation of funds, yet have only blunt tools available to monitor whether recipient governments use aid for agreed purposes. To address this problem, we developed an innovative methodology for analysing big data from major aid agencies to calculate more accurate and targeted indicators of corruption in aid-funded procurement. We employed these indicators to explore how the risks of corruption in aid allocation are affected by (1) different institutional control mechanisms and (2) the socio-political context in recipient-countries. Our analysis is based on a multi-method research design that combines a quantitative analysis of procurement data across the developing world with a few in-depth, qualitative case studies (Ghana, Tanzania, Peru, and Vietnam). Our findings hopefully contribute to guiding donor agencies in the future development of more efficient delivery and monitoring mechanisms, while our data analysis tools can be incorporated into donors’ evaluation frameworks on a real-time basis.
Data and documentation
Data (mirroring source data): complete json, flat csv for key variables
Data (analysis data files with red flags): EuropeAid dta, WB dta, IADB dta
Description of data collection and red flags calculations: pdf
Extract template (describing which variables are extracted from full json files into csv outputs): xlsx
Data scraping, parsing, and cleaning codes: https://github.com/DatlabDasData/dfid
Combined project and procurement data structure (describing the structure of the structured json database): xlsx
Data validation report (describing the data quality, and the rate of missing data): pdf