Skip to Main Content

Research Data Management

A guide on managing, organising, sharing and preserving research data

Questions You Should Ask Yourself

When considering the usage of an existing dataset (i.e. secondary data), you should consider the following questions:

  • Is the data documentation good enough?
    Data that does not provide appropriate documentation could be missing essential information such as when, where, and how it was collected, as well as methods choices you may not be aware of and that could cause issues with your research.
  • Can the source of the data be trusted?
    If the documentation of the data is complete, you should be able to identify and evaluate its source and its possible biases.
  • Can the data actually support my research question?
    Sometimes, open data is not exactly what you need, and no amount of wrangling will give you that.
  • Is the dataset licence compatible with my goal?
    You may not be able to mix and match datasets for copyright reasons. Checking that they are shared under compatible open licences is necessary before doing any work with them. A CC0 or CC BY dataset is generally good, but more restrictive licences can cause problems.

The IHEID List of Databases

Your librarians are curating a Google Sheets list of databases you can use to search for open data. It is divided in several tabs:

  • Data Search Engines
  • Central and Development Bank Data
  • International Organisations
  • National Governments
  • Microdata
  • Digital Archives
  • Other Data
  • Subscriptions (these require an access you can request from our institution)
  • APIs

You can consult it here. If you have comments or suggestions for additional databases, please contact the Library.

Data Collection Through APIs

Application Programming Interfaces (APIs) are a way to access databases and systems using various tools to perform specific searches not necessarily implemented in an open data portal. Some interesting APIs are listed in our "List of databases" above.

If you are interested in social media data collection, check "APIs for social scientists: A collaborative review".