Skip to Main Content

Research Data Management

Structuring Folders

When working as a group or preparing to share a dataset, using a clear folder structure is absolutely necessary. But what makes a good folder structure?

First of all, this structure should be agreed to and adopted by all participants. This makes sure that it is coherent and understandable to all. Folder names should always be short and explicit. Users need to understand what files are within even without opening them. If your folder structure is complex due to the scope of the project, you should include some kind of documentation (probably a readme.txt) at the root of your folders to explain how they work.

The hierarchy of your folders should be consistent and logical. Go from a general, high-level folder (starting with a single folder for the project, using its name or acronym) to more specific lower-level folders. Your structure should not be too deep nor too shallow. Depending on the size of the project, this could mean 3-4 levels, but it could be more or less for small or very large projects.

For research projects, one option is to organise the folder levels based on research activity, data type, and kind of contents (publication, documentation, deliverables, etc.). But each project has its own needs and you will need to find out what works on a case-by-case basis. As part of your preservation strategy, it can also be useful to define temporary "temp" folders from which data can be safely deleted after usage.

Things to Avoid

  • Do not use a generic "current stuff" folder
  • Do not create researcher-specific folders within a project: folders are about the contents, not the authors
  • Make sure you do not have overlapping categories or folder redundancy (similar folders in different places)
  • Do not create copies of files in different folders; if you need this, use shortcuts to keep a single reference file

Example

This example from the UK Data Service presents one way your project folders could be structured:

  • The first folder level is the project (ENBIOproject)
  • The second separates the data from the documentation
  • A further level is used to distinguish between different data types
  • The final level divides items based on the research activity

You could do it the other way around: research activities could be the second-level folder, in which case they would contain their own data and documentation folders. The only requirement is that your structure is clear and coherent with the project.