Managing and organising data

With a little planning at the start of your research project, you can make your data files easy to store, find and reuse. Main elements of good data management practices throughout your project are data storage and back-up, file naming and versioning, and documentation. If you work with sensitive data, their secure management should be addressed from the very start of the project. Please see the information below for further details on organising data throughout your research.

Data storage, backup and security

During your research project you will need to store your research data so it is secure and backed up regularly, but is easily accessible to those who are authorised. You should have several copies of your files, ideally in different places. If possible, these back –up copies should be taken automatically. The University supports you to do so.

UoB storage services

  • Research Data Store (RDS): The RDS is a central storage service for ‘active’ research data. It is highly resilient and is hosted in two data centres on campus. Space on the RDS is allocated to projects and managed accordingly. Applications should be submitted by Principal investigators (PIs) via the Service Desk. Once approved, up to 3TB of storage will be allocated by default to the Project though additional capacity may be purchased. More details on how to request access to the RDS can be found on the IT RDS pages.
  • BEAR DataShare: BEAR DataShare is a file synchronisation and sharing service provided by IT Services. The service allows you to securely save and sync your files with your colleagues and partners anywhere in the world, from any device. It provides 25GB storage capacity per user. It can be requested through the IT Service Desk (Click the ‘Make a Request' icon and choose "Request a BEAR Datashare").

File naming and versioning

Agreeing on a file naming convention at the beginning of your project will help to provide consistency. This will make it easier to find and correctly identify your files and avoid version control problems when working on files collaboratively.

Tips on creating file names

  • Name folders meaningfully
  • Be as concise with your names as you can. Use names that describe content. Keep names short, if possible no more than 25 characters.
  • Consider a standard vocabulary for file names used by everyone on your project.
  • Version your files e.g. by using a 'revision' numbering system. Any major changes to a file can be indicated by numbers, for example, v01 would be the first version, v02 the second version. For the final version, substitute the word FINAL for the version number.
  • YYYY-MM-DD dates at the beginning of the file or folder allow chronological sorting.
  • Be consistent and stick to your naming scheme.

IT Services have produced guidelines on File Naming Conventions (PDF – 396kb).

Renaming files

There are occasions when you may want to rename large number of files at once, for example digital images such as photographs, whose default file names are simply numbers. You can do this in Windows Explorer (Windows Vista, 7, 8) or get special batch renaming software.

Documentation

Good documentation makes material understandable, verifiable, and reusable. Making data available to others does without providing any context does not make it usable or useful. If you or others return to a dataset some time later, documentation is key to make sense of the dataset. Data documentation includes information about when, why, and by whom the data was created, what methods were used, explanations of acronyms or jargon, as well as the content and structure of data. Documentation should thus exist on project or study level and data level. A readme file of your dataset should e.g. include

  • An inventory of the major parts of the dataset
  • Details of any particular operating system used to create (and thus re-use) the data
  • Details of any particular software required used to create (and thus re-use) the data
  • Information about any other dependencies (e.g. particular libraries used in the code)
  • For tabular data, descriptions of column headings and row labels, any data codes used and units of measurements

You might standardise some of the information by using a metadata schema as this makes sharing data in a repository easier at the end of your project. Browse the Metadata Standards Directory for a suitable schema for your discipline.

Handling sensitive data

Sensitive data are data relating to people, rare or endangered animal or plant species, and data generated or used under a commercial research funding agreement. If you work with human participants and/or animals, make sure your research project is approved by the University Ethics Review.

Make sure your analysis, storage and data sharing options are covered by the consent forms signed by your participants or covered in your agreement with commercial partners.

Anonymise personal data early. Guidelines on data anonymization are available from the UK Data Service. Sensitive data might be shared in a de-identified form or made available under certain terms of re-use.

If you intend to use online questionnaires, we recommend you use the software Qualtrics which is free to use for staff and students in the College of Social Sciences and the College of Arts and Law. 

For further questions and help, please contact research-data@contacts.bham.ac.uk