Process and analyse
Data processing and analysing is an important stage in the research data lifecycle. Data Analysis is about how raw data is chosen, evaluated, and interpreted into meaningful and significant conclusions that other researchers and the public can understand and use. Data processing is the manipulation and conversion of data by computer in order to format or transform raw data into machine-readable data (Britannica Academic).
It will be useful to use a copy of the raw data from the collection stage to use this in the analysis stage of your project. Retain a read-only copy of the raw data as a back-up against loss or corruption of the working data.
It is recommended that while working data may be stored for processing and analysis in other locations, that your master copy is stored in a secure location such as QUT's Research Data Storage Service.
Easy metadata for working data
A simple way to capture metadata during the working phase of your research is to create a readme.txt type file, a collection of simple metadata that describes the details of the datasets and improves the long-term usability of the data. Save the readme.txt file with the data files in the same folder within your research storage. Also store a copy of your data management plan, ethics approval and other relevant documents here too.
Download a Readme.txt template from Cornell University's guide.
In some cases, metadata can be generated or extracted from digital files automatically. For example, a digital camera records the date, time, exposure setting, and file format. Software programs sometimes allow structured metadata such as title, author, organisation, subjects or keywords to be added via ‘Properties'. View disciplinary metadata standards.
Documentation
Methods of processing must be rigorously documented to ensure the utility and integrity of the data. Documentation at the processing and analysing stage should include:
- derived data created, with code, algorithm or command file used to create them
- weighting and grossing variables created and how they should be used
- data list describing cases, individuals or items studied, for logging qualitative interviews
- all structured, tabular data should have cases or records and variables well documented
- other documentation may be contained in user guides, reports, publications, working papers and laboratory books.