Publish and share

Why share data?

Publishing your data and citing its location in published works allows others to replicate, validate and ensure accuracy of results. Some research data would be impossible to collect again e.g. recordings of a specific seismic event. Sharing data improves scientific record and increases scientific integrity. The Australian Code for the Responsible Conduct of Research advises that researchers should share their data wherever possible. To support best practice in sharing data QUT has adopted the F.A.I.R data principles to make research data Findable, Accessible, Interoperable and Re-usable. The benefits of sharing data publicly include:

  • greater attention to the research and the researchers, who created the data
  • increased citation counts by up to 69% see study here
  • increased collaboration and networking opportunities between researchers
  • greater opportunities for grant funding as grant funding bodies encourage data sharing.

Durable data formats

Use standard interchangeable formats that most software is capable of interpreting, or at the end of the project convert completed data to these formats. Check data after conversion for errors or changes. Acceptable formats for long term access to data include the following options.

Data type Acceptable formats for sharing, reuse and preservation
Containers TAR, GZIP, ZIP
Computer aided design DWF, DXF, DWG, DWS, DWT, X3D, STEP, STP
Databases XML, CSV
Geospatial SHP, DBF, GeoTIFF, NetCDF
Moving images MOV, MPEG, AVI, MXF
Sounds WAV, AIFF, MP3, MXF, FLAC
Statistics ASCII, DTA, POR, SAS, SAV, STATA, SPSS
Still images TIFF, JPEG 2000, PDF, PNG, GIF, BMP, RAW
Tabular data CSV, XLS XLSX, ODS,
Text XML, PDF/A, HTML, ASCII, UTF-8, RTF, HTML, NUD*IST, NVivo, ATLAS.ti
Web archive WARC

Based on information from Stanford University Libraries (2017) and UK Data Archive (2017).

Copyright and research data

Compilations of data are protected by copyright law as 'literary works', provided the compilation involved intellectual effort, was not copied from another source, and supplies 'intelligible information' (i.e. is human/machine readable). Research students own the copyright of their works unless there is a written agreement stating otherwise.

Copyright symbol Image attribution: Mike Seyfang (2005)

Licensing data

If you decide to share your data when you have completed your research, you may use a Creative Commons (CC) licence to specify the conditions that apply to reuse. When you apply a CC licence to your data, you retain ownership of the data but license others to use the work on liberal terms. The four different licence terms are:

  • Attribution: (BY) You must always provide credit to the original author.
  • Share-Alike: (SA) If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
  • Non-Commercial: (NC) You may not use the material for commercial purposes.
  • No-Derivatives: (ND) You may not distribute modified versions of the work.

Image attribution: Creative Commons

Case study

QUT Post-doctoral researcher Christopher Noune has assigned a CC-BY-ND licence to his dataset which means that others may use, copy and redistribute his work, even for commercial purposes, so long as they give appropriate credit. If they remix, transform or build upon the work, they may not distribute the modified material without seeking permission from Christopher.

See: Noune,Christopher; Hauxwell,Caroline. (2017): HaSNPV-AC53 Genotyping and Abundance Datasets. [Queensland University of Technology]. CC-BY-ND

http://dx.doi.org/10.4225/09/595f00367d2cf

Check which terms apply to the works you want to use via Creative Commons Australia's licence chooser.

Combining and analysing existing data from multiple sources is common practice. The conditions attached to CC licences only apply when an entire, or substantial part of, a CC licensed dataset is reused. You should not apply a CC licence to a dataset if you are not the copyright owner or if the data contains secret, private or confidential information.

Use the DMP Tool to document the decisions relating to sharing and licensing your data at the conclusion of your research. For more information on licensing data, refer to the Australian National Data Service (ANDS) guide to Copyright, data and licensing.

Data repositories

Consider sharing your completed datasets or subset with other researchers, publishers and the public via QUT's data repository QUT Research Data Finder, or an open access repository such as FigShare, Dryad or Github. Find other repositories at the Registry for Research Data Repositories.

In QUT Research Data Finder, you can list the following metadata about your dataset to make it Findable, Accessible, Interoperable and Reusable (F.A.I.R):

  • Descriptive: metadata required for discovery and assessment of the collection, including title, contributors, subject or keywords, study description, list of publications the dataset contributes to and location and dates of the study.
  • Provenance: metadata about the data source, instruments used to collect or generate the data, version tracking and transformations (often including the steps that were applied to produce the data product).
  • Technical & Structural: metadata about file types, software, file size and contents of components e.g. variable names. How the data or its database is configured and how it relates to other data or how components within a set related to each other.
  • Rights & Access: metadata to enable access and licensing or usage rules e.g. negotiated access by contacting the owner or open access via a creative commons licence.

Case study

See how Phd candidate MD. Lifat Rahi has applied the F.A.I.R data principles to his published research data set.

FAIR data principles - applied. Link is to a full size image.