Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Data and Statistics for Social Sciences: Citation & sharing

Data sharing & availability statement

Some funders now make data sharing a requirement (you can check RDO funder requirements page), and it’s become increasingly commonplace for some subject areas to make data available to everyone.

The basic data sharing policy encourages authors to deposit data in a suitable repository, cite it, and include a data availability statement, sometimes referred to as a ‘data access statement’, which is crucial in signposting where the data associated with a paper is available, and under what conditions the data can be accessed, including links (where applicable) to the data set.

Some publishers offer Data availability statement templates, see Taylor and Francis, Elsevier, Springer Nature, others just give general advice on data availability, see PLOS Genetics.

Principles of data citation

Data citation is rapidly emerging as a key practice supporting data preservation, access and reuse, as well as sound scholarship. The motivation to cite datasets arises from a recognition that data generated and archived in the course of research are just as valuable to the ongoing academic discourse as papers and monographs. This view is shared by research institutions, funding councils and a growing number of publishers.

Data Citation Synthesis Group (FORCE11), a cross-team committee leveraging the perspectives from the various existing initiatives working on data citation to produce a consolidated set of data citation principles, published a set of data citation principles, Joint Declaration of Data Citation which represents a formal statement pulling together practices used in the research and publishing arenas and in common use. The declaration encompasses eight principles that stress the importance and legitimacy of data, the need to give scholarly credit to contributors and the importance of data as evidence. Cited data should have unique and persistent identifiers i.e. a Digital Object Identifier (DOI) which is the equivalent of an ISBN for data these are as issued by data repositories such as ORA-Data. Go to Research Data Oxford for more details.

Cf. an example of citation for an existing ORA-Data deposit:

Tomkins, D. & Jackson, A. (2015) “Ephemera and the British Empire - colour illustrations”. Oxford University Research Archive, https://doi.org/10.5287/bodleian:xp68kg235

or a citation from the ESRC’s Economic and Social Data Service (ESDS):

University of Essex. Institute for Social and Economic Research and National Centre for Social Research, Understanding Society: Wave 1, 2009-2010 and Wave 2, Year 1 (Interim Release), 2010 [computer file]. 3rd Edition. Colchester, Essex: UK Data Archive [distributor], February 2012. SN: 6614, http://dx.doi.org/10.5255/UKDA-SN-6614-3
In short, when citing data, include Author(s), Title, Year of deposit, repository or distributor, DOI (the standard persistent digital object identifier), or other access location. Make sure your citation includes enough information to find the data easily.

Data citation guides

Data citation guides

An exhaustive guide, How to Cite Datasets and Link to Publications from the Digital Curation Centre (DCC) discusses data citation in great detail, with information for researchers and data repositories.

ORA-data, part of Oxford Research Archive designed to help access, create, archive, share and cite research data, advises researchers to follow these guidelines.

You can also go to:
Data archives may provide guidelines on how to cite the data, e.g.:

Sometimes the website has this information on individual data set pages. More frequently, the website or database where you found your data will also have information on how to cite that data in their FAQs, "About" page, or "How to Use" information

Formatting citations

The APA Style Guide provides a recommended citation format for databases, with examples, but other style guides, including MLA and Chicago, don't--so you'll have to create your own.

You can try using CrossCite Citation Generator, which generates citations in many styles, and all you need to provide is the DOI from a dataset.

For citation styles that do not have a specific dataset format, you can base your citation on the closest equivalent formats:  if the dataset is online, use a format for online items, and if it was created by multiple "authors" or editors or researchers, use a format for edited works or items with more than one author.  Or, just base your citation off of the general reference format for your style guide.

If a grant application, scholarly journal, or instructor has strict requirements for citations, they'll usually make this clear, and may provide examples.  If they don't offer an example for citing the kind of dataset you are using, don't be afraid to ask.

MANTRA

Research data management training course offers an interactive training module which introduces the concepts of documentation and metadata, including:
  • Why documenting your research data is important, and why documentation is important for using others’ data,
  • Why and when to use metadata,
  • The importance of citing data, and how to do it.

Cite Them Right Online

Cite Them Right Online
The essential referencing resource helps to reference just about any source, and understand how to avoid plagiarism.
Those new to referencing, or those who could benefit from a refresher, can use the Cite them right eLearning tutorial.

Data citing with software

Using citation software

In Endnote

Use the reference type for "dataset".

In Mendeley

Use other more generic reference type templates and fill in the essentials for your dataset.

In Zotero

Enter the citation in the system as a "Document," depending upon if/how the data producer provides a recommended citation; either:

•  Export an RIS file and import this file into Zotero;
•  Copy and paste the information from a recommended citation into a new Zotero item with the type "Document";
•  Otherwise, use the "Document" item type to add the components of the citation.

Research data linking

Research Dataset Linking

The rise of electronic journals has led to new services being added to simple article access. One of these is the provision of forward links to papers citing the one being accessed. Such links help the reader assess the impact of the paper, place it within the wider literature and in some cases uncover counter refutations or counter arguments. Forward links from datasets to papers that cite them provide all the same benefits, as well as ensuring that documentation for the dataset can be found.

Ultimately, bibliographic links between datasets and papers are a necessary step if the culture of the scientific and research community as a whole is to shift towards data sharing, increasing the rapidity and transparency with which science advances.

An example of research dataset linking can be found on the Scopus page. If research datasets are available on the external data repository for an article, the Scopus Document details page will include a “Related Research Data” sidebar, located to the right of the article details. To access the dataset, click on the links provided.

Additional advice on general obligations of using or sharing data and more specifically on citation of data may be found on the Research Data Oxford website.