Figure 1; FAIR in a nutshell. Image: ARDC 2018 - CC-BY 4.0.
Governments have a mandate to make non-sensitive data open. For example, the Australian Government Public Data Policy Statement says “Australian Government entities will … make non-sensitive data open by default…make high value data available for use by the public, industry and academia… ensure non-sensitive publicly funded research data is made open for use and reuse… to extend the value of public data for the benefit of the Australian public.” FAIR data is a way to extend the value of data. The largest 20 nations, the G20, agreed to make Open Data Principles a priority at the 2015 meeting in Turkey, saying “Transparency… Global transformation, facilitated by technology, fuelled by data and information.. Open data is at the center of this global shift.” (p.2).
Government data custodians
Help government data custodians to understand FAIR data principles
Where “data” is used here, we also mean collections such as Cultural Collections, historical collections, documents, artefacts and other valuable collections.
Read G20, Australian and States policies on Open Data
Figure 2; Data sharing drivers Source: Katie Hannan, 2018, CC-BY.
G20: Open Government Forum; G20 Turkey 2015. “Transparency… Global transformation, facilitated by technology, fuelled by data and information.. Open data is at the center of this global shift.” (p.2) Read and consider G20 Open Data Principles.
Familiarise yourself with your State or Territories Data Policy. See links in Appendix 1.
See Appendix 1 for a list of Australian State Open Data Policies.
The following legislation may apply to the management of government data:
If your organisation doesn’t have a policy on open data, who are the key stakeholders that you would need to work with to prepare an open data policy?
What main headings would you need to include as part of your data policy?
Read https://www.go-fair.org/faq/ask-question-difference-fair-data-open-data/ Can you think of examples of data you deal with that cannot be made Open but can be made FAIR? List some advantages in making this data FAIR.
Does the current wording in the policy for Open Data encourage making the data FAIR? Where do you see gaps?
See slide 14 here https://www.slideshare.net/sjDCC/open-fair-data-and-rdm
See how Geoscience Australia implement the FAIR data principles in their work. Geoscience Australia describe themselves as “the nation’s trusted advisor on the geology and geography of Australia” (GA 2018).
How FAIR is your data? - https://www.ands-nectar-rds.org.au/fair-tool Suggest using this now, and then finishing off the modules, making some changes to a data collection and then testing again using the FAIR data tool.
Where to store data?
Some reusable content here - https://ecu.au.libguides.com/10-marine-science-rdm-things/Thing6
Read a data description on data.gov.au eg Arts Victoria, ABC or Research data Australia Eg National Archive of Australia, Australian Antarctic Data Centre, CSIRO (Commonwealth Scientific and Industrial Research Org), Geoscience Australia.
Reflection: Could you understand the description? Can you think of someone for whom this data or collection would be useful? Was it clear where to go next to access the data, or to ask for more information about this data or collection? What else would you like to know about this data/collection?
Activity: Post your questions or responses to the reflection above to: the data custodian, or the comments section at data.gov.au.
If you are a data custodian/researcher, consider your five most important datasets, that you have contributed to or that you manage. Pick the most important dataset to describe.
Q: What type of data identifier does a government data custodian have?
Add more rich description to your data description eg subjects, grant IDs (where applicable - RDA; the Australian National Data Catalogue, has permanent URLs for Australian ARC and NHMRC grants). Include a significant statement about why the dataset is important.
Ask a colleague in a related field if they can understand your description. This helps the description be broadly readable by someone who is not deeply knowledgeable in your field. This will ensure that your description is more broadly understood.
Publish your data description on your resume, especially if online e.g. LinkedIn. Send your data description to your data librarian, for addition to your Institutional Repository or Data Portal. Alternatively, post your description to a public cloud service, such as Zenodo, Figshare or Data Dryad. No data need be included. A description record is valuable in itself as it reveals the existence of data, previously unknown and inaccessible.
To make data findable, It has to be uniquely and persistently stored with an identifier. A digital object identifier (DOI) is a unique, case-insensitive, alphanumeric character sequence and can be very helpful for this purpose. See also [ANDS Guide: Digital Object Identifiers (DOI) System for Research Data]](https://www.ands.org.au/__data/assets/pdf_file/0006/715155/Digital-Object-Identifiers.pdf).
See who mints ANDS DOIs, including NSW Office of Heritage and Environment, Bureau of Meteorology, CSIRO, Geoscience Australia, Dept of Environment.
Types of persistent identifiers:
Watch the video Persistent identifiers and data citation explained by Research Data Netherlands - https://youtu.be/PgqtiY7oZ6k
Read about persistent identifiers on a very general level (awareness). DOI requires five fields; author, title, year, publisher, URL of DOI landing page.
Visit http://www.doi.org/ and try resolving these DOI numbers:
See the licensing guide: what is the appropriate licence for data produced by a government agency?
Refer to Australian Government Data Statement: “At a minimum, Australian Government entities will publish appropriately anonymised government data by default: …under a Creative Commons By Attribution licence (ie CC_BY licence) unless a clear case is made to the Department of the Prime Minister and Cabinet for another open licence.”
Specific CC licences, which require DPC approval, include NC - non-commercial, SA - share alike, and the very restrictive (and not-recommended ANDS) ND - no derivatives allowed.
Examples of licensing statements:
Why is ”clean” data important? Public policy, changes to medical protocols and economic decisions all depend on accurate and complete data. See further at ECU resource which looks at the why and what of “dirty data.”
Read this case study. The Data Retriever automates the tasks of finding, downloading, and cleaning up publicly available data, and then stores them in a variety of databases and file formats. This lets data analysts spend less time cleaning up and managing data, and more time analysing it. https://frictionlessdata.io/articles/the-data-retriever/
What is sensitive data?
FAIR data doesn’t need to be published as open data. See Thing 2.
Useful resource: CSIRO Data 61 The De-Identification Decision-Making Framework - https://publications.csiro.au/rpr/download?pid=csiro:EP173122&dsid=DS3
Indigenous Knowledge: Issues for protection and management - https://www.ipaustralia.gov.au/sites/g/files/net856/f/ipaust_ikdiscussionpaper_28march2018.pdf
Additional resources (from Library-Research-Support-Top-10-FAIR-Things_DRAFT)
Controlled vocabularies for data description
In addition to selecting a metadata standard or schema, whenever possible you should also use a controlled vocabulary. A controlled vocabulary provides a consistent way to describe data - location, time, place name, and subject.
Controlled vocabularies significantly improve data discovery. It makes data more shareable with researchers in the same discipline because everyone is ‘talking the same language’ when searching for specific data e.g. plants, animals, medical conditions, places etc
Have a browse around the stunning level of data description and data contained in the Atlas of Living Australia.
Data Dictionaries Standardised, accepted terms and protocols used for data collection
Data reuse - It is hard to check/track when you don’t have persistent identifiers and there’s not much of a data citation culture.
Web stats Selected data.gov.au web analytics - https://search.data.gov.au/dataset/ds-dga-9fa9bfda-96b3-4214-8a09-497af105524b/details?q=data.gov.au
Some old uses of open data: https://data.gov.au/showcase
Use in GovHack(AU) - https://twitter.com/govhackau?lang=en
Tracking identifiers - data citation
Looking at the broader impact of how the data has been used and the benefits it has brought to society, industry, economy, etc. is a richer source of impact evidence than just looking at citations.
Australian Federal Government: Refer policy at Dept of Prime Minister and Cabinet. See also National Data Commissioner, ”responsible for implementing a simpler data sharing and release framework”.
“The Victorian Government recognises the benefits from and encourages the availability of Victorian government data for the public good. The DataVic Access Policy has been developed to support this recognition.”
“The objectives of this policy are to assist NSW Government agencies to: release data for use by the community, research, business and industry accelerate the use of data to derive new insights for better public services embed open data into business-as-usual…”