Top 10 FAIR Data & Software Things

Imaging


Authors

Description:

This guide aims to promote the FAIR data principles and to encourage their adoption by the bioimaging and characterisation1 community. The FAIR principles are described in the context of bioimaging and characterisation and the activities are optional. This guide seeks to empower researchers, scientists and health professionals to enable them to adopt best data practices throughout the research lifecycle, to improve the quality, reproducibility and reusability of research outputs.

1“Characterisation is the general process of probing and measuring the structures and properties of materials at the micro, nano and atomic scales. It is essential across natural, agricultural, physical, life and biomedical sciences and engineering.”

Audience:

Researchers, neuroscientists, clinicians, microscopists, platform engineers, graduate students and computational and data scientists working on image analysis and processing.

Goals:

To inform data producers and users about the FAIR principles applied to bioimaging/characterisation and suggest activities to apply to their research.

Table of Contents:

1. What is FAIR?

2. What are publishers and funders saying about data access?

3. Data sharing and discovery

4. Reusable data repositories for the image community

5. Managing and sharing sensitive data

6. Persistent identifiers

7. Describing data: metadata

8. Reusable data best practices

9. Licensing your work

10. Data citation for access and attribution

Supplementary Information

References

1. What is FAIR?

The acronym FAIR, as detailed in 15 principles (GO FAIR 2016) stands for Findable, Accessible, Interoperable and Reusable. The FAIR principles (Wilkinson et al. 2016) are guidelines to motivate and enhance reusability of data, by facilitating its discovery, integration and evaluation. In this context, “data” refers to all research-oriented digital objects (including data, metadata, software, workflows and packages) (Wilkinson et al. 2017). Wilkinson et al. 2016, pioneered the definition of the guiding principles “emphasising the capacity of computational systems to Find, Access, Interoperate and Reuse data with none or minimal human intervention”, this is referred to machine-actionable FAIR principles. FAIR is not separated, but the intersection of research data management and open science, as (Higman, Bangert, and Jones 2019) describe.

“FAIRness is a prerequisite for proper data management and data stewardship”

Communities are motivated to apply the FAIR principles to research activities and to enable people and machines to find, read, use and reuse research data and research outputs. In 2018 a coalition of stakeholders (COPDESS 2018), representing the international Earth and Space science community set out to develop standards to connect researchers, publishers, and data repositories in this community to enable FAIR data on a large scale. This project will accelerate scientific discovery and enhance the integrity, transparency, and reproducibility of this data. In imaging, on 1 March 2019 (Bioimaging 2019) and other research infrastructures including ELIXIR-Europe joined forces as part of The European Open Science Cloud project to publish research data via FAIR databases. Community participation from academia, industry, small and medium-sized enterprises (SMEs) and regional bio-clusters is paramount for the success of this four-year project (starting in 2020). The imminent global uptake of the FAIR principles in different scientific domains, serves to motivate the bioimaging/characterisation community to do likewise and to move forward, promote and apply them.

Activity 1: In 2018, CODATA - The Committee on Data for Science and Technology - released news of the “Enabling FAIR Data Project and Commitment Statement”. Take a look at the partners in (COPDESS 2018), do you recognise partners in your discipline?

Activity 2: Can you think of the benefits of making your data FAIR? How can you align your current data practices to the FAIR principles? Consider the following resources when addressing the activity above:

Back to top

2. What are publishers and funders saying about data access?

Many governments, funders, and publishers around the world have adopted data access policies that either encourage or require researchers to start their journey to FAIR research.

All research papers accepted for publication in Nature and an initial 12 other Nature Research titles are required to include information on whether and how others can access the underlying data Nature Announcement 2016: where are the data? (Nature, 2016).

PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction at the time of publication (PLOS one, 2017). PLOS suggests using FAIRsharing to index resources, for example their own PLOS list.

The European Commission has drafted Guidelines on FAIR Data Management for the H2020 programme (European Commission, 2013) “those projects funded in this scheme must submit a version of this FAIR Data Management Plan (DMP)”.

The Australian Research Council states “Author(s) should consider selecting publishers and research outlets, which have policies supporting the F.A.I.R. principles, as well as immediate or early availability of Publications via Open Access, in order to maximise the availability and impact of their ARC Funded Research.”

The (Australian) National Health and Medical Research Council (NHMRC) promotes the highest quality in the research that it funds, based on international best practice. The NHMRC lists the FAIR principles under useful resources for publication and reporting of research outcomes.

Other governments, funders, and publishers adopting FAIR principles include:

Back to top

3. Data sharing and discovery

Why sharing?

“Both researchers and the broader community stand to benefit from the knowledge produced through publicly funded research” (ARC Open Access Policy Version 2017.1). Hence, why data sharing is well connected with the concepts of reproducibility and reusability.

“Data created and used by scientists should be managed, curated, and archived in such a way to preserve the initial investment in collecting them. Researchers must be certain that data held in archives remain useful and meaningful into the future. Funding authorities increasingly require continued access to data produced by the projects they fund, and have made this an important element in Data Management Plans. Indeed, some funders now stipulate that the data they fund must be deposited in a trustworthy repository” (Core Trust Seal, 2016).

Activity 1: (Infographic) Research data may be discovered (findable) and shared (accessible) in many ways. Start by looking at some data sharing trends across countries and research disciplines (Wiley, 2014). Consider your own current data sharing practices, and those of your project team(s), as yourself the question: How FAIR are those practices?

Activity 2: How can data be shared and discovered? Think about open, mediated, restricted access data repositories. What examples of these types of repositories are you aware of? Discuss with others about their answers.

Back to top

4. Reusable data repositories for the image community

How to walk towards FAIR?

Imagine if you were able to obtain extra datasets for your existing research project, start new collaboration based on common research questions, or start a new project reusing publicly available datasets. You can do this by exploring the following reusable data repositories for the imaging community divided by topic.

Reusable data repositories

  1. Neurosciences
  2. Microscopy
  3. Biomedical sciences
  4. Non-domain specific data registries and catalogues

This and the previous section intend to show that it is becoming more common for funding agents and publishers to require research data to be made accessible via appropriate repositories. This list is a starting point for you to find out what data already exists in your research area. If you want to share your data, or find data relevant to your research take a detailed look at the examples provided in the next sections. Also, most if not all the listed repositories will have guides on how to share data.

Activity 1: Find repositories for imaging in FAIRsharing.org and search for repositories relevant to your research. Try for example, searching on “neuroimaging”. Explore at least one repository you find. How well does it support the FAIR data principles? Tip: look for things such as persistent identifiers, clear descriptions, licence information, download options, file formats.

Back to top

5. Managing and sharing sensitive data

Clarification, FAIR data is not necessarily “open” data. There are some good reasons why some data should not be open. For example, to protect intellectual property, commercialisation, national security, personal privacy or endangered species. However, it may still be possible to provide mediated access to such data, or to publish a description of the data so that others can discover its existence. To align with FAIR principles your “research data should be as open as possible, as closed as necessary”.

The FAIR principles encourage us to disseminate data as widely as possible, in the most effective manner and at the earliest opportunity. This statement takes into account any restrictions relating to privacy, confidentiality, intellectual property, embargo period, or cultural sensitivities, that need to be addressed, discussed and clarified before sharing any data. In the planning phase of a research project, researchers are encouraged to consider at least making project metadata publicly accessible (read more about metadata in section 7).

If you need examples and more information, check OpenAIRE sensitive data guide (OpenAIRE, 2017), ANDS, 2018 guide to publishing and sharing sensitive data, Earth Science Information Partners (ESPI) Handling sensitive data tutorial (Downs, 2012). In addition, The Australian Bureau of Statistics (ABS) informs of the five safes framework and Table 2 provides examples at different levels of accessibility.

Activity 1: Promoting FAIR principles in the healthcare field (DCC, 2019). Highlights: The sensitive nature of patient data and additional concerns for these data include security and anonymisation of data subjects as major components considered. For more information on FAIR related to healthcare visit FAIR4health.eu.

Activity 2: Think about when and how people can share data along the research cycle. Keeping in mind that it is strongly recommended to release metadata (description) of the project to comply with FAIR principles, even if you cannot share the data itself. Institutional repositories or domain specific repository should be able to store metadata of your project and then link that information via registries (have a look at the reusable repositories section).

De-identification / Anonymisation

In the case of sensitive data, the aim is to minimise the risk of exposing confidential information. Sometimes restrictions on sharing can be resolved by de-identification or anonymisation of data. Anonymisation is sometimes used interchangeably with de-identification, ANDS, 2018 makes a clarification of these terms in the De-identification guide.

Activity 3: Look at The Future of Privacy Forum’s visual guide to practical data de-identification, what examples can you take from it and apply to your field?.

Optional extra information. 1) Open de-identification tools from Open Brain Consent (Halchenko, 2018). 2) A blog by Latanya Sweeney about The HIPAA (Health Insurance Portability and Accountability Act) Privacy Rule (US), which establishes national standards to protect individuals’ medical records and personal health information. 3) Guidance about methods for de-identification by the US Department of Helath and Human Services (HHS). 4) Anonymization of DICOM Electronic Medical Records (Newhauser, et al. 2014).

Back to top

6. Persistent identifiers

Identifiers are essential to the human-machine interoperation (F1 FAIR principle). Assigning globally unique persistent identifiers “is arguably the most important FAIR principle, because it will be hard to achieve other aspects of FAIR without them” (GO FAIR, 2017). Persistent identifiers or PIDs help find and collect data accurately, enable proper citation by collecting citation metrics about the use of a dataset, article or data generator (e.g. instrument, software, workflow). For the researcher, persistent identifiers enable disambiguation of people, and enable linking existing works as well as promoting wider dissemination of research.

For individuals:

For digital objects (files, datasets, publications, software, etc.):

Disclaimer, there are a wide range of PIDs available, we only cited two examples for each type.

Activity 1: OpenAIRE/FREYA/ORCID guide for researchers “How can identifiers improve the dissemination of your research outputs?”.

Activity 2: Six Ways to Make Your ORCID ID Work for You! (Meadows, 2017). If you already have an ORCID, check this video to link publications to your ORCID profile.

Activity 3: Discussion the points highlighted by The Joint Declaration of Data Citation Principles from FORCE11 (Martone (ed), 2014).

To learn more about persistent identifiers visit GO-FAIR (F1 Principle) or the ARDC identifiers examples.

Back to top

7. Describing data: metadata

“Metadata (information about data) provides means for discovering data objects as well as providing other useful information about the data objects such as experimental parameters, creation conditions, etc.” (Rajasekar, 2001). Unofficially, metadata can be grouped in two types by the way it has been created: automatically or manually. Read more about metadata in Working with Data by ARDC.

Why is building and using metadata relevant? It is a long-term mediator that supports the discovery, understanding and organisation of the process of research data. Across different communities, usually metadata follows standards see some examples gathered by DCC in collaboration with the Research Data Alliance. To optimise the reuse of data, metadata and data should be well-described so that they can be replicated and/or combined in different settings. Moreover, the FAIR principles give clear descriptors on what metadata should contain:

These are aspects of metadata to keep in mind whether you produce, read or reuse metadata. In order to properly be interpreted by either humans or software, metadata needs to follow a standard vocabulary to precisely define what it should include. Metadata for imaging should include a standard terminology for describing the topic of study: physiological, clinical, demographic and genetic changes, and tools and instruments used for data capture and generation. The main recommendation is to share metadata per project whenever possible, even if the data is not yet available (due to case by case restrictions).

a. Why ontologies?

By expressing image/characterisation annotation in machine computable form as a formal ontology, human knowledge can be brought to bear on effective search and interpretation of image data, especially across multiple disciplines, scales, and modalities” (Eliceiri, et al. 2012). Keep in mind that due to privacy restrictions any (meta)data can be listed under embargo or by limited access (go to section 5, if this is the case). Implementation, adoption and harvesting of metadata, requires defined ontologies. Due to increased demand for quantitative analysis and robust curation and sharing of the image/ characterisation data, the need for full ontologies and annotations is growing.

Ontologies for Neuroscience describe three domain specific ontologies and how they build on top of each other (Larson and Martone, 2009). They also note that existing domain specific vocabularies were built with the help of the Open Biological Ontologies (OBO) community (Smith, et al 2007). For example a subset of OBO is the EDAM Ontology which includes bio-imaging (Kalaš, et al. 2019). In addition. the Neuroscience Information Framework has developed the NIF Standard ontology (NIFSTD) for annotating and searching neuroscience resources. Plant, et al. 2011 provide an overview of what is needed to implement metadata that follows domain specific ontologies, they use as example microscopy cell image data. The National Center for Biomedical Ontology (NCBO) NCBO’s BioPortal provides access to more than 270 biomedical ontologies and controlled terminologies (Musen, et al. 2012), and include some of those cited before. Also, the Ontology for Biomedical Investigations OBI Ontology.org enables communication between existing ontologies (Bandrowski, et al. 2016).

b. Controlled vocabularies

A defined list of agreed terms constitutes a controlled vocabulary, which is usually led by a user-community. Controlled vocabularies help data integration when, for example, ambiguities may exist on the terms used in the different datasets and across different repositories. If the data are to be re-used outside this community additional information may be required. Controlled vocabularies are part of a model called an ontology. An ontology has controlled vocabularies and the glue to link the terms providing an effective means whereby human and electronic agents can communicate unambiguously about concepts. This is relevant to the Interoperability principle of FAIR I1 (GO FAIR). The goal of making data interoperable is to enable members of disparate communities to reuse and understand digital information over time.

Domain specific controlled vocabularies might be a wider landscape than ontologies to cover here, hence some more generic vocabulary examples are given. Schema.org widely used to build controlled vocabularies, a more specific example is bioschemas.org a collection of specifications that provide guidelines to facilitate a more consistent adoption of schema.org within the life sciences. Research vocabularies Australia is a public database of controlled vocabularies, at the time of writing this guide, no specific bioimaging vocabularies were found, maybe that is something you can help with?

c. Storing and publishing metadata

Where to store and publish metadata? The short answer is, depends which institution you are from ( we recommend enquiring the university library, research officer or data steward), some options are:

  1. Institutional repositories
  2. Domain specific repositories
  3. Generic repositories

Keeping in mind the FAIR principle A2 metadata should be accessible, even when the data are no longer available, reinforces the need of having at least shared metadata. To answer the question of where to publish (meta)data?, start with section 4 “Reusable data repositories for the image community”. For a broader view, look at FAIRsharing.org databases for imaging. The ARDC - Research Data Archive (RDA) harvests institutional repositories, hence it can be the link between multiple repositories. The CSIRO - data access portal (is another option for projects related to CSIRO). DataCite metadata store allows users to register DataCite DOIs and associated metadata in a more generic context. As well as Zenodo which provides a DOI and versioning capabilities.

Activity 1: For discussion. Have a look at the metadata stored at Research Data Australia for the 7T Magnetom instrument (CAI, 2017), it contains simple but important public metadata and a PID.

Activity 2: For more information read where to store metadata? from ARDC.

Back to top

8. Reusable data best practices

Here is a suggested list of data best practices to adopt in your research outputs. These will improve data and software reusability by others, which includes yourself in the future. Remember, making data/software available for others to re-use publicly is the goal, but not all data must be shared to all. Adding terms and conditions of accessibility is an option to consider. To share data, you can make use of public infrastructures already mentioned (section 4 “Reusable data repositories”) or use your institutionally provided data repository. To get started, there are a few things you should keep in mind.

a. Provenance - Usually provenance is a manually produced metadata file (it can also be automatically produced). It is important for the reuse of data in the future, it should contain descriptors such as data producer, date history (log of changes), data dictionary. Primary data ought to be read only.

b. File formats - Most file formats are defined by the data producer (e.g. instrument or software), whenever possible you should try to convert data to formats that are publicly accessible (open formats).

For example, DICOM (Digital Imaging and Communications in Medicine) format mostly used in neurosciences, can be converted to NIfTI (Neuroimaging Informatics Technology Initiative) or BIDS format. Another example is the Hierarchical Data Format version 5 (HDF5) (Dougherty 2009), an open source file format that supports large, complex, heterogeneous data [HDF5] used by MINC and Huygens Software. In addition, you can read more about why metadata matters and a discussion about propietary formats by Linkert, et al. 2010, introducing multidimensional microscopy image data and formats like TIFF and OME TIFF.

c. Data structures Keep consistent file and folder naming conventions across linked projects.

d. Data curation Should be included in your data quality workflow as part of the process, ideally this will be automated.

e. Data versioning To keep the provenance of your data you might use data versioning tools: Git or GitHub (for code). Git annex and Datalad are other options for data, you should investigate whether your repository of choice, has the capability to do this.

f. Containerisation For data processing pipelines, e.g. Singularity, Docker, or use Virtual environments, such as the Characterisation Virtual Laboratory.

g. Protocols Search for imaging protocols publicly shared by Protocol exchange an open resource where the community of scientists pool their experimental know-how to help accelerate research, e.g. protocol exchange for imaging.

h. Create documentation Write and describe everything you would need to understand a project or dataset in a few months. A README file helps ensure that your data can be correctly interpreted and reanalysed by others. For example, the DataDryad Readme is an example of minimum documentation. Write the docs writethedocs.org is also a great innitiative of people who care about documentation, we recommend you to use it!

i. Benchmarks or checksums Checksums are used to make sure that transferred or stored copies of data match the data that was originally created. Read more about data integrity checksums.

Activity 1: Recommended reading. A brain imaging case study that provides direct evidence of the impact of open sharing on data use and resulting publications over a seven-year period (2010-2017) stated: “We dispel the myth that scientific findings using shared data cannot be published in high-impact journals and demonstrate rapid growth in the publication of such journal articles”. You can pick to read the (pre-print Milham, et al. 2017) or the paper (Milham, et al. 2018), what are your thoughts on that, and conclusions after reading the paper?

Activity 2 (Discussion + Action): 8What Can You Do?8

  • Contribute your data – Previously published datasets.
  • Release some or all of the project metadata – your call, as a simple rule, the more the better!
  • Curate existing datasets to make available in the future - you set the upload schedule.
  • Contribute your scripts/code.
  • Have discussions with your team members about licensing and sharing.
  • Create a data management plan.

Activity 3: Go through the questions from the Horizon2020 guide to create a FAIR Data Management Plan and see if you can already answer any of the questions.

Recommended extra reading: Best Practices in Data Analysis and Sharing in Neuroimaging using MRI, Ten Simple Rules for Creating a Good Data Management Plan, Ten Simple Rules for Reproducible Computational Research and Ten principles for machine-actionable data management plans, these papers will help you connect all the concepts that you have learned so far.

Back to top

9. Licensing your work

Licensing your work / research outputs to be open access (research output here means data, metadata, code, workflows) allows you as author or contributor to enable reuse and appropriate attribution of that work. If there is no licence attached to your work, you are actually stopping anyone to legally reuse it. Did you know that No licence = No permissions?. Also, if you find research outputs that you want to reuse, you should only reuse it according to their licence.

Be aware that you have the right to choose a licence that best suits your purpose. There are multiple different licences and versions of these, to be applied to data and software. Some licences are applicable only in certain countries, so think of applying an international licence. Be aware that the data repository that you use might ask you to accept their “terms and conditions” which affects how you might use or share data, by expanding, modifying or limiting the intended purpose or your own licence. Also, you can have multiple licences, for different purposes or different audiences. Finally, not every part of your work/ research outputs needs to be publicly available or be licensed, but the more you share with clear permissions the better.

Activity 1: What if you don’t choose a licence?, explains and gives you a few reasons to think about licensing your work. If you are interested in reading about GitHub terms and conditions take 5 extra minutes.

Activity 2: (flowcharts as a survey) The ARDC has a guides about licensing for three specific scenarios: a) Data creator flowchart b) Data supplier flowchart and c) Data users flowchart. If you want to know more about licensing and copyright for data reuse visit the ANDS page about this.

A few types of licences: Creative Commons (CC) is, so far, very easy to apply and it is broadly being reused. It is strongly promoted in the United States, however it is an internationally recognised licence creator. CC is good for: a) very simple, factual data sets b) data to be used automatically. You should watch out for the version in use, recommended to use version 4 or later. CC has attribution stacking Non Commercial (NC), Shared Alike (SA) and Non derivatives (ND). The NC condition: only to be used with dual licensing. The SA condition reduces interoperability. The ND condition severely restricts reuse. To help you decide, use this https://creativecommons.org/choose. Another licence is Copyleft a general method for making a program (or other work) free (in the sense of freedom, not “zero price”), and requiring all modified and extended versions of the program to be free as well. Following on, Open Data commons, also provides licences specifically for open data, good for most databases and datasets, e.g. Open Data Commons Open Database Licence (ODC-ODbL) or Open Data Commons attribution licence (ODC-By). Licences specific for software: Furthermore, Mozilla Public Licence (MPL), MIT Licence, the GNU General Public Licence (GPL) and a list of open source licences by category are other options you might want to investigate. To help you choose a licence for software, look at the descriptions: https://choosealicense.com/. Acknowledgement, most of the cited licences on this section, were first mentioned by “License Research Data from the Digital Curation Centre” (Ball, 2014).

Back to top

10. Data citation for access and attribution

Citation analysis and citation metrics are important to the academic community, which gives recognition to the researchers and their work. Data citation continues the tradition of acknowledging other people’s work and ideas. It also helps make research data more findable and accessible. It is now common practice for authors to formally cite the research datasets and associated software that underpin their research findings.

Activity 1: (Video, 12 mins) Responsible Data Use: Citation and Credit (Mayernik, 2013).

Activity 2: How to cite data and software? This example from Dryad clearly shows how to cite the dataset that underpins a journal article as well as the article itself. Note that both citations include a Digital Object Identifier (DOI).

Activity 3: What to cite and why? For data and software from ARDC.

Back to top

Acknowledgements

We acknowledge Chris Erdmann for reviewing the first version of this document, and all collaborators now listed as authors for useful comments and the editing sections of this document. Paula Andrea Martinez also acknowledged the National Imaging Facility and the Australian Research Data Commons for funding this research.

Back to top

Pre-print

This document is also available via the Open Science Framework as a pre-print and it is citable with the following DOI 10.17605/OSF.IO/ZKJ4R where versions of it in .docx, .odt and .md have been saved. This document links back to the website of Top 10 FAIR for Imaging.

Back to top

Supplementary Information

Characterisation

“Characterisation is the general process of probing and measuring the structures and properties of materials at the micro, nano and atomic scales. It is essential across natural, agricultural, physical, life and biomedical sciences and engineering.” Back to top

Reusable data repositories Section 4

Section 4 lists various public repositories which we have collected in the following list.

Neurosciences

Data repositories recommended by the Scientific Data Journal which accept human-derived data, in addition NeuroMorpho.org and G-Node also accept data from other organisms. Please note that human-subject data submitted to OpenNeuro must be de-identified, while Functional Connectomes Project International Neuroimaging Data-Sharing Initiative (FCP/INDI) can handle sensitive patient data.

Microscopy

Biomedical sciences

Non-domain specific

Data registries and catalogues

re3data.org - a registry of some 2000 data repositories. Research Data Australia- RDA or read more about their services. Also, FAIRSharing.org offers a catalogue of databases, described according to the BioDBcore guidelines. OpenAIRE content provider, European Open Science Cloud, Google Public Data, Google Dataset Share, for open access publications Open knowledge maps.

Back to top

References

Alan Turing Institute. 2019. “Research Data Management.” The Turing Way. https://the-turing-way.netlify.com/rdm/rdm.html.

ANDS. 2017. “The FAIR Data Principles.” ANDS. https://www.ands.org.au/working-with-data/fairdata.

ANDS. 2018. “De-Identification.” ANDS. http://www.ands.org.au/working-with-data/sensitive-data/de-identifying-data.

ANDS. 2018 “Publishing and sharing sensitive data”. https://www.ands.org.au/guides/sensitivedata.

ANDS. “Licensing and Copyright for Data Reuse.” ANDS. Accessed July 17, 2019. https://www.ands.org.au/working-with-data/publishing-and-reusing-data/licensing-for-reuse.

ANDS. “Storing Metadata.” Working with Data. Accessed July 17, 2019. https://www.ands.org.au/working-with-data/metadata/storing-metadata.

ARDC. 2017a. “Citation and Identifiers.” ARDC. https://ardc.edu.au/resources/working-with-data/citation-identifiers/.

ARDC. 2017b. “Data Citation.” https://ardc.edu.au/resources/working-with-data/citation-identifiers/data-citation/.

ARDC. “Metadata.” Working with Data. Accessed July 17, 2019. https://ardc.edu.au/resources/working-with-data/metadata/.

Ball, Alex. 2014. “How to License Research Data.” DCC http://www.dcc.ac.uk/resources/how-guides/license-research-data.

Bandrowski, Anita, Ryan Brinkman, Mathias Brochhausen, Matthew H. Brush, Bill Bug, Marcus C. Chibucos, Kevin Clancy, et al. 2016. “The Ontology for Biomedical Investigations.” PLOS ONE 11 (4): e0154556. https://doi.org/10.1371/journal.pone.0154556.

Bioimaging. 2019. “EOSC-Life: Developing an Open Collaborative Space for Digital Biology in Europe Euro-BioImaging.” http://www.eurobioimaging.eu/content-news/eosc-life-developing-open-collaborative-space-digital-biology-europe.

CAI. 2017. “7T Magnetom Metadata.” Research Data Australia. https://researchdata.ands.org.au/7t-magnetom/1305790.

Cavalli, Valentino. 2018. “Open Consultation on FAIR Data Action Plan.” LIBER. https://libereurope.eu/blog/2018/07/13/fairdataconsultation/.

CODATA. 2018. “Enabling FAIR Data Project and Commitment Statement - CODATA.” http://www.codata.org/news/299/62/Enabling-FAIR-Data-Project-and-Commitment-Statement.

Commonwealth of Australia - Australian Bureau of Statistics. 2017. “Managing the Risk of Disclosure: The Five Safes Framework.” https://www.abs.gov.au/ausstats/abs@.nsf/Latestproducts/1160.0Main%20Features4Aug%202017.

COPDESS. 2018. “Enabling FAIR Data Project – COPDESS.” http://www.copdess.org/enabling-fair-data-project/.

Core Trust Seal. 2016. “An Introduction to the Core Trustworthy Data Repositories Requirements.” https://www.coretrustseal.org/wp-content/uploads/2017/01/Intro_To_Core_Trustworthy_Data_Repositories_Requirements_2016-11.pdf.

Council Australian Research. 2018. “ARC Open Access Policy Version 2017.1.” https://www.arc.gov.au/policies-strategies/policy/arc-open-access-policy-version-20171.

DCC. 2019. “Promoting FAIR Principles in the Healthcare Field Digital Curation Centre.” http://www.dcc.ac.uk/blog/promoting-fair-principles-healthcare-field.

DCC. “DCC Curation Lifecycle Model Digital Curation Centre.” Accessed July 26, 2019. http://www.dcc.ac.uk/resources/curation-lifecycle-model.

Dougherty, Matthew T., Michael J. Folk, Erez Zadok, Herbert J. Bernstein, Frances C. Bernstein, Kevin W. Eliceiri, Werner Benger, And Christoph Best. 2009. “Unifying Biological Image Formats with HDF5.” Commun ACM 52 (10): 42–47. https://doi.org/10.1145/1562764.1562781.

Downs, Robert. 2012. “Providing Access to Your Data: Handling Sensitive Data.” https://doi.org/10.7269/p3mk69t8.

Eliceiri, Kevin W., Michael R. Berthold, Ilya G. Goldberg, Luis Ibáñez, B. S. Manjunath, Maryann E. Martone, Robert F. Murphy, et al. 2012. “Biological Imaging Software Tools.” Nat Methods 9 (7): 697–710. https://doi.org/10.1038/nmeth.2084.

European Commission. 2013. “Guidelines on FAIR Data Management in Horizon 2020.” http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf.

FAIRsharing. 2015a. “DICOM Digital Imaging and COmmunications in Medicine.” https://doi.org/10.25504/fairsharing.b7z8by.

FAIRsharing. 2015b. “NIfTI-1 Data Format.” https://doi.org/10.25504/fairsharing.jgzts3.

Future of Privacy Forum. 2017. “A Visual Guide to Practical Data de-Identification.” https://fpf.org/wp-content/uploads/2016/04/FPF_Visual-Guide-to-Practical-Data-DeID.pdf.

GO FAIR. 2016. “FAIR Principles.” GO FAIR. https://www.go-fair.org/fair-principles/.

GO FAIR. 2017a. “A2: Metadata Should Be Accessible Even When the Data Is No Longer Available.” GO FAIR. https://www.go-fair.org/fair-principles/a2-metadata-accessible-even-data-no-longer-available/.

GO FAIR. 2017b. “F1: (Meta) Data Are Assigned Globally Unique and Persistent Identifiers.” GO FAIR. https://www.go-fair.org/fair-principles/f1-meta-data-assigned-globally-unique-persistent-identifiers/.

GO FAIR. 2017c. “I1: (Meta)data Use a Formal, Accessible, Shared, and Broadly Applicable Language for Knowledge Representation.” GO FAIR. Accessed July 17, 2019. https://www.go-fair.org/fair-principles/i1-metadata-use-formal-accessible-shared-broadly-applicable-language-knowledge-representation/.

Gorgolewski, Krzysztof J., Tibor Auer, Vince D. Calhoun, R. Cameron Craddock, Samir Das, Eugene P. Duff, Guillaume Flandin, et al. 2016. “The Brain Imaging Data Structure, a Format for Organizing and Describing Outputs of Neuroimaging Experiments.” Scientific Data 3 (June): 160044. https://doi.org/10.1038/sdata.2016.44.

Halchenko, Y. 2018. “Anonymization Tools — Open Brain Consent 0.1.dev1 Documentation.” https://open-brain-consent.readthedocs.io/en/latest/anon_tools.html.

Health Information Service. 2012. “Methods for de-Identification of Protected Health Information.”. HHS.gov. https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html.

Higman, Rosie, Daniel Bangert, and Sarah Jones. 2019. “Three Camps, One Destination: The Intersections of Research Data Management, FAIR and Open.” Insights 32 (1): 18. https://doi.org/10.1629/uksg.468.

Kalaš, Matúš, Nataša Sladoje, Laure Plantard, Martin Jones, Leandro Aluisio Scholz, Joakim Lindblad, and contributors. 2019. “Edamontology/Edam-Bioimaging: Alpha05.” Zenodo. https://doi.org/10.5281/zenodo.2557012.

Larson, Stephen D., and Maryann E. Martone. 2009. “Ontologies for Neuroscience: What Are They and What Are They Good for?” Front. Neurosci. 3. https://doi.org/10.3389/neuro.01.007.2009.

Linkert, Melissa, Curtis T. Rueden, Chris Allan, Jean-Marie Burel, Will Moore, Andrew Patterson, Brian Loranger, et al. 2010. “Metadata Matters: Access to Image Data in the Real World.” J Cell Biol 189 (5): 777–82. https://doi.org/10.1083/jcb.201004104.

Martone, M. 2014. “Joint Declaration of Data Citation Principles - FINAL.” Force11. https://www.force11.org/datacitationprinciples.

Meadows, Alice. 2017. “Six Ways to Make Your ORCID iD Work for You!”. https://orcid.org/blog/2018/07/27/six-ways-make-your-orcid-id-work-you.

Milham, Michael P., R. Cameron Craddock, Michael Fleischmann, Jake Son, Jon Clucas, Helen Xu, Bonhwang Koo, et al. 2017. “Assessment of the Impact of Shared Data on the Scientific Literature.” bioRxiv, September, 183814. https://doi.org/10.1101/183814.

Milham, Michael P., R. Cameron Craddock, Jake J. Son, Michael Fleischmann, Jon Clucas, Helen Xu, Bonhwang Koo, et al. 2018. “Assessment of the Impact of Shared Brain Imaging Data on the Scientific Literature.” Nature Communications 9 (1): 2818. https://doi.org/10.1038/s41467-018-04976-1.

Musen, Mark A., Natalya F. Noy, Nigam H. Shah, Patricia L. Whetzel, Christopher G. Chute, Margaret-Anne Story, and Barry Smith. 2012. “The National Center for Biomedical Ontology.” J Am Med Inform Assoc 19 (2): 190–95. https://doi.org/10.1136/amiajnl-2011-000523.

Nature. 2016. “Where Are the Data?” Nature News 537 (7619): 138. https://doi.org/10.1038/537138a.

Newhauser, Wayne, Timothy Jones, Stuart Swerdloff, Warren Newhauser, Mark Cilia, Robert Carver, Andy Halloran, and Rui Zhang. 2014. “Anonymization of DICOM Electronic Medical Records for Radiation Therapy.” Comput Biol Med 0 (October): 134–40. ttps://doi.org/10.1016/j.compbiomed.2014.07.010.

NHMRC. “Research Quality NHMRC.” Accessed May 30, 2019. https://www.nhmrc.gov.au/research-policy/research-quality.

OpenAIRE. 2017. “How to Deal with Sensitive Data.” https://www.openaire.eu/sensitive-data-guide.

OpenAIRE. “How Can Identifiers Improve the Dissemination of Your Research Outputs?”. Accessed July 17, 2019. https://www.openaire.eu/how-can-identifiers-improve-the-dissemination-of-your-research-outputs.

ORCID. 2012. “ORCID Overview for Researchers.” https://orcid.org/content/orcid-overview-researchers.

Plant, Anne L., John T. Elliott, and Talapady N. Bhat. 2011. “New Concepts for Building Vocabulary for Cell Image Ontologies.” BMC Bioinformatics 12 (1): 487. https://doi.org/10.1186/1471-2105-12-487.

PLOS ONE. 2017. “PLOS ONE: Accelerating the Publication of Peer-Reviewed Science.”. https://journals.plos.org/plosone/s/data-availability#loc-acceptable-data-sharing-methods.

PLOSData. 2017. “FAIRsharing Recommendation: PLOS.” https://fairsharing.org/recommendation/PLOS.

Protocol exchange. “Protocol Exchange Research Square Subject Imaging.” Accessed July 17, 2019. https://protocolexchange.researchsquare.com/?journal=protocol-exchange&limit=10&offset=0&status=all&subjectArea=Imaging.

Rajasekar, Arcot K., and Reagan W. Moore. 2001. “Data and Metadata Collections for Scientific Applications.” In High-Performance Computing and Networking, edited by Bob Hertzberger, Alfons Hoekstra, and Roy Williams, 72–80. Lecture Notes in Computer Science. Springer Berlin Heidelberg https://link.springer.com/chapter/10.1007/3-540-48228-8_8.

Smith, Barry, Michael Ashburner, Cornelius Rosse, Jonathan Bard, William Bug, Werner Ceusters, Louis J. Goldberg, et al. 2007. “The OBO Foundry: Coordinated Evolution of Ontologies to Support Biomedical Data Integration.” Nature Biotechnology 25 (11): 1251–5. https://doi.org/10.1038/nbt1346.

Sweeney, Latanya. “Identifiability of de-Identified Data.” Accessed July 17, 2019. http://latanyasweeney.org/work/identifiability.html.

Swiss National Science Foundation. 2018. “Explanation of the FAIR Data Principles.” http://www.snf.ch/SiteCollectionDocuments/FAIR_principles_translation_SNSF_logo.pdf.

Web of Science. “Web of Science ResearcherID.” Accessed July 17, 2019. https://www.researcherid.com/#rid-for-researchers.

Wiley. 2014. “Researcher Data Sharing Insights.”. http://www.acscinf.org/PDF/Giffi-%20Researcher%20Data%20Insights%20–%20Infographic%20FINAL%20REVISED.pdf.

Wiley. “Distinguish Yourself with ORCID Wiley.” Accessed July 17, 2019. https://authorservices.wiley.com/author-resources/Journal-Authors/submission-peer-review/orcid.html.

Wilkinson, Mark D., Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (March): 160018. https://doi.org/10.1038/sdata.2016.18.

Wilkinson, Mark D., Ruben Verborgh, Luiz Olavo Bonino da Silva Santos, Tim Clark, Morris A. Swertz, Fleur D. L. Kelpin, Alasdair J. G. Gray, et al. 2017. “Interoperability and FAIRness Through a Novel Combination of Web Technologies.” PeerJ Comput. Sci. 3 (April): e110. https://doi.org/10.7717/peerj-cs.110.

Back to top