Top 10 FAIR Data & Software Things

The European Open Science Cloud (EOSC)


Sprinters:

Marjan Grootveld / Data Archiving and Networked Services (DANS)

Frans Huigen / Data Archiving and Networked Services (DANS)

Eliane Fankhauser / Data Archiving and Networked Services (DANS)

Ellen Leenarts / Data Archiving and Networked Services (DANS)

Paula Andrea Martinez / ELIXIR-Europe

Audience:

Description:

In the ideal world, everyone would have access to all research outcomes and available knowledge, to use it, build upon it, and expand it to serve societal goals as well as private and personal interests. A way to do so is through the EOSC, but what exactly is it? Find out through these ten things!

An overview of Things:

  1. Introducing EOSC
  2. Buzzword busting
  3. FAIR principles
  4. Infrastructures
  5. FAIR in EOSC
  6. EOSC for research domains
  7. EOSC training
  8. EOSC services
  9. Scientific integrity and trust
  10. Open Science: the rest of the world

Thing 1 - Introducing

Funded through the Horizon 2020 (H2020) initiative, with 40 countries involved, EOSC offers 1.7 million European researchers and 70 million professionals in science, technology, the humanities and social sciences a virtual environment with open and seamless services for storage, management, analysis and re-use of research data, across borders and scientific disciplines by federating existing scientific data infrastructures, currently dispersed across disciplines and the EU Member States.

Activity 1:

Watch this 5 min video from the Open Science MOOC on “An introduction to the European Open Science Cloud (EOSC)” by Jean-Claude Burgelman.

Activity 2:

Familiarise yourself with the key EOSC concepts presented in the figure below.

European Open Science Cloud

Figure 1: European Open Science Cloud becomes a reality (European Commission, 23 November 2018)

Thing 2 - Buzzword busting

Buzzwords can’t be completely avoided but we will attempt to clarify and scope a handful of them, for the remainder of this Top 10:

EOSC: right, this is the European Open Science Cloud. However,

FAIR: this acronym stands for Findable, Accessible, Interoperable and Reusable. 15 FAIR principles (see the next Thing) provide guidance on making your research output more machine actionable. Since the principles were developed in 2016, they’ve gotten worldwide traction as different scientific disciplines try to operationalise them and measure the FAIRness level of data, software and other related concepts.

Activity 1:

Communication-wise, “FAIR” has been brilliant. However, some argue that other essential research aspects are missing. For instance, check out the plea for Responsible Data Science. Ensuring Fairness, Accuracy, Confidentially, Transparency (FACT). Formulate your opinion on these acronyms, do you think they are helpful?

Infrastructure: while not really a buzzword, this term is frequently used and conveniently vague. Think of it as the basic systems and services that an organization uses in order to work effectively. Keep in mind that an infrastructure can - or even should? - include human experts and support staff. For example, to make this explicit the OpenAIRE project calls itself a socio-technical infrastructure.

Thing 3 - FAIR principles

The FAIR Data Guiding Principles - for Findable, Accessible, Interoperable and Reusable data - came into existence during a workshop in Leiden in 2014 where a broad range of stakeholders in the field of research data management and stewardship came together to discuss the improvement of the reusability of research data. They published the first paper on these principles in 2016.

The principles have become an indispensable part of improving research data and software for the community. According to a 2018 report, The cost of not having FAIR research data, could mean, at a minimum, a €10.2 billion per year loss to the European economy (!). FAIR is equally important to EOSC where there has been an uptake of projects such as FAIRsFAIR, FAIRplus and the ESFRI projects.

Thing 4 - Infrastructures

EOSC facilitates open science through “vertical” or “horizontal” infrastructures:

Activity 1:

Compare research infrastructure and e-infrastructure as defined by Science Europe. Which supports you best? Consider how you could benefit more from them.

Activity 2:

Another example of an e-Infrastructure, have a look at the Circulation of Knowledge and Learned Practices in the 17th-century Dutch Republic and browse through 20.000 letters that were written by and sent to 17th century scholars who lived in the Dutch Republic. Dutch history is at your fingertips thanks to e-infrastructure.

Thing 5 - FAIR in EOSC

Early 2017, the European Commission (EC) embraced the FAIR principles. An innovative element of the Horizon 2020 grant scheme at the time was (and still is, in 2019) an Open Research Data Pilot, asking funded projects to make the data underpinning their publications available or “Open”. As EC representatives put it in April 2017: “We are now seeing openness as one component of FAIR data and aim to address all of the FAIR aspects in Horizon 2020”.

The FAIRsFAIR project contributes to the implementation of these recommendations. For instance, two work packages and a competence framework with several trainings address the skills-related recommendations. This is done by linking with other parties active in open science and FAIR such as:

FAIRsFAIR also addresses certification of FAIR services by strengthening the network of trustworthy digital repositories. Outcomes will feed into the work of the EC’s EOSC FAIR Working group and the Working Group Rules of Participation in the EOSC. These rules will guarantee an open, secure and cost-effective federated EOSC.

Activity 1:

This publication explores how open, FAIR, and research data management (RDM) connect: “The boundaries and intersections between RDM, FAIR and open cover important elements that risk being overlooked if we only focus on one concept.” Do you agree with this?

Thing 6 - EOSC for research domains

Research infrastructures (RIs) offer digital services at the domain level, for multiple domains, and/or community-wide. Many of them also provide training, for early-career researchers, on how to use their services.

Several RIs are successful in supporting cross-disciplinary work, for example the Digital Research Infrastructure for the Arts and Humanities (DARIAH). It aims to support transnational researchers in all phases of their work: data acquisition, analysis, publication and data archiving.

DARIAH meets the needs of arts and humanities researchers across Europe including the musicologist analysing digital recordings, the archaeologist digitally recreating ancient buildings, and the historian studying digitised texts to investigate how place names change over time. Cross-disciplinary collaboration also supports the growth of communities, like…

Activity 1:

An archaeological study from The Netherlands, carried out before the EOSC era, presents a nice example of cross-domain research: results are preserved in the 4TU ResearchData long-term repository for technical sciences and the DANS long-term repository, which catered to the social sciences and humanities at the time. Can you find the two datasets and the study?

Thing 7 - EOSC training

EOSC training can be several things:

Thing 8 - EOSC services

The EOSC-hub service catalogue is an obvious place to go to, as long as you’re aware that some services target researchers and research communities directly, while other services require administrator expertise. Examples include:

EOSC-building projects have started to jointly present their services. See for instance this use case about complying with open science ambitions and the GDPR: how best to manage and share person-related data? OpenAIRE’s Amnesia anonymization tool can help with removing identifying information from data.

Activity 1:

Have a look at B2FIND and Zenodo. Both are cross-domain repositories for research output. Consider their respective strengths; how can you benefit from using them?

Activity 2:

Watch this webinar on how to manage your data to make them open and FAIR.

Thing 9 - Scientific integrity and trust

EOSC aims to make science more open. Replicability of research is one important aspect of openness and is greatly improved if research data, software, and methods are explicit and publicly available. You enable fellow future researchers to learn from what you have done.

The blog Retraction Watch reports on retractions of scientific papers and erroneous research data practices. But how do we prevent such cases? One method is by using trustworthy repositories, which help to make and to keep data FAIR:

make”: by providing a persistent identifier, supporting metadata standards, supporting findability through their public catalogue, providing clear licences.

keep”: by preserving the data, documenting them, and keeping them usable in the long run (through sustainable formats, repositories). In the Guidelines on FAIR Data Management in Horizon 2020, the European Commission states: “Where will the data and associated metadata, documentation and code be deposited? Preference should be given to certified repositories which support open access where possible.” (cited from the OpenAIRE initiative). One way to make sure that your data is stored at such a certified repository, is to look at the CoreTrustSeal certification.

Core Trust Seal

Figure 2: CoreTrustSeal Certification Launched (Research Data Alliance, 11 September 2017)

EOSC aims to be a trusted environment for the storage, processing, and reuse of research data. Where FAIR tells us something about the research data, these need to be preserved somewhere trustworthy. So, trust and FAIR go hand in hand.

Activity 1:

Watch this video tutorial about FAIR data in trustworthy repositories. Do you agree with the recommendations? And which of the CoreTrustSeal requirements do you think are most important?

Thing 10 - Open Science: the rest of the world

With all this “Thingking” about E-OSC, there are clearly no Schengen-like borders around it, so let’s look at similar examples beyond Europe:

Activity 1:

We’ve covered initiatives and services on some of the continents but not all of them. One continent where there is a lot happening regarding research data management is Australia. Can you find some of the research data initiatives and services happening down under?