In the ideal world, everyone would have access to all research outcomes and available knowledge, to use it, build upon it, and expand it to serve societal goals as well as private and personal interests. A way to do so is through the EOSC, but what exactly is it? Find out through these ten things!
Funded through the Horizon 2020 (H2020) initiative, with 40 countries involved, EOSC offers 1.7 million European researchers and 70 million professionals in science, technology, the humanities and social sciences a virtual environment with open and seamless services for storage, management, analysis and re-use of research data, across borders and scientific disciplines by federating existing scientific data infrastructures, currently dispersed across disciplines and the EU Member States.
Watch this 5 min video from the Open Science MOOC on “An introduction to the European Open Science Cloud (EOSC)” by Jean-Claude Burgelman.
Familiarise yourself with the key EOSC concepts presented in the figure below.
Figure 1: European Open Science Cloud becomes a reality (European Commission, 23 November 2018)
Buzzwords can’t be completely avoided but we will attempt to clarify and scope a handful of them, for the remainder of this Top 10:
EOSC: right, this is the European Open Science Cloud. However,
FAIR: this acronym stands for Findable, Accessible, Interoperable and Reusable. 15 FAIR principles (see the next Thing) provide guidance on making your research output more machine actionable. Since the principles were developed in 2016, they’ve gotten worldwide traction as different scientific disciplines try to operationalise them and measure the FAIRness level of data, software and other related concepts.
Communication-wise, “FAIR” has been brilliant. However, some argue that other essential research aspects are missing. For instance, check out the plea for Responsible Data Science. Ensuring Fairness, Accuracy, Confidentially, Transparency (FACT). Formulate your opinion on these acronyms, do you think they are helpful?
Infrastructure: while not really a buzzword, this term is frequently used and conveniently vague. Think of it as the basic systems and services that an organization uses in order to work effectively. Keep in mind that an infrastructure can - or even should? - include human experts and support staff. For example, to make this explicit the OpenAIRE project calls itself a socio-technical infrastructure.
The FAIR Data Guiding Principles - for Findable, Accessible, Interoperable and Reusable data - came into existence during a workshop in Leiden in 2014 where a broad range of stakeholders in the field of research data management and stewardship came together to discuss the improvement of the reusability of research data. They published the first paper on these principles in 2016.
The principles have become an indispensable part of improving research data and software for the community. According to a 2018 report, The cost of not having FAIR research data, could mean, at a minimum, a €10.2 billion per year loss to the European economy (!). FAIR is equally important to EOSC where there has been an uptake of projects such as FAIRsFAIR, FAIRplus and the ESFRI projects.
EOSC facilitates open science through “vertical” or “horizontal” infrastructures:
Such RIs have been around for forty years, and this video sketches the interaction between ESFRI and EOSC. You can call RIs discipline-specific or “vertical”, as opposed to…
… develop and provide digital services which are cross-disciplinary. The expertise these infrastructures share and the services they offer are generic or “horizontal”.
Another example of an e-Infrastructure, have a look at the Circulation of Knowledge and Learned Practices in the 17th-century Dutch Republic and browse through 20.000 letters that were written by and sent to 17th century scholars who lived in the Dutch Republic. Dutch history is at your fingertips thanks to e-infrastructure.
Early 2017, the European Commission (EC) embraced the FAIR principles. An innovative element of the Horizon 2020 grant scheme at the time was (and still is, in 2019) an Open Research Data Pilot, asking funded projects to make the data underpinning their publications available or “Open”. As EC representatives put it in April 2017: “We are now seeing openness as one component of FAIR data and aim to address all of the FAIR aspects in Horizon 2020”.
The FAIRsFAIR project contributes to the implementation of these recommendations. For instance, two work packages and a competence framework with several trainings address the skills-related recommendations. This is done by linking with other parties active in open science and FAIR such as:
FAIRsFAIR also addresses certification of FAIR services by strengthening the network of trustworthy digital repositories. Outcomes will feed into the work of the EC’s EOSC FAIR Working group and the Working Group Rules of Participation in the EOSC. These rules will guarantee an open, secure and cost-effective federated EOSC.
This publication explores how open, FAIR, and research data management (RDM) connect: “The boundaries and intersections between RDM, FAIR and open cover important elements that risk being overlooked if we only focus on one concept.” Do you agree with this?
Research infrastructures (RIs) offer digital services at the domain level, for multiple domains, and/or community-wide. Many of them also provide training, for early-career researchers, on how to use their services.
Several RIs are successful in supporting cross-disciplinary work, for example the Digital Research Infrastructure for the Arts and Humanities (DARIAH). It aims to support transnational researchers in all phases of their work: data acquisition, analysis, publication and data archiving.
DARIAH meets the needs of arts and humanities researchers across Europe including the musicologist analysing digital recordings, the archaeologist digitally recreating ancient buildings, and the historian studying digitised texts to investigate how place names change over time. Cross-disciplinary collaboration also supports the growth of communities, like…
An archaeological study from The Netherlands, carried out before the EOSC era, presents a nice example of cross-domain research: results are preserved in the 4TU ResearchData long-term repository for technical sciences and the DANS long-term repository, which catered to the social sciences and humanities at the time. Can you find the two datasets and the study?
EOSC training can be several things:
The EOSC-hub service catalogue is an obvious place to go to, as long as you’re aware that some services target researchers and research communities directly, while other services require administrator expertise. Examples include:
EOSC-building projects have started to jointly present their services. See for instance this use case about complying with open science ambitions and the GDPR: how best to manage and share person-related data? OpenAIRE’s Amnesia anonymization tool can help with removing identifying information from data.
Watch this webinar on how to manage your data to make them open and FAIR.
EOSC aims to make science more open. Replicability of research is one important aspect of openness and is greatly improved if research data, software, and methods are explicit and publicly available. You enable fellow future researchers to learn from what you have done.
The blog Retraction Watch reports on retractions of scientific papers and erroneous research data practices. But how do we prevent such cases? One method is by using trustworthy repositories, which help to make and to keep data FAIR:
“make”: by providing a persistent identifier, supporting metadata standards, supporting findability through their public catalogue, providing clear licences.
“keep”: by preserving the data, documenting them, and keeping them usable in the long run (through sustainable formats, repositories). In the Guidelines on FAIR Data Management in Horizon 2020, the European Commission states: “Where will the data and associated metadata, documentation and code be deposited? Preference should be given to certified repositories which support open access where possible.” (cited from the OpenAIRE initiative). One way to make sure that your data is stored at such a certified repository, is to look at the CoreTrustSeal certification.
Figure 2: CoreTrustSeal Certification Launched (Research Data Alliance, 11 September 2017)
EOSC aims to be a trusted environment for the storage, processing, and reuse of research data. Where FAIR tells us something about the research data, these need to be preserved somewhere trustworthy. So, trust and FAIR go hand in hand.
Watch this video tutorial about FAIR data in trustworthy repositories. Do you agree with the recommendations? And which of the CoreTrustSeal requirements do you think are most important?
With all this “Thingking” about E-OSC, there are clearly no Schengen-like borders around it, so let’s look at similar examples beyond Europe:
We’ve covered initiatives and services on some of the continents but not all of them. One continent where there is a lot happening regarding research data management is Australia. Can you find some of the research data initiatives and services happening down under?