Accessible

Overview

Teaching: 0 min
Exercises: 0 min

Questions

Key question

Objectives

Understand what a protocol is

Understand authentication protocols and their role in FAIR

Articulate the value of landing pages

Explain closed, open and mediated access to data

For data & software to be accessible:

A1. (meta)data are retrievable by their identifier using a standardized communications protocol
A1.1 the protocol is open, free, and universally implementable
A1.2 the protocol allows for an authentication and authorization procedure, where necessary
A2. metadata remain accessible, even when the data are no longer available

What is a protocol?

Simply put, it’s an access method of exchanging data over a computer network. Each protocol has its rules for how data is formatted, compressed, checked for errors. Research repositories often use the OAI-PMH or REST API protocols to interface with data in the repository. The following image from TutorialEdge.net: What is a RESTful API by Elliot Forbes provides a useful overview of how RESTful interfaces work:

TutorialEdge.net: What is a RESTful API? by Elliot Forbes

Zenodo offers a visual interface for seeing how formats such as DataCite XML will look like when requested for records such as the following record from the Biodiversity Literature Repository:

Formiche di Madagascar raccolte dal Sig. A. Mocquerys nei pressi della Baia di Antongil (1897-1898).

Wikipedia has a list of commonly used network protocols but check the service you are using for documentation on the protocols it uses and whether it corresponds with the FAIR Principles. For instance, see Zenodo’s Principles page.

Contributor information

Alternatively, for sensitive/protected data, if the protocol cannot guarantee secure access, an e-mail or other contact information of a person/data manager should be provided, via the metadata, with whom access to the data can be discussed. The DataCite metadata schema includes contributor type and name as fields where contact information is included. Collaborative projects such as THOR, FREYA, and ODIN are working towards improving the interoperability and exchange of metadata such as contributor information.

Author disambiguation and authentication

Across the research ecosystem, publishers, repositories, funders, research information systems, have recognized the need to address the problem of author disambiguation. The illustrative example below of the many variations of the name Jens Åge Smærup Sørensen demonstrations the challenge of wrangling the correct name for each individual author or contributor:

Jens Åge Smærup Sørensen

Thankfully, a number of research systems are now integrating ORCID into their authentication systems. Zenodo provides the login ORCID authentication option. Once logged in, your ORCID will be assigned to your authored and deposited works.

Exercise to create ORCID account and authenticate via Zenodo

Register for an ORCID.
You will receive a confirmation email. Click the link in the email to establish your unique 16-digit ORCID.
Go to Zenodo and select Log in (if you are new to Zenodo select Sign up).
Go to linked accounts and click the Connect button next to ORCID.

Next time you log into Zenodo you will be able to ‘Log in with ORCID’:

Zenodo login with ORCID and GitHub authentication

Understanding whether something is open, free, and universally implementable

ORCID features a principles page where we can assess where it lies on the spectrum of these criteria. Can you identify statements that speak to these conditions: open, free, and universally implemetable?

Answers:

ORCID is a non-profit that collects fees from its members to sustain its operations
Creative Commons CC0 1.0 Universal (CC0) license releases data into the public domain, or otherwise grants permission to use it for any purpose
It is open to any organization and transcends borders

Challenge Questions:

Where can you download the freely available data?
How does ORCID solicit community input outside of its governance?
Are the tools used to create, read, update, delete ORCID data open?

Tombstones, a very grave subject

There are a variety of reasons why a placeholder with metadata or tombstone of the removed research object exists including but not limited to staff removal, spam, request from owner, data center does not exist is still, etc. A tombstone page is needed when data and software is no longer accessible. A tombstone page communicates that the record is gone, why it is gone, and in case you really must know, there is a copy of the metadata for the record. A tombstone page should include: DOI, date of deaccession, reason for deaccession, message explaining the data center’s policies, and a message that a copy of the metadata is kept for record keeping purposes as well as checksums of the files. Zenodo offers us further explanation of the reasoning behind tombstone pages.

DataCite offers statistics where the failure to resolve DOIs after a certain number of attempts is reported (see DataCite statistics support pagefor more information). In the case of Zenodo and the GitHub issue above, the hidden field reveals thousands of records that are a result of spam.

DataCite Statistics Page

If a DOI is no longer available and the data center does not have the resources to create a tombstone page, DataCite provides a generic tombstone page.

See the following tombstone examples:

Zenodo tombstone: https://zenodo.org/record/1098445
Figshare tombstone: https://figshare.com/articles/Climate_Change/1381402

Discussion of tombstones

Key Points

First key point.

previous episode

Library Carpentry: FAIR Data and Software

next episode