Introduction to Library Carpentry
OverviewTeaching: 30 min
Exercises: 0 minQuestions
How can The Carpentries & Library Carpentry help libraries meet the data and software needs of their communities and staff?Objectives
Learn about campus trends in data science.
Understand data science challenges and opportunties for libraries.
Learn how The Carpentries & Library Carpentry works.
See what libraries are doing with The Carpentries & Library Carpentry.
Understand how you can get involved and about training at your institution.
Introduction to Library Carpentry – Teaching Data Science Skills
Who you are?
Start with your background, role at your institution, and what role (or desired role) you have in the Library Carpentry/Carpentries community.
Our aim is to help libraries become data and software savvy.
- Campus trends in data science
- Challenges for libraries
- Opportunities for libraries
- The Carpentries - How it works
- Library Carpentry - What libraries are doing
- How to get involved
- Training at your institution
With the emergence of our ability to generate increasing amounts of data, research and work in almost every domain has a data and computational component, including the whole new field of data science.
Libraries are guided by the needs of their communities…
- Surveys point to an increasing need from researchers for data & software skills
- Researchers do not receive the training they need in software best practices
- Many universities are integrating data science into the curriculum
- Industry is looking to hire more data savvy candidates
- Early career researchers are looking for career advancement opportunities
According to Barone L, Williams J and Micklos D. Unmet Needs for Analyzing Biological Big Data: A Survey of 704 NSF Principal Investigators (2017):
- Nearly 90% of BIO PIs said they are currently/will soon be analyzing large data sets
- Majority of PIs said their institutions are not meeting 9 of 13 needs
- Training on integration of multiple data types (89%), on data management and metadata (78%), and on scaling analysis to cloud/HPC (71%) were the 3 greatest unmet needs
- Data storage and HPC ranked lowest on their list of unmet needs
- The problem is the growing gap between the accumulation of big data—and researchers’ knowledge about how to use it effectively
In a survey of biology NSF PIs, the top 3 unmet needs are around training
Importance of research software & training
- 92% of academics use research software
- 69% say that their research would not be practical without it
- 56% develop their own software (worryingly, 21% of those have no training in software development)
S.J. Hettrick et al, UK Research Software Survey 2014 [Data set]. Zenodo. http://doi.org/10.5281/zenodo.14809
Academic institutions should provide and evolve a range of educational pathways to prepare students for an array of data science roles in the workplace.
National Academies of Sciences, Engineering, and Medicine. 2018. Data Science for Undergraduates: Opportunities and Options. Washington, DC: The National Academies Press. https://doi.org/10.17226/25104.
Moore-Sloan Data Science Environments
On our campuses, library spaces have been transformed—among other things—into campus centers for data science research, training, and services, with open floor plans and furnishings that are adaptable to a range of activities that promote and support data science research and learning.
“Creating Institutional Change in Data Science” Chronicles of Higher Ed, Mar 2018
Rise of data science initiatives in academia
From the Data Science Community Newsletter by Noren & Stenger:
Brigham Young University, Caltech, Carnegie Mellon, College of Charleston, Columbia, Cornell, Dartmouth UMass, George Mason University, Georgetown University, Georgia Tech, Harvard, Illinois Wesleyan University, Johns Hopkins, Mid America Nazarene University, MIT, Northeastern University, Northern Kentucky University, Northwestern, Northwestern College in Iowa, Ohio State University, Penn State University, Princeton, Purdue, Stanford, Tufts University, UC Berkeley, UC Davis, UC Irvine, UC Merced, UC Riverside, UC San Diego, UCLA, UIUC, University of Iowa, University of Michigan, University of Oregon, University of Pennsylvania, University of Rochester, University of San Francisco, University of Warwick, University of Washington, UT Austin, UW Madison, Vanderbilt University, Virginia Tech, Washington University in St. Louis, Middle Tennessee State University, NYU, Amherst College, Brown, CU Boulder, Duke, Illinois Institute of Technology, Lehigh University, Loyola University - Maryland, Rice University, SUNY at Stony Brook, UC Santa Barbara, UC Santa Cruz, UCSF, UMass Amherst, UNC - Wilmington, University of Vermont, University of Arizona, University of British Columbia, University of Chicago, University of Virginia, USC, Worchester Polytechnic, Yale
70 and counting…
Investing in America’s data science and analytics talent
- 69% of business leaders in the United States will prefer job applicants with data skills by 2021.
- 23% of college and university leaders say their graduates will have those skills.
April 2017 Business-Higher Education Forum (BHEF) report titled “Investing in America’s Data Science and Analytics Talent: The Case for Action.”
66% of the Data Carpentry workshop attendees are early career.
Analysis of Software and Data Carpentry’s Pre- and Post-Workshop Surveys https://doi.org/10.5281/zenodo.1325463
The Carpentries workshops and lesson materials address data and software needs.
Training in data science tools and approaches provides a path to better science in less time.
Our path to better science in less time using open science tools
Reproducibility has long been a tenet of science but has been challenging to achieve—we learned this the hard way when our old approaches proved inadequate to efficiently reproduce our own work. Here we describe how several free software tools have fundamentally upgraded our approach to collaborative research, making our entire workflow more transparent and streamlined. By describing specific tools and how we incrementally began using them for the Ocean Health Index project, we hope to encourage others in the scientific community to do the same—so we can all produce better science in less time.
Lowndes, Julia S. Stewart, et al. “Our path to better science in less time using open data science tools.” Nature ecology & evolution 1.6 (2017): 160.
Challenges & opportunities for libraries
The Strategic Value of Library Carpentry & The Carpentries to Research Libraries
For libraries, organizing Data, Software, and Library workshops and meet-ups have provided an excellent opportunity to connect with their community, understand their data and software needs, and grow their library services.
Reasons why people come to Library Carpentry…
- People working in library- and information-related roles come to Library Carpentry hoping to:
- Cut through the jargon terms and phrases
- Learn how to apply data science concepts in library tasks
- Identify and use best practices in data structures
- Learn how to programmatically clean and transform data
- Work effectively with researchers, IT, and systems colleagues
- Automate repetitive, error prone tasks
38 mentions of The Carpentries in The Shifting to Data Savvy Report.
Demonstrated need from libraries
- Since 2013, roughly 120 Library Carpentry workshops have been held in 16 countries addressing software and data training needs from data cleaning in OpenRefine to versioning your work with Git.
- Update information above from https://librarycarpentry.org/upcoming_workshops/ and https://carpentries.org/upcoming_workshops/.
- Include information about your region/country/metro
Expanding Library Carpentry
The California Digital Library will advance the scope, adoption, and impact of the emergent “Library Carpentry” continuing education program… The training opportunities enabled by the project will provide librarians with the critical data and computational skills and tools they need to be effective digital stewards for their stakeholders and user communities.
See Institute of Museum and Library Services Grant RE-85-17-0121-17
On April 17, 2018, the California Digital Library welcomed Chris Erdmann, Library Carpentry Community and Development Director to help grow the Library Carpentry effort.
Growing Library Carpentry involves:
- Development and updates of core training modules optimized for the librarian community, based on Carpentries pedagogy
- Regionally-organized training opportunities for librarians, leading to an expanding cohort of certified instructors available to train fellow librarians in critical skills and tools, such as the command line, OpenRefine, Python, R, SQL, and research data management
- Community outreach to raise awareness of Library Carpentry and promote the development of a broad, engaged community of support to sustain the movement and to advance LC integration within the newly forming Carpentries organization
Involving community members has been key
- Developing Data Skills at Macquarie University Library: Drawing on Library Carpentry lessons, pedagogy and community . https://librarycarpentry.org/blog/2019/06/developing-data-skills/
- Building a Community for Digital Literacy at ZB MED: The Carpentries and HackyHours A place where everyone can come together to share topics and learn from each other
- New England Libraries Team Up to Become Carpentries Members: Developing the New England Software Carpentry Library Consortium and a Community of Practice
- University of Oregon Libraries and Oregon State University Libraries Team Up to Teach First Library Carpentry Workshop in Oregon: UO and OSU Library Carpentry report
Software, Data, and Library Carpentry at a glance
- The Carpentries: Building skills and community
- Non-profit teaching data science skills for more effective work and career development
- Training ‘in the gaps’ that is accessible, approachable, aligned and applicable (the practical skills you need in your work)
- Volunteer instructors, peer-led hands-on intensive workshops
- Open and collaborative lesson materials
- Creating and supporting community, local capacity for teaching and learning these skills and perspectives
- 2-days, active learning
- Feedback to learners throughout the workshop
- Trained, certified instructors
- Friendly learning environment (Code of Conduct)
Workshops are 2-day, hands-on, interactive, friendly learning environment (Code of Conduct), teaching the foundational skills and perspectives for working with software and data
- Our workshops.
- Our learners.
- Our 2018 Annual Report (which includes data/highlights):
Focus of Data, Software, and Library Carpentry
- Data Carpentry workshops are domain-specific, and focus on teaching skills for working with data effectively and reproducibly.
- Software Carpentry workshops are domain-agnostic, and teach the Unix Shell, coding in R or Python, and version control using Git (i.e. Research software workflows).
- Library Carpentry workshops focus on teaching software and data skills for people working in library- and information-related roles. The workshops are domain-agnostic though datasets used will be familiar to library staff (i.e. Automating workflows, data cleaning, outreach to researchers/IT).
The Carpentries workshop goals
- Teach skills
- Get people started and introduce them to what’s possible
- Build confidence in using these skills
- Encourage people to continue learning
- Positive learning experience
Goals of the workshop, aren’t just to teach the skills, but to build self-efficacy and increase confidence and create a positive learning experience. We know we can’t teach everything in two days, but we want to teach the foundational skills and get people started and give them the confidence to continue learning. Many people have had demotivating experience when learning things like coding or computational skills, and we want to change that perspective.
Educational pedagogy is the focus of Instructor training program. The following steps are required to become a certified Instructor who can teach all Carpentries lessons!
- 2-days of online training in the pedagogy
- Suggest a change to a lesson on GitHub
- 1-hr online discussion on running/teaching workshops
- Online teaching demo
More information: http://carpentries.github.io/instructor-training/
- We have an active community of lesson contributors and Maintainers that improve our 9 Library Carpentry lessons on a daily basi…
- What have the Library Carpentry Maintainers been working on? An update on what the Library Carpentry Maintainers have been working on since January 2019.
- News from the Library Carpentry Maintainer Community and Curriculum Advisory Committee: An update on Library Carpentry lesson development
- Community of people excited about software and data skills and about sharing them with others
- Mentoring program and instructor onboarding
- Discussion groups and community calls
- Email lists
- Social media, chat channels
- Teaching at other institutions
- Lesson development and maintenance
Short and long term surveys show that people are learning the skills, putting them into practice and have more confidence in their ability to do computational work. See The Carpentries January 2018 long-term survey report
We also see researchers writing about the impact Carpentries training and approaches have had in their workflows:
Yenni, G. M., Christensen, E. M., Bledsoe, E. K., Supp, S. R., Diaz, R. M., White, E. P., & Ernest, S. M. (2019). Developing a modern data workflow for regularly updated data. PLoS biology, 17(1), e3000125. Chicago https://doi.org/10.1371/journal.pbio.3000125
Library Carpentry core objectives
Library Carpentry workshops teach people working in library- and information-related roles how to:
- Cut through the jargon terms and phrases of software development and data science and apply concepts from these fields in library tasks;
- Identify and use best practices in data structures;
- Learn how to programmatically transform and map data from one form to another;
- Work effectively with researchers, IT, and systems colleagues;
- Automate repetitive, error prone tasks.
How to get involved
How can I get started? Contribute to a lesson.
All of our lessons are CC-BY and hosted on GitHub at https://github.com/LibraryCarpentry. Anyone can contribute!
How can I get started? Host, help, teach.
Request a workshop:
https://amy.carpentries.org/forms/workshop/ (cost is 2500 USD to support instructor travel)
Find a workshop in your area to attend/help with/observe:
https://carpentries.org (scroll down the page)
Apply to be an Instructor:
Become a member
- How can I get started? Become a member.
- Find out more https://carpentries.org/membership/ and/or reach out to firstname.lastname@example.org
The Carpentries & Library Carpentry websites
Connect with The Carpentries
- The Carpentries Slack, email lists, Twitter, newsletter… https://carpentries.org/connect/
- Library Carpentry Slack, email list, Gitter, Twitter… https://librarycarpentry.org/contact/
What is your library/network doing to meet the data science needs of your community?
- Have you surveyed the needs of your community/library staff/members?
- What training programs have you taken/plan to take?
- If you have taken training programs, how have you incorporated what you have learned into your programs and services?
- Your contact info
- Contact The Carpentries https://carpentries.org/connect/
Library Carpentry helps libraries…
Cut through the jargon terms and phrases of software development and data science and apply concepts from these fields in library tasks.
Identify and use best practices in data structures.
Learn how to programmatically transform and map data from one form to another.
Work effectively with researchers, IT, and systems colleagues.
Automate repetitive, error prone tasks.