# Data Intro for Archivists: Reference

## Key Points

 Introduction to Library Carpentry Don’t be scared to ask for help Don't think you work with data? We all have data and it is not just enough to put it into a system and forget about it Foundations data are used in research archival collections and archival description are data data structures should be consistent and predictable consider the standards and structures used in your own data identify and use computational methods in your work identify how standards and structures can be used in research Regular Expressions Regular expressions are powerful tools for pattern matching Introduction to Data - Multiple Choice Quiz Regular expressions reference guide Introduction to Data - Multiple Choice Quiz (answers)" Regular expressions answer sheet

# Regular Expressions Cheat Sheet

• `[]` defines a range of characters
• `.` matches any character
• `\` is used to escape the following character when that character is a special character. So, for example, a regular expression that found `.com` would be `\.com` because `.` is a special character that matches any character.
• `\d` matches any single digit
• `\w` matches any part of word character (equivalent to `[A-Za-z0-9]`)
• `\s` matches any space, tab, or newline
• `^` asserts the position at the start of the line. So what you put after it will only match if they are the first characters of a line.
• `\$` asserts the position at the end of the line. So what you put before it will only match if they are the last characters of a line.
• `\b` adds a word boundary. Putting this either side of a stops the regular expression matching longer variants of words.
• `*` matches the preceding element zero or more times. For example, `ab*c` matches “ac”, “abc”, “abbbc”, etc.
• `+` matches the preceding element one or more times. For example, `ab+c` matches “abc”, “abbbc” but not “ac”.
• `?` matches when the preceding character appears one or zero times
• `{VALUE}` matches the preceding character the number of times define by VALUE; ranges can be specified with the syntax `{VALUE,VALUE}`
• `|` means or
• Check your regex with: regex101 https://regex101.com/, rexegper http://regexper.com/, or myregexp http://myregexp.com/
• Test yourself with: Regex Crossword https://regexcrossword.com/ or our The Multiple Choice Quiz http://data-lessons.github.io/library-data-intro/05-quiz/