Summary and Setup

This lesson is an introduction to programming in Python for library and information workers with little or no previous programming experience. It uses examples that are relevant to a range of library use cases, and is designed as a prerequisite for other Python lessons that will be developed in the future (e.g., web scraping, APIs). The lesson uses the JupyterLab computing environment and Python 3.

Prerequisites

  1. Learners need to understand what files and directories are and what a working directory is.

  2. Learners must install Python and JupyterLab, and download the dataset that will be used in the lesson, before the workshop begins.

Please see setup instructions below for details.

Learning Objectives


After attending this training, participants will be able to:

  • Navigate the JupyterLab interface and run Python cells within a notebook.
  • Assign values to variables, identify data types, and display values in a Jupyter Notebook.
  • Create and manipulate lists in Python, including indexing, slicing, appending, and removing items to manage data collections effectively.
  • Call built-in Python functions, and use the help function to understand their usage and troubleshoot errors.
  • Use Python libraries like Pandas to import modules, load tabular data from CSV files, and perform basic data analysis.
  • Apply ‘for’ loops to iterate over collections, using the accumulator pattern to aggregate values and trace variable states to predict loop outcomes.
  • Manipulate pandas DataFrames to select data, calculate summary statistics, sort data, and save results in various formats, demonstrating basic data handling and analysis proficiency.
  • Write Python programs using conditional logic with ‘if’, ‘elif’, and ‘else’ statements, including Boolean expressions and compound conditions within loops.
  • Construct Python functions that encapsulate tasks, manage parameters, local, and global variables, and return values to enhance code modularity and readability.
  • Transform complex datasets into a tidy format using pandas functions like ‘melt()’ for reshaping, ‘groupby()’ for aggregation, and ‘to_datetime()’ for date handling. Address practical challenges and demonstrate the benefits of tidy data for analysis.
  • Create and customize data visualizations using Pandas and Plotly, generating various plot types (line, area, bar, histogram) to analyze trends and draw insights from time-series data.
  • Prepare for advanced Python topics such as web scraping and APIs.

Installing Python Using Anaconda


Python is a popular language for research computing, and great for general-purpose programming as well. Installing all of its research packages individually can be a bit difficult, so we recommend Anaconda, an all-in-one installer.

Regardless of how you choose to install it, please make sure you install Python 3.6 or above. The latest 3.x version recommended on Python.org is fine.

We will teach Python using JupyterLab, a programming environment that runs in a web browser (JupyterLab will be installed by Anaconda). For this to work you will need a reasonably up-to-date browser. The current versions of the Chrome, Safari and Firefox browsers are all supported (some older browsers, including Internet Explorer version 9 and below, are not).

JupyterLab


We will teach Python using JupyterLab, a part of a family of Jupyter tools that includes Jupyter Notebook and JupyterLab, both of which provide interactive web environments where you can write and run Python code. If you installed Anaconda, JupyterLab is installed on your system. If you did not install Anaconda, you can install JupyterLab on its own using conda, pip, or other popular package managers.

Download the data


  1. Download this zip file and save it to your Desktop.
  2. Unzip the data.zip file, which should create a new folder called data.
  3. Create a new folder on your Desktop called lc-python and put the data folder in this folder.

This lesson uses circulation data in multiple CSV files from the Chicago Public Library system. The data was compiled from records shared by the Chicago Public Library in the data.gov catalog. Please do not download the circulation data from data.gov since the dataset you downloaded following the steps above has been altered for our purposes.