Content from Getting Started
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How do I use the Spyder IDE?
- How can I run Python programs?
Objectives
- Learners can launch the Spyder IDE
- Learners are able to use the IPython console to interact with Python
- Learners are able to write code in the Spyder editor and run this code
- Learners are able to save their code in a *.py file
- Learners can use the different buttons and panels needed in the Spyder IDE
Use the Spyder IDE for editing and running Python.
-
The Anaconda package manager is an automated way to install the Spyder IDE.
- See the setup instructions for Anaconda installation instructions.
It also installs all the extra libraries it needs to run.
-
Once you have installed Python and the Spyder IDE requirements, open a shell and type:
This will start The Spyder IDE.
This environment has several useful tools we can use, which you can see in different panels in the Spyder IDE. We will look into some of them.
You can change the positions and sizes of these panels to your preference, as you get to know them.
Different ways of interacting with Python using Spyder
- On the left, filling half of the screen is the editor. Here you can write and edit code, which can then be saved in a file (usually with a .py extension). We can run the code we wrote here by pressing the green ‘play’ button on top or press F5 on your keyboard.
- On the bottom right, we find the IPython console. This is were we can talk directly to Python. It will interpret what you have typed directly when you press Enter.
Python in the console
Python in the editor
The large panel on the left probably has some text in it that looks like this:
"""
Spyder Editor
This is a temporary script file.
"""
Write the following line below these lines and press run (the green ‘play’ button or f5). A window might pop up asking you to specify the run settings, leave the settings as they are and press ‘Run’. What happens?
- Code written in the editor can be saved, like any other file.
Content from Variables and Assignment
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How can I store data in programs?
Objectives
- Write programs that assign values to variables and perform calculations with those values.
- Correctly trace value changes in programs that use assignment.
Use variables to store values.
- Variables are names for values.
- In Python the
=
symbol assigns the value on the right to the name on the left. - The variable is created when a value is assigned to it.
- Here, Python assigns an age to a variable
age
and a name in quotation marks to a variablefirst_name
.
- Variable names:
- cannot start with a digit
- cannot contain spaces, quotation marks, or other punctuation
- may contain an underscore (typically used to separate words in long variable names)
- Underscores at the start like
__alistairs_real_age
have a special meaning so we won’t do that until we understand the convention.
Use print
to display values.
- Python has a built-in function called
print
that prints things as text. - Call the function (i.e., tell Python to run it) by using its name.
- Provide values to the function (i.e., the things to print) in parentheses.
- To add a string to the printout, wrap the string in single quotations.
- The values passed to the function are called ‘arguments’
OUTPUT
Ahmed is 42 years old
-
print
automatically puts a single space between items to separate them. - And wraps around to a new line at the end.
Variables must be created before they are used.
- If a variable doesn’t exist yet, or if the name has been
mis-spelled, Python reports an error.
- Unlike some languages, which “guess” a default value.
ERROR
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-1-c1fbb4e96102> in <module>()
----> 1 print(eye_color)
NameError: name 'eye_color' is not defined
- The last line of an error message is usually the most informative.
- We will look at error messages in detail later.
Variables Persist Between Cells
Variables defined in one cell exist in all other cells once executed, so the relative location of cells in the notebook do not matter (i.e., cells lower down can still affect those above). Remember: Notebook cells are just a way to organize a program: as far as Python is concerned, all of the source code is one long set of instructions.
Variables can be used in calculations.
- We can use variables in calculations just as if they were values.
- Remember, we assigned 42 to
age
a few lines ago.
- Remember, we assigned 42 to
OUTPUT
Age in three years: 45
Use an index to get a single character from a string.
- The characters (individual letters, numbers, and so on) in a string are ordered. For example, the string ‘AB’ is not the same as ‘BA’. Because of this ordering, we can treat the string as a list of characters.
- Each position in the string (first, second, etc.) is given a number. This number is called an index or sometimes a subscript.
- Indices are numbered from 0 rather than 1.
- Use the position’s index in square brackets to get the character at that position.
OUTPUT
h
Use a slice to get a substring.
- A part of a string is called a substring. A substring can be as short as a single character.
- An item in a list is called an element. Whenever we treat a string as if it were a list, the string’s elements are its individual characters.
- A slice is a part of a string (or, more generally, any list-like thing).
- We take a slice by using
[start:stop]
, wherestart
is replaced with the index of the first element we want andstop
is replaced with the index of the element just after the last element we want. - Mathematically, you might say that a slice selects
[start:stop]
. - The difference between stop and start is the slice’s length.
- Taking a slice does not change the contents of the original string. Instead, the slice is a copy of part of the original string.
OUTPUT
sod
Use the built-in function len
to find the length of a
string.
OUTPUT
6
- Nested functions are evaluated from the inside out, just like in mathematics.
Python is case-sensitive.
- Python thinks that upper- and lower-case letters are different, so
Name
andname
are different variables. - There are conventions for using upper-case letters at the start of variable names so we will use lower-case letters for now.
Use meaningful variable names.
- Python doesn’t care what you call variables as long as they obey the rules (alphanumeric characters and the underscore).
- Use meaningful variable names to help other people understand what the program does.
- The most important “other person” is your future self.
swap = x # x->1.0 y->3.0 swap->1.0
x = y # x->3.0 y->3.0 swap->1.0
y = swap # x->3.0 y->1.0 swap->1.0
These three lines exchange the values in x
and
y
using the swap
variable for temporary
storage. This is a fairly common programming idiom.
minutes
is better because min
might mean
something like “minimum” (and actually does in Python, but we haven’t
seen that yet).
OUTPUT
library_name[1:3] is: oc
- It will slice the string, starting at the
low
index and ending an element before thehigh
index - It will slice the string, starting at the
low
index and stopping at the end of the string - It will slice the string, starting at the beginning on the string,
and ending an element before the
high
index - It will print the entire string
- It will slice the string, starting the
number
index, and ending a distance of the absolute value ofnegative-number
elements from the end of the string
Key Points
- Use variables to store values.
- Use
print
to display values. - Variables persist between cells.
- Variables must be created before they are used.
- Variables can be used in calculations.
- Use an index to get a single character from a string.
- Use a slice to get a substring.
- Use the built-in function
len
to find the length of a string. - Python is case-sensitive.
- Use meaningful variable names.
Content from Data Types and Type Conversion
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- What kinds of data do programs store?
- How can I convert one type to another?
Objectives
- Explain key differences between integers and floating point numbers.
- Explain key differences between numbers and character strings.
- Use built-in functions to convert between integers, floating point numbers, and strings.
Every value has a type.
- Every value in a program has a specific type.
- Integer (
int
): whole numbers like 3 or -512. - Floating point number (
float
): fractional numbers like 3.14159 or -2.5. - Whole numbers may also be stored as floats, e.g.
1.0
, but1.0
would still be stored as afloat
. - Character string (usually called “string”,
str
): text.- Written in either single quotes or double quotes (as long as they match).
- The quotation marks aren’t printed using
print()
, but may appear when viewing a value in the Jupyter Notebook or other Python interpreter.
Use the built-in function type
to find the type of a
value.
- Use the built-in function
type
to find out what type a value has. - This works on variables as well.
- But remember: the value has the type — the variable is just a label.
- When you change the value of a variable to a new data type, the
results of
print(type(your_variable))
will change accordingly.
OUTPUT
<class 'int'>
OUTPUT
<class 'str'>
Types control what operations (or methods) can be performed on a given value.
- A value’s type determines what the program can do to it.
OUTPUT
2
ERROR
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-67f5626a1e07> in <module>()
----> 1 print('hello' - 'h')
TypeError: unsupported operand type(s) for -: 'str' and 'str'
You can use the +
and *
operators on
strings.
- “Adding” character strings concatenates them.
OUTPUT
Ahmed Walsh
- Multiplying a character string by an integer N creates a
new string that consists of that character string repeated N
times.
- Since multiplication is repeated addition.
- There are more ways that traditional math operators will work on other data types. There isn’t a perfect formula for figuring out what they do, so experimentation is valuable.
OUTPUT
==========
Strings have a length (but numbers don’t).
- The built-in function
len
counts the number of characters in a string.
OUTPUT
11
- But numbers don’t have a length (not even zero).
ERROR
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-f769e8e8097d> in <module>()
----> 1 print(len(52))
TypeError: object of type 'int' has no len()
Must convert numbers to strings or vice versa when operating on them.
- Cannot add numbers and strings.
ERROR
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-fe4f54a023c6> in <module>()
----> 1 print(1 + '2')
TypeError: unsupported operand type(s) for +: 'int' and 'str'
- Not allowed because it’s ambiguous: should
1 + '2'
be3
or'12'
? - Some types can be converted to other types by using the type name as a function.
OUTPUT
3
12
Can mix integers and floats freely in operations.
- Integers and floating-point numbers can be mixed in arithmetic.
- Python automatically converts integers to floats as needed.
OUTPUT
half is 0.5
three squared is 9.0
Variables only change value when something is assigned to them.
- If we make one cell in a spreadsheet depend on another, and update the latter, the former updates automatically.
- This does not happen in programming languages.
OUTPUT
first is 2 and second is 5
- The computer reads the value of
first
when doing the multiplication, creates a new value, and assigns it tosecond
. - After that,
second
does not remember where it came from.
Choose a Type
What type of value (integer, floating point number, or character string) would you use to represent each of the following? Try to come up with more than one good answer for each problem. For example, in # 1, when would counting days with a floating point variable make more sense than using an integer?
- Number of days since the start of the year.
- Time elapsed since the start of the year.
- Call number of a book.
- Standard book loan period.
- Number of reference queries in a year.
- Average library classes taught per semester.
- Integer
- Float
- String
- Integer
- Integer
- Float
Division Types
There are three different types of division:
- ‘Normal’ division (aka floating-point division) is what most people may be familiar with: 5 / 2 = 2.5
- Floor division, which cuts out the part after the period: 5 / 2 = 2
- Modulo division, which only keeps the remained after division: 5 / 2 = 1
In Python 3, the /
operator performs floating-point
division, the //
operator performs floor division, and the
‘%’ (or modulo) operator calculates the modulo division:
OUTPUT
5 // 3: 1
5 / 3: 1.6666666666666667
5 % 3: 2
If num_students
is the number of students enrolled in a
course (let say 600), and num_per_class
is the number that
can attend a single class (let say 42), write an expression that
calculates the number of classes needed to teach everyone.
Depending on requirements it might be important to detect when the
number of students per class doesn’t divide the number of students
evenly. Detect it with the %
operator and test if the
remainder that it returns is greater than 0.
PYTHON
num_students = 600
num_per_class = 42
num_classes = num_students // num_per_class
remainder = num_students % num_per_class
print(num_students, 'students,', num_per_class, 'per class')
print(num_classes, ' full classes, plus an extra class with only ', remainder, 'students')
OUTPUT
600 students, 42 per class
14 full classes, plus an extra class with only 12 students
Strings to Numbers
Where reasonable, float()
will convert a string to a
floating point number, and int()
will convert a floating
point number to an integer:
OUTPUT
string to float: 3.4
float to int: 3
Note: conversion is some times also called typecast.
If the conversion doesn’t make sense, however, an error message will occur
ERROR
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-df3b790bf0a2> in <module>()
----> 1 print("string to float:", float("Hello world!"))
ValueError: could not convert string to float: 'Hello world!'
Given this information, what do you expect the following program to do?
What does it actually do?
Why do you think it does that?
What do you expect this program to do? It would not be so
unreasonable to expect the Python int
command to convert
the string “3.4” to 3.4 and an additional type conversion to 3. After
all, Python performs a lot of other magic - isn’t that part of its
charm?
However, Python throws an error. Why? To be consistent, possibly. If you ask Python to perform two consecutive typecasts, you must convert it explicitly in code.
PYTHON
num_as_string = "3.4"
num_as_float = float(num_as_string)
num_as_int = int(num_as_float)
print(num_as_int)
OUTPUT
3
We could also write it in a single line like this:
int(float("3.4"))
Answer: 1 and 4.
- is correct
- gives 2.1
- gives an error because we cannot convert text to int directly
- is correct
- gives 2 (as an integer not as a float)
- gives an error because
second
is a string.
Key Points
- Every value has a type.
- Use the built-in function
type
to find the type of a value. - Types control what operations can be done on values.
- Strings can be added and multiplied.
- Strings have a length (but numbers don’t).
- Must convert numbers to strings or vice versa when operating on them.
- Can mix integers and floats freely in operations.
- Variables only change value when something is assigned to them.
Content from Built-in Functions and Help
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How can I use built-in functions?
- How can I find out what they do?
- What kind of errors can occur in programs?
Objectives
- Explain the purpose of functions.
- Correctly call built-in Python functions.
- Correctly nest calls to built-in functions.
- Use help to display documentation for built-in functions.
- Correctly describe situations in which SyntaxError and NameError occur.
Use comments to add documentation to programs.
A function may take zero or more arguments.
- We have seen some functions already — now let’s take a closer look.
- An argument is a value passed into a function.
- Any arguments you want to pass into a function must go into the
()
print("I am an argument and must go here.")
- You must always use parentheses, because this is how Python knows
you are calling a function.
- You leave them empty if you don’t want or need to pass any arguments in.
-
len
takes exactly one. -
int
,str
, andfloat
create a new value from an existing one. -
print
takes zero or more.-
print()
prints a blank line.
-
OUTPUT
before
after
Commonly-used built-in functions include max
,
min
, and round
.
- Use
max
to find the largest value of one or more values. - Use
min
to find the smallest. - Both work on character strings as well as numbers.
- “Larger” and “smaller” use (0-9, A-Z, a-z) to compare letters.
- This means that:
-
'a'
is smaller than'b'
-
'A'
is smaller than'a'
-
'0'
is smaller than'a'
-
- This is useful for ordering alphabetically.
OUTPUT
3
a
A
Functions may only work for certain (combinations of) arguments.
-
max
andmin
must be given at least one argument. - And they must be given things that can meaningfully be compared.
ERROR
TypeError: unorderable types: str() > int()
Functions may have default values for some arguments.
-
round
will round off a floating-point number. - By default, rounds to zero decimal places.
OUTPUT
4
- We can specify the number of decimal places we want.
OUTPUT
3.7
Use the built-in function help
to get help for a
function.
- Every built-in function has online documentation.
OUTPUT
Help on built-in function round in module builtins:
round(...)
round(number[, ndigits]) -> number
Round a number to a given precision in decimal digits (default 0 digits).
This returns an int when called with one argument, otherwise the
same type as the number. ndigits may be negative.
Python reports a syntax error when grammar rules (that’s Python grammar, not English grammar) have been violated.
- You’ve seen errors when you try to use a function incorrectly.
- Can also have errors when you use punctuation incorrectly.
- Python will run the program up until that point, but if the grammar of that line of code has produced an error, then the program will shut down with an error.
ERROR
SyntaxError: EOL while scanning string literal
ERROR
SyntaxError: invalid syntax
- Look more closely at the error message:
ERROR
File "<ipython-input-6-d1cc229bf815>", line 1
print ("hello world"
^
SyntaxError: unexpected EOF while parsing
- The message indicates a problem on first line of the input (“line
1”).
- In this case the “ipython-input” section of the file name tells us that we are working with input into IPython.
- The
-6-
part of the filename indicates that the error occurred in cell 6 of our Notebook. - Next is the problematic line of code, indicating the problem with a
^
pointer.
Python reports a runtime error when something goes wrong while a program is executing.
ERROR
NameError: name 'aege' is not defined
- Fix syntax errors by reading the source and runtime errors by tracing execution.
Every function returns something.
- Every function call produces some result.
- If the function doesn’t have a useful result to return, it usually
returns the special value
None
.
OUTPUT
example
result of print is None
ping
tin
4
TypeError: '>' not supported between instances of 'str' and 'int'
name[len(name) - 1]
Key Points
- Use comments to add documentation to programs.
- A function may take zero or more arguments.
- Commonly-used built-in functions include
max
,min
, andround
. - Functions may only work for certain (combinations of) arguments.
- Functions may have default values for some arguments.
- Use the built-in function
help
to get help for a function. - Every function returns something.
- Python reports a syntax error when it can’t understand the source of a program.
- Python reports a runtime error when something goes wrong while a program is executing.
- Fix syntax errors by reading the source code, and runtime errors by tracing the program’s execution.
Content from Morning Coffee
Last updated on 2023-05-08 | Edit this page
Reflection exercise
Over coffee, reflect on and discuss the following:
- What are the different kinds of errors Python will report?
- Did the code always produce the results you expected? If not, why?
- Is there something we can do to prevent errors when we write code?
Content from Libraries
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How can I extend the capabilities of Python?
- How can I use software that other people have written?
- How can I find out what that software does?
Objectives
- Explain what software libraries are and why programmers create and use them.
- Write programs that import and use libraries from Python’s standard library.
- Find and read documentation for standard libraries interactively (in the interpreter) and online.
Most of the power of a programming language is in its (software) libraries.
- A (software) library is a collection of files (called
modules) that contains functions for use by other programs.
- May also contain data values (e.g., numerical constants) and other things.
- Library’s contents are supposed to be related, but there’s no way to enforce that.
- The Python standard library is an extensive suite of modules that comes with Python itself.
- Many additional libraries are available from PyPI (the Python Package Index).
- We will see later how to write new libraries.
A program must import a library module before using it.
- Use
import
to load a library module into a program’s memory. - Then refer to things from the module as
module_name.thing_name
.- Python uses
.
to mean “part of”.
- Python uses
- Using
string
, one of the modules in the standard library:
PYTHON
import string
print('The lower ascii letters are', string.ascii_lowercase)
print(string.capwords('capitalise this sentence please.'))
OUTPUT
The lower ascii letters are abcdefghijklmnopqrstuvwxyz
Capitalise This Sentence Please.
- You have to refer to each item with the module’s name.
-
string.capwords(ascii_lowercase)
won’t work: the reference toascii_lowercase
doesn’t somehow “inherit” the function’s reference tostring
.
-
Use help
to learn about the contents of a library
module.
- Works just like help for a function.
OUTPUT
Help on module string:
NAME
string - A collection of string constants.
MODULE REFERENCE
https://docs.python.org/3.6/library/string
The following documentation is automatically generated from the Python
source files. It may be incomplete, incorrect or include features that
are considered implementation detail and may vary between Python
implementations. When in doubt, consult the module reference at the
location listed above.
DESCRIPTION
Public module variables:
whitespace -- a string containing all ASCII whitespace
ascii_lowercase -- a string containing all ASCII lowercase letters
ascii_uppercase -- a string containing all ASCII uppercase letters
ascii_letters -- a string containing all ASCII letters
digits -- a string containing all ASCII decimal digits
hexdigits -- a string containing all ASCII hexadecimal digits
octdigits -- a string containing all ASCII octal digits
punctuation -- a string containing all ASCII punctuation characters
printable -- a string containing all ASCII characters considered printable
CLASSES
builtins.object
Formatter
Template
⋮ ⋮ ⋮
Import specific items from a library module to shorten programs.
- Use
from ... import ...
to load only specific items from a library module. - Then refer to them directly without library name as prefix.
OUTPUT
The ASCII letters are abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
Create an alias for a library module when importing it to shorten programs.
- Use
import ... as ...
to give a library a short alias while importing it. - Then refer to items in the library using that shortened name.
OUTPUT
Capitalise This Sentence Again Please.
- Commonly used for libraries that are frequently used or have long
names.
- E.g., The
pandas
library is often aliased aspd
.
- E.g., The
- But can make programs harder to understand, since readers must learn your program’s aliases.
- Using
help(os)
we see that we’ve gotos.getcwd()
which returns a string representing the current working directory.
Locating the Right Module
Given the variables year
, month
and
day
, how would you generate a date in the standard iso
format:
- Which standard library module could help you?
- Which function would you select from that module?
- Try to write a program that uses the function.
The datetime module seems like it could help you.
You could use date(year, month, date).isoformat()
to
convert your date:
or more compactly:
Importing the os module (import os
)
can be written as
Since you just wrote the code and are familiar with it, you might actually find the first version easier to read. But when trying to read a huge piece of code written by someone else, or when getting back to your own huge piece of code after several months, non-abbreviated names are often easier, expect where there are clear abbreviation conventions.
A2) Importing digits
from string
provides
the digits
methods B3) Importing string
provides methods such as ascii_uppercase
, but requires the
string.
syntax. C1) Importing string
with the
alias s
allows s.digits
Most likely you find this version easier to read since it’s less
dense. The main reason not to use this form of import is to avoid name
clashes. For instance, you wouldn’t import degrees
this way
if you also wanted to use the name degrees
for a variable
or function of your own. Or if you were to also import a function named
degrees
from another library.
- The date object takes arguments in the order year, month, day, so 13 is an invalid value for month.
- You get an error of type “ValueError”, indicating that the object received an inappropriate argument value. The additional message “month must be in 1..12” makes it clearer what the problem is.
Key Points
- Most of the power of a programming language is in its libraries.
- A program must import a library module in order to use it.
- Use
help
to learn about the contents of a library module. - Import specific items from a library to shorten programs.
- Create an alias for a library when importing it to shorten programs.
Content from Lunch
Last updated on 2023-05-08 | Edit this page
FIXME: describe what to reflect on.
Content from Lists
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How can I store multiple values?
Objectives
- Explain why programs need collections of values.
- Write programs that create flat lists, index them, slice them, and modify them through assignment and method calls.
A list stores many values in a single structure.
- Scenario: You have set up an Arduino to do temperature measurements in a storage room for rare books.
- Doing calculations with a hundred variables called
temperature_001
,temperature_002
, etc., would be at least as slow as doing them by hand. - Use a list to store many values together.
- Contained within square brackets
[...]
. - Values separated by commas
,
.
- Contained within square brackets
- Use
len
to find out how many values are in a list.
PYTHON
temperatures = [17.3, 17.5, 17.7, 17.5, 17.6]
print('temperatures:', temperatures)
print('length:', len(temperatures))
OUTPUT
temperatures: [17.3, 17.5, 17.7, 17.5, 17.6]
length: 5
Use an item’s index to fetch it from a list.
- Just like strings.
PYTHON
print('zeroth item of temperatures:', temperatures[0])
print('fourth item of temperatures:', temperatures[4])
OUTPUT
zeroth item of temperatures: 17.3
fourth item of temperatures: 17.6
Lists’ values can be replaced by assigning to them.
- Use an index expression on the left of assignment to replace a value.
OUTPUT
temperatures is now: [16.5, 17.5, 17.7, 17.5, 17.6]
Appending items to a list lengthens it.
- Use
list_name.append
to add items to the end of a list.
PYTHON
print('temperatures is initially:', temperatures)
temperatures.append(17.9)
temperatures.append(18.2)
print('temperatures has become:', temperatures)
OUTPUT
temperatures is initially: [16.5, 17.5, 17.7, 17.5, 17.6]
temperatures has become: [16.5, 17.5, 17.7, 17.5, 17.6, 17.9, 18.2]
-
append
is a method of lists.- Like a function, but tied to a particular object.
- Use
object_name.method_name
to call methods.- Deliberately resembles the way we refer to things in a library.
- We will meet other methods of lists as we go along.
- Use
help(list)
for a preview.
- Use
Use del
to remove items from a list entirely.
-
del list_name[index]
removes an item from a list and shortens the list. - Not a function or a method, but a statement in the language.
PYTHON
primes = [2, 3, 5, 7, 11]
print('primes before removing last item:', primes)
del primes[4]
print('primes after removing last item:', primes)
OUTPUT
primes before removing last item: [2, 3, 5, 7, 11]
primes after removing last item: [2, 3, 5, 7]
The empty list contains no values.
- Use
[]
on its own to represent a list that doesn’t contain any values.- “The zero of lists.”
- Helpful as a starting point for collecting values (which we will see in the next episode).
Lists may contain values of different types.
- A single list may contain numbers, strings, and anything else.
Character strings can be indexed like lists.
- Get single characters from a character string using indexes in square brackets.
PYTHON
element = 'carbon'
print('zeroth character:', element[0])
print('third character:', element[3])
OUTPUT
zeroth character: c
third character: b
Character strings are immutable.
- Cannot change the characters in a string after it has been created.
- Immutable: cannot be changed after creation.
- In contrast, lists are mutable: they can be modified in place.
- Python considers the string to be a single value with parts, not a collection of values.
ERROR
TypeError: 'str' object does not support item assignment
- Lists and character strings are both collections.
Indexing beyond the end of the collection is an error.
- Python reports an
IndexError
if we attempt to access a value that doesn’t exist.- This is a kind of runtime error.
- Cannot be detected as the code is parsed because the index might be calculated based on data.
OUTPUT
IndexError: string index out of range
The list’s length would be equal to high - low
.
- It creates a list of the
some string
s characters as elements. - It creates a string composed of
x
andy
, separated by a hyphen character(-
).
Working With the End
What does the following program print?
- How does Python interpret a negative index?
- If a list or string has N elements, what is the most negative index that can safely be used with it, and what location does that index represent?
- If
values
is a list, what doesdel values[-1]
do? - How can you display all elements but the last one without changing
values
? (Hint: you will need to combine slicing and negative indexing.)
OUTPUT
m
- A negative index begins at the final element.
-
-(N)
corresponds to the first index, which is the [0] index. - It removes the final element of the list.
- You could do the following:
print(values[0:-1])
OUTPUT
furn
eniroulf
-
stride
indicates both the number of steps, and from which end: positive starts from first element, negative from the last element. element[1::2]
OUTPUT
lithium
''
There is no 20th index, so the entire string is captured.
There is no element after the -1 index.
Program A:
OUTPUT
letters is ['g', 'o', 'l', 'd'] and result is ['d', 'g', 'l', 'o']
Program B:
OUTPUT
letters is ['d', 'g', 'l', 'o'] and result is None
sorted(letters)
returns a sorted copy of the list, while
letters.sort()
sorted the list in place. Thus, it
was already sorted, and calling a further sort returns
None
.
Program A:
OUTPUT
new is ['D', 'o', 'l', 'd'] and old is ['D', 'o', 'l', 'd']
Program B:
OUTPUT
new is ['D', 'o', 'l', 'd'] and old is ['g', 'o', 'l', 'd']
new = old
is assigning old
to
new
, whereas new = old[:]
is a slice
assignment, which will only return a copy of
old
.
Key Points
- A list stores many values in a single structure.
- Use an item’s index to fetch it from a list.
- Lists’ values can be replaced by assigning to them.
- Appending items to a list lengthens it.
- Use
del
to remove items from a list entirely. - The empty list contains no values.
- Lists may contain values of different types.
- Character strings can be indexed like lists.
- Character strings are immutable.
- Indexing beyond the end of the collection is an error.
Content from For Loops
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How can I make a program do many things?
Objectives
- Explain what for loops are normally used for.
- Trace the execution of a simple (unnested) loop and correctly state the values of variables in each iteration.
- Write for loops that use the Accumulator pattern to aggregate values.
A for loop executes commands once for each value in a collection.
- Doing calculations on the values in a list one by one is as painful
as working with
temperature_001
,temperature_002
, etc. - A for loop tells Python to execute some statements once for each value in a list, a character string, or some other collection.
- “for each thing in this group, do these operations”
- This
for
loop is equivalent to:
- And the
for
loop’s output is:
OUTPUT
2
3
5
The first line of the for
loop must end with a colon,
and the body must be indented.
- The colon at the end of the first line signals the start of a block of statements.
- Python uses indentation rather than
{}
orbegin
/end
to show nesting.- Any consistent indentation is legal, but almost everyone uses four spaces.
ERROR
IndentationError: expected an indented block
- Indentation is always meaningful in Python.
ERROR
File "<ipython-input-7-f65f2962bf9c>", line 2
lastName="Smith"
^
IndentationError: unexpected indent
- This error can be fixed by removing the extra spaces at the beginning of the second line.
A for
loop is made up of a collection, a loop variable,
and a body.
- The collection,
[2, 3, 5]
, is what the loop is being run on. - The body,
print(number)
, specifies what to do for each value in the collection. - The loop variable,
number
, is what changes for each iteration of the loop.- The “current thing”.
Loop variable names follow the normal variable name conventions.
- Loop variables will:
- Be created on demand during the course of each loop.
- Persist after the loop finishes.
- Use a new variable name to avoid overwriting a data collection you need to keep for later
- Often be used in the course of the loop
- So give them a meaningful name you’ll understand as the body code in your loop grows.
- Example:
for single_letter in ['A', 'B', 'C', 'D']:
instead offor asdf in ['A', 'B', 'C', 'D']:
The body of a loop can contain many statements.
- But no loop should be more than a few lines long.
- Hard for human beings to keep larger chunks of code in mind.
OUTPUT
2 4 8
3 9 27
5 25 125
Use range
to iterate over a sequence of numbers.
- The built-in function
range
produces a sequence of numbers.- Not a list: the numbers are produced on demand to make looping over large ranges more efficient.
-
range(N)
is the numbers 0..N-1- Exactly the legal indices of a list or character string of length N
OUTPUT
a range is not a list: range(0, 3)
0
1
2
Or use range
to repeat an action an arbitrary number of
times.
- You don’t actually have to use the iterable variable’s value.
- Use this structure to simply repeat an action some number of times.
- That number of times goes into the
range
function.
- That number of times goes into the
OUTPUT
Again!
Again!
Again!
Again!
Again!
The Accumulator pattern turns many values into one.
- A common pattern in programs is to:
- Initialize an accumulator variable to zero, the empty string, or the empty list.
- Update the variable with values from a collection.
PYTHON
# Sum the first 10 integers.
total = 0
for number in range(10):
total = total + (number + 1)
print(total)
OUTPUT
55
- Read
total = total + (number + 1)
as:- Add 1 to the current value of the loop variable
number
. - Add that to the current value of the accumulator variable
total
. - Assign that to
total
, replacing the current value.
- Add 1 to the current value of the loop variable
- We have to add
number + 1
becauserange
produces 0..9, not 1..10.
It is a syntax error. The problem has to do with the placement of the code, not its logic.
result
is an empty string because we use it to build or
accumulate on our reverse string. char
is the loop variable
for original
. Each time through the loop char
takes on one value from original
. Use char
with result
to control the order of the string. Our loop
code should look like this:
original = "tin"
result = ""
for char in original:
result = char + result
print(result)
nit
If you were to expand out the loop the iterations would look something like this:
Practice Accumulating
Fill in the blanks in each of the programs below to produce the indicated result.
PYTHON
# Total length of the strings in the list: ["red", "green", "blue"] => 12
total = 0
for word in ["red", "green", "blue"]:
____ = ____ + len(word)
print(total)
PYTHON
# List of word lengths: ["red", "green", "blue"] => [3, 5, 4]
lengths = ____
for word in ["red", "green", "blue"]:
lengths.____(____)
print(lengths)
Identifying Variable Name Errors
- Read the code below and try to identify what the errors are without running it.
- Run the code and read the error message. What type of
NameError
do you think this is? Is it a string with no quotes, a misspelled variable, or a variable that should have been defined but was not? - Fix the error.
- Repeat steps 2 and 3, until you have fixed all the errors.
It is an index error:
ERROR
IndexError: list index out of range
The problem is that 4
points to an item that doesn’t
exist in the list. Remember the first item of a list in Python is
0
.
Replace seasons[4]
with seasons[0]
,
seasons[1]
, seasons[2]
or
seasons[3]
to have the different items of the list
printed.
Key Points
- A for loop executes commands once for each value in a collection.
- The first line of the
for
loop must end with a colon, and the body must be indented. - Indentation is always meaningful in Python.
- A
for
loop is made up of a collection, a loop variable, and a body. - Loop variables can be called anything (but it is strongly advised to have a meaningful name to the looping variable).
- The body of a loop can contain many statements.
- Use
range
to iterate over a sequence of numbers. - The Accumulator pattern turns many values into one.
Content from Looping Over Data Sets
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How can I process many data sets with a single command?
Objectives
- Be able to read and write globbing expressions that match sets of files.
- Use glob to create lists of files.
- Write for loops to perform operations on files given their names in a list.
Use a for
loop to process files given a list of their
names.
- A filename is just a character string.
- And lists can contain character strings.
PYTHON
for filename in ['data/gapminder_gdp_africa.csv', 'data/gapminder_gdp_asia.csv']:
data = pandas.read_csv(filename, index_col='country')
print(filename, data.min())
OUTPUT
data/gapminder_gdp_africa.csv gdpPercap_1952 298.846212
gdpPercap_1957 335.997115
gdpPercap_1962 355.203227
gdpPercap_1967 412.977514
⋮ ⋮ ⋮
gdpPercap_1997 312.188423
gdpPercap_2002 241.165877
gdpPercap_2007 277.551859
dtype: float64
data/gapminder_gdp_asia.csv gdpPercap_1952 331
gdpPercap_1957 350
gdpPercap_1962 388
gdpPercap_1967 349
⋮ ⋮ ⋮
gdpPercap_1997 415
gdpPercap_2002 611
gdpPercap_2007 944
dtype: float64
Use glob.glob
to find sets of files whose names match a
pattern.
- In Unix, the term “globbing” means “matching a set of files with a pattern”.
- The most common patterns are:
-
*
meaning “match zero or more characters” -
?
meaning “match exactly one character”
-
- Python contains the
glob
library to provide pattern matching functionality - The
glob
library contains a function also calledglob
to match file patterns - E.g.,
glob.glob('*.txt')
matches all files in the current directory whose names end with.txt
. - Result is a (possibly empty) list of character strings.
OUTPUT
all csv files in data directory: ['data/gapminder_all.csv', 'data/gapminder_gdp_africa.csv', \
'data/gapminder_gdp_americas.csv', 'data/gapminder_gdp_asia.csv', 'data/gapminder_gdp_europe.csv', \
'data/gapminder_gdp_oceania.csv']
OUTPUT
all PDB files: []
Use glob
and for
to process batches of
files.
- Helps a lot if the files are named and stored systematically and consistently so that simple patterns will find the right data.
PYTHON
for filename in glob.glob('data/*.csv'):
data = pandas.read_csv(filename)
print(filename, data['gdpPercap_1952'].min())
OUTPUT
data/gapminder_all.csv 298.8462121
data/gapminder_gdp_africa.csv 298.8462121
data/gapminder_gdp_americas.csv 1397.717137
data/gapminder_gdp_asia.csv 331.0
data/gapminder_gdp_europe.csv 973.5331948
data/gapminder_gdp_oceania.csv 10039.59564
- This includes all data, as well as per-region data.
- Use a more specific pattern in the exercises to exclude the whole data set.
- But note that the minimum of the entire data set is also the minimum of one of the data sets, which is a nice check on correctness.
##Solution
1 is not matched by the regular expresion.
Minimum File Size
Modify this program so that it prints the number of records in the file that has the fewest records.
PYTHON
import pandas
fewest = ____
for filename in glob.glob('data/*.csv'):
dataframe = pandas.____(filename)
fewest = min(____, dataframe.shape[0])
print('smallest file has', fewest, 'records')
Notice that the shape method returns a tuple with the number of rows and columns of the data frame.
Content from Writing Functions
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How can I create my own functions?
Objectives
- Explain and identify the difference between function definition and function call.
- Write a function that takes a small, fixed number of arguments and produces a single result.
Break programs down into functions to make them easier to understand.
- Human beings can only keep a few items in working memory at a time.
- Understand larger/more complicated ideas by understanding and
combining pieces.
- Components in a machine.
- Lemmas when proving theorems.
- Functions serve the same purpose in programs.
- Encapsulate complexity so that we can treat it as a single “thing”.
- Also enables re-use.
- Write one time, use many times.
Define a function using def
with a name, parameters,
and a block of code.
- Begin the definition of a new function with
def
. - Followed by the name of the function.
- Must obey the same rules as variable names.
- Then parameters in parentheses.
- Empty parentheses if the function doesn’t take any inputs.
- We will discuss this in detail in a moment.
- Then a colon.
- Then an indented block of code.
Defining a function does not run it.
- Defining a function does not run it.
- Like assigning a value to a variable.
- Must call the function to execute the code it contains.
- The commands for the function are read and stored after the
def
block, but not actually executed until the function is called later on.- Imagine getting a recipe card and keeping it in your kitchen. You can cook it anytime, but you haven’t completed any of the steps until you start that cooking process.
- This means that Python won’t complain about problems until you call the function. More specifically, just because the definition of a function runs without error doesn’t mean that there won’t be errors when it executes later.
OUTPUT
Hello!
Arguments in call are matched to parameters in definition.
- Functions are most useful when they can operate on different data.
- Specify parameters when defining a function.
- These become variables when the function is executed.
- Are assigned the arguments in the call (i.e., the values passed to the function).
PYTHON
def print_date(year, month, day):
joined = str(year) + '/' + str(month) + '/' + str(day)
print(joined)
print_date(1871, 3, 19)
OUTPUT
1871/3/19
- Via Twitter:
()
contains the ingredients for the function while the body contains the recipe.
Functions may return a result to their caller using
return
.
- Use
return ...
to give a value back to the caller. - May occur anywhere in the function.
- But functions are easier to understand if
return
occurs:- At the start to handle special cases.
- At the very end, with a final result.
OUTPUT
2.6666666666666665
OUTPUT
None
- Remember: every function returns something.
- A function that doesn’t explicitly
return
a value automatically returnsNone
.
OUTPUT
1871/3/19
result of call is: None
OUTPUT
pressure is 22.5
Each line of Python code is executed in order, regardless of whether
that line calls out to a function, which may call out to other
functions, or a variable assignment. In this case, the second line call
to print
will not execute until the result of
print_date
is complete in the first line.
Calling by Name
What does this short program print?
PYTHON
def print_date(year, month, day):
joined = str(year) + '/' + str(month) + '/' + str(day)
print(joined)
print_date(day=1, month=2, year=2003)
- When have you seen a function call like this before?
- When and why is it useful to call functions this way? {: .python}
The program prints:
OUTPUT
2003/2/1
It is useful to call a function with named arguments to ensure that the values of each argument are assigned to the intended argument in the function. This allows the order of arguments to be specified independently of how they are defined in the function itself.
Encapsulate of If/Print Block
The code below will run on a label-printer for chicken eggs. A digital scale will report a chicken egg mass (in grams) to the computer and then the computer will print a label.
Please re-write the code so that the if-block is folded into a function.
PYTHON
import random
for i in range(10):
# simulating the mass of a chicken egg
# the (random) mass will be 70 +/- 20 grams
mass=70+20.0*(2.0*random.random()-1.0)
print(mass)
#egg sizing machinery prints a label
if(mass>=85):
print("jumbo")
elif(mass>=70):
print("large")
elif(mass<70 and mass>=55):
print("medium")
else:
print("small")
The simplified program follows. What function definition will make it functional?
PYTHON
# revised version
import random
for i in range(10):
# simulating the mass of a chicken egg
# the (random) mass will be 70 +/- 20 grams
mass=70+20.0*(2.0*random.random()-1.0)
print(mass,print_egg_label(mass))
- Create a function definition for
print_egg_label()
that will work with the revised program above. Note, the function’s return value will be significant. Sample output might be71.23 large
. - A dirty egg might have a mass of more than 90 grams, and a spoiled
or broken egg will probably have a mass that’s less than 50 grams.
Modify your
print_egg_label()
function to account for these error conditions. Sample output could be25 too light, probably spoiled
.
Encapsulating Data Analysis
Assume that the following code has been executed:
PYTHON
import pandas
df = pandas.read_csv('gapminder_gdp_asia.csv', index_col=0)
japan = df.ix['Japan']
- Complete the statements below to obtain the average GDP for Japan across the years reported for the 1980s.
PYTHON
year = 1983
gdp_decade = 'gdpPercap_' + str(year // ____)
avg = (japan.ix[gdp_decade + ___] + japan.ix[gdp_decade + ___]) / 2
- Abstract the code above into a single function.
PYTHON
def avg_gdp_in_decade(country, continent, year):
df = pd.read_csv('gapminder_gdp_'+___+'.csv',delimiter=',',index_col=0)
____
____
____
return avg
- How would you generalize this function if you did not know beforehand which specific years occurred as columns in the data? For instance, what if we also had data from years ending in 1 and 9 for each decade? (Hint: use the columns to filter out the ones that correspond to the decade, instead of enumerating them in the code.)
PYTHON
year = 1983
gdp_decade = 'gdpPercap_' + str(year // 10)
avg = (japan.ix[gdp_decade + '2'] + japan.ix[gdp_decade + '7']) / 2
PYTHON
def avg_gdp_in_decade(country, continent, year):
df = pd.read_csv('gapminder_gdp_' + continent + '.csv', index_col=0)
c = df.ix[country]
gdp_decade = 'gdpPercap_' + str(year // 10)
avg = (c.ix[gdp_decade + '2'] + c.ix[gdp_decade + '7'])/2
return avg
- We need to loop over the reported years to obtain the average for the relevant ones in the data.
PYTHON
def avg_gdp_in_decade(country, continent, year):
df = pd.read_csv('gapminder_gdp_' + continent + '.csv', index_col=0)
c = df.ix[country]
gdp_decade = 'gdpPercap_' + str(year // 10)
total = 0.0
num_years = 0
for yr_header in c.index: # c's index contains reported years
if yr_header.startswith(gdp_decade):
total = total + c.ix[yr_header]
num_years = num_years + 1
return total/num_years
Key Points
- Break programs down into functions to make them easier to understand.
- Define a function using
def
with a name, parameters, and a block of code. - Defining a function does not run it.
- Arguments in call are matched to parameters in definition.
- Functions may return a result to their caller using
return
.
Content from Variable Scope
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How do function calls actually work?
- How can I determine where errors occurred?
Objectives
- Identify local and global variables.
- Identify parameters as local variables.
- Read a traceback and determine the file, function, and line number on which the error occurred, the type of error, and the error message.
The scope of a variable is the part of a program that can ‘see’ that variable.
- There are only so many sensible names for variables.
- People using functions shouldn’t have to worry about what variable names the author of the function used.
- People writing functions shouldn’t have to worry about what variable names the function’s caller uses.
- The part of a program in which a variable is visible is called its scope.
-
pressure
is a global variable.- Defined outside any particular function.
- Visible everywhere.
-
t
andtemperature
are local variables inadjust
.- Defined in the function.
- Not visible in the main program.
- Remember: a function parameter is a variable that is automatically assigned a value when the function is called.
OUTPUT
adjusted: 0.01238691049085659
ERROR
Traceback (most recent call last):
File "/Users/swcarpentry/foo.py", line 8, in <module>
print('temperature after call:', temperature)
NameError: name 'temperature' is not defined
There are missing parentheses and colon ():
after the
function call, and the print messages don’t appear aligned via
whitespace
ERROR
File "<stdin>", line 1
def another_function
^
SyntaxError: invalid syntax
ERROR
File "<stdin>", line 1
print("Syntax errors are annoying.")
^
IndentationError: unexpected indent
ERROR
File "<stdin>", line 1
print("But at least Python tells us about them!")
^
IndentationError: unexpected indent
Working function:
Reading Error Messages
Read the traceback below, and identify the following:
- How many levels does the traceback have?
- What is the file name where the error occurred?
- What is the function name where the error occurred?
- On which line number in this function did the error occur?
- What is the type of error?
- What is the error message?
ERROR
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-2-e4c4cbafeeb5> in <module>()
1 import errors_02
----> 2 errors_02.print_friday_message()
/Users/ghopper/thesis/code/errors_02.py in print_friday_message()
13
14 def print_friday_message():
---> 15 print_message("Friday")
/Users/ghopper/thesis/code/errors_02.py in print_message(day)
9 "sunday": "Aw, the weekend is almost over."
10 }
---> 11 print(messages[day])
12
13
KeyError: 'Friday'
- 3 levels, since there are 3 arrows
- The file is
errors_02.py
- The function is
print_message()
- Line 11
- It is a
KeyError
- There isn’t really a message; you’re supposed to infer that
Friday
is not a key inmessages
.
Content from Afternoon Coffee
Last updated on 2023-05-08 | Edit this page
FIXME: describe what to reflect on.
Content from Conditionals
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How can programs do different things for different data?
Objectives
- Correctly write programs that use if and else statements and simple Boolean expressions (without logical operators).
- Trace the execution of unnested conditionals and conditionals inside loops.
Use if
statements to control whether or not a block of
code is executed.
- An
if
statement (more properly called a conditional statement) controls whether some block of code is executed or not. - Structure is similar to a
for
statement:- First line opens with
if
and ends with a colon - Body containing one or more statements is indented (usually by 4 spaces)
- First line opens with
PYTHON
mass = 3.54
if mass > 3.0:
print(mass, 'is larger')
mass = 2.07
if mass > 3.0:
print (mass, 'is larger')
OUTPUT
3.54 is larger
Conditionals are often used inside loops.
- Not much point using a conditional when we know the value (as above).
- But useful when we have a collection to process.
OUTPUT
3.54 is larger
9.22 is larger
Use else
to execute a block of code when an
if
condition is not true.
-
else
can be used following anif
. - Allows us to specify an alternative to execute when the
if
branch isn’t taken.
PYTHON
masses = [3.54, 2.07, 9.22, 1.86, 1.71]
for mass in masses:
if mass > 3.0:
print(mass, 'is larger')
else:
print(mass, 'is smaller')
OUTPUT
3.54 is larger
2.07 is smaller
9.22 is larger
1.86 is smaller
1.71 is smaller
Use elif
to specify additional tests.
- May want to provide several alternative choices, each with its own test.
- Use
elif
(short for “else if”) and a condition to specify these. - Always associated with an
if
. - Must come before the
else
(which is the “catch all”).
PYTHON
masses = [3.54, 2.07, 9.22, 1.86, 1.71]
for mass in masses:
if mass > 9.0:
print(mass, 'is HUGE')
elif mass > 3.0:
print(mass, 'is larger')
else:
print(mass, 'is smaller')
OUTPUT
3.54 is larger
2.07 is smaller
9.22 is HUGE
1.86 is smaller
1.71 is smaller
Conditions are tested once, in order.
- Python steps through the branches of the conditional in order, testing each in turn.
- So ordering matters.
PYTHON
grade = 85
if grade >= 70:
print('grade is C')
elif grade >= 80:
print('grade is B')
elif grade >= 90:
print('grade is A')
OUTPUT
grade is C
- Does not automatically go back and re-evaluate if values change.
PYTHON
velocity = 10.0
if velocity > 20.0:
print('moving too fast')
else:
print('adjusting velocity')
velocity = 50.0
OUTPUT
adjusting velocity
- Often use conditionals in a loop to “evolve” the values of variables.
PYTHON
velocity = 10.0
for i in range(5): # execute the loop 5 times
print(i, ':', velocity)
if velocity > 20.0:
print('moving too fast')
velocity = velocity - 5.0
else:
print('moving too slow')
velocity = velocity + 10.0
print('final velocity:', velocity)
OUTPUT
0 : 10.0
moving too slow
1 : 20.0
moving too slow
2 : 30.0
moving too fast
3 : 25.0
moving too fast
4 : 20.0
moving too slow
final velocity: 30.0
Create a table showing variables’ values to trace a program’s execution.
i | 0 | . | 1 | . | 2 | . | 3 | . | 4 | . |
velocity | 10.0 | 20.0 | . | 30.0 | . | 25.0 | . | 20.0 | . | 30.0 |
- The program must have a
print
statement outside the body of the loop to show the final value ofvelocity
, since its value is updated by the last iteration of the loop.
Compound Relations Usingand
,or
, and Parentheses
Often, you want some combination of things to be true. You can
combine relations within a conditional using and
and
or
. Continuing the example above, suppose you have
PYTHON
mass = [ 3.54, 2.07, 9.22, 1.86, 1.71]
velocity = [10.00, 20.00, 30.00, 25.00, 20.00]
i = 0
for i in range(5):
if mass[i] > 5 and velocity[i] > 20:
print("Fast heavy object. Duck!")
elif mass[i] > 2 and mass[i] <= 5 and velocity[i] <= 20:
print("Normal traffic")
elif mass[i] <= 2 and velocity <= 20:
print("Slow light object. Ignore it")
else:
print("Whoa! Something is up with the data. Check it")
Just like with arithmetic, you can and should use parentheses
whenever there is possible ambiguity. A good general rule is to
always use parentheses when mixing and
and
or
in the same condition. That is, instead of:
write one of these:
PYTHON
if (mass[i] <= 2 or mass[i] >= 5) and velocity[i] > 20:
if mass[i] <= 2 or (mass[i] >= 5 and velocity[i] > 20):
so it is perfectly clear to a reader (and to Python) what you really mean.
OUTPUT
25.0
Trimming Values
Fill in the blanks so that this program creates a new list containing zeroes where the original list’s values were negative and ones where the original list’s values were positive.
PYTHON
original = [-1.5, 0.2, 0.4, 0.0, -1.3, 0.4]
result = ____
for value in original:
if ____:
result.append(0)
else:
____
print(result)
OUTPUT
[0, 1, 1, 1, 0, 1]
Using Functions With Conditionals in Pandas
Functions will often contain conditionals. Here is a short example that will indicate which quartile the argument is in based on hand-coded values for the quartile cut points.
PYTHON
def calculate_life_quartile(exp):
if exp < 58.41:
# This observation is in the first quartile
return 1
elif exp >= 58.41 and exp < 67.05:
# This observation is in the second quartile
return 2
elif exp >= 67.05 and exp < 71.70:
# This observation is in the third quartile
return 3
elif exp >= 71.70:
# This observation is in the fourth quartile
return 4
else:
# This observation has bad data
return None
calculate_life_quartile(62.5)
OUTPUT
2
That function would typically be used within a for
loop,
but Pandas has a different, more efficient way of doing the same thing,
and that is by applying a function to a dataframe or a portion
of a dataframe. Here is an example, using the definition above.
PYTHON
data = pd.read_csv('Americas-data.csv')
data['life_qrtl'] = data['lifeExp'].apply(calculate_life_quartile)
There is a lot in that second line, so let’s take it piece by piece.
On the right side of the =
we start with
data['lifeExp']
, which is the column in the dataframe
called data
labeled lifExp
. We use the
apply()
to do what it says, apply the
calculate_life_quartile
to the value of this column for
every row in the dataframe.
Key Points
- Use
if
statements to control whether or not a block of code is executed. - Conditionals are often used inside loops.
- Use
else
to execute a block of code when anif
condition is not true. - Use
elif
to specify additional tests. - Conditions are tested once, in order.
- Create a table showing variables’ values to trace a program’s execution.
Content from Programming Style
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How can I make my programs more readable?
- How do most programmers format their code?
- How can programs check their own operation?
Objectives
- Provide sound justifications for basic rules of coding style.
- Refactor one-page programs to make them more readable and justify the changes.
- Use Python community coding standards (PEP-8).
Follow standard Python style in your code.
-
PEP8: a style
guide for Python that discusses topics such as how you should name
variables, how you should use indentation in your code, how you should
structure your
import
statements, etc. Adhering to PEP8 makes it easier for other Python developers to read and understand your code, and to understand what their contributions should look like. The PEP8 application and Python library can check your code for compliance with PEP8.
Use assertions to check for internal errors.
Assertions are powerful method for making sure that the context in which your code is executing is as you expect.
PYTHON
def calc_bulk_density(mass, volume):
'''Return dry bulk density = powder mass / powder volume.'''
assert volume > 0
return mass / volume
If the assertion is False
, the Python interpreter raises
an AssertionError
runtime exception. The source code for
the expression that failed will be displayed as part of the error
message. To ignore assertions in your code run the interpreter with the
‘-O’ (optimize) switch. Assertions should contain only basic checks and
never change the state of the program. For example, an assertion should
never contain an assignment.
Use docstrings to provide online help.
- If the first thing in a function is a character string that is not assigned to a variable, Python attaches it to the function as the online help.
- Called a docstring (short for “documentation string”).
PYTHON
def average(values):
"Return average of values, or None if no values are supplied."
if len(values) == 0:
return None
return sum(values) / average(values)
help(average)
OUTPUT
Help on function average in module __main__:
average(values)
Return average of values, or None if no values are supplied.
What Will Be Shown?
Highlight the lines in the code below that will be available as online help. Are there lines that should be made available, but won’t be? Will any lines produce a syntax error or a runtime error?
PYTHON
"Find maximum edit distance between multiple sequences."
# This finds the maximum distance between all sequences.
def overall_max(sequences):
'''Determine overall maximum edit distance.'''
highest = 0
for left in sequences:
for right in sequences:
'''Avoid checking sequence against itself.'''
if left != right:
this = edit_distance(left, right)
highest = max(highest, this)
# Report.
return highest
These two lines will show up in online help:
This line will not be made available because it is not in the
docstring format:
"Find maximum edit distance between multiple sequences."
Docstrings should reside within the function definition, using three
single quotes.
There is one syntax error:
PYTHON
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "<module2>", line 11, in overall_max
NameError: global name 'edit_distance' is not defined
The edit_distance
function has not been defined:
Clean Up This Code
- Read this short program and try to predict what it does.
- Run it: how accurate was your prediction?
- Refactor the program to make it more readable. Remember to run it after each change to ensure its behavior hasn’t changed.
- Compare your rewrite with your neighbor’s. What did you do the same? What did you do differently, and why?
Here’s one solution.
PYTHON
def string_machine(input_string, iterations):
"""
Takes input_string and generates a new string with -'s and *'s
corresponding to characters that have identical adjacent characters
or not, respectively. Iterates through this procedure with the resultant
strings for the supplied number of iterations.
"""
print(input_string)
old = input_string
for i in range(iterations):
new = ''
# iterate through characters in previous string
for j in range(len(s)):
left = j-1
right = (j+1)%len(s) # ensure right index wraps around
if old[left]==old[right]:
new += '-'
else:
new += '*'
print(new)
# store new string as old
old = new
string_machine('et cetera', 10)
Content from Wrap-Up
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- What have we learned?
- What else is out there and where do I find it?
Objectives
- Name and locate scientific Python community sites for software, workshops, and help.
Leslie Lamport once said, “Writing is nature’s way of showing you how sloppy your thinking is.” The same is true of programming: many things that seem obvious when we’re thinking about them turn out to be anything but when we have to explain them precisely.
Python supports a large community within and outwith research.
The Python 3 documentation covers the core language and the standard library.
PyCon is the largest annual conference for the Python community.
SciPy is a rich collection of scientific utilities. It is also the name of a series of annual conferences.
Jupyter is the home of the Jupyter Notebook.
Pandas is the home of the Pandas data library.
Stack Overflow’s general Python section can be helpful, as can the sections on NumPy, SciPy, Pandas, and other topics.
Content from Feedback
Last updated on 2023-05-08 | Edit this page
Overview
Questions
- How did the class go?
Objectives
- Gather feedback on the class
Gather feedback from participants.