Introduction to Python

Igor Tatarnikov, Laura Porta

Schedule

  • Aims
  • Why learn Python?
  • Basics
    • Variables
    • Data types
    • Loops
    • Conditional statements
    • List comprehensions
  • How to work with Python? (using IDEs, environments, etc…)
  • Writing your first Python script
  • Loading and saving data

Schedule

  • Recap and Q&A
  • Functions and methods
  • Classes and objects
  • Errors and exceptions
  • Integrated Development Environments (IDEs)
  • Virtual environments
  • Modules and packages
  • Organising your code
  • Documenting your code

Aims

  • Introduce Python
  • Introduce some basic programming concepts
  • Show you one (our) way of doing things
    • Not necessarily the best or only way
  • Give you the confidence to start doing it yourself
  • Prepare you for other courses

Please ask any questions at any time!

Installation

Install miniforge

Click here to go to the download page

Add Miniforge to PATH when prompted during installation.

Try the below in either your terminal (Mac/Linux) or Miniconda Prompt (Windows):

conda --version

Windows troubleshooting

  • Did you had a previous version of Anaconda installed? You might need to uninstall it first.
  • conda not found? Run conda init in your terminal.

Install an IDE

  • Visual Studio Code - customisable, extensible, open source
  • PyCharm - opinionated defaults, not fully free (free for educational use)

Install a text editor (macOS)

Install git

Why learn Python?

Why learn Python?

Popularity

  • Python is the most popular programming language for data science\(^1\)

Free and open source

  • Unlike MATLAB, IGOR, etc…
  • Anyone can use your code
    • 9.7M open Python repositories\(^2\) on GitHub logo

Versatile Not just for data analysis / machine learning

  • Visualisation
  • Web development
  • Data acquisition

Why learn Python?

Packages for everything!

NumPy logo Numpy: arrays

Pandas logo Pandas: dataframes

SciPy logo SciPy: scientific computing

Scikit-image logo Scikit-image: image analysis

PyTorch logo PyTorch: machine learning

Matplotlib logo Matplotlib: plotting

People have spent lots of time optimising these packages! Don’t reinvent the wheel!

Interactive Python

Terminal

  • A text based interface to interact with your operating system
  • You can run programs, manage files and folders, install software, etc.
  • Many names:
    • Command Prompt
    • PowerShell
    • Terminal
    • Shell
    • Git Bash
    • Anaconda/minforge Prompt

Terminal

  • Typically arguments are separated by spaces
  • Named arguments are preceded by -- or - (double or single dash)
  • Help is usually available with -h, --help or man <command>
  • Cancel a running command with Ctrl+C or Ctrl+Z
  • Tab key to autocomplete commands and file names
  • Up and Down keys to cycle through command history
  • q (typically) to quit multi-page outputs
  • python is a program that can be run from the terminal

Ways of working with Python

  • REPL (Read-Eval-Print Loop), “interactive Python”, “Python console”
  • Jupyter notebooks (.ipynb)
  • Scripts (.py)

Using interactive Python (REPL)

Demo time! 🧑🏻‍💻👩🏻‍💻

Basics of Python

Variables

  • A variable is a name that refers to a value
  • You can use variables to store data in memory
  • You can change the data stored in a variable at any time
  • Allows you to reuse data without having to retype it

Variables

a = 1
print("The value of a is:")
print(a)
The value of a is:
1


b = 2
c = a + b
print("The value of c is:")
print(c)
The value of c is:
3

Variables

b = 2
a = b
print("The value of a is now:")
print(a)
The value of a is now:
2


x, y = 10, 20

print("The value of x is:")
print(x)
print("The value of y is:")
print(y)
The value of x is:
10
The value of y is:
20

Data types

  • Different kinds of data are stored in different ways in memory
  • You can use the type() function to find out what type of data a variable contains
print(type(1))
print(type(1.0))
print(type("Hello"))
<class 'int'>
<class 'float'>
<class 'str'>

Strings

  • A string is a sequence of characters
  • Strings are enclosed in single or double quotes
  • Used to represent text

Strings

a = "I want to learn "
b = 'Python!'

print(type(a))

print(a + b)
<class 'str'>
I want to learn Python!


  • Algebraic operations can have different meanings for different data types
    • Try a*3 in the code above

f-strings

  • f-strings are a way to format strings
  • Use f before the string
  • Use {variable} to insert variables into the string
a = 42
print(f"The value of a is: {a}")
The value of a is: 42

f-strings

You can also use f-strings to format numbers

print(f"A rounded number: {8.333333333333333:.2f}")
A rounded number: 8.33

And compute values in the string

print(f"7 / 3 is: {7 / 3:.2f}")
7 / 3 is: 2.33

Question

  • What is the type of…?
    • 100
    • 'Dog'
    • 3.14
    • print
    • False
    • '50'
    • None

Question

  • Can you add an int and a float?
print(1 + 2.1)
3.1
  • What about an int and a str?
print(1 + '2')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[11], line 1
----> 1 print(1 + '2')

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Lists

  • Ordered collection of values
  • Enclosed in square brackets []
  • Any data type, including mixed types
  • Indexed from 0 with square brackets []
  • Mutable (can be changed)
my_list = [1, 2.0, 'three', 'dog']
print(type(my_list))
<class 'list'>


first = my_list[0]
second = my_list[1]
last = my_list[-1]
print(first, second, last)
1 2.0 dog


my_list.append('cat')
print(my_list)
[1, 2.0, 'three', 'dog', 'cat']


my_list[-1] = 'new_element'
print(my_list)
[1, 2.0, 'three', 'dog', 'new_element']

Question

  • Find the third element from my_list?
  • Change the second element to 3.0?
  • What happens if you request my_list[5]?
print(my_list)
[1, 2.0, 'three', 'dog', 'new_element']
print(my_list[2])
three
my_list[1] = 3.0
print(my_list)
[1, 3.0, 'three', 'dog', 'new_element']
print(my_list[5])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[19], line 1
----> 1 print(my_list[5])

IndexError: list index out of range

Tuples

  • Ordered collection of values
  • Enclosed in parentheses ()
  • Any data type, including mixed types
  • Immutable (cannot be changed)
  • When would you use a tuple instead of a list?
my_tuple = (1, 2.0, 'cat', 'dog')

print(type(my_tuple))
print(my_tuple[2])
<class 'tuple'>
cat


my_tuple[2] = 'new_element'
print(my_tuple)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[21], line 1
----> 1 my_tuple[2] = 'new_element'
      2 print(my_tuple)

TypeError: 'tuple' object does not support item assignment

Unpacking

  • Unpack a list or tuple into multiple variables
  • Convenient for reassigning variables
my_tuple = (1, 2, 3, 4, 5)

a, b, c, d, e = my_tuple
print(a, b, c, d, e)
1 2 3 4 5
a, *b, c = my_tuple
print(a, b, c)
print(type(b))
1 [2, 3, 4] 5
<class 'list'>

Question

  • How would you turn my_list into a tuple?
  • How would you turn my_tuple into a list?
my_tuple = (1, 2, 3, 4, 5)
my_list = [1, 2.0, 'three', 'dog']


my_list_as_tuple = tuple(my_list)
print(type(my_list_as_tuple))
<class 'tuple'>
my_tuple_as_list = list(my_tuple)
print(type(my_tuple_as_list))
<class 'list'>

Dictionaries

  • Collection of key-value pairs
  • Enclosed in curly braces {}
  • Keys are unique and immutable
my_dict = {
    'a_number': 1,
    'number_list': [1, 2, 3],
    'a_string': 'string',
    5.0: 'a float key'
}

print(my_dict.keys())
print(my_dict.values())
print(my_dict.items())
print(my_dict['a_string'])
print(my_dict[5.0])
dict_keys(['a_number', 'number_list', 'a_string', 5.0])
dict_values([1, [1, 2, 3], 'string', 'a float key'])
dict_items([('a_number', 1), ('number_list', [1, 2, 3]), ('a_string', 'string'), (5.0, 'a float key')])
string
a float key

Question

  • What does my_dict['a_number'] return?
my_dict = {
    'a_number': 1,
    'number_list': [1, 2, 3],
    'a_string': 'string',
    'a_number': 2
}


print(my_dict['a_number'])
2

Conditional statements

  • Execute code only if a condition is met
  • Use if, elif (else if), and else
porridge_temp = 45

if porridge_temp < 40:
    print("Too cold!")
elif porridge_temp > 50:
    print("Too hot!")
else:
    print("Just right!")
Just right!

Question

  • Write a loop that goes through the numbers 0 to 10
  • For each number, print whether it is even or odd
    • Hint: use the modulus operator % and == to check for evenness
for i in range(11):
    if i % 2 == 0:
        print(f"{i} is Even")
    else:
        print(f"{i} is Odd")
0 is Even
1 is Odd
2 is Even
3 is Odd
4 is Even
5 is Odd
6 is Even
7 is Odd
8 is Even
9 is Odd
10 is Even

Using and and or

  • and and or are logical operators
  • Used to combine conditions
print(True and False)
print(True or False)
False
True
ammount_of_rain = 0 
ammount_of_uv_rays = 10 

if ammount_of_rain > 10 and ammount_of_uv_rays > 0:
    print("Take umbrella and sunscreen!")
else:
    print("Enjoy the sun!")
Enjoy the sun!

For loops

  • Do the same thing multiple times
    • E.g.analyse N images
  • Watch out for indentation!
    • Python uses indentation to define blocks of code
    • Usually 2-4 spaces (or a tab)
for i in [0, 1, 2]:
    print(i)
0
1
2


for i in range(5):
    print(i)
0
1
2
3
4

For loops

  • You can loop over any iterable
  • Use _ if you don’t need the loop variable
  • enumerate() gives you the index and the value
  • range() generates a sequence of numbers
    • range(n) from 0 to n-1
    • range(a, b) from a to b-1
    • range(a, b, step) from a to b-1 with jumps of step
my_list = ['cat', 'dog', 'rabbit']
for animal in my_list:
    print("My pet is a: " + animal)
My pet is a: cat
My pet is a: dog
My pet is a: rabbit


for _ in range(3):
    print("Hello!")
Hello!
Hello!
Hello!


for index, animal in enumerate(my_list):
    print(f"My pet #{index} is a: {animal}")
My pet #0 is a: cat
My pet #1 is a: dog
My pet #2 is a: rabbit

Break and continue statements

  • break out of a loop
  • continue to the next iteration
for i in range(10):
    if i == 5:
        break
    print(i)
0
1
2
3
4
for i in range(10):
    if i == 5:
        continue
    print(i)
0
1
2
3
4
6
7
8
9

Question

  • Create a list of integers from 0 to 5
  • Use a loop to find the sum of the squares of the integers in the list
  • Print the result
numbers = [0, 1, 2, 3, 4, 5]
sum_of_squares = 0

for number in numbers:
    sum_of_squares = sum_of_squares + number**2

print(sum_of_squares)
55


numbers = [0, 1, 2, 3, 4, 5]
sum_of_squares = 0

for number in numbers:
    sum_of_squares += number**2

print(sum_of_squares)
55

While loops

  • What if you don’t know when you should stop iterating?
  • Use while!
  • Does something while a condition is met
expected_result = 10
result = 0
while not expected_result == result:
    print("Not there yet...")
    print(f"Result is {result}")
    result += 1
print("We got there!")
Not there yet...
Result is 0
Not there yet...
Result is 1
Not there yet...
Result is 2
Not there yet...
Result is 3
Not there yet...
Result is 4
Not there yet...
Result is 5
Not there yet...
Result is 6
Not there yet...
Result is 7
Not there yet...
Result is 8
Not there yet...
Result is 9
We got there!

Question

  • Write a loop that only prints numbers divisible by 3 until you get to 30
  • But skip printing if the number is divisible by 5
  • Use a while loop
number = 0
while number < 30:
    if number % 3 == 0 and number % 5 != 0:
        print(number)
    number += 1
3
6
9
12
18
21
24
27

List comprehensions

  • Concise way to create lists
  • [expression for item in iterable]
  • Can include conditions
  • [expression for item in iterable if condition]

Classic:

squares = []
for num in range(5):
    squares.append(num**2)
print(squares)
[0, 1, 4, 9, 16]

List comprehension:

squares = [num**2 for num in range(5)]
print(squares)
[0, 1, 4, 9, 16]

With condition:

large_squares = [num**2 for num in range(10) if num > 3]
print(large_squares)
[16, 25, 36, 49, 64, 81]

Question

  • Create a list of square even numbers from 1 to 10 using a list comprehension
even_squares = [num**2 for num in range(1, 11) if num % 2 == 0]
print(even_squares)
[4, 16, 36, 64, 100]

Writing your first Python script

Ways of working with Python

  • REPL (Read-Eval-Print Loop), “interactive Python”, “Python console”
  • Jupyter notebooks (.ipynb)
  • Scripts (.py)

Writing your first Python script

Demo time! 🧑🏻‍💻👩🏻‍💻

Loading and saving data

Writing files

  • Python can open files in various modes:
    • ‘r’ - read (default)
    • ‘w’ - write (create or overwrite)
    • ‘a’ - append (adds to the end of the file)
  • Always close the file after you’re done
  • Use with to do this automatically
new_file = open('example.txt', 'w')
new_file.write('Hello, world!\n')
new_file.close()


with open('example.txt', 'a') as new_file:
    new_file.write("Appended line.\n")
    new_file.write('Hello, world!\n')

Reading files

  • Open file in read mode to get content
  • Files can also be read line by line
with open('example.txt', 'r') as f:
    contents = f.read()

print(contents)
Hello, world!
Appended line.
Hello, world!

Demo

  • Let’s write a script that saves the below data to a comma-separated file
column_labels = "sample_id,speed,distance"
samples = [(0, 12, 53), (1, 7, 23), (2, 15, 30)]

Demo

column_labels = "sample_id,speed,distance"
samples = [(0, 12, 53), (1, 7, 23), (2, 15, 30)]

with open("out.csv", "w") as data_file:
    data_file.write(column_labels)
    data_file.write("\n")

    for sample in samples:
        for value in sample:
            data_file.write(str(value) + ",")
        data_file.write("\n")

Functions

Functions

  • Rectified Linear Unit (ReLU)
    • f(x) = max(0, x)

x = 5
if x > 0:
    y = x
else:
    y = 0
print(y)
5

Functions

  • Time consuming and error prone to repeat code
  • Functions allow you to:
    • Reuse code
    • Break problems into smaller pieces
  • Scope defined by indentation
  • Defined using the def keyword
  • Can take inputs (arguments) and return outputs
def relu(x):
    if x > 0:
        return x
    else:
        return 0

y = relu(0.2)
print(y)

print(relu(5))
print(relu(-3))
0.2
5
0

Exercise

  • Write a function to check that a password is at least 8 characters long
  • The function should take a string as input and return True if the password is long enough, and False otherwise
  • You can use the built-in len() function to get the length of a string
def is_valid_password(password):
    # Your code here
    pass

Exercise

def is_valid_password(password):
    if len(password) >= 8:
        return True
    else:
        return False

print(is_valid_password("longpassword"))
print(is_valid_password("short"))
True
False

Arguments

  • Functions can take multiple arguments
  • Positional arguments
    • Must be given in the correct order
  • Keyword arguments
    • Can be given in any order
    • Use the syntax name=value
    • Must come after positional arguments
  • Default arguments
    • Have a default value if not provided
    • Must come after non-default arguments
def print_sum(num1, num2):
    my_sum = num1 + num2
    print(my_sum)

print_sum(3, 5)  # Positional arguments
print_sum(num2=5, num1=3)  # Keyword arguments
8
8

Arguments

def list_animals(first="dog", second="cat", third="penguin"):
    print(f"First animal: {first}")
    print(f"Second animal: {second}")
    print(f"Third animal: {third}")

list_animals()
First animal: dog
Second animal: cat
Third animal: penguin


list_animals(second="elephant", first="cow")
First animal: cow
Second animal: elephant
Third animal: penguin


list_animals(second="lion", "tiger")
  Cell In[62], line 1
    list_animals(second="lion", "tiger")
                                       ^
SyntaxError: positional argument follows keyword argument

Arguments

def list_animals(first="dog", second, third="penguin"):
    print(f"First animal: {first}")
    print(f"Second animal: {second}")
    print(f"Third animal: {third}")

list_animals()
  Cell In[63], line 1
    def list_animals(first="dog", second, third="penguin"):
                                  ^
SyntaxError: parameter without a default follows parameter with a default

Exercise

  • Write a function that takes two strings as arguments and prints the longer of the two strings
  • One of the arguments should have a default value of an empty string
def longer_string(str1, str2=""):
    # Your code here
    pass

Exercise

def longer_string(str1, str2=""):
    if len(str1) > len(str2):
        long_str =  str1
    else:
        long_str = str2

    print(long_str)

longer_string("apple", "banana")
longer_string("apple")
banana
apple

Using * and **

  • * and ** are upacking operators, useful to unpack tuples, lists and dictionaries
  • You might have seen the syntax *args and **kwargs before
  • This is just a convention to indicate that the function takes a variable number of arguments
def my_func(*args, **kwargs):
    print(f"Unpacking positional arguments: {args}")
    print(f"Unpacking keyword arguments: {kwargs}")

my_list = [1, 2, 3, 4, 5]
my_dict = {'a': 1, 'b': 2, 'c': 3}

my_func(*my_list)

my_func(**my_dict)
Unpacking positional arguments: (1, 2, 3, 4, 5)
Unpacking keyword arguments: {}
Unpacking positional arguments: ()
Unpacking keyword arguments: {'a': 1, 'b': 2, 'c': 3}

Using * and **

def my_func(*args, **kwargs):
    print(f"Unpacking positional arguments: {args}")
    print(f"Unpacking keyword arguments: {kwargs}")

my_func(1, 'a', 3.14)
my_func(name="John", age=30)
my_func(1, 2, 3, name="Jane", city="New York")
Unpacking positional arguments: (1, 'a', 3.14)
Unpacking keyword arguments: {}
Unpacking positional arguments: ()
Unpacking keyword arguments: {'name': 'John', 'age': 30}
Unpacking positional arguments: (1, 2, 3)
Unpacking keyword arguments: {'name': 'Jane', 'city': 'New York'}

How are *args and **kwargs useful?

  • *args and **kwargs are useful when you don’t know which arguments will be passed to the function
  • Or when you wrap another function and want to pass all arguments to the wrapped function
def my_func(**kwargs):
    if 'name' in kwargs:
        print(f"Hello, {kwargs['name']}!")
    else:
        print("Hello, world!")

my_func(name="John")
my_func()
my_func(name="Jane")
Hello, John!
Hello, world!
Hello, Jane!

Return values

  • Functions can return one or more values
  • Use the return keyword
  • A function can have multiple return statements but only one will be executed
  • If no return statement is given, the function returns None
def a_func(a, b):
    print(a + b)

def b_func(a, b):
    return a + b


a_func(10, 20)
30


b_func(10, 20)
30


result = b_func(10, 20)
print(result)
30

Return values

def c_func(a, b, c):
    sum1 = a + b
    sum_all = a + b + c

    return sum1, sum_all


result1, result2 = c_func(1, 2, 3)
print(result1)
print(result2)
3
6


result = c_func(1, 2, 3)
print(type(result))
print(result)
<class 'tuple'>
(3, 6)


_, result2 = c_func(1, 2, 3)
print(result2)
6

Classes and objects

Objects

  • Everything in Python is an object
    • Integer, float, string, list, functions…
  • Objects have attributes and methods
    • Attributes: properties of the object
    • Methods: functions that belong to the object
a_string = "penguin"
print(a_string.capitalize())
Penguin
file_obj = open("example.txt", "a")
print(file_obj.name)
print(file_obj.mode)
file_obj.close()
example.txt
a

Classes

  • A class is a blueprint for creating objects
  • Defined using the class keyword
  • Can have attributes and methods (functions)
  • __init__ method is called when an object is created
  • self refers to the instance of the class
class Animal():
    def __init__(self, species):
        self.species = species

    def greet(self):
        print(f"Hello, I am a {self.species}.")

pingu = Animal("penguin")
print(type(pingu))
print(pingu.species)
pingu.greet()
<class '__main__.Animal'>
penguin
Hello, I am a penguin.

Exercise

  • Try adding a new attribute noise to the Animal class
  • Add a method make_noise that prints the noise of the animal
class Animal():
    your code here

pingu = Animal("penguin", "noot")
pingu.make_noise() # Output: noot

Exercise

class Animal():
    def __init__(self, species, noise):
        self.species = species
        self.noise = noise

    def greet(self):
        print(f"{self.noise}, I am a {self.species}.")

    def make_noise(self):
        print(self.noise)

pingu = Animal("penguin", "noot")
pingu.make_noise()
noot

Errors and exceptions

Errors and Exceptions

  • You’ve probably seen an error already
  • If not try these, why don’t they work?
import london
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[81], line 1
----> 1 import london

ModuleNotFoundError: No module named 'london'
len(print)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[82], line 1
----> 1 len(print)

TypeError: object of type 'builtin_function_or_method' has no len()
print("Hello"
  Cell In[83], line 1
    print("Hello"
                 ^
SyntaxError: incomplete input

SyntaxError

  • Something is wrong with the structure of your code, code cannot run
  • What’s wrong below?
print hello
  Cell In[84], line 1
    print hello
    ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(...)?
 for i in [0, 1, 2]
    print(i)
  Cell In[85], line 1
    for i in [0, 1, 2]
                      ^
SyntaxError: expected ':'
"hello" = some_string
  Cell In[86], line 1
    "hello" = some_string
    ^
SyntaxError: cannot assign to literal here. Maybe you meant '==' instead of '='?

Exceptions

  • Something went wrong while your code was running
  • What’s wrong in these examples?
1 / 0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[87], line 1
----> 1 1 / 0

ZeroDivisionError: division by zero
giraffe * 10
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[88], line 1
----> 1 giraffe * 10

NameError: name 'giraffe' is not defined
a_list = [1, 2, 3]
a_list[5]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[89], line 2
      1 a_list = [1, 2, 3]
----> 2 a_list[5]

IndexError: list index out of range

Tracebacks

  • Help you find the source of the error
  • Read from the bottom up
  • Look for the last line that is your code
  • Debugging manifesto 🐛

Tracebacks

def divide_0(x):
    return x / 0

def call_func(x):
    y = divide_0(x)

z = call_func(10)
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[90], line 7
      4 def call_func(x):
      5     y = divide_0(x)
----> 7 z = call_func(10)

Cell In[90], line 5, in call_func(x)
      4 def call_func(x):
----> 5     y = divide_0(x)

Cell In[90], line 2, in divide_0(x)
      1 def divide_0(x):
----> 2     return x / 0

ZeroDivisionError: division by zero

Tracebacks

def none_function():
    return None

def get_file_name():
    return none_function()

file_name = none_function()
open_file = open(file_name)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[91], line 8
      5     return none_function()
      7 file_name = none_function()
----> 8 open_file = open(file_name)

File ~/.local/lib/python3.12/site-packages/IPython/core/interactiveshell.py:343, in _modified_open(file, *args, **kwargs)
    336 if file in {0, 1, 2}:
    337     raise ValueError(
    338         f"IPython won't let you open fd={file} by default "
    339         "as it is likely to crash IPython. If you know what you are doing, "
    340         "you can use builtins' open."
    341     )
--> 343 return io_open(file, *args, **kwargs)

TypeError: expected str, bytes or os.PathLike object, not NoneType

Handling exceptions

  • What if you know an error might happen?
def divide(x, y):
    print(x / y) # This might raise an error

divide(10, 2)
divide(10, 0)
5.0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[92], line 5
      2     print(x / y) # This might raise an error
      4 divide(10, 2)
----> 5 divide(10, 0)

Cell In[92], line 2, in divide(x, y)
      1 def divide(x, y):
----> 2     print(x / y)

ZeroDivisionError: division by zero

Handling exceptions

  • What if you know an error might happen?
  • You can handle it with try and except
def safe_divide(x, y):
    try:
        value = x / y
        print(f"Result: {value}")
    except:
        print("Y cannot be zero!")
safe_divide(10, 2)
safe_divide(10, 0)
Result: 5.0
Y cannot be zero!

Handling exceptions

  • What’s the problem with this code?
safe_divide("10", 2)
Y cannot be zero!
  • Don’t make your except too general!
def better_safe_divide(x, y):
    try:
        value = x / y
        print(f"Result: {value}")
    except ZeroDivisionError:
        print("Y cannot be zero!")
    except TypeError:
        print("Both x and y must be numbers!")

better_safe_divide(10, 2)
better_safe_divide(10, 0)
better_safe_divide("10", 2)
Result: 5.0
Y cannot be zero!
Both x and y must be numbers!

Exercise

  • Write a function to return the length of the input
  • If the object has no length, return None
def safe_len(x):
    # Your code here
    pass

Exercise

def safe_len(x):
    try:
        return len(x)
    except TypeError:
        return None

print(safe_len("hello"))
print(safe_len(42))
print(safe_len([1, 2, 3]))
5
None
3

IDEs

IDEs: Integrated Development Environments

  • Used by most developers
  • Allows you to switch between the three main ways of working with Python
  • Within the same program you can:
    • Edit Text editor 📝
    • Organise File explorer 📁
    • Run Console 💻

Example IDEs

You can click on the name to go to the download page.

Install the one you like most!
Igor prefers PyCharm, Laura prefers VS Code.

Using IDEs

Using IDEs

Using IDEs

Using IDEs

Using IDEs

Virtual environments

Scientific Python ecosystem

Scientific Python ecosystem

Virtual environments

  • Many packages are continuously developed 🔁
    • This means there will be breaking changes
    • Some packages will require specific versions of other packages
  • Similar with Python versions
    • New versions of Python may introduce new features or deprecate old ones

Virtual environments

Virtual environments

  • Many packages are continuously developed 🔁
    • This means there will be breaking changes
    • Some packages will require specific versions of other packages
  • Similar with Python versions
    • New versions of Python may introduce new features or deprecate old ones
  • How can we have two versions of the same package installed at the same time?
  • Virtual environments! 🎉

Virtual environments

  • A virtual environment is an isolated Python installation
    • Each environment has its own Python version
    • Each environment has its own set of installed packages
  • Popular tools to manage virtual environments:

Conda environments

  • Conda is a package manager and virtual environment manager
  • Separates Python environments from each other (and from system Python)
  • Can also manage non-Python packages (e.g. R, C libraries, etc…)
  • Reproducible environments that can be shared with:
    • Collaborators
    • Readers of your paper
    • Future you (HPC, etc…)

Environment management

In your terminal (use Anaconda Prompt on Windows):


conda create --name python-intro python=3.13 notebook


  • Creates a new environment named python-intro
  • Installs Python 3.13 and Jupyter Notebook in that environment

Environment management

In your terminal (use Anaconda Prompt on Windows):


conda activate python-intro


  • Remember to activate the environment before using it!

Environment management

In your terminal (use Anaconda Prompt on Windows):


conda deactivate


  • Deactivates the current environment

Environment management

In your terminal (use Anaconda Prompt on Windows):


conda env list


  • Lists all conda environments on your system

Environment management

In your terminal (use Anaconda Prompt on Windows):


conda remove --name python-intro --all


  • Deletes the python-intro environment and all its packages

Modules and packages

Standard library

  • A module is a file containing Python code
  • A package is a collection of modules
  • The standard library is a set of modules that come with Python
  • It provides a wide range of functionality
    • File I/O
    • Time and date handling
    • Math and statistics
  • Accessed using the import and from keywords
import os

print(os.getcwd())
/home/runner/work/course-intro-python/course-intro-python
import math

print(math.sqrt(16))
4.0
from pathlib import Path
print(Path.home())
/home/runner

Installing a third-party package

Important

  • In the terminal, not in the Python console!
  • Remember to activate the conda environment first
conda activate python-intro
pip install pandas

Using packages

  • pandas is a popular package for data science
  • Based on DataFrames (tables) and Series (columns)
  • You can access it using the import keyword
import pandas

data_frame = pandas.read_csv("out.csv")

print(data_frame)
   sample_id  speed  distance
0         12     53       NaN
1          7     23       NaN
2         15     30       NaN

Importing packages

  • You can import all available functions in a package
  • You can import a specific function from a package
    • Use from ... import ...
  • You can also rename the package or function using as
    • Common convention for pandas is pd
import pandas

df = pandas.read_csv("out.csv")


from pandas import read_csv

df = read_csv("out.csv")


import pandas as pd

df = pd.read_csv("out.csv")

Installing packages

Two main ways to install packages:

pip

  • Python’s built-in package manager
  • Uses the Python Package Index (PyPI)
pip install package_name

conda

  • Uses multiple channels (conda-forge is best, avoid defaults)
  • Can install non-Python dependencies
conda install package_name

PyPI logo



Conda logo

Exercise

In your conda environment, using pip:

  • Install numpy
  • Uninstall numpy
  • Install numpy version 1.26.4
  • Update the installation of numpy
  • Install matplotlib and scikit-image in one command
  • List all installed packages with pip list
pip install numpy


pip uninstall numpy


pip install numpy==1.26.4


pip install numpy -U


pip install matplotlib scikit-image

Exercise

In your conda environment, using conda:

  • Install scipy
  • Install scipy version 1.11.1
  • Update the installation of scipy
  • List all installed packages with conda list
conda install scipy


conda install scipy=1.11.1


conda update scipy

Exercise

  • Find and install a package for opening Word documents in Python
pip install python-docx

Organising your code

Structuring your code

  • As your code gets more complex, you will want to organise it into multiple files and folders
  • This makes it easier to find and reuse code
  • Python modules and packages help you do this
  • You can also use version control (e.g. Git) to manage changes to your code (covered in the good practices lecture)

Exercise

  • Create a file my_funcs.py with the functions square_add_10 and print_twice
def square_add_10(x):
    y = x**2
    y = y + 10
    return y

def print_twice(a_string):
    print(a_string)
    print(a_string)

Exercise

  • In a new file analysis.py, import the functions from my_funcs.py and use them
  • Modularity promotes reuse!
from .my_funcs import square_add_10, print_twice
result = square_add_10(5)
print("Result:", result)

print_twice("Hello, world!")
Result: 35
Hello, world!
Hello, world!

Structuring your code

  • As your codebase grows, consider organizing files into directories
from .utils.my_funcs import square_add_10, print_twice
result = square_add_10(5)
print("Result:", result)
print_twice("Hello, world!")
Result: 35
Hello, world!
Hello, world!

Organisation

  • Organise your code like you would any other project
  • Use meaningful names for files and directories
  • Group related functionality together
.
├── python_project/
│   ├── analysis/
│   │   ├── align.py
│   │   ├── fft.py
│   │   ├── preprocessing.py
│   │   └── register.py
│   ├── IO/
│   │   ├── load.py
│   │   ├── save.py
│   │   ├── tools
│   │   ├── settings.py
│   │   └── system.py
│   └── visualisation/
│       ├── preprocess.py
│       └── visualisation.py
└── README.md

Documenting your code

Documenting your code

  • Aim to need as little documentation as possible
  • Which is easier to understand?
    • Use meaningful pronounceable names
f = 378
c = 22
fpc = f / c


foci_number = 378
cell_number = 22
foci_per_cell = foci_number / cell_number

Documenting your code

  • Each function should do one thing
  • Give functions descriptive names
  • This is a form of documentation too!
def analyse_well(well_data):
    ...

def save_analysed_well(well_data):
    ...

def process_single_well(well_data):
    analysed = analyse_well(well_data)
    save_analysed_well(analysed)

def process_plate(plate_data):
    for well_data in plate_data:
        process_single_well(well_data)

Comments

  • Sometimes the code isn’t self-explanatory
  • Use comments to explain why, not what
  • Use # for single-line comments
  • Use triple quotes """ for multi-line comments or docstrings
  • What’s wrong with this example?
def analyse_image(image):
    # run gaussian blur
    image = gaussian(image)
    # run otsu thresholding
    image = otsu(image)
    # run watershed
    image = watershed(image)
    return image

Comments

def analyse_image(image):
    # run gaussian blur
    image = gaussian(image)
    # run otsu thresholding
    image = otsu(image)
    # run watershed
    image = watershed(image)
    return image
def analyse_image(image):
    # remove noise
    image = gaussian(image)
    # binarise cells
    image = otsu(image)
    # split merged cells
    image = watershed(image)
    return image

Docstrings

  • Use docstrings to describe what a function does and how to use it
  • Detail inputs and outputs
  • Use triple quotes """ for docstrings
def add_numbers(num_0, num_1):
    """Adds two numbers together

    Parameters
    ----------
    num_0 : float
        First number to be added

    num_1 : float
        Second number to be added

    Returns
    -------
    float
        Sum of numbers
    """
    return num_0 + num_1

README files

  • Single file, usually README.md or README.txt
  • Outlines most important information
    • What the project does
    • How to install
    • How to use
  • Very useful for others (and future you!)
My awesome Python package
Adam Tyson 2025-01-01
code@adamltyson.com

Analyses data from X, Y, Z

To install:
...
Data requirements:
...
To run:
...
Output:
...
Troubleshooting:
...

Next steps

Next steps

  • We’ve covered some of the basics
  • Next steps – Practice!:
    • Mess around with code
    • Solve problems
    • Google stuff
    • Contact
    • Additional courses

Further resources

Troubleshooting tips

  • Make sure the correct environment is activated
    • which python or which pip to check
  • Check the scope of your variable
  • Pay attention to whether an operation is in_place or not
    • Especially with pandas and numpy
  • Structure your code as a package as soon as you can
    • pyproject.toml → more on this in future sessions!
  • Look out for deep vs shallow copies
    • Especially with collections (lists, dicts, sets)

Question

  • What does this return?
a = [1, 2, 3]
b = a
b[0] = 42

print(a[0])
42

Troubleshooting tips

  • Read the error messages!
    • Google is your friend
    • LLMs can help too
    • Ask a colleague or contact us
  • Use a debugger (e.g. pdb, or IDE built-in debuggers)
  • Write tests for your code (e.g. using pytest)