5.2. Optimization Micro-benchmarking

5.2.1. Evaluation

  • Fresh start of Python process

  • Clean memory before start

  • Same data

  • Same start conditions, CPU load, RAM usage, iostat

  • Do not measure how long Python wakes up

  • Check what you measure

5.2.2. PyPerformance

  • pip install pyperformance

  • pyperformance run -b myfile.py - run myfile.py benchmark

$ python3.12 -m venv venv-py312
$ venv-py312/bin/pip install pyperformance
$ venv-py312/bin/pyperformance run -b myfile.py
$ python3.13 -m venv venv-py313
$ venv-py313/bin/pip install pyperformance
$ venv-py313/bin/pyperformance run -b myfile.py

5.2.3. References

5.2.4. Assignments

# %% About
# - Name: About EntryTest Endswith
# - Difficulty: easy
# - Lines: 6
# - Minutes: 5

# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author

# %% English
# 1. Collect email addresses with domain listed in `DOMAINS`
# 2. Define variable `result: list[str]` with the result
# 3. Search for emails only in `DATA` -> `rows`
# 4. Run doctests - all must succeed

# %% Polish
# 1. Zbierz adresy email z domeną wylistowaną w `DOMAINS`
# 2. Zdefiniuj zmienną `result: list[str]` z wynikiem
# 3. Szukaj adresów email tylko w `DATA` -> `rows`
# 4. Uruchom doctesty - wszystkie muszą się powieść

# %% Expected
# >>> result
# ['alice@example.com',
#  'bob@example.com',
#  'carol@example.com',
#  'mallory@example.net']

# %% Why
# - Check if you can filter data
# - Check if you know string methods
# - Check if you know how to iterate over `list[dict]`

# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0

>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'

>>> assert 'result' in globals(), \
'Variable `result` is not defined; assign result of your program to it.'

>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'

>>> assert type(result) is list, \
'Variable `result` has an invalid type; expected: `list`.'

>>> assert len(result) > 0, \
'Variable `result` has an invalid length; expected more than zero elements.'

>>> assert all(type(x) is str for x in result), \
'Variable `result` has elements of an invalid type; all items should be: `str`.'

>>> from pprint import pprint
>>> result = sorted(result)
>>> pprint(result)
['alice@example.com',
 'bob@example.com',
 'carol@example.com',
 'mallory@example.net']
"""

# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`

# %% Imports

# %% Types
result: list[str]

# %% Data
DATA = {
    'database': 'myapp',
    'table': 'users',
    'rows': [
        {'username': 'alice', 'email': 'alice@example.com'},
        {'username': 'bob', 'email': 'bob@example.com'},
        {'username': 'carol', 'email': 'carol@example.com'},
        {'username': 'dave', 'email': 'dave@example.org'},
        {'username': 'eve', 'email': 'eve@example.org'},
        {'username': 'mallory', 'email': 'mallory@example.net'},
    ]
}

DOMAINS = ('example.com', 'example.net')

# %% Result
result = ...

# %% About
# - Name: About EntryTest ToListDict
# - Difficulty: easy
# - Lines: 2
# - Minutes: 5

# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author

# %% English
# 1. Convert `DATA` from  `list[tuple]` to `list[dict]`
# 2. First row has column names (keys in result `dict`)
# 2. Define variable `result` with the result
# 3. Run doctests - all must succeed

# %% Polish
# 1. Przekonwertuj `DATA` z `list[tuple]` do `list[dict]`
# 2. Pierwszy wiersz ma nazwy kolumn (klucze w wynikowym `dict`)
# 3. Zdefiniuj zmienną `result` z wynikiem
# 4. Uruchom doctesty - wszystkie muszą się powieść

# %% Expected
# >>> result
# [{'firstname': 'Alice', 'lastname': 'Apricot', 'age': 30},
#  {'firstname': 'Bob', 'lastname': 'Blackthorn', 'age': 31},
#  {'firstname': 'Carol', 'lastname': 'Corn', 'age': 32},
#  {'firstname': 'Dave', 'lastname': 'Durian', 'age': 33},
#  {'firstname': 'Eve', 'lastname': 'Elderberry', 'age': 34},
#  {'firstname': 'Mallory', 'lastname': 'Melon', 'age': 15}]

# %% Why
# - Convert data from `list[tuple]` to `list[dict]`
# - `list[tuple]` is used to represent CSV data
# - `list[tuple]` is used to represent database rows
# - `list[dict]` is used to represent JSON data
# - CSV is the most popular format in data science

# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0

>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'

>>> assert 'result' in globals(), \
'Variable `result` is not defined; assign result of your program to it.'

>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> result = list(result)
>>> assert type(result) is list, \
'Variable `result` has an invalid type; expected: `list`.'

>>> assert len(result) > 0, \
'Variable `result` has an invalid length; expected more than zero elements.'

>>> assert all(type(x) is dict for x in result), \
'Variable `result` has elements of an invalid type; all items should be: `dict`.'

>>> from pprint import pprint
>>> pprint(result, sort_dicts=False)
[{'firstname': 'Alice', 'lastname': 'Apricot', 'age': 30},
 {'firstname': 'Bob', 'lastname': 'Blackthorn', 'age': 31},
 {'firstname': 'Carol', 'lastname': 'Corn', 'age': 32},
 {'firstname': 'Dave', 'lastname': 'Durian', 'age': 33},
 {'firstname': 'Eve', 'lastname': 'Elderberry', 'age': 34},
 {'firstname': 'Mallory', 'lastname': 'Melon', 'age': 15}]
"""

# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`

# %% Imports

# %% Types
result: list[dict[str,str|int]]

# %% Data
DATA = [
    ('firstname', 'lastname', 'age'),
    ('Alice', 'Apricot', 30),
    ('Bob', 'Blackthorn', 31),
    ('Carol', 'Corn', 32),
    ('Dave', 'Durian', 33),
    ('Eve', 'Elderberry', 34),
    ('Mallory', 'Melon', 15),
]


# %% Result
result = ...