8.1. Recap About
8.1.1. Assignments
# %% About
# - Name: Recap About Sample
# - Difficulty: easy
# - Lines: 4
# - Minutes: 5
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Read data from `DATA` as `df: pd.DataFrame`
# 2. Set all rows in random order
# 3. Reset index without leaving a backup of the old one
# 4. Define `result` with last 10 rows
# 5. Run doctests - all must succeed
# %% Polish
# 1. Wczytaj dane z `DATA` jako `df: pd.DataFrame`
# 2. Ustaw wszystkie wiersze w losowej kolejności
# 3. Zresetuj index nie pozostawiając kopii zapasowej starego
# 4. Zdefiniuj `result` z ostatnimi 10 wierszami
# 5. Uruchom doctesty - wszystkie muszą się powieść
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert 'result' in globals(), \
'Variable `result` is not defined; assign result of your program to it.'
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is pd.DataFrame, \
'Variable `result` has an invalid type; expected: `pd.DataFrame`.'
>>> pd.set_option('display.max_columns', 50)
>>> pd.set_option('display.max_rows', 200)
>>> pd.set_option('display.width', 500)
>>> pd.set_option('display.memory_usage', 'deep')
>>> pd.set_option('display.precision', 4)
>>> result # doctest: +NORMALIZE_WHITESPACE
Name Country Gender Flights Total Flights Total Flight Time (ddd:hh:mm)
557 Thomas Marshburn, M.D. United States Man STS-127 (2009), Soyuz TMA-07M (2012) 2 161:07:03
558 Michael Baker United States Man STS-43 (1991), STS-52 (1992), STS-68 (1994), S... 4 040:03:04
559 Rick Husband United States Man STS-96 (1999), STS-107 (2003) 2 025:13:33
560 Svetlana Savitskaya Soviet Union Woman Soyuz T-7 (1982), Soyuz T-12 (1984) 2 019:17:07
561 Charles "Pete" Conrad United States Man Gemini 5 (1965), Gemini 11 (1966), Apollo 12 (... 4 049:03:38
562 Lawrence J. DeLucas United States Man STS-50 (1992) 1 013:19:30
563 Aleksandr Laveykin Soviet Union Man Soyuz TM-2 (1987) 1 174:03:25
564 Owen Garriott United States Man Skylab 3 (1973), STS-9 (1983) 2 069:17:56
565 Ivan Vagner Russia Man Soyuz MS-16 (2020) 1 145:04:14
566 Yuri Malenchenko Russia Man Soyuz TM-19 (1994), STS-106 (2000), Soyuz TMA-... 6 826:09:22
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
import pandas as pd
import numpy as np
# %% Types
result: pd.DataFrame
# %% Data
np.random.seed(0)
DATA = 'https://python3.info/_static/astro-database.csv'
# %% Result
result = ...
# %% About
# - Name: Recap About Sample
# - Difficulty: easy
# - Lines: 5
# - Minutes: 5
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Read data from `DATA` as `df: pd.DataFrame`
# 2. In data column "Order":
# - determines the order of the astronaut/cosmonaut in space
# - Sometimes several people flew on the same ship and their numbers should be the same, and in the data there is `NaN`.
# - Fill in the missing indexes using `df.ffill()`
# 3. Set all rows in random order
# 4. Reset index without leaving a backup copy of the old one
# 5. Run doctests - all must succeed
# %% Polish
# 1. Wczytaj dane z `DATA` jako `df: pd.DataFrame`
# 2. W danych kolumna "Order":
# - określa kolejność astronauty/kosmonauty w kosmosie
# - Czasami kilka osób leciało tym samym statkiem i ich numery powinny być takie same, a w danych jest `NaN`.
# - Wypełnij brakujące indeksy stosując `df.ffill()`
# 3. Ustaw wszystkie wiersze w losowej kolejności
# 4. Zresetuj index nie pozostawiając kopii zapasowej starego
# 5. Uruchom doctesty - wszystkie muszą się powieść
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert 'result' in globals(), \
'Variable `result` is not defined; assign result of your program to it.'
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is pd.DataFrame, \
'Variable `result` has an invalid type; expected: `pd.DataFrame`.'
>>> pd.set_option('display.max_columns', 50)
>>> pd.set_option('display.max_rows', 200)
>>> pd.set_option('display.width', 500)
>>> pd.set_option('display.memory_usage', 'deep')
>>> pd.set_option('display.precision', 4)
>>> result # doctest: +NORMALIZE_WHITESPACE
Order Astronaut Type Date Spacecraft
0 244 Donald McMonagle Orbital 28 April 1991 STS-39
1 93 Georgi Ivanov Orbital 10 April 1979 Soyuz 33
2 387 Rick Husband Orbital 27 May 1999 STS-96
3 185 William Pailes Orbital 3 October 1985 51-J
4 390 Jeffrey Ashby Orbital 23 July 1999 STS-93
.. ... ... ... ... ...
578 277 Franco Malerba Orbital 31 July 1992 STS-46
579 10 Leroy Cooper Orbital 15 May 1963 Faith 7
580 359 Carlos Noriega Orbital 15 May 1997 STS-84
581 192 Rodolfo Neri Vela Orbital 27 November 1985 61-B
582 559 David Saint-Jacques Orbital 3 December 2018 Soyuz MS-11
<BLANKLINE>
[583 rows x 5 columns]
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
import pandas as pd
import numpy as np
# %% Types
result: pd.DataFrame
# %% Data
np.random.seed(0)
DATA = 'https://python3.info/_static/astro-order.csv'
# %% Result
result = ...
# %% About
# - Name: Recap About Phones
# - Difficulty: easy
# - Lines: 5
# - Minutes: 8
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Read data from `DATA` as `df: pd.DataFrame`
# 2. Give information about total duration of all phone calls for each calendar month
# 3. Run doctests - all must succeed
# %% Polish
# 1. Wczytaj dane z `DATA` jako `df: pd.DataFrame`
# 2. Podaj informacje o łącznej długości wszystkich połączeń telefonicznych dla każdego miesiąca kalendarzowego
# 3. Uruchom doctesty - wszystkie muszą się powieść
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert 'result' in globals(), \
'Variable `result` is not defined; assign result of your program to it.'
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is pd.Series, \
'Variable `result` has an invalid type; expected: `pd.Series`.'
>>> pd.set_option('display.max_columns', 50)
>>> pd.set_option('display.max_rows', 200)
>>> pd.set_option('display.width', 500)
>>> pd.set_option('display.memory_usage', 'deep')
>>> pd.set_option('display.precision', 4)
>>> result # doctest: +NORMALIZE_WHITESPACE
year month
1999 10 16309.0
11 16780.0
12 14861.0
2000 1 18705.0
2 11019.0
3 14647.0
Name: duration, dtype: float64
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
import pandas as pd
# %% Types
result: pd.DataFrame
# %% Data
DATA = 'https://python3.info/_static/phones-pl.csv'
# %% Result
result = ...
# %% About
# - Name: Recap About FemaleTop
# - Difficulty: medium
# - Lines: 5
# - Minutes: 8
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Read data from `DATA` as `df: pd.DataFrame`
# 2. Which nationality has the most flight time of a female in space?
# 3. Sort the result in descending order
# 4. Run doctests - all must succeed
# %% Polish
# 1. Wczytaj dane z `DATA` jako `df: pd.DataFrame`
# 2. Który kraj ma największy nalot kobiet w kosmosie?
# 3. Posortuj wynik malejąco
# 4. Uruchom doctesty - wszystkie muszą się powieść
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert 'result' in globals(), \
'Variable `result` is not defined; assign result of your program to it.'
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is pd.Series, \
'Variable `result` has an invalid type; expected: `pd.Series`.'
>>> pd.set_option('display.max_columns', 50)
>>> pd.set_option('display.max_rows', 200)
>>> pd.set_option('display.width', 500)
>>> pd.set_option('display.memory_usage', 'deep')
>>> pd.set_option('display.precision', 4)
>>> result # doctest: +NORMALIZE_WHITESPACE
Nationality
American 124
Russian 6
Canadian 3
Japanese 3
Chinese 2
French 2
British 1
Italian 1
South Korean 1
Name: Flights, dtype: int64
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
import pandas as pd
# %% Types
result: pd.Series
# %% Data
DATA = 'https://python3.info/_static/astro-gender.csv'
# %% Result
result = ...
# %% About
# - Name: Recap About AstronautTop10
# - Difficulty: medium
# - Lines: 5
# - Minutes: 13
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Read data from `DATA`
# 2. Create ranking of astronauts with most flights
# 3. Define `result: pd.Dataframe` with top 9
# 4. Sort by `flights` (descending) and `name` (ascending)
# 5. Run doctests - all must succeed
# %% Polish
# 1. Wczytaj dane z `DATA`
# 2. Stwórz ranking astronautów z największą liczbą lotów
# 3. Zdefiniuj `result: pd.Dataframe` z top 9
# 4. Posortuj po `flights` (malejąco) i `name` (rosnąco)
# 5. Uruchom doctesty - wszystkie muszą się powieść
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert 'result' in globals(), \
'Variable `result` is not defined; assign result of your program to it.'
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is pd.DataFrame, \
'Variable `result` has an invalid type; expected: `pd.DataFrame`.'
>>> pd.set_option('display.max_columns', 50)
>>> pd.set_option('display.max_rows', 200)
>>> pd.set_option('display.width', 500)
>>> pd.set_option('display.memory_usage', 'deep')
>>> pd.set_option('display.precision', 4)
>>> result.reset_index(drop=True)
name flights
0 Chang-Diaz, Franklin R. 7
1 Ross, Jerry L. 7
2 Brown, Curtis L., Jr. 6
3 Foale, C. Michael 6
4 Krikalev, Sergei 6
5 Malenchenko, Yuri 6
6 Musgrave, Franklin Story 6
7 Wetherbee, James D. 6
8 Young, John W. 6
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
import pandas as pd
# %% Types
result: pd.Series
# %% Data
DATA = 'https://python3.info/_static/astro-selection.csv'
# %% Result
result = ...
# FIXME: za trudne zadanie, przenieść je do case study
# %% About
# - Name: Recap About EVA
# - Difficulty: medium
# - Lines: 13
# - Minutes: 21
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Read data from `DATA` as `df: pd.DataFrame`
# 2. Create top 10 ranking of astronauts with the most time spent on EVA (ExtraVehicular Activity)
# 3. Run doctests - all must succeed
# %% Polish
# 1. Wczytaj dane z `DATA` jako `df: pd.DataFrame`
# 2. Stwórz ranking top 10 astronautów z największym czasem EVA (Spacerów kosmicznych)
# 3. Uruchom doctesty - wszystkie muszą się powieść
# %% Hints
# - Note, that file delimiter is semicolon ";" (not comma)
# - Parse CSV and replace newlines inside fields with `","`
# - Split names into separate columns for each spacewalker (first, second, third)
# - Split names into separate rows for each spacewalker (use ffill)
# - Split times into separate columns (hours, minutes)
# - `pd.Series.str.split()` with `expand=True`
# - `pd.DataFrame.melt()`
# - `pd.DataFrame.set_index()`
# - `pd.Series.astype()`
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert 'result' in globals(), \
'Variable `result` is not defined; assign result of your program to it.'
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is pd.DataFrame, \
'Variable `result` has an invalid type; expected: `pd.DataFrame`.'
>>> pd.set_option('display.max_columns', 50)
>>> pd.set_option('display.max_rows', 200)
>>> pd.set_option('display.width', 500)
>>> pd.set_option('display.memory_usage', 'deep')
>>> pd.set_option('display.precision', 4)
>>> result # doctest: +NORMALIZE_WHITESPACE
Duration
Astronaut
Anatoliy Solovyov 3 days 06:48:00
Michael Lopez-Alegria 2 days 19:40:00
Peggy Whitson 2 days 12:21:00
Fyodor Yurchikhin 2 days 11:29:00
Jerry Ross 2 days 10:38:00
John Grunsfeld 2 days 10:30:00
Richard Mastracchio 2 days 05:04:00
Sunita Williams 2 days 02:40:00
Stephen Smith 2 days 01:48:00
Edward Fincke 2 days 00:36:00
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
import pandas as pd
# %% Types
result: pd.DataFrame
# %% Data
DATA = 'https://python3.info/_static/astro-eva.csv'
# %% Result
result = ...