Commit 02893925 authored by Michael Wimmer's avatar Michael Wimmer
Browse files

add the info on structuring python code

parent d8104a89
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Looking back at what we learned\n",
"\n",
"1. Writing simple programs by combining building blocks\n",
"2. Working with files and scripts using shell\n",
"3. Working with advanced libraries\n",
"4. Searching for answers to your coding problems online\n",
"5. Keeping track of a project using git"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"By the way: don't worry if you do not feel comfortable with one of these topics: most of these cannot be learned in a day!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Structuring (python) code\n",
"\n",
"We have learned the basics of python by now, and have seen how to use various python concepts and powerful libraries from ipython notebooks. This is a perfectly viable way to work, as long as projects are not too complicated. \n",
"\n",
"Real-life projects however usually turn out to be more complex, with code accumulating over time. In this case structuring your code is vital."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![](http://imgs.xkcd.com/comics/goto.png)\n",
"\n",
"(Image © Randall Munroe, XKCD, https://xkcd.com/292/, CC-BY-NC v2.5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Task: what distinguishes good software project from a bad one?\n",
"\n",
"1. ???\n",
"2. ???"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Bonus question: how to learn about structuring your code?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Python modules\n",
"\n",
"### Collecting commonly used code\n",
"\n",
"Typically, in your project you will develop some core functionality that you will use over and over again - for example some function\n",
"\n",
"```python\n",
"def do_some_fancy_stuff(arguments):\n",
" ...\n",
"```\n",
"\n",
"You then use this function for many purposes. Over time, you will have several notebooks using it, and then you would need to copy the function and all related code to each notebook!\n",
"\n",
"This is not practical, and error prone.\n",
"\n",
"In this case you should put the function and all related code into a python `module`. This means you put the code in some text file with a name ending in ``.py``, let's say\n",
"``module.py``. Then you can use this function in any notebook you want by importing the module:\n",
"\n",
"```python\n",
"import module\n",
"\n",
"module.do_some_fancy_stuff(...)\n",
"```\n",
" \n",
"Alternatively, you can also write\n",
"\n",
"```python\n",
"from module import do_some_fancy_stuff\n",
"\n",
"do_some_fancy_stuff(...)\n",
"```\n",
"\n",
"but you already saw that syntax in the basic introduction in ``from math import ...``.\n",
"\n",
"### Separation in different namespaces\n",
"\n",
"Another advantage of modules is that it helps you to avoid errors that may arise if too much code gets intermixed. Say you have code like this:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"c = 1\n",
"\n",
"def f(x):\n",
" return c * x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The function ``f(x)`` depends on a global variable ``c``. Now say you write a lot of code inbetween, and you define the variable ``c`` for some other purpose overwriting it by accident (you forgot it was even used before). Then you will change the behavior of ``f(x)``! Try that below"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"f(1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"c = 2\n",
"f(1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you separate the code into a module ``function.py`` which reads\n",
"\n",
"```python\n",
"c = 1\n",
"\n",
"def f(x):\n",
" return c * x\n",
"```\n",
" \n",
"and you use it as\n",
"\n",
"```python\n",
"import function\n",
"\n",
"function.f(1)\n",
"\n",
"c = 2\n",
"\n",
"function.f(1)\n",
"```\n",
"\n",
"then no problem arises. ``f`` uses the variable ``c`` from the namespace of the module, whereas you used the variable ``c`` from the namespace of your notebook. You could stil change ``c`` in the module, but then you need to write ``module.c = 2`` - which makes it immediately clear what you do.\n",
"\n",
"Namespaces are a powerful concept in python. They also apply for example to functions - if you write"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"c = 1\n",
"\n",
"def f(x):\n",
" c = 2\n",
" return c * x"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"c = 1\n",
"print(f(1))\n",
"c = 3\n",
"print(f(1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"then the variable ``c`` within the definition of ``f(x)`` is used. Note however that in this case you cannot access the variable c as ``f.c`` (try what happens if you do that). \n",
"\n",
"This was a simple example of nested namespaces. If you want to know more, then google it!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Docstrings\n",
"\n",
"If your code becomes more and more complex, you need to add documentation. A convenient way in python is to add documentation directly to the function by wrting a string directly after the function definition:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def f(x, y):\n",
" \"\"\"This is a function that does bla bla ...\n",
" \n",
" Parameters\n",
" ----------\n",
" x : value for bla bla\n",
" y : value for bla bla\n",
" \n",
" Returns\n",
" -------\n",
" z : some value that depends on x and y\n",
" \"\"\"\n",
" pass"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can then access the documentation from the ipython notebook simply by writing\n",
"\n",
"- ``f`` and hitting `SHIFT + TAB` to show the first part of the docstring\n",
"- ``f?`` or ``help(f)`` to get the full docstring\n",
"\n",
"You can also add a docstring to the top of a module file."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Testing\n",
"\n",
"If the code is complex, changes of this code become hard to check as well. Imagine a function you wrote being used in 50 different places.\n",
"\n",
"\n",
"It is very easy and common for everybody, even the most experienced programmer, to introduce bugs in your code while working on it. Often those bugs may not affect what you are doing now, but break some stuff you did before! On many occasions you can catch these problems by writing tests alongside your code.\n",
"\n",
"(In principle, there are several frameworks for keeping track of tests in python - we are using ``pytest``, which is the a common and user-friendly option)\n",
"\n",
"What you have to do is simple: When you wrote some code in a module, add another python file that starts with ``test_``, and add a function to it that starts with ``test_`` (that's easy to remember, right;). For example, in the module ``module.py`` you might have:\n",
"\n",
"```python\n",
"def add_together(x, y):\n",
" return x + y\n",
"```\n",
"\n",
"In ``test_module.py`` you would write:\n",
"\n",
"```python\n",
"import module\n",
"\n",
"def test_add_together():\n",
" assert add_together(1, 2) == 3\n",
" assert add_together(\"abc\", \"def\") == \"abcdef\"\n",
"```\n",
" \n",
"We introduced a new statement, ``assert``. Let's check here in the notebook, what it does:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%writefile module.py"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"assert 1 == 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nothing happens in this case. But now let's see what happens if we assert a statement that is not true:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"assert 1 < 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this case, an ``AssertionError`` is raised.\n",
"\n",
"The key is to introduce test functions that raise an ``AssertionError`` if something goes wrong.\n",
"\n",
"So now you have a python file with lots of tests. You can run them all automatically by calling from the command line \n",
"\n",
"```\n",
"py.test\n",
"```\n",
" \n",
"within the folder containing the modules and tests. It will run all the tests you ever wrote, and show you all failures!\n",
"\n",
"To make these tests useful, you would want to run them as often as possible. When you design them, try to use them on as small problems as possible, so that they run *fast*. In this way, you will often run them, and catch many errors."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"# Object-oriented programming and classes\n",
"\n",
"**Don't use this unless you are completely sure you cannot avoid using it.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Everything in Python is an *object* of some *class*:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"type(1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def f(x):\n",
" return x\n",
"\n",
"type(f)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"type(print)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"type(print)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"type(None)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"type(np.array([1]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Classes** are a way of defining your own types of objects with custom properties and behavior.\n",
"\n",
"Minimal example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class Fruit: # The way to define a class\n",
" pass # Empty statement to avoid a syntax error (try removing this)\n",
"\n",
"green_apple_from_Jumbo = Fruit() # Looks like we are calling a function\n",
"type(green_apple_from_Jumbo)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Objects have *attributes*, these are variables associated with the object."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"green_apple_from_Jumbo.tasty = \"maybe yes\" # assign an attribute\n",
"\n",
"print(green_apple_from_Jumbo.tasty) # access the attribute"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Objects may also have *methods*. These are functions that can modify the state of the object."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"class Fruit:\n",
" def __init__(self, taste): # The function used when the object is created.\n",
" self.taste = taste\n",
" \n",
" def eat(self):\n",
" if self.taste == \"tasty\":\n",
" print('Wow, tasty!')\n",
" else:\n",
" print('Not tasty :-(')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"green_apple_from_Jumbo = Fruit() # What will happen now?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"red_apple_from_Jumbo = Fruit(taste='tasty')\n",
"red_apple_from_Jumbo.eat() # Note that \"self\" is not passed as an argument!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"One final piece to know about objects is *inheritance*: making a new class from an old one."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class Apple(Fruit):\n",
" def __init__(self, taste, color):\n",
" self.taste = taste\n",
" self.color = color\n",
"\n",
"green_apple = Apple('tasty', 'green')\n",
"green_apple.eat() # Because green_apple is also a Fruit!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"type(green_apple)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"green_apple is an Apple:\", isinstance(green_apple, Apple))\n",
"print(\"green_apple is an Fruit:\", isinstance(green_apple, Fruit))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
../day1_morning/helpers.py
\ No newline at end of file
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Code Refactoring\n",
"\n",
"> Code refactoring is the process of restructuring existing computer code—changing the factoring—**without changing its external behavior**. ([Wikipedia](https://en.wikipedia.org/wiki/Code_refactoring))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When tackling a problem you will often not have an idea, initially, of what you're really doing.\n",
"\n",
"This lack of understanding will be reflected in the code that you write to solve the problem.\n",
"\n",
"**Refactoring** a piece of code after an initial implementation (and after verifying, with tests, that it works) has 2 benefits:\n",
"\n",
"+ Looking critically at the code can help you understand the problem better\n",
"+ When you have a better understanding of the problem, your code can better reflect the structure of the problem. This will usually make it easier to read."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As a concrete example let's take the exercise on `if/else` from day 1."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### If, if, if, ....\n",
"\n",
"In this exercise you will be making geometric pictures using `if`-clauses. To this end you will program a python function `f(x, y)` that takes inputs `x` and `y` ranging from 0 to 10, and that returns a number from 0 to 5. This function can then be plotted in a color plot.\n",
"\n",
"The color plot we have prepared for you in a helper function (you will learn in day 2 about how to plot things in python):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from helpers import plot_function"
]
},
{
"cell_type": "markdown",