{ "metadata": { "name": "", "signature": "sha256:1f45a5dd53b812656e47525bfb478333af88adc5d6d0090b42076189c4ee0ecc" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Introduction to programming 3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Clyde Fare and Jo\u00e3o Pedro Malhado, Imperial College London (contact: [python@imperial.ac.uk](mailto:python@imperial.ac.uk))\n", "\n", "This notebook is licensed under a [Creative Commons Attribution 4.0 (CC-by) license](http://creativecommons.org/licenses/by/4.0/)." ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Overview" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the last workshop of this series we finished covering the most important building blocks of the Python programming language, which are indeed common to many other programming languages (with only syntactic variations). We are thus now equipped to tackle all sorts of problems, and the major challenge in programming is to break down a given problem into simple steps using this language we have learned. The best way to achieve proficiency at doing this is by actual practice.\n", "\n", "In the first part of this workshop we will be learning how to crate new functions, which is a way to organize our code as we aim to tackle more complex problems.\n", "\n", "In the second part of the workshop we shall not be introducing new language elements, and instead focus on a few examples and problem some common themes that appear in attempting to solve them." ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Packaging more complex instructions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As mentioned above, we are now equiped to tackle most problems in programming and build complex program from the relatively simple pieces of language we covered. However, it is convenient, and in practical terms even essential, to provide structure to our programs by grouping instructions together in order to achieve a more complex task.\n", "To perform more complex operations on data we will be using two mechanisms made available by the Python language: *functions* and *methods*.\n", "\n", "So far we have already come across some simple functions, such as type(), len(), float(), print() or range(). To make use of a function, the general syntax is:\n", "\n", " function_name(function_arguments)\n", " \n", "and we will obtain the result of the function operation on the argument. In the example\n", " \n", " len(\"count me in\")" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "len is the function name, which as we have seen counts the number of characters in a string; \"count me in\" is the string the function is operating on; the result of the operation is on the Out[] cell. Naturaly we can store the result of the function on a variable\n", "\n", " letter_count=len(\"count me in\")" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ " letter_count" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Functions can accept more than one argument, in which case they are separated by commas.\n", "\n", " days=365\n", " print(\"A year has\",days,\"days\")" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The print functions is being called with 3 arguments, two strings and one variable which is a number. Note once more that print does not give an output, it is just printing to the screen all the arguments separated by spaces. It is thus not very useful to assign the result of print to a variable\n", "\n", " unresult=print(\"pointless\")" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ " unresult" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are a number of inbuilt functions in python to perform different tasks, these are listed in the [official documentation](http://docs.python.org/3/library/functions.html), we will make use of a few relevant ones but shall otherwise not insist on them. Much more useful will be to build our own custom made functions. We will see how to do this presently, but first we will look at another mechanism we haven't encoutered yet: *methods*.\n", "\n", "Methods are similar to functions but they are more tightly related to the objects they act on. For example, a pertinent operation when operating on strings would be to give the same string all in capital letters. This can be done via a specific string method.\n", "First let us define the string variable\n", "\n", " a=\"i shall be big\"" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To check what specific methods are available to strings, we can write the variable name followed by a dot '.' and press the tab key on the keyboad\n", "\n", " a." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the drop-down list all methods that can be applied to variable *a* are displayed. The method *.upper()* is the one that will perform our intended task\n", "\n", " a.upper()" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The general syntax is thus\n", "\n", " variable_name.method_name(optional_arguments)\n", "\n", "During this course we shall not be defining our own methods, but we will be using some useful pre-existing ones." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Defining new functions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Any moderatly useful program can become big, encompassing hundreds or thousands of lines of code. In order to work with such codes in a manageble way, it is important to group code functionality in logical pieces that perform a specific task, and that can be reused whenever needed.\n", "Creating our own functions is a way of packaging such functionality.\n", "\n", "As with variables above, there is an analogy between functions in programming and in mathematics.\n", "A mathematical function is an operation that receives a number as an argument and returns another number based on the value of the argument. When we write the mathematical expression\n", "\n", "$$f(x)=x^2-1 ,$$\n", "\n", "we are defining an operation named $f$, which for a given value of the argument $x$ will give back a result $x^2-1$. Once more, $x$ takes one particular value at the time, it can be any value, but only one each time.\n", "\n", "While in mathematics, functions mostly receive numbers and return numbers, in programming the concept of a function is largely extended. As we've seen above for example, functions can return strings, or receive strings as arguments and return numbers, or package any other type of instruction. One simple idea remains: functions are operations that receive arguments and return a result based on the arguments it has received.\n", "\n", "Let us look at a practical example and how we would convert it into a function. Let us consider the case where we have two strings, str1 and str2, and we want to form one string from these two joined by a dash '-'.\n", "\n", " str1=\"post\"\n", " str2=\"meridiem\"\n", " \n", "How would you accomplish the task in this case?" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This task was relativelly simple, but if we want to repeat it for 300 other pairs of strings, even for such a simple task, we could benefit from encapsulating this operation into a function. We will call our function *merge*\n", "\n", " def merge(a,b):\n", " \"Merge two strings with a dash.\"\n", " c= a + '-' + b\n", " return c" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will analyse how the function definition works, but first you should use the function merge we just defined to accomplish the same result as above." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "Step 1: function signature" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The function *signature* corresponds to the first line in the function definition and specifies some important information\n", "\n", " def merge(a,b):\n", "\n", "It starts with the keyword *def*, which tells Python that we are *defining* a function. Then comes a space, and the name of our function, in this case *merge*. After, an open parenthesis, the comma-separated function arguments. In this case the function has two arguments we are calling *a* and *b*. After the arguments, a close parenthesis, and a colon.\n", "\n", "The number of arguments depends highly on what the function does. We can even have a function with no arguments\n", "\n", " def function_name():\n", "\n", "or one argument\n", "\n", " def function_name(x1):\n", "\n", "or any other number of arguments.\n", "\n", "An important point to note is that the argument labels *a* and *b* we chose for our function are not variables we may have defined outside the function. They will be \"names\" used solely inside the function, just as the $x$ in $f(x)=x^2$ is just a label that could be called $w$ as in $f(w)=w^2$." ] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "Step 1.5: document string" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After our signature line we can add a piece of text enclosed in quotes explaining to the human reader what the function is doing. This is calles a *docstring*.\n", "\n", " def merge(a,b):\n", " \"Merge two strings with a dash.\"\n", " \n", "This is a matter of best practice. The docstring has no effect on the operation of the function, but when functions get very complicated and many lines long, or we start having many functions to work with, it is very useful to have a short explanation text.\n", "\n", "The docstring is shown when the question mark is used to gather information about the function\n", "\n", " merge?" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "Step 2: do useful work inside the function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Underneath the function signature and document string we do the useful work. As with if statements and loops before, **everything inside the function is indented**, so Python knows that it is a part of the function.\n", "\n", " def merge(a,b):\n", " \"Merge two strings with a dash.\"\n", " c= a + '-' + b\n", " \n", "In the function body we can use the function arguments *a* and *b* just as we have used variables. In this case we are storing the result of the operation *a + '-' + b* into the local variable *c*. It is important to note that both the function arguments *a* and *b*, as well as variables defined inside the function, are not afected by the values of the variables of the same name defined outside the function\n", "\n", " a=1\n", " b=2\n", " c=3\n", " merge('no','entry')" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can check what value the variable c at the moment\n", "\n", " c" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that even though there is a variable *c* inside the function *merge*, which when we call the function will assume a given value, that variable is local to the function and is isolated from the global variable with the same name *c*, which conserves the same value as before." ] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "Step 3: return something" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Although it is possible to create functions which produce no output (the function *print* is an example of this), most functions yield an output of one form or the other, such that it can be assigned to a variable.\n", "\n", "The output of a function is controled by the *return* keyword\n", "\n", " def merge(a,b):\n", " \"Merge two strings with a dash.\"\n", " c= a + '-' + b\n", " return c\n", "\n", "In the case of the function *merge*, the output will be the value of the local variable *c* defined in the body of the function.\n", "\n", "We could have defined the function *merge* without making use of the local variable *c* and giving exactly the same output\n", "\n", " def merge(a,b):\n", " \"Merge two strings with a dash.\"\n", " return a + '-' + b\n", " \n", "\n", "Once the *return* statement of the function is executed, we are done with the function -- we don't get to do any more work. That means if we have a function like this:\n", "\n", " def merge(a,b):\n", " \"Merge two strings with a dash.\"\n", " c= a + '-' + b\n", " return c\n", " print(\"This is meant to be really clever!\")\n", " \n", "The print statement will never be executed. Try it below." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we covered how to define functions, define the function $f(x)=x^2-1$ we considered above. Do not forget to give an appropriate docstring." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use the function to produce a list with the results of the function $f$ for integer values of the argument between 0 and 10." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Summary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Even simple programs can quickly grow to hundreds or thousands of code lines. Structuring programs using functions and methods is essential to make programs scallable and manageable.\n", "\n", "Functions in programming are similar to functions in mathematics, except that they are not limited to dealing with numbers. Functions are operations that receive arguments and return a result based on the arguments it has received." ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Exercises" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In these exercises you will write functions that perform a particular task. For the exercises we are using this docstring to specify how we want the function to behave. It shows examples of the function being used along with what output we expect to be produced.\n", "\n", "The exercises consist in completing the body of the function (bellow the docstring) such as to achieve the desired functionality. You should also test the functions on seperate cells to check they are working as intended." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Make HTML" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Web pages are written in hypertext markup language ([HTML](http://www.w3schools.com/tags/default.asp)), which consists in enclosing the text of the documents between tags that specify its function in the document. For example, \"<p>A paragraph of text</p>\" causes the string \"A paragraph of text\" to be identified as a paragraph in the document. In this example, the \"p\" tag surrounds the string \"A paragraph of text\" starting with <p> and ending with </p>. Given a string *tag* and a string *word*, return the word with HTML tags around it as \"<tag>word</tag>\". " ] }, { "cell_type": "code", "collapsed": false, "input": [ "def make_HTML(tag, word):\n", " r'''\n", " >>> make_HTML('p', 'A paragraph')\n", " '
A paragraph
'\n", " >>> make_HTML('h1', 'Title of the document')\n", " 'variable'\n",
" '''"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Stretch string"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Given a string and an integer *stretch_number*, return a string in which each character of the original string is repeated stretch_number times."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def stretch_string(string, stretch_number):\n",
" r'''\n",
" >>> stretch_string('banana', 2)\n",
" 'bbaannaannaa'\n",
" >>> stretch_string('apple', 3)\n",
" 'aaappppppllleee'\n",
" >>> stretch_string('fig', 4)\n",
" 'ffffiiiigggg'\n",
" '''"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Remove vowels"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Given a lower-case string, return that string with all of the vowels removed."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def remove_vowels(string):\n",
" r'''\n",
" >>> remove_vowels('bookkeeper')\n",
" 'bkkpr'\n",
" >>> remove_vowels('almanac')\n",
" 'lmnc'\n",
" >>> remove_vowels('syzygy')\n",
" 'syzygy'\n",
" '''"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Mean"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The NumPy package has the function mean() to calculate the mean value given a list (or array), but we shall do our own."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def our_mean(data):\n",
" r'''\n",
" >>> our_mean([1,2,3])\n",
" 2.0\n",
" >>> our_mean([1,10,100])\n",
" 37.0\n",
" >>> our_mean([0.12,-5,1/137,-3.14159265])\n",
" -2.0035733449817519\n",
" '''"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Standard deviation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use your function our_mean() you produced above, to create a function that gives the sample standard deviation of a list. As a reminder, the standard deviation can be calculated as\n",
"\n",
"$$s_x=\\sqrt{\\frac{\\sum_{i=1}^N (x_i-\\bar{x})^2}{N-1}}$$\n",
"\n",
"where $\\bar{x}$ is the sample mean."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def our_std(data):\n",
" r'''\n",
" >>> our_std([1,2,3])\n",
" 1.0\n",
" >>> our_std([1,10,100])\n",
" 54.74486277268398\n",
" >>> our_std([0.12,-5,1/137,-3.14159265])\n",
" 2.5051169675281404\n",
" '''"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Spelling bee"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Given the correct spelling of a word, and the contestant's guessed spelling, return the string \"correct\" if the guess is spelled exactly correctly, \"almost\" if there are at most two mistakes in the spelling, and \"wrong\" if three or more letters are different. You can assume that all guesses are the correct length."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def spelling_bee(correct, guess):\n",
" r'''\n",
" >>> spelling_bee('hello', 'hello')\n",
" 'correct'\n",
" >>> spelling_bee('python', 'pithon')\n",
" 'almost'\n",
" >>> spelling_bee('echolocation', 'eckolocashun')\n",
" 'wrong'\n",
" '''"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Nested loops"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In what follows we will not be introducing new functionality, but will be applying what we've learned to more complex exemples.\n",
"\n",
"Data will often take the form of lists of lists (which can be seen as 2D or higher dimension lists), and we might be interested in doing operations on, or using every element of these lists. Such operations often require running more than one loop simultaneously. Let us look at the example case of the following table\n",
"\n",
" table=[[1,1.2,'a'],\n",
" [2,2.3,'b'],\n",
" [3,3.4,'c']]"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, we will print *x*2* for every *x* element of *table* (note that this operation exist for both numbers and strings but with different effects). To do this we need 2 loops:\n",
"\n",
"* The first will loop through the elements of *table*: i.e. each of the inner lists.\n",
"* The second will loop through the elements of the inner lists.\n",
"\n",
"It will look like the following:\n",
"\n",
" for row in table:\n",
" for x in row:\n",
" print(2*x)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are using loop variable names that are indicative of what these variables are in the algorithm. The first for loop is \"picking out\" the rows of the table. For each value of *row*, the second loop is \"picking out\" individual elements of *row*.\n",
"\n",
"It is perhaps helpful to introduce some print functions to better understand what the code is doing. It is important to note indentation and which print statement belongs to which loop body to understand when it is being executed\n",
"\n",
" for row in table:\n",
" print('row=',row)\n",
" for x in row:\n",
" print('x=',x)\n",
" print(2*x)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If it is not clear how the nested loops are working, please ask a demonstrator.\n",
"\n",
"We could obtain exactly the same results by using indexes to extract the elements of *table*.\n",
"\n",
" for i in range(3):\n",
" for j in range(3):\n",
" print(2*table[i][j])"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Adding some print statements to make things clearer\n",
"\n",
" for i in range(3):\n",
" print(\"i=\",i)\n",
" for j in range(3):\n",
" print(\"j=\",j)\n",
" print(\"table[i][j]=\",table[i][j])\n",
" print(2*table[i][j])"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the variables we are looping over are integers running from 0 to 2. In this case we know the shape of the array and can explicitly set the loop bounds with range(3). In cases when we don't know, or don't want to know, the size of the table we are dealing with, we can write a more generic form\n",
"\n",
" for i in range(len(table)):\n",
" for j in range(len(table[i])):\n",
" print(2*table[i][j])"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have just shown that we can loop over list elements or list indexes to obtain the same result. The first way is often simpler and more natural in Python, while the second way is more common in most other programming languages.\n",
"\n",
"There are some cases when using indexes can be handy. For example, until now we have been printing the result of the operations following each row \"horizontally\", and moving to the next row \"below\" it. Write some code using indexes that will perform the same operation but following down the columns of *table*."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So far we have been performing operations on every element of the list and printing the result to the screen. This is not a particularly useful operation, and we often want to make a transformation to the full table and keep it. We can do this by building a new table *table2*\n",
"\n",
" table2=[]\n",
" for row in table:\n",
" new_row=[]\n",
" for x in row:\n",
" new_row=new_row+[2*x]\n",
" table2=table2+[new_row]\n",
" \n",
" table2"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that we have defined 2 result variables, *table2* and *new_row*, each updated on a different loop. *table2* is initialized to an empty list and is updated at the end of the body of the first loop. *new_row* is initialized to an empty list **in each cycle** of the first loop, build up inside the second loop, and appended to *table2* after the second loop has finished.\n",
"\n",
"This is a comon construction when building new tables, it is important that you understand how it works. If it is not completelly clear, try adding some print statements to the code above."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Although very common in 2D lists, nested loops also occur in other contexts.\n",
"As an example, we will now consider how to generate a list of all the prime numbers below 100. We will approach the problem in the simplest, albeit not very efficient, way: we will loop through every number between 1 and 100 and test if each of them is a prime number. To test if each number is a prime we need a second loop where we test if the number is devisable by any of the numbers smaller than itself.\n",
"\n",
"We will provide the outline of the code and let you fill the gaps to make it functional."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"primes=[]\n",
"for i in range(1,100):\n",
" is_prime=True\n",
" for j in range():\n",
" if % ==0:\n",
" is_prime=\n",
" if :\n",
" primes=\n",
" \n",
"primes"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Although common, this type of nested loops can get relatively complicated to read and understand. It is often possible to avoid them by defining a function that will do the work being done in the inner loop seperately.\n",
"\n",
"In the case of the list of prime numbers, we could make our lives easier if we define a function *isprime()* that will return a boolean indicating if the argument is a prime number. Define such a function below.\n",
"\n",
"(If you are feeling confident, you can define your function using a while loop instead of a for loop. Note that if a number is devisible, for example by 2, we already know the number is not a prime, so no need to test division by other numbers. This will make the program more efficient as it will do fewer loop steps.)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use the function isprime() to retrieve the list of the prime numbers below 100"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"Sorting"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Sorting the elements of a list is another operation that requires the use of nested loops. This is an important operation in many contexts and although simple to state, there are [many different algorithms](https://en.wikipedia.org/wiki/Sorting_algorithm) to achieve this task (performance can be a concern for big or frequent operations, think about sorting Twitter message by time).\n",
"\n",
"Although Python has the built-in function *sorted()* and the list method *.sort()* to do this operation, we will be looking at a variation of the simple Insert Sort algorithm in order to practice more general implementations.\n",
"\n",
"We will build a new ordered list by taking elements of the original list and inserting them into the new ordered list in such a way that our new list is always in the right order. Let us try to picture how it would work:\n",
"\n",
"*Step 0:* initial state\n",
"\n",
" original_list=[3,1,2]\n",
" ordered_list=[]\n",
" \n",
"*Step 1:* we pick the first element of the orginial_list and place it in the ordered_list\n",
"\n",
" |\n",
" original_list=[3,1,2]\n",
" ordered_list=[3]\n",
"\n",
"*Step 2:* select the second element of original_list and place it in position 0 of the ordered_list\n",
"\n",
" |\n",
" original_list=[3,1,2]\n",
" ordered_list=[1,3]\n",
"\n",
"*Step 3:* select the third element of the original_list and place it in position 1 of the ordered_list\n",
"\n",
" |\n",
" original_list[3,1,2]\n",
" ordered_list[1,2,3]\n",
"\n",
"Remember from the first workshop how to insert an element in a list in a specific position? (You can also use the *.insert()* list method to do this operation.)\n",
"\n",
" ordered_list=[1,2,4,5,6]\n",
" ordered_list[2:2]=[3]"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Perhaps it is already apparent where the nested loop structure is comming from in the algorithm above:\n",
"\n",
"* First loop to select the elements of the original list in turn.\n",
"* Second loop to check in which position to insert the new element in the ordered list.\n",
"\n",
"Instead of trying to implement the algorithm all in one piece of code, we will split out the task of inserting elements in the correct position of an ordered list. Define the function *insert_in_order(num,ordered_list)* which will receive as arguments a number and an ordered list, and will return an ordered list where the number passed will be in the correct position. \n",
"\n",
"E.g. executing insert_in_order with arguments 3, [1,2,4,5] will return [1,2,3,4,5]"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Test is your insert_in_order() function\n",
"\n",
" olist=list(range(0,20,3))\n",
" insert_in_order(8,olist) "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now define a function *insert_sort(unordered_list)* that receives an unordered list as argument and returns an ordered list with the same elements, that uses the *insert_in_order()* function defined above."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Does it work?\n",
"\n",
" insert_sort([91, 80, 34, 97, 9, 93, 50, 49, 6, 46])"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Distilling alcohols"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The list *organic_mix* defined below groups a bunch of strings with chemical fomulae for simple organic molecules"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"organic_mix=['CH2CH2','CH3CHCH2','(NH2)2CO','CH2ClCH2Cl','CH2CHCl','C6H6','C6H5CH2CH3','CH3O(CH3)3',\n",
" 'C6H5CHCH2','CH3OH','CH2O','C6H4(CH3)2','CH3CH2OH','C6H5OH','C6H3(OH)2Br','CH2OHCH2OH',\n",
" 'CH3CHCH3CHOHCH3','C6H3(OH)2CH2CHNH2CO2H','CH3CHOHCH2CH2OH','C6H8(OH)2','CHCl3,CH3CO2H',\n",
" 'CH3OCH3','CH3COCH3','CH3CH2CH2CH2OH']"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first goal is to create a function *distill()*. This function will take as an argument a list of chemical formulae. It will return a list containing the alcohols present in the original list.\n",
"\n",
"(Note that in the chemical formulae above, carboxylic acid groups are written as 'CO2H'. If you want a more challenging problem you can change the chemical formulae of acids in the list to read like 'COOH')\n",
"\n",
"Before writing your function, think about:\n",
"\n",
"* What type of result should your function return?\n",
"* How do you \"build\" such result?\n",
"* Do you need a loop? What should the loop variable be?\n",
"* What kind of tests (*if* statements) do you need to do while building your result?"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will go further in defining a function called *frac_distill()*, that will return a list of two lists. The first of such lists will contain the formulae of simple alcohols, while the second list will contain the diols.\n",
"\n",
"One way to approach this problem is:\n",
"\n",
"* Define an auxiliar function *isdiol()* which will identify if a given formula is a diol.\n",
"* Use the function *distill()* to generate an intermediary list with all alcohols.\n",
"* Use the function *isdiol()* to create two lists for simple alcohols and diols.\n",
"\n",
"(In identifying diols we also want to look for the cases written as '(OH)2'. In terms of the implementation of isdiol() you will probably want to use indexes and check 2 characters in the string at a time.)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Summary"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the second part of this workshop we have looked at slightly more complex structures in programming, in particular nested loops. We have seen that nested loops often occur in operations with lists and can often be \"decomposed\" by separating the inner loops into an auxiliar function."
]
}
],
"metadata": {}
}
]
}