# Making Python run faster: a case study

December 25, 2016As a langauge, Python was originally designed for faster development cycles, and not for execution speed. This principle remains very much alive today, and although it has garnered the language a bad reputation for its sluggishness, there are a few very common antipatterns that you can avoid to make your Python run faster. In this post, I will demonstrate some of these ideas by exploring solutions to a very simple algorithm: calculating the average (or, to be more precise, the arithmetic mean) out of a list of integers.

For our experiment, we’ll time our functions on three lists of different lenghts, as the size of the list will affect our solution’s runtime. We can start by generating some test data:

With our three test lists, we can start writing up solutions. Notice that all our functions will start with an `if lst`

check, since we’ll only return a value if a list isn’t empty.

### The naive solution

Let’s start simple. We iterate through the indices of the list, and we keep track of two values as we go: the running sum, and the running count. When we get to the end, we return the sum over the count. That is the *very* definition of how to calculate an average.

This gives us the right solution, and the following runtimes:

List Length | 10 | 100 | 1000 |
---|---|---|---|

Runtime (µs) | 1.483 | 9.789 | 129.695 |

We can surely do better, though.

### Still naive, but more pythonic

If you work with Python long enough, or if you peruse questions on it on Stack Overflow, you’ll see the word *pythonic* pop up again, and again. Generally, it refers to concise and readable patterns that the community tends to follow. Most times, their popularity is based on actual efficiencies, and well deserved, even if their users don’t even know that’s the case. The word can take the place of “clean,” or “idiomatic,” but ultimately has no real meaning.

What makes the answer above not pythonic, you ask? The iteration pattern. If you come from a language like C/C++, or even Java, you’re probably used to processing iterables via indexing, leading to code like:

While this case looks simple, in more complex algorithms it can be hard to keep track of the i’s and j’s to grab objects out of our list by an index. In Python, most cases of `range(len(lst))`

are a mistake that we can avoid via the iterator protocol, or perhaps with `enumerate`

if indexed access is unavoidable. This helps us not only make our code more readable, but also more efficient:

As you can see below, we get a nice performance bump, especially as our input list grows:

List Length | 10 | 100 | 1000 |
---|---|---|---|

Runtime (µs) | 0.956 | 7.647 | 97.220 |

### The preferred way, falling into C

One of the virtues of Python, is that under the hood it is written in very efficient C code. While rolling our own functions tends to be fun, in most cases there is some built-in method that we can call to do the work for us. Knowing that such functions exist, and thinking of them when solving problems is half the battle when writing Python. In this case, its the `sum`

and `len`

functions that we want to invoke:

Since Python’s `len`

implementation on lists takes constant time, the time complexity is dominated by the `O(N)`

operation required to sum, iterating over the numbers only once. (Thanks to Leon Sasson for reminding me that `len`

is `O(1)`

)

List Length | 10 | 100 | 1000 |
---|---|---|---|

Runtime (µs) | 0.409 | 1.041 | 7.348 |

This change gives us more than a performance bump. The code is an order of magnitude faster relative to the previous solution and it gives us a considerably simpler answer.

### Using other libraries

It is important to note that there are other libraries that can provide you with a way to compute the mean of a series. In Python 3 there is the `statistics.mean`

function, which uses the Kahan Summation algorithm to reduce numerical errors at the expense of 50 to 100 times more computation time than our preferred solution, depending on list size.

Additionally you could also use `numpy.mean`

which has its own kind of overhead due to its array based operations, but can be very efficient when computing very large datasets thanks to other optimizations.

Neither of these are really comparable to the other `mean`

functions we’ve written, as they serve different, and very specific, purposes. However, since `numpy.mean`

and `statistics.mean`

*do* compute the mean, innocent programmers might use the less performant but specialized variants unknowingly for no reason.

### So what’s the best choice?

Obviously, the measured runtime will vary depending on the machine the tests run on. What we actually care about are the ratios. In the table below I have normalized the values in each column to the most efficient version for a given input list, and also included the `numpy`

and `statistics`

functions for reference:

Solution | Length 10 | Length 100 | Length 1000 |
---|---|---|---|

`mean1` | 3.6 | 9.4 | 17.7 |

`mean2` | 2.3 | 7.3 | 13.2 |

`mean3` | 1.0 | 1.0 | 1.0 |

`numpy.mean` | 25.9 | 15.8 | 9.4 |

`statistics.mean` | 45.9 | 65.5 | 79.2 |

While all the solutions proposed run in linear `O(N)`

time, there are clear winners and losers. If your algorithms don’t require the extra precision, nor the array functionality, you should go with the `mean3`

version described above.

I would encourage you to play with the code, and generate your own results, by tweaking my script, which you can find here.

Can you think of other more efficient, more clever, ways to calculate the mean of a list with Python? Let me know!

**Update (1/5/17):** Leon Sasson correctly pointed out that `len`

is `O(1)`

on Python lists. This alters my explanation for `mean3`

’s performance, but not its expected time complexity.

*All tests were run on Python 3.5.2 on a mid-2014 MBPr with 16GB of RAM and a 2.5 GHz Intel Core i7 processor.**Image from “Reptiles and birds. A popular account of the various orders” by Figuier, Louis, 1819-1894 Gillmore, Parker. - Licensed under Public Domain, via Internet Archive Book Images*

Want to see more articles like this? Sign up below: