Gevent is an alternative approach to parallelisation and it brings coroutines to pre Python 3.5 code. Under the hood it takes advantage of small, independent pseudo-thread “Greenlets”, but also spawns some threads for internal needs. The overall memory footprint is very similar to multithreading.
Since the release of Python 3.5, coroutines are now possible with the
asyncio module which is part of the standard Python library. To take advantage of
asyncio I used
aiohttp instead of
aiohttp is an async equivalent of
requestswith the same functionality and similar API.
In general, this is a point to consider before starting a project in async, although most of the popular IO related packages —
psycopg2 — have their equivalents in the async world.
asyncio, memory usage is significantly lower compared to the previous methods. It’s very close to a single thread version of the script without parallelisation.
So should we start using asyncio?
Parallelism is a very efficient way of speeding up an application that has a lot of IO operations. In my case, there was a ~40% speed increase compared to sequential processing. Once a code runs in parallel, the difference in speed performance between the parallel methods is very low. An IO operation heavily depends on the performance of the other systems (i.e. network latency, disk speed, etc). Therefore, the execution time difference between the parallel methods is negligible.
ThreadPoolExecutor and Gevent are very powerful tools that can speed up an existing application. One major advantage is that in most cases it requires only minor changes in the codebase. When it comes to overall performance, the best performing tool is
asyncio with its local threads. The memory footprint is much lower compared to other parallel methods without impacting the overall speed. It comes with a price though, the codebase and its dependencies have to be specifically designed for use with
asyncio. This is something that has to be considered when moving a codebase to coroutines.
At Kiwi.com we use
asyncio in high performing APIs where we want to achieve speed with a low memory footprint on our infrastructure. An example of an “asyncio service” running at Kiwi.com is our public API for geographical locations data. You can try using the service yourself and the documentation is available here.