Working with Python's 3rd Party Libraries the Right Way
Although the Python's standard library provides a great set of awesome functionalities, there will be times that you will eventually run into the need of making use of third party libraries.
And this is nothing to be ashamed of! Can you imagine building a webserver from scratch? Or making a port to a database driver? Or, maybe, comming up with an image manipulation tool? I'm not saying that this is impossible, all I'm saying is that it would be a whole new project by itself.
Third party libraries are welcome in a way that they prevent you from reinventing the wheel. They save you time to focus on what really matters: to finish and deliver your application.
By the time I'm writing this entry, the Python Package Index (PyPI) is loaded with 62788 packages that address different purposes and it is most likely that there is already something there that will do the heavy lifting for you.
There are different ways to install and make use of those packages and here I'm going to show you the way it works for me.
A quick dive into Python modules
The Python module system is one of the most awesome features of the Python programming languge. Different from other popular languages, a Python module is namespace and, during runtime, an object. To understand how to use third party packages, it is important to grasp the basics of how Python's modules work.
Say you are writing a silly script that outputs a random number between 0 and 10. In order to do so, after a quick Google search you find out that the standard library provides you with a module called
random that will do exactly what you want. Then you would start by doing something like:
# file: silly.py r = random.randint(0, 10) print r
Once you attempt to run this code, you'll get the following output:
$ python silly.py Traceback (most recent call last): File "silly.py", line 3, in <module> r = random.randint(0, 10) NameError: name 'random' is not defined
What happened? As the uncaught error suggests, you are trying to call the method
randint in a binding named
random that does not exist in the current context.
Alright. Let's fix it:
# file: silly.py import random r = random.randint(0, 10) print r
And now, if we run our script, everything works as expected:
$ python silly.py 4
It generated a 4 but if you run it several times you'll notice that at each run it outputs a different random number between 0 and 10, just as we wanted in the first place.
That's cool, but what we are really interested in is how did that happen. Where did the
import random get what was needed to make our code work?
To understand that, let's run another command line tool:
$ whereis python python: /usr/bin/python3.4m /usr/bin/python /usr/bin/python3.4 /usr/bin/python2.7 /usr/bin/python2.7-config /etc/python /etc/python3.4 /etc/python2.7 /usr/lib/python3.4 /usr/lib/python2.7 /usr/bin/X11/python3.4m /usr/bin/X11/python /usr/bin/X11/python3.4 /usr/bin/X11/python2.7 /usr/bin/X11/python2.7-config /usr/local/lib/python3.4 /usr/local/lib/python2.7 /usr/include/python3.4m /usr/include/python2.7 /usr/share/python /usr/share/man/man1/python.1.gz
Note: Your output might differ because of the Python versions installed on your computer.
whereis program locates the binary, source, and manual page files for a command, that in our case is
Now, if you snoop into one of your Python
lib/ directories (in our case,
/usr/local/lib/python2.7/, since we have been using using the system's default Python installation), you will find that it is composed by all the standard library modules, including our already familiar
random. And that is where it goes looking for standard library modules when we use an
import statement on our code.
Note: In constrast with the standard library modules, for the sake of organization, the third party modules live in special directories called
dist-packages/ located right under the
lib/ directory we just talked about.
pip is a recursive acronym and stands for "pip installs packages". It has become the official Python package manager and it is used basically to install and uninstall packages from the Python Package Index.
If you are using the most recent versions of Python (2.7.9+ or 3.4+)
pip comes already bundled into the Python installation.
To check if
pip is installed on your machine, type the following command in a terminal:
$ pip --version pip 6.1.1 from /usr/local/lib/python2.7/dist-packages (python 2.7)
If it is installed, your output should be similar to the one above. However, don't fear if it isn't: there's a one-liner to fix this problem:
$ curl -s https://bootstrap.pypa.io/get-pip.py | sudo python /dev/stdin
OK! Now that you have your package manager installed, let's use it to download and install the packages needed to create a website using the almighty Django framework:
$ sudo pip install django
It is simple as that!
Once you've done that, you'll be able to follow the Django's official tutorial.
However, although it works, it is a terrible idea to bloat your Python installation with third party libraries because of the following reasons:
- You may have more than one project that depends on different versions of the same library. It is impossible to have different versions of the same library under the same Python installation.
- You may want to verify what are the exact dependencies of a particular project in order to run it somewhere else. If you install everything under the same Python installation, it is impossible to track what project is using what dependency.
That being said, I highly recommend that you use virtualenv as the foundation of your Python projects as you will see in the next session.
Now that you know the wrong way to use
pip, let's uninstall the
django package we installed before and learn how to do it the right way. In a terminal type:
$ sudo pip uninstall django
Done. As you can see, uninstalling packages is as easy as installing them with the aid of
For more commands and further information on
pip, please refer to the pip documentation.
Sandboxing projects with 'virtualenv'
By now, you should have understood why it is important (and sane) to separate the dependencies of your projects and that no Python package should be installed under the main Python installation.
Except that this is a lie. There are three packages that should be installed under the main Python installation. One is already there:
pip. The other two are
virtualenvwrapper. We'll take a look at
virtualenvwrapper in the next session, but, right now, let's concern ourselves with the
virtualenv does are basically two things:
- It creates a new instance of your main Python installation in a particular directory.
- It provides tools for you to activate and deactivate these instances in a way that whenever they are activated, they have precedence on your system's PATH. In other words, it means that if you activated your virtualenv and attempted to run any Python binary, it is going to look for it in the new instance's directory first.
And that, my friend, is how you will be sandboxing your projects! We begin installing the
virtualenv in your main Python installation:
$ sudo pip install virtualenv
Note: This step doesn't need to be repeated for every project. You'll do it only once.
Now that we have virtualenv available, let's make a directory for our sandboxed django project and cd into it:
$ makedir -p $HOME/Workspace/my-sandboxed-django-proj $ cd $HOME/Workspace/my-sandboxed-django-proj
And next we create a virtualenv for this project:
$ virtualenv .venv New python executable in .venv/bin/python Installing setuptools, pip...done.
The directory structure should look like this:
my-sandboxed-django-proj/ └── .venv/ ├── bin/ ├── include/ ├── lib/ └── local/
Now, the only remaining thing we have to do before we install the django package is to activate our newly created virtual environment:
$ source .venv/bin/activate
And that's it. If everything worked, you should notice that your prompt has changed to something like:
Great! Now you are safe to install your third party libraries and code your Django application. How to install the
django package in your newly activated virtualenv? Easy, with
(.venv)$ pip install django
Note: Attention here! Notice that here we did not use
sudo in front of our command. If we did, the package would be installed in the main Python installation directory, even though we were with the
Alright, now you can import all needed third party library, code your Django application and conquer the world!
Once you are done using your
virtualenv you can deactivate it by typing:
Smooth and simple. Now everything goes back to how it was in the last section. If you attempt to install something with pip it is going to be installed in the main Python installation and you prompt should have gone back to its usual look.
And we are done with
virtualenv. There's not a lot more to be said here. If you want more information visit virtualenv documentation.
The convenient 'virtualenvwrapper'
Lazy programmers might think that having to do all of the above steps for every project might seem to be a burden, than why not create a script to do that for you?
You don't need to:
virtualenvwrapper is a collection of shell functions that will ease your life with sandboxed Python environments. It includes:
- A centralized way of mantaining your virtualenvironments
- Activation of a virtualenvironment from anywhere in the directory tree
- Temporary virtualenvironments to make quick tests
Isn't it awesome? Let's configure it. We start by installing it.
$ sudo pip install virtualenvwrapper
Note: The truth is that you didn't need to install
virtualenv beforehand in order to install
virtualenvwrapper. This happens because when you use
pip to install a package, it automatically installs all its dependecies recursively. Since
virtualenv is a dependency for
virtualenvwrapper the previous command would suffice.
Note: As for
virtualenv, this step doesn't need to be repeated for every project. You'll do it only once.
The last configuration step, before we have access to all the utilities of
virtualenvwrapper is that we need to tell it where to save our virtualenvironments (
$WORKON_HOME) and autostart it whenever we start a new terminal. We do that by appending the following lines to our
$HOME/.bashrc file (or similars):
# file: $HOME/.bashrc # ... other content goes here ... export WORKON_HOME=$HOME/.venvs export VIRTUALENVWRAPPER_SCRIPT=/usr/local/bin/virtualenvwrapper.sh source $VIRTUALENVWRAPPER_SCRIPT
Now restart your terminal or type:
$ source $HOME/.bashrc
Are you ready to see its joy?
So you want to create a new virtualenvrionment? No problem! Go ahead and type:
$ mkvirtualenv my-new-virtualenv New python executable in my-new-virtualenv/bin/python Installing setuptools, pip...done.
Whoa! You shell should have changed to the following and you already know what that means:
That's right. You activated your virtualenv with one single command. To deactivate it is the same command as for
virtualenv, just type:
And what if you want it active once again? No problem, here you go:
$ workon my-new-virtualenv
Simple? Oh! You don't want that virtualenvironment anymore? Easy:
$ rmvirtualenv my-new-virtualenv Removing my-new-virtualenv...
That's it. Gone.
Want to test what is going to be the new library for your project? OK. How about you make a temporary environment:
$ mktmpenv New python executable in tmp-ffe1f7f2b5e5102f/bin/python Installing setuptools, pip...done. This is a temporary environment. It will be deleted when you run 'deactivate'.
Yes. You should be in a temporary environment that will autodestruct once you deactivate it!
There are many commands provided by the
virtualenvwrapper and you can check them out in the virtualenvwrapper documentation. Here I just presented some that I use constantly.
Once you have a virtualenvironment activated, to install requirements is straightforward as usual: just use
pip as we did before in the previous sections.
The virtualenvironments are all held under the
$WORKON_HOME that you specified on the
$HOME/.bashrc, so if you need, just cd into it and do whatever you have to do.
Freezing and consuming freezed requirements
If you want to share your code with other people and make it less of a pain for them to run it, it is a good practice to freeze (write) a relation of the dependencies of your project in a
In order to do so, we can use one functionality of
pip. Just activate the virtualenvironment with your dependencies and then type:
(my-virtualenv)$ pip freeze > requirements.txt
Note: The name of the output file could be anything, but
requirements.txt is the usual choice and it should be adopted to become compatible with the community.
Suppose that in your virtualenvironment '
my-virtualenv' there was a
MySQL-python installation. The content of the resulting
requirementes.txt would be:
Where the values on the left of the equality are the packages and the values on the right of the equality are their respective versions.
And that's it, once someone put their hand on your code, the first thing they would do is to create a new virtualenvironment and then type:
(new-virtualenv)$ pip install -r requirements.txt
And that would install the right version of all the dependencies needed for them to run your code.
Installing specific versions of 3rd party libraries
You can tell
pip to install a specific version of a library by typing the commands as follows:
$ pip install django==1.6.11
That would install the Django in its version 1.6.11. Notice that there must exist a correspondent version available at the Python Package Index.
Note: If you do not specify a version, pip will install the latest stable release available with a matching package name.
Using different Python versions
It might be the case that you have multiple Python versions installed on your machine.
If that is the case, both
virtualenvwrapper provide ways for you to specify what version the target virtualenvironment is going to be.
For virtualenv, use:
$ virtualenv <path to virtualenv> -p <path to python executable>
And for virtualenvwrapper you can also use:
$ mkvirtualenv <virtualenv name> -p <path to python executable>
If you do not specify a version, the Python version used to install
virtualenvwrapper will be used.
Although there are other sandboxing projects for Python out there, such as Pyenv, I think the ones I described here are the most popular and you could (and should) even use them in production.
I hope this entry helps people that are starting to dive into the pythonic world to get to know a little bit more about the Python ecossistem and avoid making the same mistakes I've made when I was starting myself.
If you have a question or a suggestion, please feel more than welcome to use the comment section below.