Setting Up a Python Programming Environment

Want to learn how to setup a programming environment for Python using open source tools? Then read on!

Over the last 20 years or so I have drifted in and out of different Python programming fads. In hindsight, it is clear there is no “default” setup. Folks stumble accidentally into different configurations, post about it, and eventually you get patterns and drift.

As of 2022, here is my preferred setup (and how to work it).

Operating System

First, I like an Ubuntu base. I’m now on 22.04 LTS and generally upgrade when the LTS versions become available.

IDE

PyCharm Community Edition (CE) is my Integrated Development Environment (IDE) of choice. It’s free and widely supported/documented. It’s generally a toss-up between PyCharm and VSCode for Python development (I’m not hardcore enough for emacs or vim only editing). Having used both, my gut prefers PyCharm – I find it a little easier to use and a little more focused.

Virtual Environments vs Conda Environments vs Docker Containers

This is the biggest headache – what do you code into?

Virtual Environments

I started out with virtual environments. This was the classic Python approach. It all got slightly easier with the built-in venv support.

Positives:

  • No external dependencies or configurations.
  • Simplify everything with a requirements.txt file and pip install.
  • Lots of supporting documentation and how-tos on the web.
  • Good support for cloud hosting.

Negatives:

  • The separate venv folder in your coding folder can get in the way.
  • Difficult to keep track of different environments.
  • A bit more rough-and-ready with more complexity for things like setting environment variables.
  • There was a time when many of the major libraries that required some form of hardware support needed more that pip install using the Python package manager (e.g. machine learning libraries; anything using images, video or audio – OpenCV I see you; stuff requiring Graphical Processing Unit – GPU – support). This means you need more than a virtual environment to get things running.

I’ve found that there are sometimes problems initialising a new Python environment using PyCharm so I suggest creating a venv folder manually using the terminal:

python3 -m venv venv

Then selecting your interpreter as an existing virtual environment – selecting the Python binary (bin) within the venv folder.

Remember to activate your environment before working in it (and deactivate it following use) – or just close and reload your terminal window in Pycharm.

Conda Environments

The last negative above drove many people to conda or Anaconda as a Python environment manager. Conda is often preferred as it is lighter-weight and so easier to port into a production environment. Installing packages with conda also installs compatible binaries so you can build environments with a combination of traditional package installations and Python package installations.

Positives:

  • Much easier to obtain hardware support and working environments, especially for machine learning and media processing libraries.
  • Better interface for environment management – set environmental variables, export and clone environments, list all environments on your system etc.

Negatives:

  • You end up with a mix of conda managed packages and pip managed packages.
  • Need to use conda-forge for many main libraries. Supported packages take a long time to work their way into default support.
  • Need to install conda within any production environment. This has been a real problem with Amazon Web Services (AWS) and Azure where their “web app” environments generally just have support for the pip install of a requirements.txt file.
  • I haven’t yet managed to easily export an environment and start it up on a different machine. Even though it “should” work, in practice it is a big headache – it’s unclear what is the best default environment specification, packages seem specifically tied to the host environment configuration, having a loser coupling raises dependency errors.

As Python, Linux and machine learning libraries have matured I’m finding fewer problems with hardware support for standard pip installations. This and the lack of cloud support is driving me away from conda towards venv or Docker.

Docker Environments

The issues with the lack of wide-spread support for conda environments, and the difficulties in porting environments to different machines, and the need to build complex web applications with message brokers, web servers and backend databases then leads us all to container services such as Docker.

As well as providing a monolithic container for easy running of a complex service, you can also code within the Docker environment you are running. This can keep things nicely compartmentalised.

Positives:

  • Sometimes supported by cloud hosting providers.
  • Easy porting of environment to different machines and across different operating systems.
  • GitHub’s CodeSpaces is going this way.

Negatives:

  • Still a pain in the arse to get fast GPU and bare-metal support (e.g. for machine learning libraries).
  • Often requires sudo/administrative privileges to get things working (despite what the docs say).
  • Easy to find yourself running out of diskspace on terabyte storage devices.
  • New GUI makes things slightly easier but is run as a user process, leading to a split between administrative and user-run containers.
  • Difficult to remember all the command-line kung-fu and the compose file syntax.
  • Security issues.

PyCharm CE has a Docker plugin that can be installed via Settings>Plugins. See this guide to set it up. You can also connect to a Docker repository. PyCharm provides a nice interface to manage Docker containers without having to use the official Docker GUI or the command line. Unfortunately though you need to pay for the Professional edition to use a Docker container as a remote Python interpreter.

One nice thing about coding up Dockerfiles within PyCharm is you can run the files from within PyCharm and manage the container. For web applications, you often need to expose ports. This isn’t obvious but is fairly simple. Select the Dockerfile then click on “Run” in the menu and “Edit Configurations…”. If you have setup Docker you should have that as a run option. Then click on “Modify options” and select the “Bind ports” option. You then have a “Bind Ports” input box where you can select a port mapping (e.g. 8000 to 8000 as shown).

Conclusions

If you are building a small Python package use a virtual environment for development if you can.

If you are doing local research and development and don’t need to port your code to a production environment or cloud hosting site, use a conda environment.

If you are building any kind of web-app, use Docker.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s