Chained to Complexity: Python Dependency Management

Nov 19, 2024 | Programming, Python

Dependency management in software development is often akin to playing an elaborate game of Jenga where everyone involved is on their third IPA: every block you move introduces the risk of toppling the tower which is becoming more wobbly over time. 

Complexity Unchained

Python prides itself on simplicity and readability, but as soon as you introduce a library with dependencies (and its dependencies’ dependencies), the situation starts to spiral. Let’s say you install a simple library like requests. You’re not just installing requests; you’re also grabbing its dependencies, such as urllib3 and chardet. Now, imagine installing another package that also depends on urllib3, but it requires a different version. Welcome to dependency hell.

This issue isn’t unique to Python, but Python’s lack of built-in version locking at the global level exacerbates the problem. You might think, “I’ll just pin versions in my requirements.txt file,” but that only solves part of the puzzle. The reality is that the chain of dependencies is often opaque—libraries you rely on may change their requirements without warning, and the ripple effect can break your entire environment.

Chasing Compatibility

Consider a real-world example I recently encountered. I was working on a small ETL script that relied on pandas for data manipulation and openpyxl for Excel handling. I installed them with a simple pip3 install pandas openpyxl, expecting smooth sailing. But beneath the surface, there were hidden conflicts:

  • pandas depended on numpy, but the version installed didn’t play nicely with my system’s architecture.
  • openpyxl brought along a version of jdcal that clashed with another library I had installed.

Resolving these issues wasn’t just a matter of updating one package. Each fix cascaded into new problems, leading me to dig into release notes, GitHub issues, and obscure documentation to understand what broke and why.

Virtual Environments Aren’t a Panacea

One of the most common recommendations for managing dependencies in Python is to use virtual environments. Tools like venv or virtualenv help you isolate project dependencies, ensuring one project’s requirements don’t interfere with another’s. While this is good practice, it doesn’t eliminate the challenge of managing the dependency chain itself.

Even in an isolated environment, you’re still at the mercy of how dependencies interact with one another. Without carefully curating your requirements.txt, you might inadvertently update a package that introduces breaking changes further down the chain. Tools like pip freeze can help lock down your current state, but they don’t necessarily prevent you from starting with a broken chain in the first place.

Strategies for Survival

Here are some tips I’ve learned (sometimes the hard way) for managing dependency chains effectively:

  1. Pin Everything: Use pip freeze > requirements.txt to lock versions after you’ve tested your environment. This prevents future installations from pulling in unexpected updates.
  2. Use Dependency Scanners: Tools like pipdeptree or poetry can give you a visual representation of your dependency chain. Understanding the chain is the first step to managing it.
  3. Regular Audits: Dependency management isn’t a set-it-and-forget-it task. Schedule regular audits of your dependencies, especially for projects that aren’t frequently updated.
  4. Isolate Projects: Always use virtual environments or tools like Docker to ensure clean, isolated environments for each project.
  5. Leverage Tools Like Poetry: While pip3 is ubiquitous, tools like poetry or pip-tools provide additional features for dependency resolution and conflict management.

The Human Factor

The most challenging part of dependency management isn’t technical but human. The more libraries you rely on, the more you’re at the mercy of other developers’ choices and priorities. Open-source maintainers might deprecate a feature you depend on or introduce breaking changes without sufficient documentation. These aren’t just edge cases—they’re inevitable realities in a rapidly evolving ecosystem.

Conclusion

Dependency management with pip3 can be frustrating, but it’s also a reminder of the interconnected nature of software development. While tools and strategies can help mitigate the pain, the key is adopting a mindset of vigilance and adaptability. Like spelunking, you need the right tools, a good map, and a willingness to get your hands dirty. And sometimes, you just have to accept that the chain will break—you’ll learn more from fixing it than you ever did from installing it. Dependency management: it’s not glamorous, but it’s the glue holding our Python projects together. Keep your requirements locked, your chains inspected, and your sanity intact—well, as intact as it can be. 

Got data locked in a legacy or proprietary system? Take a look at my ETL and automation solution Alice.

 

 

More from Mike:

About Me

Hi! I’m Mike! I’m a software engineer who codes at The Mad Botter INC. You might know me from Coder Radio or The Mike Dominick Show.  Drop me a line if you’re interested in having some custom mobile or web development done.

Follow Me

© 2024 Copyright Michael Dominick | All rights reserved