Finding Dead Python Files with Snakefood

Koby Bass
2 min readDec 12, 2018

--

And cleaning them up.

TLDR; Download snakefood and look at this gist.

Recently I started working on a python project with a large and mature codebase. During development, the codebase was evolving and some parts were left behind, unused.

To clean things up, I wanted to find all unused python modules in the project.

Vulture is a great tool to find unused code inside your files, but I wanted to find whole files that weren’t being used. For this, I needed some dependency analysis tool.

Coverage allows us to see which code is being used by running our project, but we have many entrypoints and code flows.

Snakefood is a command line tool that analyses dependencies, and can great graphs from them. I decided to use the dependency analysis to find which files weren’t included by any of our code.

A summary is available in this gist, and I explain the details below.

How It Works

To generate a dependency file, we run sfood with the internal flag -ion our directory. The internal flag will exclude any files outside of our project. Since this can take some time, we cache results to a temporary file:

sfood -i <dir> > /tmp/out.deps

The dependency file contains many dependency lines, like the following:

# file main.py depends on common/redis.py
(('/tmp', ‘project/main.py’), ('/tmp', ‘project/common/redis.py’))

We’re interested in which files are being dependant on, using pipes we can create a command that parses those:

cat /tmp/out.deps | \ 
grep -v test | \
cut -d"'" -f8 | \
sort | \
uniq > /tmp/required.txt
  • Split the dependency by ' and get the 8th field, the filename.
  • Sort filenames and get unique filenames
  • Save result for later comparison.

We can generate a list of all our python files using some bash scripting:

find . -name '*.py' | \
grep -v "__init__.py" | \
grep -v "test" | \
sort > /tmp/modules.txt

Comparing those files, we get what we wanted, the list of files that aren’t dependant on:

> diff /tmp/modules.txt /tmp/required.txtproject/main.py
project/common/unused.py

Filtering out our entry point project/main.py and framework files, we get our list of dead files.

EOF

That’s it. Hope it helped someone, wasn’t enough information about this when I was searching. Comment below if you know a better way, I’ll be glad to hear it.

--

--

No responses yet