Sunday, August 17, 2025

Introducing: topnfiles

Surprisingly often, I find myself needing to find the n most/least recently changed files. For example:

  • In a large and deep directory structure of configuration files, where has the most recent change activity happened?
  • In a nested directory of (many) uploaded files, where are the oldest files?

There doesn't seem to be a good, existing command line tool for this. Some (six) unix tools can be chained together, let's call it the chained find+sort method. But on a file system with lots of files, the chained find+sort method as not efficient as it could be with regards to CPU and memory consumption.

So I've had fun with Rust and have written a little tool called topnfiles. Benchmarks are good. It has very limited memory use. Specifically, its memory use is proportional only to the amount of output (unlike the find+sort method where it is proportional to the number of considered files). Also, I've added some functionality which is hard to get with the chained find+sort method, such as being able to have JSON output.

Example, finding the most recently changed files in /usr on a Debian 13 server:

$ topnfiles --exclude "/__pycache__/" /usr
/usr/lib/modules/6.12.38+deb13-amd64/modules.builtin.bin
/usr/lib/modules/6.12.38+deb13-amd64/modules.builtin.alias.bin
/usr/lib/modules/6.12.38+deb13-amd64/modules.devname
/usr/lib/x86_64-linux-gnu/graphviz/config6a
/usr/bin/topnfiles
/usr/share/doc/topnfiles/LICENSE.txt
/usr/share/man/man8/topnfiles.8.gz
/usr/share/doc/topnfiles/CHANGELOG.txt
/usr/share/doc/topnfiles/README.md
/usr/share/doc/topnfiles/copyright