Jörg Rödel's Blog

Linux and Confidential Computing

15 Mar 2024

Introducing Flocc: The Fast Lines-Of-Code Counter

Today I am writing about a project that has been cooking for years on my disk and personal Github space. The Fast Lines Of Code Counter scans a code-base and prints detailed statistics about the lines of code, comments and whitespace, all per programming- or markup-language found. It can count on checked out repositories as well as git revisions and supports a detailed JSON output mode.

Why Another Lines-of-Code Tool?

This is a valid question, there are already a couple of tools out there which can do the same. Before I started with flocc I did some experiments with other tools, but found their performance rather disappointing. So I sat down and hacked up a little tool in C++ which then evolved into what I am introducing today.

The pre-existing tools I found had a throughput of around 100.000 LOC per second, while flocc easily reaches 3.000.000 LOC in the same time. For example, a run on a recent Linux kernel source tree for 6.8 on my Ryzen 7 2700X give me this output (note that the tool is single-threaded, source on disk but in the page cache):

Results for .:
  Scanned 80006 unique files (80426 total)
  T=12.477s (6412.2 files/s,  3009243.6 lines/s)
            Files       Code        Comment     Blank       
  --------------------------------------------------------------------
  ASN.1             16          445         87          128         
  Assembler         1312        237630      87058       47303       
  Awk               8           880         105         170         
  C                 33544       17695328    2663689     3416259     
  C++               1           1591        66          271         
  C/C++ Header      24408       7385181     1428458     728244
  [...]
  YAML              3899        344052      17820       71194       
  Yacc              10          4951        413         705         
  --------------------------------------------------------------------
  Total             80426       28464454    4337084     4744795     

Installation

Flocc can be installed via your Linux distribution package manager (only on openSUSE Tumbleweed so far) or directly via compiling the source code. The repository is hosted at Github. There are some build dependencies:

  • C++ compiler
  • git
  • Libgit2 development packages for supporting the git mode
  • Perl, mainly for pod2man to generate the man page

When these are installed you can (as a normal user) do:

$ git clone https://github.com/joergroedel/flocc
$ cd flocc
$ make -j`nproc` all
$ make install

And done. This will install flocc into the ~/bin/ directory.

On openSUSE Tumbleweed you can just do:

$ sudo zypper install flocc

Usage

Flocc comes with a man-page and small set of parameters. When invoked without any parameter it will start counting in the current directory and all its sub-directories.

$ flocc

It can also count on another directory passed as a parameter:

$ flocc ~/src/linux/

JSON Output

Instead of writing results in human-readable form to standard output, the tool can also create a JSON file with detailed results, but beware that the file can get pretty big. It will contain per-file and per-directory results for automatic processing by other tools. A JSON output file for the Linux kernel is currently around 6.7MiB large.

To create JSON file, use the --json parameter:

$ flocc --json output.json

This will write all results to the output.json file instead of standard output.

Git Mode

Flocc can not only count on checkout-out directory trees, but also on git revisions. That is helpful if you want to compare the current Lines-of-Code count with older versions of a source tree. For that flocc has bindings to libgit2 the traverse a git-tree object hierarchy. To use it, pass the --git parameter:

$ flocc --git v6.7

This will, instead of the checked out revision, do all counting on the git tree referenced by the v6.7 tag. This mode is slower than reading files from disk, with git mode you only get around 60% of the Lines-per-second performance.

Conclusions

The tool is in the making for a few years, but still has some rough edges. Over the next months I will work on those to make it even better and create a seamless experience for the users. I am always grateful for feedback about how the tool is used and what can be better.

If you play with flocc please let me know your feedback/bug reports/feature requests by dropping an email to joro@8bytes.org or by reporting them on Github Issues. Until then, don’t forget to have a lot of fun!