Post Tenebras Lux

Today I got an interesting little challenge: how do I see the output of a command and at the same time get its size in bytes, without modifying the program?

The most pedestrian way I can think of is:

  1. redirect the output of the command to a file.
  2. Dump the file to the terminal (e.g. with 🐱).
  3. Get the size of the file with wc --bytes.

As simple as:

$ foo > output_of_foo
$ cat output_of_foo
$ wc --bytes output_of_foo

That works. But that isn’t as convenient as I needed. How to do all of that in one go? Concatenating that to a one-liner like foo > output_of_foo; cat output_of_foo; wc -c output_of_foo is out.

Could tee do that? tee copies the standard input (stdin) to both the standard output (stdout) and to one or more files. For example, echo Hi | tee a b c shows the string Hi (the input for tee) in the terminal (standard output) and saves it in the three files a, b and c, overwriting their content if they already exist.

This seems promising. But… foo | tee | wc -l is the same as not using tee at all: there are no specified files.

We could hack our way around with foo | tee /dev/stderr | wc --bytes: this way we have the output of foo in both stdout and in stderr (standard error)! The pipe to wc redirects only stdout leaving stderr in the terminal:

$ echo Hi | tee /dev/stderr | wc --bytes

Note the three bytes in there: H, i and \n. Use echo -n to not have the trailing newline at the end of the string. Beware of side effects. You’ve been warned.

That gets the job done. But can we avoid stderr?

In case you are running Bash you can use some nasty process substitution expansion as well. With this obscenity we can turn the input for wc into a filename for our tee:

$ echo Hi | tee >(wc --bytes)

And no stderr involved 🙂

With this neat trick we can feed the output of one command to any number of following processes, for example:

$ echo Hi | tee >(wc --bytes) >(wc --lines) >(md5sum)
31ebdfce8b77ac49d7f5506dd1495830  -

Note that tee wrote into the filenames starting from the right to the left, after writing to stdout: the first line contains the input string Hi\n, then the MD5 checksum (output from md5sum), then the number of lines (from wc --lines) and then the number of bytes.