Process a textfile line by line in Bash

One of the most common tasks when working with Bash-scripts and the Linux shell is text processing like filtering, selection, transforming, …

Often, these texts come from text files like CSV, log files, and so on. If you are an experienced user who is doing this on a daily basis, typing these kinds of command chains often feel like they are coming from “muscle memory” more than from your brain. But most of the time, you need only parts of these lines, like “the 5th to the 7th field” or some regular expressions match; these are usually quite easy to catch using a combination of the well-known tools awk, grep, cut or sed.
But: How to iterate (loop) over each line of a file in Bash and use that value for your processing; like, if you want to preserve only lines, matching a specific pattern or divide the script of a play into separate files per role, for example?

This is not really a hard task. But unlike the formerly mentioned processing for but parts of that lines, doing such things for the whole line is not a very commonly needed thing for my daily work. That’s why this gives me a hard time when I need it in those rare cases because that is not stored in my very own “muscle memory” and it enforces me to crawl through my memory castle and dig for it (or Google for it 😅).

The solution

Sure thing, there are plenty of ways to solve this, including not utilizing Bash in the first place, but some real programming language like Python 🐍. But to me, the following approach has proven itself as the most effective and easy to remember one:

Explanation:

  • IFS=””
    prevents leading/trailing whitespace from being trimmed.
  • -r
    prevents backslash escapes from being interpreted.
  • || [[ -n “$line” ]]
    prevents the last line from being ignored if it doesn’t end with a newline (\n), since read returns a non-zero exit code when it encounters EOF.

Instead of the “echo”-line, you can do whatever you like with the ${line}-variable, of course!

An Example

The following example is using this sketch.txt – file, which contains the text of a Monty Python sketch. It uses the proposed solution to separate its content into two separate files for each of the roles:

When you save this as “sketch-process.sh”, set it’s execution bit and put it into the same folder like formerly mentioned sketch.txt – file, you will end up with another 3 files after you executed “sketch-process.sh”:

  1. garbage_lines.txt
    … containing the “(pause)“-lines
  2. man_lines.txt
    … containing the lines starting with “Man:
  3. otherman_lines.txt
    … containing the lines, starting with “Other Man:

I hope you liked this article and that it turns out being helpful for some! Let me know in the comments ✌

Born in 1982, Marc Richter is an IT enthusiastic since 1994. He became addicted when he first put hands on their family’s pc and never stopped investigating and exploring new things since then.
He is married to Jennifer Richter and proud father of two wonderful children, Lotta and Linus.
His current professional focus is DevOps and Python development.

An exhaustive bio can be found at this blog post.

Found my articles useful? Maybe you would like to support my efforts and give me a tip then?