Process a textfile line by line in Bash

GNU BASH logo

One of the most common tasks when working with Bash-scripts and the Linux shell is text processing like filtering, selection, transforming, …

Often, these texts come from text¬†files like CSV, log files, and so on. If you are an experienced user who is doing this on a daily basis, typing these kinds of command chains often feel like they are coming from “muscle memory” more than from your brain. But most of the time, you need only parts of these lines, like “the 5th to the 7th field” or some regular expressions match; these are usually quite easy to catch using a combination of the well-known tools awk, grep, cut or sed.
But: How to iterate (loop) over each line of a file in Bash and use that value for your processing; like, if you want to preserve only lines, matching a specific pattern or divide the script of a play into separate files per role, for example?

This is not really a hard task. But unlike the formerly mentioned processing for but parts of that lines, doing such things for the whole line is not a very commonly needed thing for my daily work. That’s why this gives me a hard time when I need it in those rare cases because that is not stored in my very own “muscle memory” and it enforces me to crawl through my memory castle and dig for it (or Google for it ūüėÖ).

The solution

Sure thing, there are plenty of ways to solve this, including not utilizing Bash in the first place, but some real programming language like Python ūüźć. But to me, the following approach has proven itself as the most effective and easy to remember one:

Explanation:

  • IFS=””
    prevents leading/trailing whitespace from being trimmed.
  • -r
    prevents backslash escapes from being interpreted.
  • || [[ -n “$line” ]]
    prevents the last line from being ignored if it doesn’t end with a newline (\n), since read returns a non-zero exit code when it encounters EOF.

Instead of the “echo”-line, you can do whatever you like with the ${line}-variable, of course!

An Example

The following example is using this sketch.txt Рfile, which contains the text of a Monty Python sketch. It uses the proposed solution to separate its content into two separate files for each of the roles:

When you save this as “sketch-process.sh”, set it’s execution bit and put it into the same folder like formerly mentioned¬†sketch.txt¬†– file, you will end up with another 3 files after you executed¬†“sketch-process.sh”:

  1. garbage_lines.txt
    … containing the “(pause)“-lines
  2. man_lines.txt
    … containing the lines starting with “Man:
  3. otherman_lines.txt
    … containing the lines, starting with “Other Man:

I hope you liked this article and that it turns out being helpful for some! Let me know in the comments¬†‚úĆ

Fix missing Google calendars in Evolution / CalDAV

Google Calendar and Evolution Sync

Me and my wife both use Google calendars to organize our daily schedule. Also, we share these calendars with each other, to see each other’s appointments. This way, we do not clash each other’s plans¬†by accepting appointments in the same, concurrent time slots.

Recently, we found that not all of her calendars were offered to me in Evolution. After some digging, I found the solution and I’m going to explain it in this article.

Since some people in charge (note that I did not mention “we”¬†ūüė§) decided to switch to Office365, I need to use Evolution PIM on my Linux machine to have a slightly enjoyable Exchange-experience, at least. The other solutions, like DavMail for example, worked basically but proved to be too error-prone¬†and slow for my taste; more often than never, Mails I archived in Thunderbird with DavMail in between showed up again after some Minutes and it took ages for any action.
However, somehow I could not select all of these foreign “shared” calendars, my wife shared with me – they simply were not listed in the select dialogue.

How to solve this

I finally found the solution to this issue here: It turned out that it’s not Evolution’s fault; instead, Google does not advertise (list) these calendars by default. You first have to change this (quite confusing) default setting at the following location:

https://www.google.com/calendar/syncselect

Please mind the lower list on that page; this should be a complete list of calendars shared with you. Select those you want to be able to access in CalDAV based clients (including Evolution) and save your selection. Your changes should be effective, immediately.

I hope this is helpful to some – Please let me know in the comments‚úĆ

Fixed my most popular Docker Image

Confluence Logo

Today, I’d like to announce that my most popular Docker Image¬†derjudge/confluence, a batteries-included solution to get Atlassian Confluence up and running with a mature database (PostgreSQL) as storage backend in¬†seconds, has been¬†fixed and updated.

  • It ships with the most recent version of Confluence now, which is 6.8.1.
    The image was not updated since Confluence version 6.0.2 before … sorry for that!
  • PostgreSQL version was updated to 9.6.
    Since Atlassian has decided to finally support this version, I declared it to be the version of choice in my image, too. It was set to be 9.4 before.
  • JAVA version was updated to 1.8.0_162.
    This has been 1.8.0_112 before.
  • Underlying Debian release was updated to “stretch”.
    This was “jessie” before.

I have to admit this had not really received some love, recently … but to my defense: I do not use it for hosting Confluence myself, currently. Not¬†that I do not like to taste my own poison, but the infrastructural environment my hosting is built up in does not need it, currently.¬†So I do not really realize if anything breaks (which was the case with PostgreSQL not launching, recently).
Also, nobody got in touch with me, telling something is wrong; the first note on this issue (PostgreSQL not working) I received by mail on 2018-04-12 at 11:04 CEST (thank you, Michael Bykovski from //SEIBERT/MEDIA!); on 2018-04-13 at 20:13 CEST the fix has been made, the formerly listed updates were applied, a new Image tag was created for this new release and the image was built, successfully.

I wonder a bit why nobody has done before:¬†The Image has 50K pulls (WOW – thank you!!¬†‚úĆ), the PostgreSQL issue seems to has been in there since December 2016 (!) and both, my E-Mail address¬†and the link to my source repository which has an issues reporting feature are both prominently available to the images Docker Hub page.
Guys: I can only fix things I know of, so:

Please utilize the tools offered to get in touch!

I hope I’ll find the time to push newer versions more often, proactively in the future. If I miss something: Feel free and actually invited, to poke me!¬†ūüėČ

More recent Python in Enterprise Linux like CentOS and RHEL

Tux

This article describes what “Enterprise Linux” is and how to add a more recent version of Python to it than those available in the base package repository.

What is “Enterprise Linux”?

General definition

CentOS and RedHat Enterprise Linux (RHEL) both are counted as one of the so-called¬†“Enterprise Linux” systems. This term is an artificial noun, which has different meanings. In general, this describes Linux distributions, which are targeted at the commercial market thus putting a strong focus on reliability and long lifecycles.
CentOS, RHEL and SUSE Linux Enterprise Server (SLES) usually maintains a release for 10 years; RHEL and SLES even offer extended support contracts for additional years of support. That means these distributions offer at least twice as long support for a version than Ubuntu LTS versions does (which usually is ~5 years).

This kind of distribution’s biggest strength often is also one of their biggest downsides: If you want to have¬†a more recent¬†version of any of the¬†software¬†they¬†contain, you¬†often¬†have¬†bad luck. More recent versions (if any are available at all) usually come¬†from¬†3rd party¬†repositories. One of the most famous ones for CentOS is EPEL (Extra Packages for Enterprise Linux), which¬†ports a lot of¬†“had expected that to be available” – like packages from Fedora¬†to¬†Enterprise Linux.
But the more repositories you add, the more unpredictable and unreliable the core becomes.

RedHat based definition

But there’s also another meaning: The term “Enterprise Linux” also has established as¬†a term to¬†group distributions, which are based on blueprints from RedHat Enterprise Linux or build alike it. Some sources refer to this kind of distributions like this (like EPEL, for example). This list normally includes (but is not limited to):

RedHat Family Tree
RedHat Family Tree

This does not mean in any way, that other distributions, not based on RHEL, are not enterprise class or ready!

What are Enterprise Linux distributions used for?

Enterprise Linux distributions are often used in large-scale IT environments with several hundred or thousand hosts. In this kind of environments, reproducibility (by orchestration/ automatization), reliability, compatibility and hardened concepts and versions are key aspects.

In large orchestrated IT environments, an often selected choice for a distribution as a base is CentOS since, being an Enterprise Linux distribution, its main focus aims at being a rock-stable “enterprise-class” platform prioritized¬†over delivering the latest upstream versions of software¬†selections. It also aims at being binary compatible to RedHat Enterprise Linux (RHEL) while being free of charge and only community supported.

Talking about CentOS …

 

CentOS logo

I will stick to CentOS here since this is the Enterprise Linux I utilize the most. But since we are talking about “Enterprise Linux” here, the following should largely apply for similar distributions, also.

At the time of this writing, CentOS 7, is the latest release of the distribution and was released in 07/2014. It will receive full updates until Q4/2020 and stay maintained (provided with Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) only) until 07/2024.
CentOS 6 also is still around; released in 07/2011, it’s maintenance will be continued until 11/2020.
CentOS 5 (released in 04/2007) exceeded it’s maintenance timeframe in 03/2017 and is considered unsupported.
Thus, I will only consider CentOS 7 and 6 here.
For details, please see CentOS Product Specifications and Red Hat Enterprise Linux Life Cycle.

Talking about Python …

Python logo

System Default and base repo

CentOS utilizes Python a lot for its command line tools. In fact, it’s primary package manager yum is heavily depending on Python 2 (2.7.5 in CentOS 7, 2.6.6 in CentOS 6 – provided by the package named “python” from “base” repo) and uninstalling it by force is a reliable way to render your package management useless. This version of Python also comes with some modules installed, which are hard to find in the most common locations like PyPI, including:

  • yum-metadata-parser
  • slip.dbus

In short: You do not want to mess around with this system interpreter!¬†ūüí£

What about Python 3?

Python 3 is not available in either CentOS 6 or 7.

Python 3.0 was released in 12/2008.
CentOS 6 was released in 07/2011.
CentOS 7 was released in 07/2014.

Normally one would consider that to be a fair amount of time to add any release of a major technique to even an Enterprise Linux; especially with the Python 2 End of Life in sight for 2020. But – as you can see, it hasn’t been done yet. Remember what I said about the downsides of an Enterprise Linux? Here’s an example¬†ūüėČ
But you still have choices.

Using EPEL

The most convenient way to get Python 3 for any supported Enterprise Linux is by adding the EPEL repository to your system. This also has the benefit that its usage is quite common and so the risk to end up with a too customized system is not that huge.
Also, EPEL’s Guidelines and Policies aim to not interfere with any base package and also have quite a strict upgrade policy.
Also, the project is somewhat very close to RedHats own development, since it was born out of Fedora and Fedora is sponsored by RedHat and it is aimed to be used in their Enterprise Distribution, also.

EPEL provides Python 3.4.5 for both, CentOS 6 and 7.

If that satisfies your needs, this is quite a low hanging fruit. You add EPEL and install Python 3 from it like this:

Using IUS

IUS (Inline with Upstream Stable) is a project which is sponsored by Rackspace. It aims at providing more recent versions of some major key software packages, including Python.

It’s goals and philosophies are very close to those of EPEL in not to interfere with base packages. It’s naming convention makes sure that even if a package equivalent will ever show up in base, it will never interfere with those provided by IUS.
It considers itself a “SafeRepo” and compares itself with EPEL here; feel free to read this resource if want to learn more.

IUS provides several versions of Python:

  • 3.4.7 (package python34u)
  • 3.5.4 (package python35u)
  • 3.6.4 (package python36u)

If this satisfies your needs, this is also quite easy to achieve by issuing the following:

Installing from source

This is by far the most flexible approach while also the most cumbersome one. Also, this is not an “Enterprise Linux” specific task but can be done on any Linux system in the more or like the same way.
Please consider the fact that this is some kind of an anti-pattern if you think of what’s the goal and philosophy behind Enterprise Linux distributions.

By compiling from source, you can freely decide which version of Python you want to use. But at the same time, you sacrifice all benefits package managers have to offer, including:

  • Ease of installation
  • Package QA and reviewing workflow
  • Being supplied with security updates
  • Integration and availability to the package manager, often integrated into other orchestration tools

This approach will not be described here since it has already been done an uncountable amount of times (see here, or here, or any of the other ~973.000 results from a Google search) and since it is not really an Enterprise Linux approach.

Hello World!

Time to start a new Blog with a “Hello World!” post!

Let’s start with some technical, personal and historical background about my IT journey so far (even no one will be interested enough in this to read the whole article, though. And you know why I do it anyway? Because I decide so, it’s only about me to decide what is put here and what isn’t and there’s nothing you can do about¬†ūüėõ).
You digital-native-social-network-scum¬†can search for some “Dislike¬†ūüĎé or Report¬†ūüďĘ”¬†buttons as long as you like – this page was made by elders for elders¬†ūüßst(quoth the 35 years old author), who went through the dark ages of the Internet, including connection breakdowns, because some other member of the household picked up the phone and interrupted your dial-up connection.

My first steps into the IT

I started to explore computers in 1994/1995¬† Continue reading “Hello World!”