Regular Expressions and Stream Editor
Even if you are not an expert with UNIX commands, you would probably encounter grep
or sed
for text searching/replacing/adding purposes. Even though both commands are used to editing, there are different specifications you could add to accelerate your flow of work.
grep
stands for Global regular expression print. grep is a command line utility tool designed for UNIX systems and work on some other operation systems like UNIX as well.
sed
is non-interactive by default, which means the command does not modify your files unless you add instructions for sed
to do so. First developed in 1973 by Mr.Lee E. McMahon from the Bells Labs, the stream editor sed is supposed to serve for text transformations.
With some background for these two commands, let us now take a look at what the commands would look like typically.
sed -r 's/REGEX/TEXT/' file.txt
The REGEX in this line of command refers to the regular expressions of the text pattern you are trying to match, the replaced part would be TEXT and output is printed to the console.
Imagine then if we were to have a list of names, last name following first name and separated by a comma. If we want to extract the names and print them as first name then last name separated by a blank space, we could use the following command
sed -r 's/^(.*),(.*)$/\2 \1/' names.txt
The first . * expression is referring to zero or more characters in front of the comma, and by putting parenthesis on their sides, we could save and refer to them later. Same thing with the second expression that extracts last names from the file. And if you were to parenthesize more expressions, you could use the espaces character + n where n is the nth expression being stored.
Inconsistent formats can also be handled by sed commands, the * we used in the above example indicates none or more char/char(s) specified in front, there are also ? which specifies zero or one char/char(s) and + which specifies one or more.
Sometimes our revisions are expected to modify the files, and remember we have mentioned before that sed is by default non-volatile. This could easily be changed by adding an ‘I’ after the third / in the command.
sed -r 's/^(.*),(.*)$/\2 \1/I' names.txt
Some other conditions we could specify with character(s) appending is to perform global search and replacement with updating the file in place, like the following:
sed -r 's/^(.*),(.*)$/\2 \1/gI' names.txt
References
Originally published at http://xinyix.wordpress.com on February 18, 2021.