### Learning BASH: Text Processing - HEAD & TAIL

BASH continues to surprise me with it's amazing collection of simple, yet extremely useful commands. They can give you a huge boost in speed and control while working. No doubt bash along with editors like VIM are the developer's favorite combinations. Gradually , you will feel the invention of mouse as a waste since you can pretty much control everything with just your keyboard.

Today we continue with more commands that are related to Text Processing.

These commands are used to get contents of a file starting from the top and bottom. Unlike the CAT command that displays the whole content of a file, these command gives you control over how much you want to see.

syntax: HEAD filename | TAIL filename

Note: By default, HEAD | TAIL shows 1st/last 10 lines of a file.

Lets say I have a text file like this.

$cat numbers.txt 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20  Let's run head command without any arguments other than the filename. $ head numbers.txt
1
2
3
4
5
6
7
8
9
10


## Problem: Display first 5 lines of the file provided.

$head numbers.txt -n 5 1 2 3 4 5  $ tail numbers.txt -n 5
16
17
18
19
20


The -n argument lets us specify the number of lines I want to see/grab , starting from the first line.

Problem: Display the first 20 characters of a file/some text line.

$echo "This is a test line with many characters" | head -c 12 This is a te  $ echo "This is a test line with many characters" | tail -c 12
characters


Here, the -c argument is for character count. As the bash help says,

-c, --bytes=[-]NUM       print the first NUM bytes of each file with the leading '-', print all but the
last NUM bytes of each file

Note: HEAD/TAIL commands do not accept a range. So you can't display lines starting from n1 to n2. say 5-10th line. You can do that, but using a mixture of tail and head command. We will see that later.

TIP:

One of the most popular use of the tail command is to monitor changes to a file. Example a log file that logs each activity in a software or a website.

$tail -f log.txt  The -f is for file input. Problem: Display the lines from the given file between line 10 and 15. Solution: So head gives us lines from first line i.e 1 to n . And tail gives us n lines , counting from the bottom . Our answer is expected as 11,12,13,14 as I said , I want numbers IN BETWEEN 10 and 15. In plain words, this might be our plan. • First we get us the all numbers till (15 - 1). i.e 1 to 14. • Then we remove 1 to 10 from this and the rest is our answer. In pictures, I can put it like this. $ head numbers.txt -n 15 | tail -n 5
11
12
13
14
15


### Learning BASH: Text Processing - Cut Command

Text processing tools in Bash is a huge topic . So we will take it one command at a time.

## CUT COMMAND

You might think , CUT means to remove a file from location A to location B. But as the link here says, Cut command in unix (or linux) is used to select sections of text from each line of files. You can use the cut command to select fields or columns from a line by specifying a delimiter or you can select a portion of text by specifying the range or characters. Basically the cut command slices a line and extracts the text.

The definition of CUT command in linux itself says:

Print selected parts of lines from each FILE to standard output.

I created a text file (I am on windows running Cygwin...so......) . Added a few lines.

This is the first line
This is the second
And this is not the last line
Finally we end
Good Bye

The linux help says:

N         N'th byte, character or field, counted from 1
N-       from N'th byte, character or field, to end of line
N-M   from N'th to M'th (included) byte, character or field
-M      from first to M'th (included) byte, character or field

Problem : Give me the first (1st) letters of every line.

Solution:

$cut -c1 foo.txt T T A F G  Analysis: -c1 means , column one (1). Or position 1. That's the N'th byte. Note: Column numbering starts from 1. NOT zero (0). Problem: Show me the first three characters of each line. Soln: $ cut -c1-3 foo.txt
Thi
Thi
And
Fin
Goo

$cut -c-3 foo.txt Thi Thi And Fin Goo  Two ways to do it , May be more, but these are the easiest ways I suppose. You can specify a RANGE . We have used here -M and N-M in each example. Problem: Get the 3rd character of each line in a file. The file is given as an input from user. Solution: cut -c3$(expr read file)


Note that there other ways to do this.

cut -c3


Note:  cut reads from standard input if the argument is "-" or absent.

## Using Delimiters

The -d option in cut command can be used to specify the delimiter and -f option is used to specify the field position.

$cut -d$' ' -f-3 foo.txt
This is the
This is the
And this is
Finally we end
Good Bye


Note : The -d needs a delimiter to be specified. The -f tells us the position. Here I have used first to third position .

In the above example, my delimiter is a single length space. I need to see till the 3rd occurance of space.

Another example:

$cat foo.txt Hi:I:Am:Groot$ cut -d$':' -f1-3 foo.txt Hi:I:Am  Problem: Given a sentence, identify and display its fourth word. Assume that the space (' ') is the only delimiter between words. Solution: cut -d$' ' -f4


Same for semi colon example:

$cat foo.txt Hi:I:Am:Groot$ cut -d$':' -f4 foo.txt Groot  Problem: Given a tab delimited file with several columns (tsv format) print the fields from second fields to last field. Solution: cut -f2- Note: The default delimiter is tab. So you DON'T need to specify a delimiter at all if the problem asks for a tab delimiter. __________________________________________________________________________________________________ Reference: I took the problems from my favorite code competition site. Hacker Rank. Visit this link to practice more problems . Solve the first 9 problems which are based on CUT command for bash. Best of luck. ### Learn BASH with me in 5 mins I just started learning Linux bash from today. From my first impression of the language, I infer that it is a language with all the basic capabilities as of an infant high level language. May be I am right or wrong.Time will tell . We will keep going and keep discovering gradually. Let's start with the usual protocol of learning a language. ## The HELLO WORLD program. How to print things in shell. This is the first thing everyone wants to know while learning any language. Anything that is not a variable is printable . And we print/echo it using the famous ECHO keyword $ echo hello world
hello world



Printing a number.



$echo 1 1  Printing a string with double quotes $ echo "my name is arindam"
my name is arindam

Printing a string with single quotes

$echo 'my name is Arindam' my name is Arindam  Printing a number with quotes$ echo '1'
1


## Creating  Variables and recalling them.

So how can we store things. How to recall that stored value. How to change that value.

X=999

Note: There should be no spaces around the assignment operator (=). Also, there is no return value after the assignment statement is executed.

$X=999$ echo $X 999$ $X bash: 999: command not found  A simple = sign works great for assigning values but, the spaces around a important. Otherwise you will get an error. To recall the value inside a variable, use the$ sign.
If you don't use the echo keyword and try to print the value by just a $sign (people coming from languages like python would understand why someone would try such a thing). ## Saving Strings in variable $ X=arin

$echo$X
arin

$x="hi world"$ x=hi world


You can store a single word with spaces around without using quotes. But if there are spaces, then you need to use quotes. Other bash breaks down "x=hi world" as two commands x=hi and world. Obviously this doesn't work.

## Dynamically changing value

$echo$X
99

$echo$((X+1))
100


What happened here. I wanted to use the variable X and get an incremented value of the same.
You need to use a double parenthesis in these cases. Note that this won't change the value of X to the new value.

## Using Bash as a calculator

$echo$((X*2))+$X 198+99$ echo $(($((X*2))+$X)) 297  This probably is an overkill but , if you need to do it, this is how you can. From the first line, you can observe that , bash evaluates each section separately, and just displays their value in the same format. This is like interpretation The second does the job , because we asked to evaluate the equation using$((equation)).
I think this looks messy and risky but, just for example sake, it works.

Another way , is to use the keyword "expr". Whatever is mentioned after this keyword become the expression to be solved/interpreted.

$echo$(expr 5 + 5)
10

n=4
$echo$(expr 4 * $n) 16  ## Iterations and LOOPS Every body loves loops. Iterations are a part of every language. Bash provides the omnipresent FOR loop and WHILE loop. $ for num in 1 2 3
> do
> echo $num > done 1 2 3  So here, we looped on a list of numbers. X=1 while [$X -le 99 ]
do
echo $X X=$((X+2))
done


Here, I have printed from number 1 to 99 but only odd numbers.

## Accepting input

Another common feature of any language is accepting an input from the user. We have the 'read' keyword for it.

read name
done


Now , what if the end limit of your range, is inside a variable. You might think, I'll just do {1..$N}. Sorry that doesn't work. There is a better way to do this. If you know C syntax, then you must be familiar with this. n=4$ for ((i=1;i<=n;i++)); do echo $i; done 1 2 3 4  ## Ranges with Step We want to add a step value. We can do it as {start..end..step} $ for num in {1..10..2}; do echo $num; done 1 3 5 7 9  ## If condition with comparison operators if [$A -gt $B ] then echo$(($A-$B))
else
echo $(($B-$A)) fi  There are many operators available. Below table should be referred. For string comparisons, the operators are different. Example: $ if [ 'Y' == 'Y' ]; then echo YES; else echo NO; fi
YES