grep Vs sed Vs awk: what a proficient/advanced Linux Shell User should know

How many times have we used grep to narrow our searches on a Linux FS (File System)? Well, this is a good question since almost everyone (intended as Linux average User) knows grep and its basic features set. To recap: g/re/p stands for globally search a regular expression and print, a name, a manifesto I would say.

The Linux’s ecosystem has two other very useful and powerful tools for patterns search: sed that stands for stream editor, and awk that instead is named by the names of its creators, Aho, Weinberger and Kerningham.

A regex

Given the three, what is the main difference? Which is the best usage for each one of the three? Straight to the point, very good questions that hereafter are answered.

  • grep. A fast and powerful pattern search tool that can be easily combined with other filters to find results and customize the display, even if the main aim is to search for matches. Its main usage consists in narrowing search results by forcing the match with the given pattern.
  • sed. A fast stream editor, able to search for a pattern and apply the given transformations and/or commands; still easy to combine in sophisticated filters, but serving a different aim: modifying the text in the stream. Its main usage consists in editing in-memory a stream according to the given pattern.
  • awk. A loosely typed programming language for stream processing, where the basic unit is the String (intended as an array of characters) that can be i. matched, ii. substituted and iii. worked around; most of the times, it is no really needed to combine awk with other filters, since its reporting capabilities are very powerful (the printf built-in function allows to format the output text as in C). Its main usage consists in perform fine-grained (variables can be defined and modified incrementally) and programmatic manipulations (flow control statements) to the input stream.

According to the above definitions, the three tools serve different purposes, may still be used in combination, and as said work in matching patterns, but, there is still no net difference between sed and awk so let’s try to clarify by examples.


Input Data

total 68
-rw-rw-r--. 1 pmaresca pmaresca 49 Mar 21 20:34 blanks
-rw-rw-r--. 1 pmaresca pmaresca 36257 Mar 22 20:05 commands
-rw-rw-r--. 1 pmaresca pmaresca 79 Mar 20 23:18 json
-rw-rw-r--. 1 pmaresca pmaresca 37 Mar 21 20:44 keyvalue
-rw-rw-r--. 1 pmaresca pmaresca 873 Mar 21 22:51 menu_json
-rw-rw-r--. 1 pmaresca pmaresca 85 Mar 22 18:41 phones
-rw-rw-r--. 1 pmaresca pmaresca 16 Mar 21 19:01 sum
-rw-rw-r--. 1 pmaresca pmaresca 67 Mar 22 18:31 telephones
-rw-rw-r--. 1 pmaresca pmaresca 199 Mar 22 14:21 test

Processing – Take the ‘ls’ output and grep for a pattern ‘b.+s’

 ls -l | grep -E 'b.+s' 

Output Data

-rw-rw-r–. 1 pmaresca pmaresca    49 Mar 21 20:34 blanks


Input Data – ‘phones’


Processing – take in Input some US numbers and split each one of them in i. Area, ii. Second and iii. Third


 sed -e 's/\(^.*)\)\(.*-\)\(.*$\)/Area: \1 Second: \2 Third: \3/g' phones 

Output Data

Area: (555) Second: 555- Third: 1212
Area: (555) Second: 555- Third: 1213
Area: (555) Second: 555- Third: 1214
Area: (666) Second: 555- Third: 1215
Area: (666) Second: 555- Third: 1216
Area: (777) Second: 555- Third: 1217


Input Data – ‘menu_json’

{"menu": {
   "header": "SVG Viewer",
   "items": [
     {"id": "Open"},
     {"id": "OpenNew", "label": "Open New"},
     {"id": "ZoomIn", "label": "Zoom In"},
     {"id": "ZoomOut", "label": "Zoom Out"},
     {"id": "OriginalView", "label": "Original View"},
     {"id": "Quality"}, 
     {"id": "Pause"},
     {"id": "Mute"},
     {"id": "Find", "label": "Find..."},
     {"id": "FindAgain", "label": "Find Again"},
     {"id": "Copy"},
     {"id": "CopyAgain", "label": "Copy Again"},
     {"id": "CopySVG", "label": "Copy SVG"},
     {"id": "ViewSVG", "label": "View SVG"},
     {"id": "ViewSource", "label": "View Source"},
     {"id": "SaveAs", "label": "Save As"},
     {"id": "Help"},
     {"id": "About", "label": "About Adobe CVG Viewer..."}

Processing – take in Input the menu data,  extract the IDs, the first value for each one of them, and build a set of Shell Exports

 awk 'BEGIN { sum = 0 }; \

/id/ { sum += 1; gsub(/[\",}]/, ""); sub(/{id:/, "export VAR_"sum"="); \

printf("%s %s%s%s%s\n", $1, $2, "\"", $3, "\"") }; \

END { print "Total", sum }' menu_json 

Output Data

export VAR_1="Open"
export VAR_2="OpenNew"
export VAR_3="ZoomIn"
export VAR_4="ZoomOut"
export VAR_5="OriginalView"
export VAR_6="Quality"
export VAR_7="Pause"
export VAR_8="Mute"
export VAR_9="Find"
export VAR_10="FindAgain"
export VAR_11="Copy"
export VAR_12="CopyAgain"
export VAR_13="CopySVG"
export VAR_14="ViewSVG"
export VAR_15="ViewSource"
export VAR_16="SaveAs"
export VAR_17="Help"
export VAR_18="About"
Total 18


As conclusion of this short post, and from above, awk’s capabilities shine: it is programming language with an awkward syntax that allows advanced in-memory modifications and powerful reporting; as seen, sed is able to modify the text, but, it cannot operate programmatically as awk, this poperly makes it a powerful stream editor – to be used like that.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s