Regular Expressions in Linux Explained with Examples

Regular expressions (Regexp) is one of the advanced concept we require to write efficient shell scripts and for effective system administration.

  • Basically regular expressions are divided in to 3 types for better understanding.
  1. Basic Regular expressions
  2. Interval Regular expressions (Use option -E for grep and -r for sed)
  3. Extended Regular expressions (Use option -E for grep and -r for sed)
  • What is a Regular expression?

A regular expression is a concept of matching a pattern in a given string.

  • Which commands/programming languages support regular expressions?

vi, tr, rename, grep, sed, awk, perl, python etc.

BASIC REGULAR EXPRESSIONS

Basic regular expressions:

  • This set includes very basic set of regular expressions which do not require any options to execute.
  • This set of regular expressions are developed long time back.

^ –Caret/Power symbol to match a starting at the beginning of line.

$ –To match end of the line

* –0 or more occurrence of the previous character.

. –To match any character

[] –Range of character

[^char] –negate of occurrence of a character set

<word> –Actual word finding

–Escape character

[ad type=”banner”]

$ REGULAR EXPRESSION

  • Match all the files which ends with sh
    bash code
      ls -l | grep sh$

    As $ indicates end of the line, the above command will list all the files whose names end with sh.

    how about finding lines in a file which ends with dead

bash code
grep 'dead$' filename

How about finding empty lines in a file?

bash code
grep '^$' filename

REGULAR EXPRESSION

Example : Match all files which have a word twt, twet, tweet etc in the file name.

bash code
ls -l | grep 'twe*t'
[ad type=”banner”]

How about searching for apple word which was spelled wrong in a given file where apple is misspelled as ale, aple, appple, apppple, apppppple etc. To find all patterns

bash code
grep 'ap*le' filename

Readers should observe that the above pattern will match even ale word as * indicates 0 or more of the previous character occurrence.

[^CHAR] REGULAR EXPRESSION

Example: Match all the file names except a or b or c in it’s filenames

bash code
 ls | grep  '[^abc]'

This will give output all the file names except files which contain a or b or c.

<WORD> REGULAR EXPRESSION

Example: Search for a word abc, for example I should not get abcxyz or readabc in my output

bash code
   grep '<abc>' filename

ESCAPE REGULAR EXPRESSION

Example : Find files which contain [ in it’s name, as [ is a special charter we have to escape it

bash code

grep "[" filename

or

grep '[[]' filename
[ad type=”banner”]

[] SQUARE BRACES/BRACKETS REGULAR EXPRESSION

Example : Find all the files which contains a number in the file name between a and x

bash code
ls -l | grep 'a[0-9]x'

This will find all the files which is
a0xsdf
asda1xsdfas
..
..
asdfdsara9xsdf
etc.
  • So where ever it finds a number it will try to match that number.
  • Some of the range operator examples for  you.
  • [a-z] –Match’s any single char between a to z.
  • [A-Z] –Match’s any single char between A to Z.
  • [0-9] –Match’s any single char between 0 to 9.
  • [a-zA-Z0-9] – Match’s any single character either a to z or A to Z or 0 to 9
  • [!@#$%^] — Match’s any ! or @ or # or $ or % or ^ character.

 

Categorized in: