How to exclude patterns, files and directories with grep

Since 1974, the Linux team has been grephelping people find lines in files. But sometimes greptoo carefully. Here are some ways to say grepignore different things.

grep command

The command grepsearches text files for lines that match the search patterns specified on the command line. The power greplies in the use of regular expressions. They allow you to describe what you are looking for instead of explicitly defining it.

Birth greppredates Linux. it was developed in the early 1970s for Unix. It gets its name from the g/re/p key sequence in edthe line editor (pronounced “ee-dee” by the way). This meant a global, regular express search, printing matching lines.

grepknown—perhaps notorious—for being thorough and single-minded. Sometimes it will look for files or directories that you’d rather not waste your time on because the results can make it impossible for you to see the forest for the trees.

Of course, there are ways to control grep. You can tell it to ignore patterns, files, and directories so that grep searches faster and you’re not inundated with meaningless false positives.

Pattern exclusion

To search with, grepyou can direct input to it from some other process, such as cat, or you can specify a filename as the last command-line argument.

We are using a short file containing the text of the poem  Jabberwocky by Lewis Carroll. In these two examples, we are looking for strings that match the search term “jabberwock”.

cat jabberwocky.txt | grep "Jabberwock"grep "Jabberwock"jabberwocky.text

Rows that contain matches with a search hint are listed for us, with the corresponding item in each row highlighted in red. This is a direct search. But what if we want to exclude the lines containing the word “Jabberwock” and print the rest?

We can achieve this with the -v(invert match) option. This lists the strings that do not match the search query.

grep -v "Jabberwock"jabberwocky.text

Lines that do not contain “Jabmaglot” are displayed in the terminal window.

We can exclude as many terms as we wish. Let’s filter out all lines containing “Jabberwock” and all lines containing “and”. For this we will use an -eoption (expression). We need to use it for every search pattern we use.

grep -v -e "Jabberwock"-e "and"jabberwocky.txt

A corresponding decrease in the number of lines in the output.

If we use the -Eoption (extended regular expressions), we can combine search patterns with ” |“, which in this context does not indicate a channel, it is a logical ORoperator.

grep -Ev "Jabberwock|and"jabberwocky.txt

We get exactly the same output as with the previous, longer command.

The command format is the same if you want to use a regular expression pattern instead of an explicit search hint. This command will exclude all lines starting with any letter from the “ACHT” set.

grep -Ev "^ACHT"jabberwocky.txt

To see lines that contain a pattern but not another pattern, we can pass grepin grep. We will search for all rows containing the word “jabberwock” and then filter out all rows that also contain the word “killed”.

grep "Jabberwock"jabberwocky.txt | grep -v "slain"

File exclusion

We can ask to grepsearch for a string or pattern in a set of files. You can list each file on the command line, but with many files this approach doesn’t scale.

grep "vorpal"verse-1.txt verse-2.txt verse-3.txt verse-4.txt verse-5.txt verse-6.txt

Note that the name of the file containing the matched line appears at the beginning of each line of output.

To shorten the input, we can use wildcards. But this can be counterintuitive. It seems to work.

grep "vorpal"*.txt

However, there are other TXT files in this directory that have nothing to do with the poem. If we search for the word “sword” with the same command structure, we will get a lot of false positives.

grep "sword"*.txt

The results we need are masked by a stream of false results from other TXT files.

The word “vorpal” did not correspond to anything, but the word “sword” is included in the word “password”, so it appeared many times in some pseudo-log files.

We need to exclude these files. To do this, we use the --excludeoption. To exclude one file named “vol-log-1.txt” we would use this command:

grep --exclude=vol-log-1.txt "sword"*.txt

In this case, we want to exclude multiple log files with names starting with “vol”. The syntax we need is:

grep --exclude=vol*.txt "sword"*.txt

When we use the -R(dereference-recursive) option, grepit will search all directory trees for us. By default, it will look for all files in these locations. There may well be several types of files that we want to exclude.

Under the current directory on this test machine, there are subdirectories containing log files, CSV files, and MD files. These are all types of text files that we want to exclude. We could use a --excludeparameter for each file type, but we can achieve what we want more efficiently by grouping the file types.

This command excludes all files with .csv or .md extensions, as well as all .txt files whose names begin with “vol” or “log”.

grep -R --exclude=*.{csv,md} --exclude={vol*,log*}.txt "sword"/home/dave/data/

Excluding directories

If the files we want to ignore are contained in directories, and those directories don’t contain the files we want to find, we can exclude those directories entirely.

The concept is very similar to excluding files, except we use a --exclude-dirparameter and name the directories to be ignored.

grep -R --exclude-dir=backup "vorpal"/home/dave/data

We’ve excluded the “backup” directory, but are still looking in another directory named “backup2”.

Not surprisingly, we can use this --exclude-diroption multiple times in the same command. Note that the path to the excluded directories must be relative to the directory where the search will start. Do not use an absolute path from the root of the file system.

grep -R --exclude-dir=backup --exclude-dir=backup2 "vorpal"/home/dave/data

We can also use groupings. We can achieve the same result more concisely:

grep -R --exclude-dir={backup,backup2} "vorpal"/home/dave/data

You can combine file and directory exclusions in one command. If you want to exclude all files from a directory and exclude certain types of files from search directories, use this syntax:

grep -R --exclude=*.{csv,md} --exclude-dir=backup/archive "frumious"/home/dave/data

Leave a Reply

Your email address will not be published.