Score:9

Grep ignore array of patterns

kg flag

Since I learned some bash syntax, I have being very enthusiastic about its use in daily life. A famous command is grep. In case one with to grep something but ignore several files, the commands below MAY work.

grep_ignore=("token_a", "token_b")
grep -rnw . -e "token2" | grep -v <(printf '%s\n' "${grep_ignore[@]}")

How to reproduce:

  1. Create some dummy folder: command run mkdir dummy & cd dummy

  2. Create files:

    a. file_token_a.txt: command run echo "token1 token2" > file_token_a.txt;

    b. file_token_b.txt: command run echo "token1 token3" > file_token_b.txt

    c. file_token_c.txt: command run echo "token2 token3" > file_token_c.txt

Command run:

grep_ignore=("token_a", "token_b")
grep -rnw . -e "token2" | grep -v <(printf '%s\n' "${grep_ignore[@]}")

Expected output:

./file_token_c.txt:1:token2 token3

Given output:

./file_token_c.txt:1:token2 token3
./file_token_a.txt:1:token1 token2
Score:12
hr flag

There are two issues with your attempt:

  1. your array construction has an erroneous comma, which makes the first pattern token_a, instead of token_a

  2. <(printf '%s\n' "${grep_ignore[@]}") is being passed to grep -v as a file to be searched a pattern consisting of the process substitution's file descriptor string like /dev/fd/631, rather than as a list of patterns - to have patterns read from a file (or process substitution) you need to make it an argument to the -f option

Correcting for these:

grep_ignore=("token_a" "token_b")

then

$ grep -rnw . -e "token2" | grep -vFf <(printf '%s\n' "${grep_ignore[@]}")
./file_token_c.txt:1:token2 token3

(the -F says to treat the array elements as fixed strings rather than regular expressions).


Alternatively, at least in GNU grep, you can use --exclude (and --include) to limit the match to specific file subsets to avoid the second grep altogether. So using your example above:

$ grep -rnw . -e "token2"
./file_token_a.txt:1:token1 token2
./file_token_c.txt:1:token2 token3

but given an array of filename patterns (note the elements are separated by whitespace not commas):

grep_ignore=("*token_a*" "*token_b*")

then

$ grep -rnw . -e "token2" "${grep_ignore[@]/#/--exclude=}"
./file_token_c.txt:1:token2 token3

where the array parameter expansion ${grep_ignore/#/--exclude=} expands as follows:

$ printf '%s\n' "${grep_ignore[@]/#/--exclude=}"
--exclude=*token_a*
--exclude=*token_b*

Alternatively you could use a brace expansion instead of an array:

grep -rnw . -e "token2" --exclude={"*token_a*","*token_b*"}

  1. try it with set -x for example:

     $ grep -rnw . -e "token2" | grep -v <(printf '%s\n' "${grep_ignore[@]}")
     + grep --color=auto -rnw . -e token2
     + grep --color=auto -v /dev/fd/63
     ++ printf '%s\n'
     ./file_token_a.txt:1:token1 token2
     ./file_token_c.txt:1:token2 token3
    

    Note how the grep command has become grep --color=auto -v /dev/fd/63? You can further confirm that it's treating /dev/fd/63 as a pattern rather than a pseudo-file as follows:

     printf '%s\n' /dev/fd/{61..65} | 
       grep -v <(printf '%s\n' "${grep_ignore[@]}")
    

    (you'll see that /dev/fd/63 gets filtered out).

Bruno Peixoto avatar
kg flag
Very nice, sir! :-)
hr flag
@BrunoPeixoto you're welcome - I just added a note about brace expansion (which you may find more intuitive than the array expansion)
Bruno Peixoto avatar
kg flag
I found this documentation for brace expansion: https://www.gnu.org/software/bash/manual/html_node/Brace-Expansion.html
pierrely avatar
cn flag
@Bruno , then consider upvoting the answer, I just did
Bruno Peixoto avatar
kg flag
Where is the answer you provide? I cannot see it
hr flag
@BrunoPeixoto I realized your original approach would work - with a couple of corrections. Please see the edited answer.
Bruno Peixoto avatar
kg flag
I am very annoyed with close-enough-solutions, although most of mine are of these kinds. ;-P
Bruno Peixoto avatar
kg flag
You divide the answer into 3 blocks. Block 3 assumes I know what "set -x" as well as does not provide the expected output. There are still answer improvement opportunities.
Bruno Peixoto avatar
kg flag
@steeldriver you can use project root "https://github.com/quivero/prego" to test the commands below. The array values are not filtered out. :-( `grep_ignores=( "*node_modules*" "*.git*" "*package-lock*" "*codecov*" "*scripts*" )` `grep -rnw . -e "@babel/cli" | grep -vFf <(printf '%s\n' "${grep_ignores[@]}")`
Bruno Peixoto avatar
kg flag
Some holy enlightment made me change `grep -vFf` to `grep -vEf`. It worked like a charm!
Score:1
za flag
grep -rnw token2 *.txt | grep -E -v "(token_a|token_b)"

seems to me to be a simpler approach, than to handle arrays.

Grep with -E for extended regular expressions, so you can use the OR-Operator "(token_a|token_b)".

Bruno Peixoto avatar
kg flag
The answer is a great start! We can go from it to expansions `{}`. Thanks!
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.