Score:0

script to compare two files and remove match line in file A

eg flag

I have two files. file A:

Modern Tales: Age Of Invention Playstation 4, 1935, 3478-3480
Modern Tales: Age Of Invention Xbox One, 3074
Moero Chronicle Switch, 6667, 12400, 28910, 29900, 29901, 29920
MOHAA Counter Intelligence, 
MOHAA MatchWatch, 950, 5500-5699, 13000-14000
MOHAA Missionary, 
Mokoko X Switch, 6667, 12400, 28910, 29900, 29901, 29920
Mokoko X Xbox One, 3074

and file B:

MIX Game Server
MOHAA Counter Intelligence
MOHAA MatchWatch
MOHAA Missionary
MOHAA Reverend
MoneyWorks Datacentre
Monsoon Vulkano
Moonlight Game Streaming
MY-IPCAM Anywhere

How can I remove entire lines form file A where its first column are identical in file B? Final file should be:

Modern Tales: Age Of Invention Playstation 4, 1935, 3478-3480
Modern Tales: Age Of Invention Xbox One, 3074
Moero Chronicle Switch, 6667, 12400, 28910, 29900, 29901, 29920
Mokoko X Switch, 6667, 12400, 28910, 29900, 29901, 29920
Mokoko X Xbox One, 3074
Raffa avatar
jp flag
I see you asked the opposite of this question here: https://askubuntu.com/q/1459425/968501 … Just remove the `-v` option in this answer https://askubuntu.com/a/1459415/968501 if that answer worked for you in this question … or remove the `!` in the `awk` solution in this answer https://askubuntu.com/a/1459419/968501 And the should both solve your new question :-)
Score:2
iq flag

You can use grep for this task, so to remove the file from fileA.txt that are the sames lines as in the file fileB.txt it will be:

grep -v -w -F -f fileB.txt fileA.txt > fileA_filtered.txt

fileA_filtered.txt will contain all the lines from those 2 files that were not the same.

Raffa avatar
jp flag
Do you actually need `-w` when matching a fixed string using `grep`?
hr flag
@Raffa it does seem to impose regex-style word boundary assertions even in fixed-string mode - try `echo foo | grep -Fw oo` for example
Raffa avatar
jp flag
@steeldriver It does indeed, but kind of defy the purpose of `-F` and will work the same(*or at least the same end result*) without it ... i.e. `echo foo | grep -w oo`
Raffa avatar
jp flag
@steeldriver After testing(*judging by user CPU time when processing very large files*) it appears that with both `-F` and `-Fw` very close CPU time is used while with only `-w` CPU time is clearly, distinctively and significantly much greater ... Therefore I assume using `-Fw` can be both harmless and justified in some cases IMHO :-)
Score:1
hr flag

If you really want to match on the first comma-separated field rather than anywhere in the line you can do so with awk:

$ awk -F, 'NR==FNR{a[$0]; next} !($1 in a)' file_B file_A
Modern Tales: Age Of Invention Playstation 4, 1935, 3478-3480
Modern Tales: Age Of Invention Xbox One, 3074
Moero Chronicle Switch, 6667, 12400, 28910, 29900, 29901, 29920
Mokoko X Switch, 6667, 12400, 28910, 29900, 29901, 29920
Mokoko X Xbox One, 3074

Alternatively, since your files appear to be pre-sorted on the first field, you could use the join command:

$ join -t, -j 1 -v 1 file_A file_B
Modern Tales: Age Of Invention Playstation 4, 1935, 3478-3480
Modern Tales: Age Of Invention Xbox One, 3074
Moero Chronicle Switch, 6667, 12400, 28910, 29900, 29901, 29920
Mokoko X Switch, 6667, 12400, 28910, 29900, 29901, 29920
Mokoko X Xbox One, 3074
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.