Score:-1

How to filter rows in a csv file with bash based in two conditions?

br flag

I'm doing a project in which I have to parse a csv file (indian liver patient dataset) and I'm trying to change the position of one column. The second to last must be the last column. I'm following these approach but I don't know if It is the right one:

while IFS="," read -r col1 col2 col9 col8 col

do

echo "$col1, $col2, $col9, $col8"

done < <(cut -d "," --fields=1,2,9,8 csvfile)

Also I need to separate between "Male" and "female" (col2), and just show those values where col9 = 3. The desire output is:

Women
38,Female,3, 5.6
38,Female,3, 5.6
32,Female,3, 6

and so on

Men
72,Male,3, 7.4
60,Male,3, 6.3
33,Male,3, 5.4

and so on

How can I do that without using grep or akw?

muru avatar
us flag
Why can't you use grep or awk?
tucomax avatar
br flag
Project rules. I can't change that.
muru avatar
us flag
What project rules allow `cut` but not `awk`?
Score:0
cn flag

I agree with Muru not allowing the tools best suited is not optimal, probably has its purpose, though. I Don't think it's possible to do this in one loop, at least not without sorting the file first or dropping the header. With an associative array, it's possible to simulate "group by" where the key becomes Female or Male, and its fields are "serialized" as value. In the first loop _ is used to skip fields, and the second for loop iterates through the keys and formats the output.

#!/bin/bash

declare -A A=()
declare -A B=([Male]=Men [Female]=Women)

while IFS=, read -r a b _ _ _ _ _ c d _ ; do
    [[ $d = 3 ]] && \
        A[$b]+=" $a $b $d $c"
done < file.csv

for e in ${!A[@]}; do
    printf %s%s\\n "$nl" ${B[$e]}
    printf '%s, %s, %s, %s\n' ${A[$e]}; nl=$'\n'
done
tucomax avatar
br flag
This answer has all the features I was looking for. Thank you I have seen my mistakes.
Score:0
cn flag

I'd stick an IF statement around the echo and append to separate files.

Before the read loop starts

# quietly erase CSV files
rm col2eq8.csv 2> /dev/null
rm col2noteq8.csv 2> /dev/null

Inside your read loop:

# if $col2 equals 8
if [[ "$col2" -eq 8 ]]
then
  # then re-order columns and append to col2eq8.csv file
  echo "$col1, $col2, $col9, $col8" >> col2eq8.csv
else
  # else re-order columns and append to col2noteq8.csv
  echo "$col1, $col2, $col9, $col8" >> col2noteq8.csv
fi

Change the two echo commands to get just the fields you want in the order you want.

If you need to separate based on columns, change the '$col2 -eq 8' to be whatever condition you want.

For other bash-only CSV manipulations, see Bash CSV parsing.

tucomax avatar
br flag
Thank you for your answer. The thing is that I made a mistake. Instead of col8 is col2. Also, every row in that column is Male or Female and I have to separate them and put all Males together and the same with Females. Finally, I have to show just rows equal to 3 of col9.
cn flag
You can vary your IF statements, e.g. col2=male and (&&) col9=3: if [[ "$col2" == "Male" && "$col9" -eq 3 ]]
tucomax avatar
br flag
This is great answer also. It helps me a lot to deal with the task.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.