Score:0

How to sort files into directories and subdirectories by name?

id flag

I'm new to ubuntu and also to the programming world, and I don't really know how to do this!

I have a folder with lots of files named as follows:

000_S_001_mpc_asd.json
000_S_001_mpc_asd.nii
000_S_001_mpc_asd_aa.nii
011_S_001_mpc_asd.json
011_S_001_mpc_asd.nii
011_S_001_mpc_asd_aa.nii
000_S_002_mpc_asd.json
000_S_002_mpc_asd.nii
000_S_002_mpc_asd_aa.nii
000_S_001_dtd_rty.bval
000_S_001_dtd_rty.bvec
000_S_001_dtd_rty.nii
000_S_001_dtd_rty.json
011_S_001_dtd_rty.bval
011_S_001_dtd_rty.bvec
011_S_001_dtd_rty.nii
011_S_001_dtd_rty.json
000_S_002_dtd_rty.bval
000_S_002_dtd_rty.bvec
000_S_002_dtd_rty.nii
000_S_002_dtd_rty.json
011_S_001_flf_lkj.json
011_S_001_flf_lkj.nii
011_S_001_flf_lkj_aa.nii
000_S_001_flf_lkj.json
000_S_001_flf_lkj.nii
000_S_001_flf_lkj_aa.nii
000_S_002_flf_lkj.nii
000_S_002_flf_lkj_aa.nii

Let's say xxx_S_xxx is the principal name, and the rest of the file's name gives secondary information (let's call it a secondary name).

I would like to find a specific name into the secondary name and make a folder with this name (for example mpc, dtd or flf), then make subfolders named as the principal name of each file, and into those folders put the respective files. Probably an image will explain better what I'm trying to say.

So for example, the output for the names I gave you above would look like this:

Desired output:

Is this possible to do from the terminal? I would appreciate your help.

My OS is Ubuntu 20.04 LTS

Score:2
cn flag

In cases where the target directory structure does not exist, you can chop the path into pieces and resemble them in the desired order.

#!/bin/bash

while IFS=_ read -r a b c d e; do
    mkdir -p target/$d/${a##*/}_${b}_${c}; mv -t $_ ${a}_${b}_${c}_${d}_$e
done < <(printf %s\\n files/*)

Or a regular expression, where the array BASH_REMATCH records the matched parts. The first array member contains the part that matches the whole regular expression. Substrings matched by parenthesized subexpressions are assigned to the following members.

#!/bin/bash

for i in files/*; do
    if [[ $i =~ ^files/([0-9]+_S_[0-9]+)_(mpc|dtd|flf) ]]; then
        mkdir -p target/${BASH_REMATCH[2]}/${BASH_REMATCH[1]}; mv -t $_ $i
    fi
done

You can also divide the process into two steps where you start by creating the directory structure first, awk in combination with xargs optimizes the use of mkdir and then use, e.g., mmv for the renaming.

#!/bin/bash

builtin cd files

printf %s\\0 * | \
awk -F _ ' \
    BEGIN{ RS = ORS = "\0" } { printf("../target/%s/%s_%s_%s\0", $4, $1, $2, $3) } \
' | xargs -0 mkdir -p

# Remember we have change our working directory.
mmv -m '*_S_*_*_*' '../target/#3/#1_S_#2/#1_S_#2_#3_#4'
Score:0
cn flag

I'm not quite sure how to do it purely in the terminal without it getting too difficult to read, but I think you could get the result you are looking for, or at least get started, with something along these lines:

Edit: updated with info from the comment. Also swapped secondary and primary since those were backwards.

Edit2: realized that while the secondary name no longer relied on placement, the principle name relied on secondary's placement.

#!/bin/bash

input_directory="/path/to/your/data"
output_directory="/path/to/your/output"

cd "$input_directory"
for file in *; do

    if [ ! -f "$file" ]; then
        continue;
    fi

    # place your secondary names inside "(mpc|flf|dtd)" seperated by '|'
    secondary=$(echo "$file" | grep -o -E "(mpc|flf|dtd)");
    princple=$(echo "$file" | grep -o -E "([0-9]+_S_[0-9]+)");
    
    # skip over and alert that a secondary match was not found for a file
    if [ "$secondary" == "" ]; then
        echo "No secondary match found!! Skipping $file";
        continue;
    fi

    destination="${output_directory}/$secondary/$princple"

    # create the directories if they don't exist
    if [ ! -d "$destination" ]; then
        mkdir -p "$destination";
    fi

    # uncomment to move the files to the new directories if the test output
    # from echo is correct
    #mv "$file" "$destination"

    # test to print result of moving the files
    from=$(readlink -f "$file")
    echo "$from -> $destination/$file"
done

grep -o -E "(mpc|flf|dtd)" searches the filename for one of the secondary name keywords, e.g. (mpc, dtd or flf), and saves that word to the secondary variable.

grep -o -E "([0-9]+_S_[0-9]+) Same idea, by looks for the xxx_S_xxx pattern.

It can be run as: bash script.sh

The input_directory and output_directory variable will need to be filled in with the correct paths. Also, the fields "(mpc|flf|dtd)" in the grep statement can be filled in with other secondaries.

Al_Mt avatar
id flag
I can't get anything, probably I'm doing something wrong. I saved this code as a .sh file and run it from terminal ./code.sh, am I wrong?
wizardpurple avatar
cn flag
@Al_Mt I added an example run command. It won't create the directories, or move/copy the files over. It will just print a sort of test result. I wasn't sure where you would want it output, if you wanted the files moved or copied, etc.
Al_Mt avatar
id flag
Thank you @wizardpurple. Well I really don't know anything about bash or shell so I just tried to run your code. I'd like to move the files to the new corresponding directories. I think is quite accurate, the only thing is that sometimes the element of the secondary name I want to get is not in position 4th but it can vary, so I'd prefer get the specific words I want (a kind of find mpc, dtd or flf) instead of a position. Could you help me with that?
wizardpurple avatar
cn flag
@Al_Mt the secondary shouldn't rely on placement any more. It should only print a test to check for correctness until the line with ```mv``` is uncommented.
Al_Mt avatar
id flag
this is of great help!! Thank you so much!
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.