Score:0

A tool to split tree into subtree groups

tn flag

Let's say I have a textfile with an exported list of filenames (e.g. files that have been created too long time ago or that are too large). The list is pretty huge (100k+ files). I don't have direct filesystem access to the files (e.g. they are in a cluster). A sample list would be:

/cluster/team-1/file001
/cluster/team-1/file002
/cluster/team-1/file003
/cluster/team-2/subteam-A/dump
/cluster/team-2/subteam-B/exportlist.txt
/cluster/team-3/2021/11/05/dump
/cluster/team-3/2021/11/04/dump
/cluster/team-3/2021/10/30/dump
/cluster/team-3/2021/09/30/dump
/cluster/team-4/project-foo/x
/cluster/team-4/project-foo/y
/cluster/team-4/project-foo/z
/cluster/team-4/project-bar/i
/cluster/team-4/project-bar/j
/cluster/team-4/project-bar/k

I would like to generate a list of prefixes that create logical grouping of those files. There seems to be some structure in that tree, but probably nothing that can be reasonably automated. From the sample above, a grouping would be

/cluster/team-1/*
/cluster/team-2/subteam-A/*
/cluster/team-2/subteam-B/*
/cluster/team-3/*
/cluster/team-4/project-foo/*
/cluster/team-4/project-bar/*

With such a list of groups I can then tackle each group separately (e.g. inspect those files or reach out to the team that owns those files)

What tool would you use to create such mapping?

What I have tried so far:

  • vim -- it requires a lot of ad-hoc macros/searches to remove files that are in the same group. It can get the job done but it does not feel like the right tool
  • fzf -- fzf allows selecting multiple files/directories that would be printed out to output, has fuzzy search. It fits well if you are searching for something, rather than selecting grouping
  • broot -- is great for exploring the tree structure. However it does not allow reading a list of entries from a text file rather than filesystem

Ideally there would be a tool similar to mercurial interactive commit selection:

Screenshot of hg commit -i

You can select either a whole file, chunk or individual lines. The tool could help select directories which then would become the grouppings.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.