Score:1

Ubuntu

Variables for constructing patterns for use with sed

Fatipati

3/1/23, 1:33 PM

I have written a bash function to print sections of text enclosed between lines matching ## mode: org and ## # End of org in a file, with an empty line between sections. Before the ##, there can be any number of spaces.

Here is an example of a file to extract information from.

file: test.sh

## mode: org
## * Using case statement
## # End of org
case $arg in
 ("V")
   echo "Author"
   ;;
 (*)
   ## mode: org
   ## ** Silent Error Reporting Mode (SERM) in getopts
   ## *** Detects warnings without printing built-in messages.
   ## *** Enabled by colon {:} as first character in shortopts.
   ## # End of org
   break
   ;;
esac

The desired output would be

Code:

* Using case statement

** Silent Error Reporting Mode (SERM) in getopts
*** Detects warnings without printing built-in messages.
*** Enabled by colon {:} as first character in shortopts.

Here is the function I am using

capture-org ()
{
  local efile="$1"

  local begsec="## mode: org"
  local endsec="## # End of org"

  sed -n "/^[[:space:]]*${begsec}$/,/^[[:space:]]*${endsec}$/s/ *//p'" "$efile" |
   sed 's/^'"${begsec}"'$/\n'"${begsec}"'/' |
   sed '/^'"${begsec}"'$/d' | sed '/^'"${endsec}"'$/d' | cut -c 3-
}

I would like to simplify the function, using variables to construct patterns. But need some assistance to compile commands together such that I do not have to call sed so many times.

Perhaps using awk would be a better strategy.

capture-org ()
{
  local efile="$1"

  local begsec='^[[:space:]]*## mode: org$'
  local endsec='^[[:space:]]*## # End of org$'

  sed -n "/${begsec}/,/${endsec}/s/ *//p" "$efile" |
   sed 's/^## # End of org$/## # End of org\n/' |
   sed '/^## mode: org$/d' | sed '/^## # End of org$/d' | cut -c 3-
}

0 + 0

bash

sed

Score:1

Ubuntu

terdon

3/1/23, 5:45 PM

I would indeed use something more sophisticated for this. Like awk:

$ awk -v start="$begsec" -v end="$endsec" \
    '{ 
        if($0~start){want=1; next} 
        if($0~end){want=0; print ""; next} 
        gsub(/\s*#+\s*/,""); 
     } want' file
* Using case statement

** Silent Error Reporting Mode (SERM) in getopts
*** Detects warnings without printing built-in messages.
*** Enabled by colon {:} as first character in shortopts.

Or, using your last function there as a template:

capture-rec ()
{

  local begsec='## mode: org'
  local endsec='## # End of org'

  awk -v start="$begsec" -v end="$endsec" \
    '{ 
        if($0~start){want=1; next} 
        if($0~end){want=0; print ""; next} 
        gsub(/\s*#+\s*/,""); 
     } want' "$1"
}

One caveat that may be important is that this doesn't not require that the $begsec and $endsec be the only things on the line other than leading whitespace like your approach did, it simply searches for them anywhere on the line. I am assuming this isn't a very big deal considering what you are looking for, but if it is, you can use this instead which will remove whitespace at the beginning and end of the line before matching:

capture-rec ()
{

  local begsec='## mode: org'
  local endsec='## # End of org'

    awk -v start="$begsec" -v end="$endsec" \
    '{ 
        sub(/^[[:space:]]*/,"");
        sub(/[[:space:]]*$/,"");
        if($0==start){ want=1; next} 
        if($0==end){   want=0; print ""; next} 
        gsub(/\s*#+\s*/,""); 
     } want' "$1"
}

0 + 0

Fatipati

3/1/23, 7:29 PM

It is printing other parts of the file preceding the first matched line

Fatipati

3/1/23, 7:47 PM

I need them to be the only things on the line other than leading whitespace, because they would get picked up if I run the commands on the same file (picking up `local begsec='## mode: org'` and `local endsec='## # End of org'`).

terdon

3/2/23, 10:47 AM

@Fatipati see updated answer for a version that works only on lines where the beg- and endsec is the whole line except for leading and trailing whitespace. If that doesn't work as expected, please show me a file it fails on so I can understand why.

Score:0

Ubuntu

elmclose

3/1/23, 8:53 PM

This single line command in your script might do the job:

awk '/## mode/{flag=1;next} /## # End/{flag=0} flag' extract.txt | tr -d /#/ | awk '$1=$1'

* Using case statement
** Silent Error Reporting Mode (SERM) in getopts
*** Detects warnings without printing built-in messages.
*** Enabled by colon {:} as first character in shortopts.

The second awk simply removes leading spaces.

0 + 0

Fatipati

3/2/23, 6:28 AM

A mentioned, I am getting more lines being printed that between the `begin` and `end` parts. This is because I am passing the same file where I have the code and thus the function `capture-org`.

Fatipati

3/2/23, 6:52 AM

I need to match exactly as with the patterns `^[[:space:]]*## mode: org$` and `^[[:space:]]*## # End of org$`.

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Variables for constructing patterns for use with sed

TH: ตัวแปรสำหรับสร้างรูปแบบเพื่อใช้กับ sed

RO: Variabile pentru construirea de modele pentru utilizare cu sed

RU: Переменные для построения шаблонов для использования с sed

VI: Các biến để xây dựng các mẫu để sử dụng với sed

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.