Score:1

Variables for constructing patterns for use with sed

jp flag

I have written a bash function to print sections of text enclosed between lines matching ## mode: org and ## # End of org in a file, with an empty line between sections. Before the ##, there can be any number of spaces.

Here is an example of a file to extract information from.

file: test.sh

## mode: org
## * Using case statement
## # End of org
case $arg in
 ("V")
   echo "Author"
   ;;
 (*)
   ## mode: org
   ## ** Silent Error Reporting Mode (SERM) in getopts
   ## *** Detects warnings without printing built-in messages.
   ## *** Enabled by colon {:} as first character in shortopts.
   ## # End of org
   break
   ;;
esac

The desired output would be

Code:

* Using case statement

** Silent Error Reporting Mode (SERM) in getopts
*** Detects warnings without printing built-in messages.
*** Enabled by colon {:} as first character in shortopts.

Here is the function I am using

capture-org ()
{
  local efile="$1"

  local begsec="## mode: org"
  local endsec="## # End of org"

  sed -n "/^[[:space:]]*${begsec}$/,/^[[:space:]]*${endsec}$/s/ *//p'" "$efile" |
   sed 's/^'"${begsec}"'$/\n'"${begsec}"'/' |
   sed '/^'"${begsec}"'$/d' | sed '/^'"${endsec}"'$/d' | cut -c 3-
}

I would like to simplify the function, using variables to construct patterns. But need some assistance to compile commands together such that I do not have to call sed so many times.

Perhaps using awk would be a better strategy.

capture-org ()
{
  local efile="$1"

  local begsec='^[[:space:]]*## mode: org$'
  local endsec='^[[:space:]]*## # End of org$'

  sed -n "/${begsec}/,/${endsec}/s/ *//p" "$efile" |
   sed 's/^## # End of org$/## # End of org\n/' |
   sed '/^## mode: org$/d' | sed '/^## # End of org$/d' | cut -c 3-
}
Score:1
cn flag

I would indeed use something more sophisticated for this. Like awk:

$ awk -v start="$begsec" -v end="$endsec" \
    '{ 
        if($0~start){want=1; next} 
        if($0~end){want=0; print ""; next} 
        gsub(/\s*#+\s*/,""); 
     } want' file
* Using case statement

** Silent Error Reporting Mode (SERM) in getopts
*** Detects warnings without printing built-in messages.
*** Enabled by colon {:} as first character in shortopts.

Or, using your last function there as a template:

capture-rec ()
{

  local begsec='## mode: org'
  local endsec='## # End of org'

  awk -v start="$begsec" -v end="$endsec" \
    '{ 
        if($0~start){want=1; next} 
        if($0~end){want=0; print ""; next} 
        gsub(/\s*#+\s*/,""); 
     } want' "$1"
}

One caveat that may be important is that this doesn't not require that the $begsec and $endsec be the only things on the line other than leading whitespace like your approach did, it simply searches for them anywhere on the line. I am assuming this isn't a very big deal considering what you are looking for, but if it is, you can use this instead which will remove whitespace at the beginning and end of the line before matching:

capture-rec ()
{

  local begsec='## mode: org'
  local endsec='## # End of org'

    awk -v start="$begsec" -v end="$endsec" \
    '{ 
        sub(/^[[:space:]]*/,"");
        sub(/[[:space:]]*$/,"");
        if($0==start){ want=1; next} 
        if($0==end){   want=0; print ""; next} 
        gsub(/\s*#+\s*/,""); 
     } want' "$1"
}
jp flag
It is printing other parts of the file preceding the first matched line
jp flag
I need them to be the only things on the line other than leading whitespace, because they would get picked up if I run the commands on the same file (picking up `local begsec='## mode: org'` and `local endsec='## # End of org'`).
terdon avatar
cn flag
@Fatipati see updated answer for a version that works only on lines where the beg- and endsec is the whole line except for leading and trailing whitespace. If that doesn't work as expected, please show me a file it fails on so I can understand why.
Score:0
cn flag

This single line command in your script might do the job:

awk '/## mode/{flag=1;next} /## # End/{flag=0} flag' extract.txt | tr -d /#/ | awk '$1=$1'

* Using case statement
** Silent Error Reporting Mode (SERM) in getopts
*** Detects warnings without printing built-in messages.
*** Enabled by colon {:} as first character in shortopts.

The second awk simply removes leading spaces.

jp flag
A mentioned, I am getting more lines being printed that between the `begin` and `end` parts. This is because I am passing the same file where I have the code and thus the function `capture-org`.
jp flag
I need to match exactly as with the patterns `^[[:space:]]*## mode: org$` and `^[[:space:]]*## # End of org$`.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.