Score:0

Filter blocks of log-output from a log stream

gb flag

Here's my task:

I've got a source stream of live log output from a messaging process. Lots of output is irrelevant to me but there are sections i want to collect and evaluate separately. Those blocks start with "---BEGIN Request---" at the end of a separate line which begins with date / time, hostname and process[pid]: . And accordingly a block ends with "---END Request---" at the end of another line. Inbetween these two is what i want to capture.

My attempts at sed with a log excerpt file failed. I approached this by trying to remove everything outside of my focus points but i still got every single line. Maybe someone sees my mistake:

sed -r '/---END Request---$/{
   $!{ N 
     s/---END Request---.?\n([^:]+: )---BEGIN Request---$/---END Request---\n\1---BEGIN Request---/
     t sub-hit
     :sub-miss
     P          
     D          
     :sub-hit
   }    
 }' sample.log

I think awk could be an alternative tool to use here but i have not looked into it's performance for working on live log streams.

There's alway someone with a solution using python or any other language. I'm open to that but consider i plan on using this on a log stream and not on static text files.

Here's my simplified sample log excerpt for testing. I've anonymised and removed some things.

Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: PUT /token/connect HTTP/2.0
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept: */*
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept-Encoding: gzip, deflate, br
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept-Language: de-de
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Cache-Control: no-cache
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Content-Length: 306
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Content-Type: text/xml
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: User-Agent: TokenHandler/3.2
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: <?xml version="1.0" encoding="UTF-8"?>
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: <!DOCTYPE and so on. Intentionally cut short here for askubuntu
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---END Response---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: transport=http method=PUT status=200 proto=HTTP/2.0 host=10.17.17.240 user_agent=TokenHandler/3.2 path=/token/connect
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: POST /v3/token/033aaed70bdce765ace3223a5dc5 HTTP/1.1
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Authorization: Basic bWljcm9tZG06MjVuWjdWV3BjMkZaalRkZlRNVTNzaWdyS2xwZlRsVQ==
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Length: 0
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: level=info component=tknzr method=add udid=033aaed70bdce765ace3223a5dc5 err=null took=145.419185ms
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Type: application/json; charset=utf-8
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: {
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]:   "status": "success",
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]:   "notification_id": "FC88CDE8-D3AD-4607-602F-6005E70E83E2"
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: }
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: transport=http method=POST status=200 proto=HTTP/1.1 host=10.17.17.230 user_agent= path=/v3/token/033aaed70bdce765ace3223a5dc5
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: PUT /token/connect HTTP/2.0
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept: */*
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept-Encoding: gzip, deflate, br
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept-Language: en-US,en;q=0.9
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Cache-Control: no-cache
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Length: 306
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Type: text/xml
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: User-Agent: TokenHandler/3.2
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: <?xml version="1.0" encoding="UTF-8"?>
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: <!DOCTYPE and so on. Intentionally cut short here for askubuntu
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Response---
Score:0
in flag

In a stream (live log) you can use the sed option -u (unbuffered)

You can additionally use |cut -c55-10000 on the end of the command to cut the date, hostname etc..

Andreas avatar
gb flag
Hi Denis, i like the unbuffered parameter. Your idea to use cut with fixed positions won't work if the pid (process id) in the square brackets has more or less digits than 4. I already know i would use `cut -d':' -f 4- ` to achieve that.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.