Here's my task:
I've got a source stream of live log output from a messaging process. Lots of output is irrelevant to me but there are sections i want to collect and evaluate separately.
Those blocks start with "---BEGIN Request---" at the end of a separate line which begins with date / time, hostname and process[pid]: . And accordingly a block ends with "---END Request---" at the end of another line. Inbetween these two is what i want to capture.
My attempts at sed with a log excerpt file failed. I approached this by trying to remove everything outside of my focus points but i still got every single line. Maybe someone sees my mistake:
sed -r '/---END Request---$/{
$!{ N
s/---END Request---.?\n([^:]+: )---BEGIN Request---$/---END Request---\n\1---BEGIN Request---/
t sub-hit
:sub-miss
P
D
:sub-hit
}
}' sample.log
I think awk could be an alternative tool to use here but i have not looked into it's performance for working on live log streams.
There's alway someone with a solution using python or any other language. I'm open to that but consider i plan on using this on a log stream and not on static text files.
Here's my simplified sample log excerpt for testing. I've anonymised and removed some things.
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: PUT /token/connect HTTP/2.0
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept: */*
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept-Encoding: gzip, deflate, br
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Accept-Language: de-de
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Cache-Control: no-cache
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Content-Length: 306
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Content-Type: text/xml
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: User-Agent: TokenHandler/3.2
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: <?xml version="1.0" encoding="UTF-8"?>
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: <!DOCTYPE and so on. Intentionally cut short here for askubuntu
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: ---END Response---
Jan 20 14:20:47 host-230-17-17-10 tokenhandler[4230]: transport=http method=PUT status=200 proto=HTTP/2.0 host=10.17.17.240 user_agent=TokenHandler/3.2 path=/token/connect
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: POST /v3/token/033aaed70bdce765ace3223a5dc5 HTTP/1.1
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Authorization: Basic bWljcm9tZG06MjVuWjdWV3BjMkZaalRkZlRNVTNzaWdyS2xwZlRsVQ==
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Length: 0
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: level=info component=tknzr method=add udid=033aaed70bdce765ace3223a5dc5 err=null took=145.419185ms
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Type: application/json; charset=utf-8
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: {
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: "status": "success",
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: "notification_id": "FC88CDE8-D3AD-4607-602F-6005E70E83E2"
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: }
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: transport=http method=POST status=200 proto=HTTP/1.1 host=10.17.17.230 user_agent= path=/v3/token/033aaed70bdce765ace3223a5dc5
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: PUT /token/connect HTTP/2.0
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Host: host-230-17-17-10
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept: */*
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept-Encoding: gzip, deflate, br
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Accept-Language: en-US,en;q=0.9
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Cache-Control: no-cache
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Length: 306
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Content-Type: text/xml
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: User-Agent: TokenHandler/3.2
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: <?xml version="1.0" encoding="UTF-8"?>
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: <!DOCTYPE and so on. Intentionally cut short here for askubuntu
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Request---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---BEGIN Response---
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: HTTP/1.1 200 OK
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: Connection: close
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: [1B blob data]
Jan 20 14:20:48 host-230-17-17-10 tokenhandler[4230]: ---END Response---