Score:0

How to calculate log per second?

jp flag

I have a log management system in which the Clickhouse database has been used alongside metrico/qryn, rsyslog on all the servers (Debian 11) send gathered logs from the system and applications to promtail for labeling and then promtail sends them to qryn, qryn then inserts the logs in the Clickhouse DB and has an API compatible with LogQL.

I'm trying to find a relationship between the Clickhouse system utilization and data sent to and received from the Clickhouse hence, I need to calculate the log per second in order to estimate the system utilization per each log per second so if I had a bigger environment I'd already know how much limit I should set for Clickhouse RAM & CPU usage.

My question is how to calculate this log per second I only have one instance of Clickhouse but I have no idea how to calculate it.

The method doesn't really matter I just need to know the log per second it could be on the system level, DB level, using a third-party application, etc.

Any help is appreciated, thanks in advance.

Romeo Ninov avatar
in flag
Do you store the logs somewhere as file (text)? Also provide example format of these logs.
x_Skipper_x avatar
jp flag
These logs are all gathered in `/var/log/syslog` of the central `rsyslog` instance and the Clickhouse data is accessible, as for the format they're all according to the Syslog RFC 5424.
Romeo Ninov avatar
in flag
You can try something like: `awk '{print $1" "$2" "$3}' /var/log/syslog|uniq -c`
x_Skipper_x avatar
jp flag
@RomeoNinov I am quite familiar with the `awk` command however, I cannot wrap my mind around how the command you provided would get me the `log per second`, would you be so kind to explain?
Romeo Ninov avatar
in flag
If you test the command you will see as output number and timestamp. And because every timestamp is with seconds preciseness you will get count of events for every second. Which is log records per second.
Score:1
pn flag

In logql you simply query your logs with

count_over_time({label="labelValue"} [1s])

That will give you the count of logs selected per 1s interval.

--

I love seeing our project being used in the wild. :)

x_Skipper_x avatar
jp flag
Thanks for recommending this method, as a matter of fact, this was suggested in an official issue I opened [here](https://github.com/metrico/qryn/issues/222#issuecomment-1274457727). You and I have had discussions in some other issues on the same repo, glad to see you here too.
x_Skipper_x avatar
jp flag
Please forgive me for not accepting your method as an answer since the previous one works as well as your method.
Score:1
in flag

You can use this combination of awk and uniq to count the records for every second:

awk '{print $1" "$2" "$3}' /var/log/syslog|uniq -c

for sample data like this:

[root@rhel01 ~]# tail /var/log/messages
Oct 10 08:51:29 rhel01 systemd[1]: Started Update UTMP about System Runlevel Changes.
Oct 10 08:51:29 rhel01 systemd[1]: Started Process archive logs.
Oct 10 08:51:29 rhel01 systemd[1]: Started pmlogger farm service.
Oct 10 08:51:29 rhel01 systemd[1]: Started Half-hourly check of pmlogger farm instances.
Oct 10 08:51:29 rhel01 systemd[1]: Reached target Timers.
Oct 10 08:51:29 rhel01 systemd[1]: Startup finished in 1.715s (kernel) + 5.822s (initrd) + 12.949s (userspace) = 20.486s.
Oct 10 08:51:30 rhel01 systemd[1]: NetworkManager-dispatcher.service: Succeeded.
Oct 10 08:51:32 rhel01 pcp-pmie[2130]: High per CPU processor utilization 99%util[cpu0]@rhel01 99%util[cpu1]@rhel01
Oct 10 08:51:33 rhel01 su[3183]: (to root) romeo on pts/0
Oct 10 08:51:34 rhel01 systemd[1]: pmlogger_daily.service: Succeeded.
[root@rhel01 ~]# tail /var/log/messages|awk '{print $1" "$2" "$3}'|uniq -c
      4 Oct 10 08:51:29
      1 Oct 10 08:51:30
      1 Oct 10 08:51:32
      1 Oct 10 08:51:33
      1 Oct 10 08:51:34
      1 Oct 10 08:51:43
      1 Oct 10 08:51:50

If you have delay of feed (not well sorted records by time) you will need to use sort command:

sort /var/log/messages|awk '{print $1" "$2" "$3}'|uniq -c
<snip>
      6 Sep 23 09:28:15
      3 Sep 23 09:30:15
      3 Sep 23 09:40:15
      3 Sep 23 09:50:05

or use only awk with associative array

awk '{a=$1" "$2" "$3;b[a]+=1} END {for (i in b) print b[i]","i}' /var/log/messages
x_Skipper_x avatar
jp flag
Thanks for the explanation @RomeoNinov, I now realize that I was not familiar with the `-c` switch of `uniq` command, you've answered my question as well as teaching me something new.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.