Score:2

What may cause TCP / HTTP request's delay / lag

in flag

We have a device that sends POST requests or TCP messages in fixed intervals, with a JSON payload, through the internet to our server running a node application, on another location.

The JSON payload has a timestamp and some other values. We compare that timestamp to the server's timestamp to calculate the time difference, that we call lag.

The interval is 100ms. What we experience, is that initially, the lag is under 200ms. We let the system run for a few days and we observe the lag is increasing, after 2-3 days it is 2000 - 3000ms, and increasing further .. after 6 days around 6000ms.

Once we restart the server process the lag is back to normal, so I assume the sender is OK. This happens with both, the POST requests implementation, and the TCP messages implementation.

Does anyone have any idea why this may happen, or how to narrow down the problem?

sh flag
Ben
As AlexD indicates server and client clocks can be different. To check timings it's best to take begin/end times on the same machine.
jp flag
Could you try manually triggering/timing the message on day 1 vs day 3 and comparing the performance? You'd be running it manually, so you'd want to ignore the timestamps/logging in favor of visually watching the performance. This would help narrow down whether the problem is due to synchronization (if manual run is fast) vs. actual lag (if manual run is slow)?
VL-80 avatar
cn flag
If possible, setup a test device-server pair in a controlled environment, close to each other and see if the problem persists. It will let you eliminate the network aspect of the problem and should help narrow down the issue if you suspect that the routing over the Internet contributes to the problem.
Ben Voigt avatar
pl flag
You should also measure the round-trip time, since this takes two measurements from a single clock source, any clock mismatch / error / skew cannot be confused with delay / latency.
in flag
The client is a device that has it's own OS. I have tried to emulate the device, but was not able to reproduce the problem. Round trip time is fairly low, usually under 20ms. Sine the reproduction takes days, the process of finding this is relative slow.
Score:6
jp flag

Do you have clocks on both server and sender synchronized? 1 second per day isn't unusual clock drift.

cn flag
`Once we restart the server process the lag is back to normal` suggests otherwise.
jp flag
@rtaft it is possible that they are using some imprecise clock (`setInterval` ?) within their service process which drifts over time. Or they are restarting the process by restarting the server which does `ntpdate` at the boot but doesn't run `ntpd` service.
jp flag
@AlexD: It shouldn't be due to imprecise interval, since they're sending the server timestamp in the payload. Interval drift would also drift the included timestamp. However, clock drift is a very real possibility, especially if OP is using a virtual/cloud server rather than a dedicated machine.
jp flag
@Brian, we don't know how they are calculating their timestamps. They can get system time once at the start and then add intervals.
in flag
We get the date via javascript `new Date()` on the server. Clock drifting is unlikely, but I'll keep it in mind.
Score:3
cn flag

This would indicate what's known as a resource leak. Something in your application is slowing down processing over time as a queue or memory or some other resource builds up.

I'd recommend adding instrumentation to your NodeJS code to track any internal queues, memory allocated, database lookups, etc. Log all the things so you can start to identify where the bottleneck is.

If you're running in the cloud there are good tools to help instrument code, such as AWS X-Ray.

in flag
Thanks. I will double-check my code.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.