Keeping connections open obviously consumes resources even though they're just TCP connections sitting there, waiting response.
It is also obvious that they don't consume as much compared to, say, a process making database queries and transforming results to create a response.
My issue is, when I calculate potential resource consumption of raising timeout values, what i take into account seems to yield an optimistic result and I'd like to know if I'm missing anything.
I explore this by thinking as if the operation will end successfully and with identical results either way. Imagine we'll either raise the timeout values or magically get the external server respond faster. So that we can isolate the effects of keeping the connection open for longer. Then I can, separately compare them to the effects of the process not being completed.
Things I consider are:
- Memory consumption: I find this to be the most concerning one. However when the open connection spends almost all it's time waiting for a payload, it has allocated the minimum amount of memory of it's life cycle.
- CPU usage: Next to zero. The process itself just checks if the response is there yet. The OS should be maintaining the connection but can't think of any significant calculations
- Concurrent open connections: A light research says the limit is around .3 million. If my number gets nowhere near that, we're cool
- Socket count: Again, if my system is nowhere near the limit, it's ok
- Port count: Same: We know the limit, we know if we're pushing it.
- Max number of open files: We can test how much we need for each process and calculate accordingly
- Keeping other services busy: We may be keeping a database connection busy, an inbound connection waiting for our process etc.
This is all I can come up with as drawbacks. In my case every item besides memory isn't even worth making calculations and memory consumption is not affected that much anyway.
When I compare this to not being able to complete the operation, the scale tips way further towards raising the timeouts. In our specific situation if the operation fails, the next request will take even longer trying to compensate, which kind of includes the entirety of drawbacks of high timeouts. Having raised the timeouts, the total time spent waiting would be even less.
This became a light debate with my boss who seems to think this issue is (or I am) below him to be worth arguing. Can't get anything constructive out of that conversation so I came here.
Is there any notable aspects that I've missed? Is any of my items being underestimated?