Score:0

How does HTTPS certificate presentation work, exactly?

ng flag

I'm troubleshooting an issue with a SAS vendor. To be clear, this question isn't "how do I fix it?", nor is it "what exactly is causing this problem?" -- rather, it's "how do these technologies work, such that this combination of symptoms is possible?" I have a support ticket open with the vendor already (and I am less-than-patiently waiting for it to be escalated to someone sufficiently capable). The purpose of this question is to expand my own understanding of how these things (typically?) work, and what variables might be in play that I have not considered.

Said vendor provides a "domain customization" feature, where you can access their service via a domain you control. (You provide them with a private key, plus a certificate w/ chain, and you add a CNAME entry that points at a domain under the vendor's control.)

I have two "tenants" with this vendor -- one for development purposes and one for production. Both are configured with custom domains. They use the exact same cert and PK; the cert has production domain as its CN, and both dev and prod domains listed as SANs.

The production tenant works perfectly, so I know the cert is correct. However, when I visit the dev tenant in my browser, about 80-90% of the time I get a certificate error, and when I investigate, my browser reports that the cert being presented is valid but isn't mine, but rather one that belongs to the vendor (and, as such, does not list my domain as CN or SAN). I've tried different browsers. I've tried curl. I've tried remoting in to various servers I have access to and checking from there. My coworkers, who are distributed around the US, have tried as well. The behavior does not seem to vary with client software, client hardware, geographic location, or network configuration.

At that point in my process, I'm thinking "ok, they have a pool of servers behind a load balancer, and some of the servers don't have the correct cert and so they're presenting something else in some sort of fallback." Sure, fine, makes sense.

But then I tried DigiCert's Website Security tool, where you can enter a domain and it will evaluate the correctness of your certificate (among other things). Using that tool, I cannot reproduce the intermittent behavior; instead, it fails every time.

As a software engineer with a few decades of experience building web sites and services in various stacks, I have a reasonably good understanding of DNS, HTTPS, TLS certificates, webserver configuration, network routing, load balancing, and so on. But I'm stumped as to how the DigiCert validator could be be seeing different behavior than I see myself.

My first thought was a DNS propagation delay, but DNSChecker indicates no such issue. Next, I considered that DigiCert might be caching something on their end, but that would make for a glaring flaw in their tool. In both cases, the likelihood of that explanation has diminished now that this behavior has persisted for a few days.

So, my question, for those out there with more expertise in this than I: what possible explanations are there for the DigiCert tool's experience being different than every other client I've tried?

(Apologies if this isn't the right SE site for a question like this. It seemed a better choice than Network Engineering, and as I perused my options I didn't see any others that looked right.)

Massimo avatar
ng flag
What about *asking the vendor*? We can't possibly know how their systems work.
JakeRobb avatar
ng flag
@Massimo I have a support ticket open with them already, and I am somewhat impatiently waiting for it to get escalated in their support hierarchy. I fully understand that nobody here is going to be able to tell me exactly what is happening with full confidence. As I said in the question: what possible explanations are there?
Score:2
ng flag

I'd agree with your guess that this can be caused by different certificates being used on different systems, which can explain the inconsistent behavior; depending on whatever affinity rules the vendor uses to distribute requests, this can also explain why someone (such as DigiCert's testing tool) gets consistently wrong results.

However, trying to guess what's happening is completely useless here; you should report this issue to the vendor's support and ask them to fix it.

JakeRobb avatar
ng flag
I've clarified the question a bit. The intent is not to diagnose the issue, but to seek possible explanations that can expand my understanding of the technologies at play. You make a good point that the load balancing system could be more complex than I anticipated. My experience is with round-robin and load-based balancers; it didn't occur to me that there might also be additional rules in play. I can't think of a reasonable rule that would yield behavior in which some clients simply never hit some servers. Can you?
Massimo avatar
ng flag
Yes, definitely. If this vendor is anything more than a small local company, they will be running a distributed service where it's totally possible for various systems to become out of sync with each other. If the service is scaled to geographical levels, different clients will hit different endpoints based on their locations, and they will get different results depending on which endpoint their requests hit.
JakeRobb avatar
ng flag
They are pretty big and widely distributed, but all of the services _involved in my account_ are in a single AWS region.
JakeRobb avatar
ng flag
I've accepted this answer. FWIW, resolution came when I re-uploaded my cert. I conclude that something failed in whatever process they have that distributes certs across the appropriate servers, and that apparently they do not have a system in place to detect or mitigate such failures.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.