Score:0

How does DNS work with websites that have large traffic and multiple servers?

lr flag

Based on what I learned from CS50x, each computer on the internet has its own unique public IP address. Some computers are used on the client-side to access websites, while other computers are used as servers that respond to requests (by providing data from their computers' databases, or more generally, fulfilling some "service(s)" for the client). When a client visits a website, they must make a request through their address bar for the specific server whose service they want to access, which they typically do by typing in a domain name (which was created for the purpose of not having to make users remember the IP addresses of websites that they visit). However, the domain name must then be "converted" to an IP address for the request to be processed, which the DNS is responsible for. Up until this point, there is one point that confuses me:

Why is it necessary for the domain names to be converted to IP addresses for a server to be accessed by the client? My hypothesis is that it has to do with the fact that a single domain name can "map to" multiple computers and hence IP addresses and so it is necessary to identify each computer separately. Also, perhaps it allows for "nodes" (computer networks) on the "global" network to be identified in a more standardized way (as computers that do not act as web servers do not have domain names). In any case, a more thorough explanation would be much appreciated.

Now, I also learned that some websites cannot be hosted on a single computer, as they have too much traffic, which would make them too slow to run on only one computer. This implies that the website would have multiple IP addresses used to host it. In that case, how would DNS convert a domain name into an IP address? If DNS chooses to return one IP address over another when the domain name is given to it, how would the client obtain access to the part of the website that is not "on that particular computer server"? What I mean by that is let's say we're considering a website with a huge database such as Yahoo Finance. There is likely more than one computer server hosting the website. Let's say that each computer stores a chunk of the database (as the database is too large for one computer to store in its entirety). Then, if a client makes a request to the server using a domain name, how would the DNS know to return the IP address of the particular computer that contains the information that a user is looking for?

Score:1
ru flag

Why is it necessary for the domain names to be converted to IP addresses for a server to be accessed by the client?

Because the Internet works on IP addresses - either IPv4 or IPv6. It consists of a large number of IP networks that are interconnected by gateways.

There's no way you can use DNS names for routing. Any name needs to be translated into an IP address before you can send a request. You can consider DNS as an add-on or overlay. Basically, it's only there for us humans, so we don't have to remember IP addresses (not entirely true, but let's leave it at that here).

In that case, how would DNS convert a domain name into an IP address?

There's a large number of ways to do load balancing. You can resolve a single DNS name to multiple IP addresses, use a load balancer in front of a server cluster, distribute web data objects across multiple named servers, distribute servers with the same IP addresses geographically (anycast), and many more, in almost any combination.

If DNS chooses to return one IP address over another when the domain name is given to it, how would the client obtain access to the part of the website that is not "on that particular computer server"?

DNS simply provides records for name resolution. The client needs to make sense of it. If a DNS record resolves to multiple IP addresses, usually the clients just picks the first or a random one. The server(s) behind each address must be able to fulfill any reasonable request. There's no "dunno, ask another one" reply, the client only tries another one if it is unable to make a connection.

Distributed databases are something completely different altogether. Basically, they are not queried/used by the client directly but by the server backend.

Score:0
in flag

Why is it necessary for the domain names to be converted to IP addresses for a server to be accessed by the client? My hypothesis is that it has to do with the fact that a single domain name can "map to" multiple computers and hence IP addresses and so it is necessary to identify each computer separately. Also, perhaps it allows for "nodes" (computer networks) on the "global" network to be identified in a more standardized way (as computers that do not act as web servers do not have domain names). In any case, a more thorough explanation would be much appreciated.

It doesn't have anything to do with domain names being able to map to different IPs. That's just how the Internet is built, like @Zac67 says.

Every computer that is on the Internet has a public IP address assigned to it. For a computer to send a packet of data to another, it has to know the IP address of the computer it is sending the packet to (the IP is stored inside every packet sent).

When you access a website via a domain for e.g., you are basically sending packets to it to tell it "hey, give me the data for this webpage", so you need to know its IP address to do so in the first place.

In that case, how would DNS convert a domain name into an IP address?

The DNS works on many levels. You have a DNS lookup table on your own computer (try and find the hosts file on your computer). Your ISP will provide a set of DNS servers for you too, and you can also specify your own DNS servers for your modem if you know how.

Some of these DNS lookup tables will have precedence over others. Your local DNS lookup table for example, supercedes over everything (so you can use it to route traffic wherever you want).

If DNS chooses to return one IP address over another when the domain name is given to it, how would the client obtain access to the part of the website that is not "on that particular computer server"?

This is quite complex. Usually, big web servers split up their computers into application servers (which don't contain any data) and CDN servers / databases (which stores data such as photos or say your personal data). The DNS usually points to application servers, which just gives you the code needed to generate the webpage. These generated codes then 1) accesses data from databases to serve, as well as 2) tells the client (your browser) to retrieve media from CDN servers.

These databases and CDN servers have ways to synchronise their data across different servers, and they are specially built for that. If you want to see an example of a CDN, just look at the URL for any image on Facebook.

Then, if a client makes a request to the server using a domain name, how would the DNS know to return the IP address of the particular computer that contains the information that a user is looking for?

See the above point.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.