Today, computers run useful processes that other users would use using their computer device. Examples of such processes are the google search engine, the amazon online store, your favorite flight operator’s booking service and so on. How does your computer know where these processes are running?
Where is my coffee?
You are in Coffee Land on a business visit and feel like drinking coffee after a long day at work. You want to go to a restaurant. You decide you would walk into Quality Coffee after spending some time consulting your colleagues.
You are told Quality Coffee is at No 4, Coffee Bean Avenue, Roasted County, Coffee Land. You go there, order coffee, drink it and leave satisfied (possibly after paying in case you don’t want to be arrested).
With some reflection, you can see that the address is more useful to you when you need to navigate to the correct shop and order coffee. However, the product that you sought was Quality Coffee. When you later talk about this to your other colleagues back home, you would say you visited Quality Coffee and not No 4,…. The shop can move to a different place, but the service that you experienced is tied to the brand and it would remain the same.
Here we have two pieces of information - Name and Address. You seek the services of Name. You need to know its current Address to actually get the services you need.
The Name is generally the brand associated with the service. In the computer world, the address would be the IP Address.
Recall that an IP Address is the unique identifier for a machine. There are 2 versions of it - IPv4 and IPv6. An IPv4 IP address looks like this -
10.0.0.1
.
OK, let’s get to YouTube
Getting to YouTube is similar to walking into a restaurant to order coffee. YouTube is the service provider with the brand. The restaurant is your browser. Coffee is the video (or the service provided by the service provider.)
To provide this service,
- Software engineers at YouTube have run programs on their computers that make video content searchable and playable.
- They then advertise their IP address to everyone.
- You go to the browser and type in the IP address of youtube.
- You are presented with a web page - which is an interface used for searching and playing videos.
I hate memorizing numbers!
Everyone does!
As of writing this article, one of the IP addresses of the youtube services is 74.125.68.93
.
It is impractical to remember the IP addresses of any target service. In the real world, these IP addresses would change over time more often than how real-world addresses change. It would be a nightmare for both the youtube engineers as well as the end user to keep track of this.
Enter the Domain Name System or DNS for short. DNS solves this problem of remembering arbitrary numbers for addresses to online services by providing it with a more memorable Domain Name such as www.youtube.com
. The engineers at youtube can simply say - if you want to search and watch videos, head over to YouTube (the name or brand). And hey, you can find YouTube at www.youtube.com
(the address).
For most people, www.youtube.com
is more memorable than 74.125.68.93
. This also solves the problem of moving IP addresses. When the IP address changes to let’s say 74.125.68.100
, we can simply update the domain name system so that the domain name www.youtube.com
points to the new address. This change is silent and there is no impact on what the end user types in on their browser.
I still have some questions
If you’ve come this far, you should be having a grasp of what a domain name is and know that there is a system called the domain name system that translates the domain names to IP addresses. If you are not content, and If you are curious to know how a domain name system is set up to translate the addresses, read on!
When you open your browser and type in www.youtube.com
, the browser first negotiates with the domain name system to find out the IP address. This process is called DNS address resolution. Once the address is resolved, then the browser establishes a connection and can send and receive meaningful data (in this case, watch videos).
It would seem there would be a straightforward process of maintaining a directory of a domain name to its IP address. But this poses a few challenges.
- Scalability: There are a lot of websites and querying a single directory can be computationally expensive. If a lot of users are querying to resolve the address of
www.youtube.com
, this can negatively impact other users that are trying to resolve lesser queried domain names such aswww.mysite.com
. The owner ofwww.youtube.com
should somehow be accountable for traffic coming to the domain name system querying their domain. Sites can also have multiple sub-domains such aswww.blog.mysite.com
so there isn’t a strict one-to-one mapping with the owner of a website. - Security: Coordinating updates from different domain name owners securely are challenging. For example, I should not update the address of
www.yoursite.com
while I was updatingwww.mysite.com
. - Conditional resolution: Often, a service is hosted in multiple locations for better user experience (think multiple branches of Quality Coffee set up in different cities so that the brand is more accessible). The YouTube IP addresses shown above are probably the ones on a computer physically closest to my house so that my computer has a fast connection to YouTube. So the central directory should now keep track of different locations and determine where the lookup request is coming from.
- Cost: Solving any of the above problems would require more money to set up this domain name system directory. More disk space, computing power (to serve domain name queries), power supply, cooling system etc.
Delegated resolution
Instead of storing all the domain names in a single directory (called a DNS server), there are multiple DNS servers for every sub-domain level. As an example imagine knowledge of a separate DNS server for youtube.com
that serves all DNS requests that end in youtube.com
.
www.youtube.com
gaming.youtube.com
about.youtube.com
photography.youtube.com
This distributes the DNS traffic and it is now up to the owners of the domain name to securely manage and scale their DNS servers.
But, there is still a problem. How do I figure out what is the DNS server for
youtube.com
?
There are a set of reserved suffixes by which a domain name can end. Example com
, in
, us
, org
, net
. These are called Top Level Domains (TLD).
Each one of these has its DNS server operated by an organization or a country. The owner of youtube
would set up a DNS server resolving IP addresses of all their sub-domains ending in youtube.com
. Then they would get hold of the operator for the com
domain and ask them to add a record for youtube
pointing to a DNS server that they just set up.
So if I know the DNS server of
com
I can reach the DNS server foryoutube.com
. But wait, how do I figure out the DNS server forcom
?
All the top-level domains are hosted in another DNS server called the Root DNS Server. These are operated by a non-profit Internet Assigned Numbers Authority organization. The list of the root DNS servers are hard-coded in your computer when you buy it.
Now a typical name resolution works like this
- You enter
www.youtube.com
on the browser. - Your browser consults the Root DNS server to find where the
com
DNS server is. - Your browser consults the
com
DNS server to find out whereyoutube
is. - Your browser consults the
youtube
DNS server to find out wherewww.youtube.com
is. - Your browser gets the IP address and makes a connection to the website.
Notice how the domain name is parsed from right to left - from the more generic to the more specific. If you think about it, this is similar to how we parse location addresses. When you needed to know where No 4, Coffee Bean Avenue, Roasted County, Coffee Land is, you would parse it from right to left as
- Go to Coffee Land.
- Find Roasted County inside Coffee Land.
- Find Coffee Bean Avenue inside Roasted County.
- Find No 4 inside Coffee Bean Avenue.
If you are still here, you should definitely head over to youtube and listen to this and relax.