Caching is the process of storing copies of files in a cache, or temporary storage location, so that they can be accessed more quickly. Technically, a cache is any temporary storage location for copies of files or data, but the term is often used in reference to Internet technologies. Web browsers cache HTML files, JavaScript, and images in order to load websites more quickly, while DNS servers cache DNS records for faster lookups and CDN servers cache content to reduce latency.
To understand how caches work, consider real-world caches of food and other supplies. When explorer Roald Amundsen made his return journey from his trip to the South Pole in 1912, he and his men subsisted on the caches of food they had stored along the way. This was much more efficient than waiting for supplies to be delivered from their base camp as they traveled. Caches on the Internet serve a similar purpose; they temporarily store the ‘supplies’, or content, needed for users to make their journey across the web.
What does a browser cache do?
Every time a user loads a webpage, their browser has to download quite a lot of data in order to display that webpage. To shorten page load times, browsers cache most of the content that appears on the webpage, saving a copy of the webpage’s content on the device’s hard drive. This way, the next time the user loads the page, most of the content is already stored locally and the page will load much more quickly.
Browsers store these files until their time to live (TTL) expires or until the hard drive cache is full. (TTL is an indication of how long content should be cached.) Users can also clear their browser cache if desired.
What does clearing a browser cache accomplish?
Once a browser cache is cleared, every webpage that loads will load as if it is the first time the user has visited the page. If something loaded incorrectly the first time and was cached, clearing the cache can allow it to load correctly. However, clearing one’s browser cache can also temporarily slow page load times.
What is CDN caching?
A CDN, or content delivery network, caches content (such as images, videos, or webpages) in proxy servers that are located closer to end users than origin servers. (A proxy server is a server that receives requests from clients and passes them along to other servers.) Because the servers are closer to the user making the request, a CDN is able to deliver content more quickly.
Think of a CDN as being like a chain of grocery stores: Instead of going all the way to the farms where food is grown, which could be hundreds of miles away, shoppers go to their local grocery store, which still requires some travel but is much closer. Because grocery stores stock food from faraway farms, grocery shopping takes minutes instead of days. Similarly, CDN caches ‘stock’ the content that appears on the Internet so that webpages load much more quickly.
When a user requests content from a website using a CDN, the CDN fetches that content from an origin server, and then saves a copy of the content for future requests. Cached content remains in the CDN cache as long as users continue to request it.
What is a CDN cache hit? What is a cache miss?
A cache hit is when a client device makes a request to the cache for content, and the cache has that content saved. A cache miss occurs when the cache does not have the requested content.
A cache hit means that the content will be able to load much more quickly, since the CDN can immediately deliver it to the end user. In the case of a cache miss, a CDN server will pass the request along to the origin server, then cache the content once the origin server responds, so that subsequent requests will result in a cache hit.
Where are CDN caching servers located?
CDN caching servers are located in data centers all over the globe. Cloudflare has CDN servers in 300 cities spread out throughout the world in order to be as close to end users accessing the content as possible. A location where CDN servers are present is also called a data center.
How long does cached data remain in a CDN server?
When websites respond to CDN servers with the requested content, they attach the content’s TTL as well, letting the servers know how long to store it. The TTL is stored in a part of the response called the HTTP header, and it specifies for how many seconds, minutes, or hours content will be cached. When the TTL expires, the cache removes the content. Some CDNs will also purge files from the cache early if the content is not requested for a while, or if a CDN customer manually purges certain content.
How do other kinds of caching work?
DNS caching takes place on DNS servers. The servers store recent DNS lookups in their cache so that they do not have to query nameservers and can instantly reply with the IP address of a domain.
Search engines may cache webpages that frequently appear in search results in order to answer user queries even if the website they are attempting to access is temporarily down or unable to respond.
How does Cloudflare use caching?
Cloudflare offers a CDN with 300 PoPs distributed internationally. Cloudflare offers free CDN caching services, while paid CDN customers are able to customize how their content is cached. The network is Anycast, meaning the same content can be delivered from any of these data centers. A user in London and a user in Sydney can both view the same content loaded from CDN servers only a few miles away.
source: https://www.cloudflare.com/learning/cdn/what-is-caching/