Centrally hosted content must travel over many networks to reach end users.
Every network adds latency and decreases reliability.
Specific reasons include:
- Peering Point Congestion: Peering points are where different networks exchange traffic. They tend to become bottlenecks because network operators make most of their money elsewhere.
- Inefficient Routing Protocols: Internet routing protocols try to minimize the number of hops but do not take into account the latency or reliability of each hop.
- Unreliable Networks: Some networks are more reliable than others. Undersea cables and regional networks in southeast asia and the middle-east are called out as troublesome.
- TCP is slow when packet loss is high: HTTP uses TCP. Packet loss is high because of the previously mentioned points. Packet losses with TCP cause retransmissions and throttling due to flow control.
- Scalability: Origin servers, that is servers that originate content, get slower the more traffic you send to them.
Because of these reasons the time to download a four gigabyte video over the internet varies between 12 minutes within 100 miles of the origin and 20 hours on the other side of the globe.
Originally Akamai addressed this problem with the first commercial Content Delivery Network (CDN). The original network cached static site content at the edges of the network.
More specifically, Akamai's static content CDN:
- Controls DNS servers so hostnames are resolved to Akamai caches close to the user
- The Akamai caches close to the user proxy requests to origin servers on cache miss. On cache hit they serve the content locally avoiding the issues associated with traversing the internet.
- So originally Akamai ran DNS servers and caching HTTP proxies. On cache miss, the HTTP proxy simply forwarded the HTTP request to the origin server and saved the origin server's HTTP response for later playback.
- Also, Akamai favors many smaller proxying sites over fewer beefier sites in order to get as close to the user as possible. The closer Akamai gets the content to the user, the less latency is affected by the issues with traversing the internet. Fewer hops means fewer inter-network peering congestion, etc.
The main limitation of the classical content delivery network is that it's restricted to static content. News articles that are read by thousands of users work well with classical CDNs. Your facebook page which is read only by you does not.
As content becomes more dynamic and more personalized, more sophisticated approaches become necessary.
So how has Akamai adapted to changes in internet content?
For static content, they've done two big things.
First, they have added support for edge caching of video streams. Video streams are still fundamentally static content, but the protocols are different.
Second, they have added support for multi-level caches to further reduce load on origin servers. Instead, edge caches hit intermediate caches which in turn hit origin servers.
But how does Akamai handle dynamic or non-cacheable content? Here Akamai employs two strategies.
First, they try to speed up the link between the non-cacheable origin and the user. This basically boils down to better routing protocols that minimize hop latency and maximize hop reliability.
Second, they "push application logic" out from the origin server to the edge. This is a limited form of edge serving of dynamic content. It's particularly useful for applications that generate dynamic views of relatively static data. Applications that have to modify data still have to traverse the internet for writes. If eventual consistency is not appropriate, applications also have to traverse the internet for reads.
- As internet content becomes more dynamic and personalized, content delivery networks like Akamai can either accelerate delivery of dynamic content or die. If you could pick any cloud computing company or platform to solve this problem, which company would you pick? Answers like 'Google because they have better programmers' are not appropriate. This is a technology question.
- How are other cloud platforms like Amazon EC2 and Google AppEngine similar to Akamai?
- If applications hosted on Akamai's edge have to share state or update a centralized 'source of truth', what are their options?