Facebook’s detour through China and Korea
Many of you remember the story of about a year ago, when we reported that a Chinese network was announcing a significant part of the prefixes on the Internet. Networks affected by this incident included big names such as dell.com and cnn.com as well as U.S. government (.gov) and military (.mil) sites, including those for the Senate.
This week a similar story appeared. Although the number of networks involved was low, it did affect one of the world most popular websites, Facebook!
Barrett Lyon reported on his blog that he noticed that AT&T was routing traffic to Facebook through a Chinese network (China telecom AS4134). In this blog post we will take a closer look at what exactly happened.
The raw data
By analyzing the raw data we can indeed see BGP announcements for several of Facebook prefixes with rather long and odd looking ASpaths. We also see several US based providers, such as AT&T, that selected a path through SK Telecom (Hanaro Telecom South Korea) and China Telecom.
It seems to all have started on Tuesday March 22, at 07:15:02 GMT. That’s when we see the first announcement for one of Facebook’s networks, routed through Korean provider SK Telecom. The last announcement via SK Telecom was at Tuesday, 22 Mar 2011 16:11:02 GMT, so about 9 hours in total.
The impact of this incident varied per provider. Some networks didn’t use the path through Korea, some only for a few of Facebook’s prefixes as some for (almost) all of Facebook’s prefixes.
Two Japanese providers, who peer directly with SK Telecom (Korea) routed the following Facebook networks trough AS9318, SK Telecom (Hanaro Telecom South Korea)
The providers affected by this were:
Internet Initiative Japan Inc. (IIJ) ASpath: 2497 9318 32934 32934 32934
KDDI ASpath: 7660 2516 9318 32934 32934 32934
Several other large providers such as Savis, AT&T, Tinet, KPN, Telia, Qwest, AboveNet and Telecom Italia (note this list is not inclusive, there are several more) were also affected, however in these cases only two prefixes were affected.
All of these providers learned this announcement via AS4134, which is China telecom. In all cases the Facebook prefixes were prepended three times. Different peers on different locations saw the announcements, the common part in the ASpath was:
4134 9318 32934 32934 32934
This proves that the announcements were not spoofed by China telecom, as others learned it directly from SK Telecom as well.
What exactly did AT&T see at the time of the incident?
This is a snapshot of the AT&T routing table at March 22nd, 8AM, showing Facebook networks only. Note that only 2 networks were routed through Korea and China.
|184.108.40.206/21||7018 3356 32934 32934|
|220.127.116.11/21||7018 3356 32934 32934|
|18.104.22.168/24||7018 3549 32787 32934|
|22.214.171.124/21||7018 3356 32934 32934|
|126.96.36.199/21||7018 3356 32934 32934|
|188.8.131.52/20||7018 4134 9318 32934 32934 32934|
|184.108.40.206/24||7018 2914 32934 32934|
|220.127.116.11/20||7018 3356 32934 32934|
|18.104.22.168/24||7018 4134 9318 32934 32934 32934|
|22.214.171.124/22||7018 3356 32934 32934|
As can be seen from the snapshot above, the majority of the prefixes (as seen from AT&T) are routed through Level3 (3356). So what happened to the other 2 prefixes? Why was the path to those networks not through Level3. To answer that question we need to know the path to these networks from Level3′s perspective. When looking at the same datasource we see the following for 126.96.36.199/20:
|Level3||3356 32934 32934 32934 32934 32934|
|Verizon Business||701 3356 32934 32934 32934 32934 32934|
|Sprint||1239 3356 32934 32934 32934 32934 32934|
This is interesting; it shows that Level3 learns the Facebook prefix with a rather long ASpath, Facebook’s AS is prepended 5 times. This explains why AT&T chose the path via China and Korea. Even though it’s long, it was shorter than the heavily prepended path via Level3. As a result providers who peer with China Telecom and had this shorter alternative, would have chosen the path via China.
Global Crossing, another large global Transit provider showed the same behavior as Level3, an ASpath with 5 times Facebook’s AS32934.
What could be the reason?
For some reason the path via providers such as Level3 or Global Crossing, Transit networks that are commonly used by the affected providers, were advertised with a very long ASpath. And as a result the providers that have a peering agreement with China Telecom (such as AT&T) started to use the shorter path via China and Korea.
SK Telecom normally is not one of Facebook’s transit providers, but likely just a normal peer. SK Telecom then announced these Facebook prefixes to its peers (IIJ, KDDI and China Telecom) which started to use them.
China Telecom did not filter these Facebook announcements learned via SK Telecom and re-announced them happily to its peers. As a result traffic from several major global providers started using the path through China and Korea to reach some of Facebook’s networks.
Why the providers such as Level3 and Global Crossing did not have a shorter path to these two Facebook networks is unknown. But it’s likely that it had to do with the Facebook BGP setup / infrastructure.
Keep in mind….
It’s likely that both China Telecom and SK Telecom have a presence in the US. And as such it’s not necessarily the case that traffic was actually routed through China. If someone has a traceroute from AT&T or any of the other networks taken at the time of the incident, please post this in the comments, this can helps us determining how the actual traffic flowed.
Facebook has a lot of data and does all kinds of CDN (content delivery) tricks to get data to its users. Depending on your location you get a different DNS result and send to a different server. This might have been the reason why there were no reports of widespread outage.