Occidental College
Information Resources
Yesterday's Internet Outage
November 21, 2011
On Sunday, November 20th, Oxy experienced an Internet Outage lasting about 2 hours, first reported to ITS at around 10:30am by Campus Safety, who was relaying a message from the Library staff. So, what happened? Short answer: there was a problem with some parts of the Internet in Southern California and Oxy happened to be among the institutions that were affected. The longer answer is available after the break.
Where was the problem?
The problem occurred with the Southern California segment of the CalREN network. CalREN maintains a Los Angeles area hub, reflected on the map of the CalREN backbone. There was a bad piece of fiber equipment in that hub that disrupted the stability of the network. CENIC, the group that maintains CalREN, temporarily routed all traffic through Northern California hubs until the faulty equipment was replaced. According to CENIC, the problems began at 8:07am and was not fully resolved until 7:00pm though Oxy was only affected for a two hour period from about 10:30am to 12:30pm.Wait, CENIC isn't our ISP, is it?
That is correct, our actual ISP is the Los Nettos Regional Network. However, Los Nettos peers with CENIC to pass traffic over to its CalREN network. Peering is a process where ISPs and large networks exchange Internet traffic with other ISPs and Internet backbone providers. The detailed hows and whys are fairly complex but here's a good explanation for those who are interested. The end result is that even though we aren't strictly CENIC customers, we still get affected by their problems. Back in August, for example, Oxy experienced problems accessing this blog site due to a problem Level3 Networks, one of the major Internet backbone providers, was having with their link between Los Angeles and Dallas.What did this problem look like?
Traffic on the network will typically traverse a number of different networks and within those networks, a number of different hops as packets try to get from point A to point B. Here, for example, is what the path to Moodle looked like at about noon yesterday:1 48 ms 90 ms 86 ms 134.69.3.5 2 <1 ms <1 ms <1 ms 134.69.33.11 3 <1 ms <1 ms <1 ms 65.214.150.234 4 1 ms 1 ms 1 ms 130.152.182.106 5 161 ms 38 ms 233 ms 130.152.181.188 6 1 ms 1 ms 1 ms 137.164.23.225 7 2 ms 1 ms 1 ms 137.164.46.107 8 1 ms 1 ms 1 ms 137.164.46.118 9 * * * Request timed out. 10 * 2 ms 2 ms 137.164.130.54 11 * 2 ms 11 ms 10.54.54.2 12 28 ms * 37 ms 72.52.92.37 13 41 ms 41 ms * 72.52.92.5 14 * 41 ms 41 ms 184.105.213.34 15 52 ms * 52 ms 184.105.250.62 16 52 ms 52 ms 52 ms 204.13.103.5 17 * 52 ms * 204.13.102.81 18 52 ms * 52 ms 204.13.102.81Hops 1 through 3 are Oxy's network. Hops 4 and 5 are our ISP, Los Nettos. Hops 6-10 are CalREN and you can see at hop #9 that we have a problem. For the sake of completion, hops 11 through 15 belong to Hurricane Electric, another major Internet backbone provider. Hop 16 belongs to Arsalon, the company that physically hosts our Moodle server, and hop 17-18 is our actual moodle.oxy.edu server. Asterisks ("*" symbols) denote failed attempts to communicate with each of the hops. At around 12:45pm when the problem started to resolve itself, the path had changed to look like this:
1 76 ms 93 ms 95 ms 134.69.3.5 2 <1 ms <1 ms <1 ms 134.69.33.11 3 <1 ms <1 ms <1 ms 65.214.150.234 4 1 ms 1 ms 1 ms 130.152.182.106 5 2 ms 1 ms 1 ms 130.152.181.188 6 1 ms 1 ms 1 ms 137.164.23.225 7 2 ms 2 ms 1 ms 137.164.46.107 8 10 ms 9 ms 9 ms 137.164.46.95 9 9 ms 9 ms 9 ms 137.164.46.205 10 11 ms 11 ms 11 ms 137.164.131.61 11 11 ms 11 ms 11 ms 137.164.130.58 12 24 ms 18 ms 11 ms 72.52.92.70 13 39 ms 33 ms 41 ms 10.105.213.106 14 46 ms 49 ms 46 ms 72.52.92.5 15 46 ms 47 ms 47 ms 184.105.213.34 16 58 ms 57 ms 57 ms 184.105.250.62 17 57 ms 57 ms 57 ms 204.13.103.5 18 58 ms 58 ms 58 ms 204.13.102.81Much better. No more asterisks but if you look carefully, you'll notice the IP addresses are different starting at hop #8, which was the problematic hop back at noon. This is because by this point, Internet traffic from Oxy to our Moodle server were taking a different path - most likely to route around the faulty equipment. As of this morning, though, the original path has been restored and is working properly.
How do I find out what path I'm using to get to a site on the Internet?
If you're curious, you can always fire off a traceroute. On Windows, you'll need to open up the Command Prompt and use the command "tracert" while on OS X, you'll need to open up a Command Shell window and use the command "traceroute". You can specify the IP address (134.69.3.95) or a server name (www.oxy.edu) to trace to. If you want to find out more about who owns a particular hop, a good place to start is the ARIN WHOIS service - just paste the IP number you're interested in into the search box at the upper-right corner of that page. Here is Oxy's information, for example.- Info Center:
(323) 259-2640
- Technology Helpdesk:
(323) 259-2880 helpdesk@oxy.edu
- IR Operations Offices: (323) 259-2832
- Information Resources VP/CIO: (323) 259-1451
