OpenDNS, DNS, and Anycast

I wanted to let everyone know about OpenDNS. They provide publicly-accessible DNS servers at the following addresses:

208.67.222.222
208.67.220.220

This is a fantastic service, and leads me to a discussion on the topic of Anycast and DNS.

Anycast is one of the types of traffic flows. Remember the other types:

  1. Unicast: used for peer to peer communication (1 to 1). Use IP classes A, B, and C here.
  2. Multicast: used for one to many communication (1 to n, where n indicates the multicast subscribers). Multicast addresses are class D, range 224.0.0.0 to 239.255.255.255, or 224.0.0.0/4 in CIDR notation.
  3. Broadcast: used for one to ALL communication (1 to N, where N indicates all systems within a broadcast domain). This is also known as the all-ones IP address, i.e: 255.255.255.255

Anycast is similar but doesn’t have a specific IPv4 address like Multicast and Broadcast. I would opine that it’s 1 to x, where x is the closest (from a routing perspective) Unicast address.

This will work as long as your application supports it. The DNS is one of those applications because it’s UDP-based, and you simply fire off a query to the nearest DNS server and accept any reply that comes back. In fact, the top-level domain (TLD) servers are anchored in Anycast IP addresses. I suspect that OpenDNS servers are also using Anycast, with a back-end replication and zone transfers being performed using the ‘real’ IP addresses of the servers. RFC3258 outlines how the TLD Anycast service is set up, but basically it works like this:

The IP addresses you want to use for Anycast are advertised in several places within your network. If you have a global network, you might use 10.0.0.0/8 in Europe as well as China and Canada. If you are in China trying to reach this address, the network will route you to the closest 10.0.0.0/8 advertisement point. The nice thing is, if that route is no longer advertised in a particular location, the network will route you to the next closest one. Keep in mind that the service is not necessarily tied to advertisement of this prefix within your network, so some additional mechanism will need to be in place to avoid blackholing traffic.

MPLS notes

I took a class this past week for some more formalized MPLS training. I will be posting some notes to remind myself of key ideas.

There are a few key take-aways that helped me understand the how and why of MPLS. I knew the basics, like how MPLS uses ‘tags’ instead of layer 3 addresses to switch packets. A few years ago this was an important differentiator because if the routers could avoid a layer 3 lookup they could switch packets much faster. This advantage has become obsolete because most lookups are done in hardware, or at least installed in hardware after an initial layer 3 lookup that is done in software.

Anyway, besides that I didn’t realize the benefits to service providers and I’ll try to outline them below:

MPLS allows providers to keep external routes at the edges of their networks.

Why is this important? iBGP requires a full-mesh of peers, and if you don’t run BGP on all transit links you’ll wind up blackholeing traffic. Let’s say you learn about the prefix 192.168.1.0/24 from one of your customers via eBGP. You advertise this to your iBGP neighbor across the cloud, which is maybe 3 routers in diameter. Another customer then wants to send traffic to this network which you have advertised. A packet with the destination of 192.168.1.1 comes into your network, hits the second hop which does not have a route to 192.168.1.1 in its table. The route isn’t there because you haven’t redistributed it into your IGP.

There are two solutions to this problem. You can advertise the eBGP learned routes into your IGP, or you can run iBGP across all your transit routers. Either solution doesn’t scale particularly well. With redistribution, your IGP will have to scale to the number of routes your customers will have, potentially hundreds of thousands. There isn’t an IGP that can handle this. The other solution: full iBGP mesh, requires [n*(n-1)]/2 peering relationships. You can use route reflectors, but this is difficult to scale as well.

The solution MPLS provides is elegant. You simply run an IGP like you normally would, run MPLS on top of your network, and peer using the loopback interfaces of your edge BGP speakers. MPLS builds a forwarding table based on the loopbacks, so when a packet comes into your network bound for 192.168.1.1, your edge BGP router does a layer 3 lookup, finds the next hop of the remote iBGP peer loopback, and encapsulates it inside an MPLS packet. The intermediate routers then use MPLS tags to switch the packet to the remote iBGP router.

Think about that and read it again. MPLS removes the requirement of full mesh, and removes the requirement to redistribute routes into your cloud. This permits service providers to scale to a very large network.

In short: MPLS permits SP networks to keep their routing tables at the edges of their networks. Your transit routers don’t have to carry your edge routes and they also don’t have to worry about convergence.

More to follow…

Term: Fate Sharing

The term “fate sharing” with respect to the CCDE written outline is an unfamiliar term. I’ve done a bit of research and here’s what I found.

David D. Clark defines the term fate-sharing in his paper “The Design Philosophy of the DARPA Internet Protocols” which can be found here. It’s a short paper that is well worth the effort of reading, as it explains why we have a datagram based Internet today.

Clark describes fate-sharing as the property of a system where state information is shared between the communicating entities, and if an entity is lost, all state is lost with the entity. Think of it like this: if you have a TCP connection open to a web server, and that web server dies, your TCP connection dies with it.

Think of the alternative to fate-sharing: if intermediate entities had to maintain state about the communication, you wind up with the difficult problem of replicating that state, and it doesn’t necessarily obviate all failures – only n-1 failures, where n is the number of replicated copies of that state information.

This model pushes resiliency further up the stack. Think of how your web browser handles failures at the TCP layer, it can simply be refreshed and your transaction will still (sometimes) take place on a different server. The fate-sharing philosophy also necessarily creates the datagram-based communication model we have today.

The alternative to a fate-sharing system would be one in which the state is part of the infrastructure. An example of this could be a Permanent Virtual Circuit (PVC) that 1. contains state of the communication, and 2. maintains this state in the event of a failure. This model, if applied to a protocol that is transported over the Internet, would make some details of networking easier. For example: if you set up a permanent call that reroutes, the network can guarantee a particular quality of service, and the communicating systems wouldn’t need to worry about data arriving out-of-order, etc.

Now, I don’t claim to understand all this, but that’s what I’ve been able to gather. If there are smarter folks out there please comment.

The other way I’ve heard fate-sharing described is in Russ White’s 2007 Networkers presentation – Deploying IGPs. He describes fate sharing as the coupling of hardware and software states. A good example for hosts would be virtualization: if the hardware fails, all of the virtual machines share the fate of the hardware. In networking terms, if a dot1q trunk or DWDM link fails, all the devices using that path exclusively fail as well.

Now the key question: How does this term apply to good network design?

A good network designer has to consider that with the datagram model of TCP/IP, reliable data transfer is the responsibility of the protocol, not necessarily the network. This may seem obvious, but it also leaves us with some problems. QoS is not necessarily enforced at all points, so RSVP was developed. Now RSVP starts to push some state information onto the network, which becomes complicated when traffic is rerouted. This runs counter to the datagram model of TCP/IP, which means our designers must balance between the ease of engineering for fate-sharing systems (because we don’t need to guarantee delivery of datagrams) with the relative complexity (and further lack of good tools to manage) of systems which contain state embedded in the network.

My Roadmap

CCDE Roadmap CCNA certification (complete) CCDA certification (complete) CCIP certification:

  • training
    • CTT-TAC MPLS Online bootcamp
    • CTT-TAC QoS Online bootcamp
  • tests
  • reading list
    • MPLS and VPN architectures (Pepelnjak / Guichard)
    • Routing TCP/IP volume 2 (Doyle / Carroll)
  • Networkers presentations
    • QoS (2007)
    • CAC (Call Admission Control) (2007)

CCDP certification:

  • training
    • ? Not sure what is available
  • tests
    • 642-892 Composite (as above)
    • 642-873 ARCH
  • reading list
    • Top-Down Network Design (Oppenheimer)
    • SRND guides:
      • routed access layer with ospf/eigrp
    • Patterns in Network Architecture (Day)

CCDE written exam:

  • reading list
    • Optimal Routing Design (White)
  • Networkers presentations:
    • Routed Fast Convergence (Retana) (2007)
    • Deploying IGPs (White/Retana) (2007)

roadmap for study

Ok, so you want to get the CCDE certification. Where to begin?

Let me share with you my path, and some tips where you might begin.

I’ve been involved with networking since around 1994, when we had no real network on the University of Dayton campus, and I was a student worker. I ran around installing SMC Ethernet NICs in computers and getting them connected to either 10base2 coax or “new” 10baseT twisted-pair networks. Sometimes I had to modify the users’ autoexec.bat files so they would be prompted to either “load CD-Rom drivers” or “connect to the network”. Imagine having to choose! The reason was, most of these older machines couldn’t load all of the drivers into low memory.

Regardless, in around 2000 I completed the CCNA and CCDE, later doing the CCNP and finally taking/passing the CCIE practical to become CCIE #7966 in August of 2001. Suffice it to say, I have forgotten most of what I learned back then, and that which I remember is mostly irrelevant. The two-day lab had IPX and DLSW on it.. not much of that is to be expected on the CCDE.

Anyway, I’ve taken this as a foundation and will be building on this to take the CCDE exams. Here’s what I’ve chosen to concentrate on.

I’ll go take the 642-642 QoS test. Since QoS is a major topic for the written CCDE, I think this will warrant a high amount of study. I’ve been avoiding QoS like the plague for most of my career so this will be hard for me.

I’ll also be taking the 642-691 (MPLS+BGP) test. I’ve very little MPLS experience, and little real-world BGP experience, so this will also require some heavy reading and lab time. I’m going to read the BGP sections of Doyle’s Routing TCP/IP Volume 2, MPLS and VPN Architectures, and Internet Routing Architectures (rereading).

Rounding out the CCIP certification will require the 642-892 (composite) test. This is pretty much foundational Cisco study. It goes over the major RPs and switching technologies. I should be able to take this cold, but for those of you needing some refresher, I highly recommend Doyle’s Routing TCP/IP volume 1 and reading the major documentation on switching technologies from Cisco’s site.

I’ll also take 642-873 (Designing Cisco Network Service Architectures) to complete the requirements for the CCDP. I don’t think I’ll be getting the Cisco press book that they have for this exam, mainly because it appears out of date. A better preparation would most likely be “Top Down Network Design” by Oppenheimer. Even if it doesn’t address 642-873 directly, I think this will most likely be the best path for the goal of passing the CCDE written. I also think the SRND guides will help quite a bit with this exam.

MPLS and OSPF

I’m beginning to learn a bit about MPLS and have just completed some MPLS OSPF setup. I must say, it seems a bit more complicated than it’s worth. Ok, so by default, even links in the same area are advertised as type 3 summary LSAs. I’m going to work on the sham-link which is supposed to alleviate this particular issue.

regardless, here’s a few tidbits for studying: make sure you can originate a default route from one CE router and have it picked up as a type 5 LSA in other areas. There are a few tricks to get this working.

(correction: they’re type 3 summary LSAs. I originally said they were type 2)

CCDE Written Outline Topic 5 : Security

Topic 5 : Security is not fleshed out as far as the other four topics, so I thought I would tackle it first.

  • Explain the impact of security availability design in the characteristics of a network.What does this mean? Let’s dig into the subtopics and see if we can find an explanation.
    • OOB Access – out-of-band access to devices. If your network goes down or if a device is unreachable, you may need some way of remotely logging into the device. A good example would be a modem connected to the AUX port on a router.
    • Decoupling – This probably refers to the separation of control/data planes in routed networks.
    • Paul Baran Model – according to Wikipedia, Paul was one of the thought leaders in distributed networking as an answer to reliability. Building networks that could withstand nuclear attack, etc.. This shows some mathematical rigor for communications networks.
    • Compartmentalization – this probably relates to Schneier’s book Beyond Fear where he states that:

      All systems have a weakest link, and there are several general strategies for securing systems despite their vulnerabilities. Defense in depth ensures that no single vulnerability can compromise security. Compartmentalization ensures that a single vulnerability cannot compromise security entirely. And choke points reduce the number of potential vulnerabilities by allowing the defender to concentrate his defenses. In general, tried and true countermeasures are preferable to innovations, and simpler overlapping countermeasures are preferable to highly complex stand-alone systems. However, because attackers inevitably develop new attacks, reassessment and innovation must be ongoing.

I’m a huge fan of Bruce Schneier. I highly recommend crypto-gram and Beyond Fear.

Another issue Schneier talks about is ‘brittleness’:

Brittleness refers to the way a system fails. Microsoft Windows is a brittle system. A small insecurity breaks the entire system, and often the entire network. The credit-card system is resilient. It can tolerate all sorts of insecurities and still work profitably.

  • Use available tools in a network security design to address identity, monitoring and correlation aspects.
    • SNMP: This falls under the ‘monitoring’ requirement. Keep in mind that SNMP is by default not very secure, and you should be using SNMPv3 if at all possible.
    • NetFlow: You can use records generated by NetFlow to look for all sorts of security events in your network. Normally the data generated is too much and you’ll have to use a third party tool to analyze it. NetFlow uses port 9996/udp by default so designing a system that can accept all of the NetFlow records without dropping is essential if you’re to use it for auditing.
    • Syslog: Obviously, syslog is something you should have enabled in your network. It runs on udp as well, so all the usual udp rules apply. It’s also unencrypted by default.
    • RMON: I’ve not used much RMON in the past, but this falls under application classification/utilization. Third-party tools are best for RMON probes and analysis.
    • DNS: DNS can help to correlate – if for example all of your routers and switches are in DNS and you source records like Syslog and NetFlow, if you have everything defined to do so the IP addresses will resolve in your logs/reports.
    • Radius/AAA: Authentication/Authorization/Accounting is a requirement for any large-scale network. You’ll have to audit the logs for events in this as well.
    • Full Packet Classifiers: They probably refer to NBAR (network based application recognition). It is a tool built in to the routers and switches that will classify your application based on its behavior. It can, for example, classify P2P applications. It does increase the load on your infrastructure, so be careful when implementing it. NBAR can be used to classify and then police/shape applications like P2P, etc.
  • Explain the impact of control plane design decisions on the security of a network; implement security mechanisms to protect the control plane.
    • Use and impact of addressing: This may refer to the concept of infrastructure hiding, where you assign addresses to your devices that are unreachable from outside your network. You could assign all RFC1918 addresses to your loopbacks and refuse to NAT/advertise these networks. This does not automatically hide the infrastructure addresses from your internal users and devices, so you would have to apply inbound filters to prevent access. You can use control-plane policing for this (COPP)
    • Use and impact of area (flooding domain/summary points) placement.
    • Route/Topology/Link Hiding
    • Adjacency Protection (MD5, GTSM, etc.): you should be using MD5 to authenticate links between adjacent neighbors. All of the major dynamic routing protocols support MD5. GTSM stands for Generic TTL Security Mechanism. Defined in RFC3682, it outlines the use of the TTL as a way to ensure your updates are coming from directly-attached neighbors. If you receive an update with a TTL <>
    • Route Validation: probably a manual process, anyone have any ideas?
    • Route Filtering: filter updates from your neighbors that you don’t want. Or just allow those that you do want.
    • Routing Plan: You need to know where your packets will route in steady state.
    • Other routing techniques: unsure of what they mean here.
  • Explain the impact of data plane design decisions on the security of a network; implement security mechanisms to protect the data plane.
    • Infrastructure Protection: Think COPP
    • Policy Enforcement (QoS, BCP38): Probably just want to read BCP38
  • Prepare and explain security incident preparation and response strategies in a network.
    • Reaction Tools (Identification and Classification): IDS/IPS
    • Traceback Tools: not Cisco tracebacks, look here.
    • Remotely-Triggered Black Holes (RTBH) (destination, source, rate limit, etc.): good whitepaper here.
    • Sink Holes: paper here.
    • Reactive ACLs: this may refer to installation of ACLs by a third-party IDS/IPS tool.

CCDE Written Outline.. breaking it down.

The CCDE written, as published by Cisco, contains 5 major headings for study.

1. IP Routing.
Note: no IPX or Appletalk or anything else. IPv6 basics is a subtopic under ‘generic routing and addressing concepts’ Other than that, no big surprises. I suppose we’ll be concentrating on best practices rather than odd, non-obvious configurations.

2. Tunneling
There is mention of tunneling non-IP protocols, specifically NetWare IPX.

3. QoS
Nothing specifically relating to Voice/VoIP, just generic requirements for VoIP.

4. Management
This section is much lighter than the others.

5. Security
This section is lightest of all (only 5 subheadings)

The Beginning…

Ok, so Cisco recently announced the CCDE certification.

Those old timers out there will remember the ill-fated CCIE+Design certification that apparently NOBODY passed. I wonder if this will suffer the same fate. The problem is how does one test against something that is considered a dark art? Networking is not something that falls easily under an Engineering discipline. Networking is not something that can be quantified or easily measured. People talk of 5+nines availability but even this cannot be easily defined or quantified – 5 nines for what? Applications? Ping? What do you measure? The whole network or just critical pieces of it? Availability is one thing, but reliability is another.