Named Data Networking - Boffin Alert
September 8, 2014
· Posted by Greg Lloyd
On Sep 4, 2014 the Named Data Networking project announced a new consortium to carry the concepts of Named Data Networking (NDN) forward in the commercial world. If this doesn't sound exciting, try The Register's take: DEATH TO TCP/IP cry Cisco, Intel, US gov and boffins galore. What if you could use the internet to access content securely and efficiently, where anything you want is identified by name rather than by its internet address? The NDN concept is technically sweet, gaining traction, and is wonderfully explained and motivated in a video by its principle inventor and instigator Van Jacobson. Read on for the video, a few quotes, reference links, and a few thoughts on what NDN could mean for the Internet of Things, Apple, Google and work on the Web. Short version: Bring popcorn.
For a short non-technical introduction, see Wade Roush's Sep 2012 piece on Van Jacobson and Content Centric Networking The Next Internet? Inside PARC’s Vision of Content Centric Networking. Background: Jacobson's work on CCN begot the NDN project, where he is now a Principle Investigator. A few quotes from Roush's story:
The fundamental idea behind Content Centric Networking is that to retrieve a piece of data, you should only have to care about what you want, not where it’s stored. Rather than transmitting a request for a specific file on a specific server, a CCN-based browser or device would simply broadcast its interest in that file, and the nearest machine with an authentic copy would respond. File names in a CCN world look superficially similar to URLs (for example, /parc.com/van/can/417.vcf/v3/s0/Ox3fdc96a4…) but the data in a name is used to establish the file’s authenticity and provenance, not to indicate location.
It’s easy to see how much sense this makes compared to the current client-server model. Say I’m using my Apple TV box to browse my Flickr photo collection on my big-screen TV. To get each photo, the Apple TV has to connect to Flickr, which is hosted on some remote data center owned by Yahoo—it could be in Utah or North Carolina, for all I know. The request has to travel from the Apple TV over my Wi-Fi network, into Comcast’s servers, then across the Internet core, and finally to Yahoo. Then the photos, which amount to several megabytes each, have to travel all the way back through the network to my TV.
But the photos on Flickr are just copies of the originals, which are stored on my camera and on my laptop, about 15 feet away from my TV. It would be much smarter and more economical if the Apple TV could simply ask for each photo by name—that is, if it could broadcast its interest in the photo to the network. My laptop could respond, and I could keep browsing without the requests or the data ever leaving my apartment. (In Jacobson’s scheme, file names can include encrypted sections that bar users without the proper keys from retrieving them, meaning that security and rights management are built into the address system from the start.)
“The simplest explanation is that you replace the concept of the IP address as the defining entity in the network with the name of the content,” says Lunt. “Now all the talk in the network is about ‘Have you seen this content?’ and ‘Who needs this content?’ as opposed to ‘What is the routing path to particular terminus in the network?’ It’s a simple idea, but it makes a lot of things possible...
“One of the things that’s intriguing about not having to go to the source is that you could start to think about implementing applications differently,” Lunt says. “You could build apps that don’t have any notion of a server at all. So you could have Twitter without Twitter or Facebook without Facebook—that is, without having to have a major investment in hosting content, because the network is caching it all over the place.”
Such architectures might give users more control over privacy and security of their data, and let them share their own data across devices without having to go through proprietary services like Apple’s iCloud, PARC executives say.
“What Apple is trying to do with iCloud is to say: You shouldn’t have to care which device you got an app on, or which device you took a photo on, whether it was your iPad or iPhone or MacBook Air. You just want your content to be on the other devices when you want it,” says Steve Hoover, CEO of PARC. “That validates our vision. But the way they are solving that puts more load on the network than it needs to, and it requires consumer lock-in. So Apple may be a user of this [CCN] technology one day, because it will make it easier. On the other hand, they could also hate it, because it will make it a lot easier for other people to provide that capability of getting the content whenever you want.”
In my option, one of the technically sweetest characteristics of NCN is its relationship to current TCP/IP and networking protocols (quotes from NDN Architecture: Motivation and Details):
Like IP, NDN is a “universal overlay”: NDN can run over anything, including IP, and anything can run over NDN, including IP. IP infrastructure services that have taken decades to evolve, such as DNS naming conventions and namespace administration or inter-domain routing policies and conventions, can be readily used by NDN. Indeed, because NDN’s hierarchically structured names are semantically compatible with IP’s hierarchically structured addresses, the core IP routing protocols, BGP, IS-IS and OSPF, can be used as-is to deploy NDN in parallel with and over IP. Thus NDN’s advantages in content distribution, application-friendly communication, robust security, and mobility support can be realized incrementally and relatively painlessly...
Communication in NDN is driven by the receiving end, i.e., the data consumer. To receive data, a consumer sends out an Interest packet, which carries a name that identifies the desired data (see Figure 2). A router remembers the interface from which the request comes in, and then forwards the Interest packet by looking up the name in its Forwarding Information Base (FIB), which is populated by a name-based routing protocol. Once the Interest reaches a node that has the requested data, a Data packet is sent back, which carries both the name and the content of the data, together with a signature by the producer’s key (Figure 2). This Data packet follows in reverse the path taken by the Interest to get back to the consumer. Note that neither Interest nor Data packets carry any host or interface addresses (such as IP addresses); Interest packets are routed towards data producers based on the names carried in the Interest packets, and Data packets are returned based on the state information set up by the Interests at each router hop (Figure 3).
The router stores in a Pending Interest Table (PIT) all the Interests waiting for returning Data packets. When multiple Interests for the same data are received from downstream, only the first one is sent upstream towards the data source. Each PIT entry contains the name of the Interest and a set of interfaces from which the Interests for the same name have been received. When a Data packet arrives, the router finds the matching PIT entry and forwards the data to all the interfaces listed in the PIT entry. The router then removes the corresponding PIT entry, and caches the Data in the Content Store. Because an NDN Data packet is meaningful independent of where it comes from or where it may be forwarded to, the router can cache it to satisfy future requests. Because one Data satisfies one Interest across each hop, an NDN network achieves hop-by-hop flow balance...
NDN design assumes hierarchically structured names, e.g., a video produced by PARC may have the name/parc/videos/WidgetA.mpg, where ‘/’ indicates a boundary between name components (it is not part of the name). This hierarchical structure is useful for applications to represent relationships between pieces of data. For example, segment 3 of version 1 of the video might be named /parc/videos/WidgetA.mpg/1/3. The hierarchy also enables routing to scale. While it may be theoretically possible to route on flat names (see ROFL), it is the hierarchical structure of IP addresses that enables aggregation, which is essential in scaling today’s routing system. Common structures necessary to allow programs to operate over NDN names can be achieved by conventions agreed between data producers and consumers, e.g., name conventions indicating versioning and segmentation.
Name conventions are specific to applications but opaque to the network, i.e., routers do not know the meaning of a name (although they see the boundaries between components in a name). This allows each application to choose the naming scheme that fits its needs and allows the naming schemes to evolve independently from the network.
I haven't quoted from short sections on Data Centric Security, Routing and Forwarding, Intelligent Data Plane, Caching, or Intellectual Property Approach and open source. You should read NDN Motivation & Details, then much more from named-data.net if either your head exploded, or you are jumping up and down in your seat with questions and objections.
Much of this is QED Marketing - I told you how it works, not what it means for you. Here are a few thoughts:
1) Secure efficient transport of content crossing many boundaries is a hard problem, getting harder as the number of people, things, and places on the Web grow, and as people look for a seamless and trusted way to deal with things they care about at home and at work. For example, how could Apple (or Google) leverage NDN to deliver on an internet of your things? How might players other than the giants leverage NDN to compete?
2) NDN offers the possibility of doing a lot of the hard work at the network level, which is a win if it offers a economic benefit to those who pay for the fabric of the internet, and opportunities to invent and grow scalable businesses more effectively. For example, what could change if Amazon offered NDN as an Amazon Web Service?
3) NDN might offer an appropriate secure, flexible framework for connecting people to content at work. Businesses use siloed applications for for transactional data for good reasons: they are simpler to build, (potentially) more secure, and (potentially) more flexible than old style monolithic business applications if they become sources of content linked together at a higher level of an application stack. NDN might be a great protocol to build flexible, secure, extensible business applications connecting people to the content they want - and are allowed to use.
With respect to the network issues, I'm a fan, not an expert, but the NDN proposal seems to share many of the (relatively) simple, scalable, decentralized characteristics that fueled the growth of the Web and evolution of TCP/IP. NDN seems to be most attractive for big content, particularly where multicast style delivery and caching can delivery big bandwidth and responsiveness improvements, but it looks like a lot of thought has gone into efficient localized delivery. Likewise, management of a very large, frequently changing name space is a challenge, which also seems to have gotten a lot of intelligent attention.
With Cisco and Huawei on board as founding industrial partners of the NDN Consortium, you can bet that a lot of caching routers can be sold, and NDN routing technology will take the fast track if there's economic payback for NDN, which will drive better payback, faster adoption, etc.
The good thing is the program has advanced to the stage where many of these questions can answered by experiment - we shall see.
Will the NDN Consortium take off? Will Google, Apple and Microsoft jump in? Or will NDN join the queue of technically sweet solutions that never really get off the ground? I'm optimistic that NDN has the right technical characteristics and pedigree, with smart experienced people leading the charge. With the Internet of Things and secure content distribution efficiencies as economic drivers, I hope we'll all benefit from NDN's content delivery model as the next stage of the Web's evolution. If you're not in the battle, bring popcorn and watch - it should be a good show.
Named Data Networking Architecture: Motivation & Details The best short technical overview I've found of the objectives and approach of the Named Data Networking project. Read the overview to get quick idea of how content is named, the NDN security and caching model, how NDN works over (or under) TCP, scaling issues, and more.
A New Way to Look at Networking - Van Jacobson's Aug 2006 Google Tech talk on TCP and Content Centric Networking (CCN). CCN is the title of Jacobson's Xerox PARC project, which became "the single biggest internal project at PARC." CCN led to the formation of the Named Data Networking project as a National Science Foundation funded Future Internet Architecture program in Sep 2010. Jacobson is currently a Principle Investigator of the NDN project. See Van Jacobson speaks on Content Centric Networking for a longer (three hour) and slightly earlier version of Jacobson's CCN talk presented as a Future Internet short course, including slides.
Reinventing the Web II (Aug 2014) The Web won vs "better" models by turning permanence into a decentralized economic decision. Why isn't the Web a reliable and useful long term store for the links and content people independently create? What can we do to fix that? Who benefits from creating spaces with stable, permanently addressable content? Who pays? What incentives can make Web scale permanent, stable content with reliable bidirectional links and other goodies as common and useful as Web search over the entire flakey, decentralized and wildly successful Web? NDN is the sweetest and most credible global technical approach I've seen.
Continuity and Intertwingled Work (Jun 2014) A level above an Internet of Things: seamless experience across devices for you, your family, your health and trusted service providers, at home and at work.
Intertwingled Work (Jul 2010) No one Web service or collection of Web servers contain everything people need, but we get along using search and creative services that link content across wildly different sources. The same principal applies when you want to link and work across wildly diverse siloed systems of record and transactional databases.
Thought Vectors - Ted Nelson: Art not Technology (Jul 2014) Ted Nelson should be smiling - but I won't hazard a guess. From what I see, everything in NDN seems compatible if not influenced by the Docuverse, Tumbler, and fine grain content addressable network architecture that Nelson described in detail in his 1987 book Literary Machines. I believe NDN provides secure, scalable, fine grain, and upwards compatible networking that could connect the front end and back end Xanadu architecture that Nelson describes in Literary Machines. I'll follow up on this with a separate Boffin alert.