Fri, Jun 29, 2007 |
Software Test & Performance |
ALM
|
Developer Tools
|
Security
|
Development Tools Directory
|
Architecture | Business | Data | Hardware | Legacy Systems | Networks | Open Source | Languages | SOA | Social Computing | Telecom | Virtualization | UML | Web | Wireless |
Columns: | Curmudgeon | Geek@Home | Interviews | Kode Vicious | | | Conference Calendar | Issue Index | Site Map | | | |||
CRC Career Resource Center |
Networks -> Features -> DNS issue
DNS (domain name system) is a distributed, coherent, reliable, autonomous, hierarchical database, the first and only one of its kind. Created in the 1980s when the Internet was still young but overrunning its original system for translating host names into IP addresses, DNS is one of the foundation technologies that made the worldwide Internet (and the World Wide Web) possible. Yet this did not all happen smoothly, and DNS technology has been periodically refreshed and refined. Though it's still possible to describe DNS in simple terms, the underlying details are by now quite sublime. This article explores the supposed and true definitions of DNS (both the system and the protocol) and shows some of the tension between these two definitions through the lens of the Internet protocol development philosophy. Simplified ViewThe DNS namespace has a tree structure, where every node has a parent except the root node, which is its own parent. Nodes have labels that are from 1 to 63 characters long, except the root node whose label is empty. A domain is a node in context, and a fully qualified domain name has a presentation form that is just the node names, bottom up, with each followed by a period (.). For example, www.google.com is the fully qualified name of a node whose name is www, whose parent is google, whose grandparent is com, and whose great-grandparent is the DNS root. Nodes are grouped together into zones, the apex of each being called a start of authority and the bottom edges being called delegation points if other zones exist below them, or leaf nodes if not. Zones are served by authority servers that are either primary (if the zone data comes to them from outside the DNS) or secondary (if their zone data comes to them from primary servers via a zone transfer procedure). For example, root, org, acm.org, and hq.acm.org are separate zones of administrative authority. Every node can have RRs (resource records) that contain the actual content of DNS. Depending on its name, type, and data, an RR can map a host name to an IP address or vice versa, or describe the mail servers for a domain, or serve a growing variety of other purposes. Every RR has a name, class, type, TTL (time to live), and data. TTL is measured in seconds and begins to decrement whenever an RR is transmitted from an authority server. This TTL eventually ticks down to zero inside intermediate caching servers; thus, the authoritative server's stated TTL puts an upper bound on the reuse lifetime of an RR.
DNS clients are most often found inside the runtime libraries of TCP/IP initiators. These runtime libraries are called resolvers and most often will not have caches of their own (thus, they are stub resolvers). Stub resolvers request recursive service from their designated upstream full resolvers. A full resolver is capable of caching data for reuse, and of surfing the zone hierarchy to locate a DNS RR no matter where in the namespace it is located or on which authority servers it may be stored. This view of DNS might not sound "simplified." Take heart, it's actually oversimplified. Read on to find out what's really going on in there. Actual ViewThe character set of a DNS label is modified US-ASCII. It's eight-bit clean except for the values 0x41 to 0x5A (uppercase letters) and the values 0x61 to 0x7A (lowercase letters). These are considered equivalent ranges for the purpose of searching and matching, but their distinctions are retained on the wire and in presentation, in support of possible mixed-case English language trademarks encoded as domain names. Therefore, out of the 256 possible values that an octet can contain, only 230 are unique. In practice, only printable US-ASCII letters and numbers are used, and sometimes a hyphen for internal punctuation. Internationalization of the DNS protocols has been ongoing for 10 years and has no virtual market presence thus far. Label case is supposed to be preserved when DNS data is cached and forwarded, in support of possible trademarks. The internal data structures that are universally used to support DNS caching, however, keep only one copy of each label. For example, there will be only one com TLD (top-level domain) label in a DNS cache, even though millions of other domain names can be stored "under" that TLD. The net effect here is that if the first .com domain you encounter uses all uppercase letters for its TLD domain label, then all other .com domains you encounter will appear the same way. Therefore, if you cache vix.com first, then you will cache example.com, even if you really did hear example.com next. The trailing period (.) of a fully qualified domain name can be omitted in presentation, which can mean either that the name is not fully qualified and has to be searched in the default context or that it is fully qualified. For example, if you are inside ACM world headquarters and point your Web browser at internal, your resolver library will likely assume that you mean internal.hq.acm.org. Disambiguation is either by application convention or by actual searching (maybe you really meant internal.acm.org). One common application convention is, "If there are other period characters in the domain name, then assume that the name is fully qualified." (So, if you were in the San Francisco office, your resolver library might assume that internal means internal.acm.org but it would never guess that internal.hq means internal.hq.acm.org.) RRs used to describe downward delegations must be present both at the bottom edge of the parent (delegating) zone and at the apex of the child (delegated) zone. These records are expected to be identical, but differences are common and the meaning of such differences is undefined. The system is very robust in the face of this and other undefined conditions, and protocol agents are prepared to retry pretty hard - and try every possible data path - before giving up. (Thus are local configuration errors transformed into silent resource drains on the world at large.)
by Paul Vixie, Internet Systems Consortium Submit this story to one of the following blogs:
|
Queue Partners |
ACM Home |
About Queue | Advertise with Queue | Advisory Board | Back Issues | Contact Us | Dev Tools Roadmap | Free Subscription | Privacy Policy | Writer Faq | RSS feeds |
© ACM, Inc. All rights reserved. |