RSS (file format)

From Wikipedia, the free encyclopedia.

(Redirected from RSS (protocol))
Jump to: navigation, search
To discuss RSS syndication feeds from Wikipedia, visit Wikipedia:Syndication.

RSS is a family of XML file formats for Web syndication used by (amongst other things) news websites and weblogs. The abbreviation is used to refer to the following standards:

The technology behind RSS allows Internet users to subscribe to websites that have provided RSS feeds; these are typically sites that change or add content regularly. To use this technology, site owners create or obtain specialized software (such as a content management system) which, in the machine-readable XML format, presents new articles in a list, giving a line or two of each article and a link to the full article or post. Unlike subscriptions to many printed newspapers and magazines, most RSS subscriptions are free.

The RSS formats provide web content or summaries of web content together with links to the full versions of the content, and other meta-data. This information is delivered as an XML file called RSS feed, webfeed, RSS stream, or RSS channel. In addition to facilitating syndication, RSS allows a website's frequent readers to track updates on the site using an aggregator.

Contents

Usage

RSS is widely used by the weblog community to share the latest entries' headlines or their full text, and even attached multimedia files. (See podcasting, vodcasting, broadcatching, photocasting, picturecasting, screencasting, Vloging, and MP3 blogs.) In mid 2000, use of RSS spread to many major news organizations, including Reuters, CNN, and the BBC, until under various usage agreements, providers allow other websites to incorporate their "syndicated" headline or headline-and-short-summary feeds. RSS is now used for many purposes, including marketing, bug-reports, or any other activity involving periodic updates or publications.

A program known as a feed reader or aggregator can check RSS-enabled webpages on behalf of a user and display any updated articles that it finds. It is now common to find RSS feeds on major Web sites, as well as many smaller ones.

Client-side readers and aggregators are typically constructed as standalone programs or extensions to existing programs like web browsers. Such programs are available for various operating systems. See list of news aggregators.

Web-based feed readers and news aggregators require no software installation and make the user's "feeds" available on any computer with Web access. Some aggregators syndicate (combine) RSS feeds into new feeds, e.g. take all football related items from several sports feeds and provide a new football feed. There are also search engines for content published via RSS feeds like Feedster, Blogdigger or Plazoo.

On Web pages, RSS feeds are typically linked with an orange rectangle (Image:livemark.png) optionally with the letters XML (Image:XML.gif) or RSS (Image:rss.png or Image:RSS.gif).

History

Before RSS, several similar formats already existed for syndication, but none achieved widespread popularity or are still in common use today, as most were envisioned to work only with a single service. For example, in 1997 Microsoft created Channel Definition Format for the Active Channel feature of Internet Explorer 4.0, which became mildly popular. Dave Winer also designed his own XML syndication format for use on his Scripting News weblog, which was also introduced in 1997 [1].

RDF Site Summary, the first version of RSS, was created by Dan Libby of Netscape in March 1999 for use on the My Netscape portal. This version became known as RSS 0.9. In July 1999, responding to comments and suggestions, Libby produced a prototype tentatively named RSS 0.91 [2] (RSS standing for Rich Site Summary), that simplified the format and incorporated parts of Winer's scriptingNews format. This they considered an interim measure, with Libby suggesting an RSS 1.0-like format through the so-called Futures Document [3].

Soon afterwards, Netscape lost interest in RSS/XML, leaving the format without an owner, just as it was becoming widely used. A working group and mailing list, RSS-DEV, was set up by various users and XML world notables to continue its development. At the same time, Winer unilaterally posted a modified version of the RSS 0.91 specification to the Userland website, since it was already in use in their products. He claimed the RSS 0.91 specification was the property of his company, UserLand Software.[4] Since neither side had any official claim on the name or the format, arguments raged whenever either side claimed RSS as its own, creating what became known as the RSS fork.

The RSS-DEV group went on to produce RSS 1.0 in December 2000. Like RSS 0.9 (but not 0.91) this was based on the RDF specifications, but was more modular, with many of the terms coming from standard metadata vocabularies such as Dublin Core.

Nineteen days later, Winer released by himself RSS 0.92, a minor and supposedly compatible set of changes to RSS 0.91. In April 2002, he published a draft of RSS 0.93 which was almost identical to 0.92. A draft RSS 0.94 surfaced in August, reverting the changes made in 0.93, and adding a type attribute to the description element.

In September 2002, Winer released a final successor to RSS 0.92, known as RSS 2.0 and emphasizing "Really Simple Syndication" as the meaning of the three-letter abbreviation. The RSS 2.0 spec removed the type attribute added in RSS 0.94 and allowed people to add extension elements using XML namespaces. Several versions of RSS 2.0 were released, but the version number of the document model was not changed. In 2003, Winer and Userland Software assigned ownership of the RSS 2.0 specification to his then workplace, Harvard's Berkman Center for the Internet & Society.

Winer was criticized for unilaterally creating a new format and raising the version number. In response, RSS 1.0 coauthor Aaron Swartz published RSS 3.0, a non-XML textual format. The format was possibly intended as a parody and only a few implementations were ever made.

In January 2005, Sean B. Palmer and Christopher Schmidt produced a preliminary draft of RSS 1.1. [5] It was intended as a bugfix for 1.0, removing little-used features, simplifying the syntax and improving the specification based on the more recent RDF specifications. As of July 2005, RSS 1.1 had amounted to little more than an academic exercise.

In August 2005, Jonathan Avidan launched his own project [6] to create an "RSS 3", though apparently without backing from anyone in the RSS industry, and the project failed to take off. Sean B. Palmer and Morbus Iff, claiming to be acting on behalf of Aaron Swartz, sent a cease-and-desist notice for abuse of the RSS 3 name. [7]

Incompatibilities

As noted above, there are several different versions of RSS, falling into two major branches. The RDF, or RSS 1.* branch includes the following versions:

  • RSS 0.90 was the original Netscape RSS version. This RSS was called RDF Site Summary, but was based on an early working draft of the RDF standard, and was not compatible with the final RDF Recommendation.
  • RSS 1.0 and 1.1 are an open format by the "RSS-DEV Working Group", again standing for RDF Site Summary. RSS 1.0 is an RDF format like RSS 0.90, but not fully compatible with it, since 1.0 is based on the final RDF 1.0 Recommendation.

The RSS 2.* branch (initially UserLand, now Harvard) includes the following versions:

  • RSS 0.91 is the simplified RSS version released by Netscape, and also the version number of the simplified version championed by Dave Winer from Userland Software. The Netscape version was now called Rich Site Summary, this was no longer an RDF format, but was relatively easy to use. It remains the most common RSS variant.
  • RSS 0.92 through 0.94 are expansions of the RSS 0.91 format, which are mostly compatible with each other and with Winer's version of RSS 0.91, but are not compatible with RSS 0.90. In all Userland RSS 0.9x specifications, RSS was no longer an acronym.
  • RSS 2.0.1 has the internal version number 2.0. RSS 2.0.1 was proclaimed to be "frozen", but still updated shortly after release without changing the version number. RSS now stood for Really Simple Syndication. The major change in this version is an explicit extension mechanism using XML Namespaces.

For the most part, later versions in each branch are backward-compatible with earlier versions (aside from non-conformant RDF syntax in 0.90), and both versions include properly documented extension mechanisms using XML Namespaces, either directly (in the 2.* branch) or through RDF (in the 1.* branch). Most syndication software supports both branches. Mark Pilgrim's article "The Myth of RSS Compatibility" discusses RSS version compatibility in more detail.

The extension mechanisms make it possible for each branch to track innovations in the other. For example, the RSS 2.* branch was the first to support enclosures, making it the current leading choice for podcasting, and as of mid-2005 is the format supported for that use by iTunes and other podcasting software; however, an enclosure extension is now available for the RSS 1.* branch, mod_enclosure [8]. Likewise, the RSS 2.* core specification does not support providing full-text in addition to a synopsis, but the RSS 1.* markup can be (and often is) used as an extension. There are also several common outside extension packages available, include a new proposal from Microsoft for use in Internet Explorer 7.

The most serious compatibility problem is with HTML markup. Userland's RSS reader—generally considered as the reference implementation—did not originally filter out HTML markup from feeds. As a result, publishers began placing HTML markup into the titles and descriptions of items in their RSS feeds. This behaviour has become widely expected of readers, to the point of becoming a de facto standard, though there is still some inconsistency in how software handles this markup, particularly in titles. The RSS 2.0 specification was later updated to include examples of entity-encoded HTML, however all prior plain text usages remain valid.

Atom

In reaction to perceived deficiencies in both RSS branches (and because RSS 2.0 is frozen with the intention that future work be done under a different name), a third group started a new syndication specification, Atom, in June 2003, and their work was later adopted by Internet Engineering Task Force (IETF).

The relative benefits of Atom and the two RSS branches are currently a subject of heated debate within the Web-syndication community. Supporters claim that Atom improves on both RSS branches by relying more heavily on standard XML features, by supporting autodiscovery, and by specifying a payload container that can handle many different kinds of content unambiguously. Opponents claim that Atom unnecessarily introduces a third branch of syndication specifications, further confusing the marketplace.

For a comparison of Atom 1.0 to RSS 2.0 from the point of view of an Atom supporter, see Tim Bray's article here: [9].

See also

External links

Personal tools