A Syndication Project

Design Principles

This URL: http://www.iangraham.org/projects/news/issues-0.html

Created: 24 September, 2000
Last Update: 2 October, 2000

Author(s): Ian Graham

Overall Message Design

The message has to start with some sort of root element, so I've chosen the name syndicationMessage -- which seems to pretty well summarize what we're talking about. This element contains a syndicated data message, and consequently can contains five broadly-defined types of information:

  1. metadata about the service that aggregated the message
  2. metadata relevant to the syndication process
  3. metadata about each piece of data content in the message. This will have both human language-independent and language-dependent parts.
  4. metadata about possible relationships between the different data content parts in the message
  5. the data content parts themselves

This document describes the specific types of information that are most desired.

Note that the information provided should only be that sufficient to manage a collection of syndicated data and/or data feeds, and nothing else. To put it in other words, the information encoded in a syndication message should not provide any information about the data content of the message, or the service that provided/delivered the data, that is not needed to manage the syndication process. Additional information required to fullfill those roles would be either

1. Metadata about the Aggregator

The following information about the aggregator (source of the message) is often required by the consumer.

2. Metadata and the Syndication Process

The syndication process itself needs certain pieces of information so that it knows how and when it can use the data content of the message. The following properties summarize the 'lowest common denominator' information typically needed to manage syndicated data:

3a. Language-Independent Metadata for the Data Content

The syndication process will also need to know certain things about the data content, and/or the creator of the data content (which may be different than the aggregator). This information would include:

3b. Language-Dependent Metadata for the Data Content

The syndication process also needs to know certain language-dependent things about the data content. The following items are the 'lowest common denominator' items that are common to most current syndication processes:

Each message could have more than one set of language-dependent metadata, so that the syndication tool could appropriately manage the data in systems using different human languages.

4. Relational Metadata

It is easy to imagine a message containing related pieces of data: for example a Web page and the associated image files, a collection of XML documents representing different language variants of the same text content, or a sequence of news items related to the same general topic. It would be useful if the syndication mechanism allows for an easy way of including such relational information.

There are several efforts underway to provide a framework for such metadata (site maps, etc.). These all generally use RDF (expressed in XML) to encode the information. There is also no set of common, simple relationships that could easily be chosen for a 'basic' syndication format. It may thus make sense to define a syndication message format that 'allows' for such metadata, but that does not define any specific mechanism for providing the relational metadata.

5. The Data Itself

We need to know a fair bit about the data content itself. Such as