Metadata

From Wikipedia, the free encyclopedia.

(Redirected from Metadata (computing))
Jump to: navigation, search
Metadata is also a U.S. trademark of The Metadata Company

Metadata (Greek: meta-+ Latin: data "information"), literally "data about data", is information that describes another set of data. A common example is a library catalog card, which contains data about the contents and location of a book: It is data about the data in the book referred to by the card. Other common contents of metadata include the source or author of the described dataset, how it should be accessed, and its limitations.

Other machine generated data about data, such as the reversed index created by a free-text search engine is generally not considered as metadata. Another important type of data about data is the links or relationship among data. Some metadata scheme attempts to embrace this concept (such as Dublin Core element link). Since metadata is also data, it is possible to have "metadata of the metadata of data".

The metadata which is embedded with content is called embedded metadata. A data repository typically stores the metadata detached from the data.

Contents

Uses

Metadata has become important on the World Wide Web because of the need to find useful information from the mass of information available. Manually-created metadata adds value because it ensures consistency. If one webpage about a topic contains a word or phrase, then all webpages about that topic should contain that same word. It also ensures variety, so that if one topic has two names, each of these names will be used. For example, an article about Sports Utility Vehicles would also be given the metadata keywords ‘4 wheel drives’, ‘4WDs’ and ‘four wheel drives’, as this is how they are known in some countries.

Examples of metadata for an audio CD include the MusicBrainz project, and AMG's All Music Guide. Similarly, MP3 files have metadata tags in a format called ID3.

Metadata is more properly called ontology or schema when it is structured into a hierarchical arrangement. Both terms describe “what exists” for some purpose or to enable some action. For instance, the arrangement of subject headings in a library catalog serves as not only a guide to finding books on a particular subject in the stacks, but also as a guide to what subjects “exist” in the library’s own ontology and how more specialized topics are related to or derived from the more general subject headings.

Metadata is frequently stored in a central location and used to help organizations standardize their data. This information is typically stored in a Metadata Registry.

Types

Relational database metadata

Relational databases use tables to store their own metadata. Each relational database has their own methods for storing metadata Examples of relational metadata include:

  • table of all tables in database, their names, sizes and number of rows in each table
  • tables of columns in each database, and what tables they are used in, the type of data stored in each column

For an example see Oracle metadata

Data warehouse metadata

Kimball1 lists the following types of metadata in a data warehouse (See also [1]):

File system metadata

Nearly all file systems keep metadata about files out-of-band. Some systems keep metadata in directory entries; others in specialized structure like inodes or even in the name of a file. Metadata can range from simple timestamps, mode bits, and other special-purpose information used by the implementation itself, to icons and free-text comments, to arbitrary attribute-value pairs.

With more complex and open-ended metadata, it becomes useful to search for files based on the metadata contents. The Unix find utility was an early example, although inefficient when scanning hundreds of thousands of files on a modern computer system. Apple Computer's current version of its Mac OS X operating system (Tiger) supports cataloging and searching for file metadata through a feature known as Spotlight. Microsoft Windows (Vista) is expected to include a similar functionality via the WinFS file system.

Program metadata

Most executable file formats include metadata describing issues that need to be considered by the runtime or operating system when executing the program.

In DOS, the COM file format does not, but the EXE file format does, and Windows PE format also. This metadata can include the company that published the program, the date the program was created, the version number and more.

In the Microsoft .NET executable format, extra metadata is included to allow reflection at runtime.

Other programs such as Microsoft Word and other Microsoft Office products save metadata into the document files. This metadata can contain the name of the person who created the file (obtained from the operating system), the name of the person who last edited the file, how many times the file has been printed, and even how many revisions have been made on the file.

For a list of executable formats, see object file.

See also

External links

References

1 Ralph Kimball, The Data Warehouse Lifecycle Toolkit, Wiley, 1998

2 Guy V Tozer, Metadata Management for Information Control and Business Success, Artech House, 1999

Blog link

Personal tools