Search engine

From Wikipedia, the free encyclopedia

(Redirected from Search engines)
Jump to: navigation, search

A search engine is an information retrieval system designed to help find information stored on a computer system. Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to other techniques for managing information overload.

The most popular form of a search engine is a Web search engine which searches for information on the public World Wide Web. Other kinds of search engines include enterprise search engines, which search on intranets, desktop search engines, and mobile search engines.

Contents

How search engines work

Querying

Main article: Search query

Search engines provide an interface to a group of items that enables users to specify criteria about an item of interest and have the engine find the matching items within the group.

In the most popular form of search, items are documents or web pages and the criteria are words or concepts that the documents may contain[1].

There are several varieties of syntax in which a search engine user can express a query. Some methods are formalized and require a strict, logical and algebraic syntax. Other approaches are less strict and allow for a less defined query. One form of a less-restricted query syntax is referred to as Natural Language Search, which is a term typically used to describe web search engines that apply natural language processing of some form. For example, instead of searching for one or two words, a query could consist of an English sentence or paragraph. A natural language search engine will then parse the query into words and evaluate searches for these words. This places less burden on the search engine user to formulate a specific query using restrictive, and sometimes difficult to learn, syntax. A second definition of natural language search engines reflects how the search engine performs indexing, unrelated to the query syntax. This requires a semantic understanding of the query in order to disambiguate the text.

Traditional search engines tend to use a non-linguistic model of language and the hypothesis is that NLS will provide better results - that is to say, results that more accurately and efficiently support a user's need[citation needed].

Ranking

A Boolean search for an item within a group of items will either return the exact matching item or nothing. This is a rather orthodox search method where the equality between the desired item and the actual item must be exact. In application, it is sometimes far more beneficial and useful to incorporate a more lax measure of similarity between the desired item(s) and the items that exist in the group being searched.

For example, instead of finding only the exact book in a library, a library search engine may return a list of 'similar' books, with the exact book listed first.

The list of items that meet the criteria specified by the query are typically sorted, or ranked, in some regard so as to place the most 'relevant' items first. Placing the most relevant items first reduces the time required by users to determine whether one or more of the resulting items are sufficiently similar to the query. It has become common knowledge through the use of Web search engines that the further down the list of matching items you browse, the less relevant the items become.

Indexing

Main article: Index (search engine)

To provide a set of matching items quickly, a search engine will typically collect information, or metadata, about the group of items under consideration beforehand. For example, a library search engine may determine the author of each book automatically and add the author name to a description of each book. Users can then search for books by the author's name. Other metadata in this example might include the book title, the number of pages in the book, the date it was published, and so forth.

The metadata collected about each item is typically stored on a computer in the form of an index. The index typically requires a smaller amount of computer storage and provides a way for the search engine to calculate the relevance, or similarity, between the query and the set of items.

Helpful Features

  1. Spell checker
  2. Highlighter

Users can save time in typing correct words with auto correct options enable/disable. Highlighter such as yellow line markers help in highlighting certain search item on the results. Could be used for copying and editing as well.

Predictive search engines such as study consumer or end user pattern in presenting results. Variants of search engines present personalized data.

See also

References

  1. ^ Voorhees, E.M. Natural Language Processing and Information Retrieval. National Institute of Standards and Technology. March 2000.
Personal tools