Information Retrieval (IR)


Information retrieval is essentially a matter of deciding which documents in a collection should be retrieved to satisfy a user’s need for information. The retrieval decision is made by comparing the terms of the query with the index terms (important words or phrases) appearing in the document itself. The decision may be binary (retrieve/reject), or it may involve estimating the degree of relevance that the document has to the query.

System Data Object Primary Operation Database Size
IR (Information Retrieval) document (probabilistic) retrieval small to very large
(relational) DBMS table (deterministic) retrieval small to very large
AI (Artificial Intelligence) logical statements inference usually small

Retrieval Algorithms
Retrieve documents or text with information content that is relevant to a user’s information need. To determine in which order to display documents to the users, the search engine uses an algorithm to rank pages that contain the keywords. For example, it may count the number of times the keyword appears on a page.

Query Processing
Query processing is the activity of analyzing a query and comparing it to indexes to find relevant items. A user enters a keyword or keywords, along with Boolean modifiers such as “and,” “or,” or “not” into a search engine, which then scans indexed Web pages for the keywords.




      “Recognizing power in another does not diminish your own.”    
      ― Joss Whedon