From indexes to intelligent retrieval
Information retrieval has been a fundamental human need since the dawn of recorded knowledge. For the past 70 years, retrieval systems have operated under the same core paradigm:
- First, a user frames an information need as a query.
- They then submit this query to the retrieval system.
- Finally, the system returns references to documents that may satisfy the information need:
- References may be rank-ordered by decreasing relevance
- Results may contain relevant excerpts from each document (known as snippets)
While this paradigm has remained constant, the implementation and user experience have undergone remarkable transformations. Early information retrieval systems relied on manual indexing and basic keyword matching. The advent of computerized indexing in the 1960s introduced the inverted index—a data structure that maps each word to a list of documents containing it. This lexical approach powered the first generation...