History of Information Retrieval timeline

1945

Vannevar Bush: "As we May Think"

"In view of current concerns, the problem is not so much that excessive publications are made as they have far exceeded our present capacity to make real use of them (...) Professionally our methods of transmitting and reviewing the results of the scientific research are several generations old and, for now, totally inadequate in its purpose ...
Period: 1945 to 1959

Period I: start of the use of the computer in IR
1950

Calvin Mooers introduced the term "information retrieval"
1957

H. P. Luhn: proposed to use words as units of indexing

He proposed to use words as units of indexing for documents and to measure the superposition of words as a criterion of recovery.
1957

Cranfield Institute of Technology: marked the beginning of the IR as an empirical discipline

The Cranfield Institute of Technology and other associated entities, tests began that marked the beginning of the recovery of information as an empirical discipline. These tests made a strong influence on the evolution of the discipline. With them, an evaluation methodology was developed that is still in use by the IR systems nowadays.
Period: 1960 to 1969

Period II: 1960s decade
1968

Gerard Salton: VSM, TF, cosine similarity

The group (of Harvard and Cornell universities), managed by Salton, produced numerous technical reports, establishing ideas and concepts still important research areas today. Areas as the formalization of algorithms to classify documents about a query, an approach in which documents and queries were visualized as vectors within an n-dimensional space, and later, the similarity between a document and the query vector, to be measured through the cosine of the angle between the two vectors.
Period: 1970 to 1979

Period III: 1970s decade

One of the key developments of this period was the weighting of the frequency of terms (TF) of Luhn (based on the occurrence of words within a document), complemented by the work of Sparck Jones on the occurrence of words in the documents of a collection. Likewise, Salton synthesized the results of his group’s work on vectors to produce the vector space model.
1972

Karen Spärck Jones: TF-IDF
1976

Roberston & Sparck: Probabilistic Model

An alternative means of modeling IR systems involved expanding the idea of Maron et al. [86] to use probability theory. Robertson defined the principle of probability classification, which determined how to classify best documents based on probabilistic measures concerning the defined evaluation measures.
1980

Martin F Porter: stemming

Creation of new stemming algorithms, the process of matching words to their lexical variants, which, although they were known since 1960, had an important improvement with the contribution of Porter and other authors, which are still used today.
Period: 1980 to 1995

Period IV: the decade of the 80s and mid-90s of the last century
1990

Deerwester et al.: LSI
1992

Text Retrieval Conference TREC

An initiative of Voorhees and Harman, as an annual exercise in which numerous international research groups collaborate to build test datasets larger than those that existed before. With the large collections of text available under TREC, many old techniques were modified, new ones were developed, and are still being developed for effective recovery.
1994

Robertson et al.: BM25
Period: 1996 to 2021

Period V: the mid-1990s to today
1998

Brin & Page: PageRank
1998

Ponte & Croft: language model
1999

Jon M. Kleinberg: HITS

Vannevar Bush: "As we May Think"

Period I: start of the use of the computer in IR

Calvin Mooers introduced the term "information retrieval"

H. P. Luhn: proposed to use words as units of indexing

Cranfield Institute of Technology: marked the beginning of the IR as an empirical discipline

Period II: 1960s decade

Gerard Salton: VSM, TF, cosine similarity

Period III: 1970s decade

Karen Spärck Jones: TF-IDF

Roberston & Sparck: Probabilistic Model

Martin F Porter: stemming

Period IV: the decade of the 80s and mid-90s of the last century

Deerwester et al.: LSI

Text Retrieval Conference TREC

Robertson et al.: BM25

Period V: the mid-1990s to today

Brin & Page: PageRank

Ponte & Croft: language model

Jon M. Kleinberg: HITS