| |
| |
Latent semantic analysis - Wikipedia, the free encyclopedia |
 | | Latent semantic analysis (LSA) is a technique in natural language processing, in particular in vectorial semantics, invented in 1990 [1] by Scott Deerwester, Susan Dumais, George Furnas, Thomas Landauer, and Richard Harshman. |
 | | LSA uses a term-document matrix which describes the occurrences of terms in documents; it is a sparse matrix whose rows correspond to documents and whose columns correspond to terms, typically stemmed words that appear in the documents. |
 | | A typical example of the weighting of the elements of the matrix is tf-idf: the element of the matrix proportional to the number of times the terms appear in each document, where rare terms are upweighted to reflect their relative importance. |
| en.wikipedia.org /wiki/Latent_semantic_indexing (630 words) |
|