Latent Semantic Indexing

Increasingly (Thursday; July 20, 2006) used by Google. (This means that they are depending more on semantic analysis to determine ranking of results, rather than simple keywords and link analysis.) You can even do a "semantic" search on a term by prepending it with "~". For example, searches on "~word" and "word" return different results.

General references on LSI:

"Latent Semantic Indexing (LSI) is a novel, patented information retrieval method developed at Telcordia Technologies, Inc. By using statistical algorithms, LSI can retrieve relevant documents even when they do not share any words with a query."

http://www.cs.utk.edu/~lsi/

http://www-psych.nmsu.edu/~pfoltz/cois/filtering-cois.html

Papers by Lillian Lee, who has a wonderfully playful website, and good taste in literature.

Dr. Edel Garcia writes to say "You might want to know of a Tutorial Series on LSI and its algorithm, SVD, available at http://www.miislita.com, debunking the many SEO myths about the subject." By "SEO" I believe he is referring to "Search Engine Optimization", an industry that attempts to tell people how to get their pages ranked prominently on Google. It's true that searches on this subject return a lot of garbage results from SEO outfits, making it harder to find the interesting research. (August 16, 2006.)