opening it up with Common Lisp

Favorite weblogs

Lisp Related

Bill Clementson

Finding Lisp

Lemonodor

Lispmeister.com

Planet Lisp

Politics

Orcinus

Talking Points Memo

This Modern World

Working for Change

Other home

Polliblog

Recent Readings

Book review: Darwinia
Reviewed: Friday, August 11, 2006

Summer reading: Spin
Reviewed: Saturday, August 5, 2006

Runner
Reviewed: Tuesday, July 18, 2006

the Omnivoire's Delimma
Reviewed: Wednesday, July 12, 2006

the Golem's Eye
Reviewed: Wednesday, May 31, 2006





tinderbox

An Introduction to Latent Semantic Analysis
Thomas K. Landauer and Peter W. Foltz and Darrell Laham, 1998
Wednesday, August 18, 2004

Though many have believed that its popularity stems only from having a wonderful name, Latent Semantic Analysis (LSA) turns out to be both surprisingly useful and possibly an accurate representation of what goes on inside our heads. Landauer et. al. show this by summarizing a large body of research comparing LSA with humans on tasks such as categorization, estimating coherency, semantic priming and even scoring essays (!?).

LSA takes as input a matrix representing the occurrence of, for example, words in phrases or phrases in documents or, most broadly, things in collections. It uses singular value decomposition (SVD) to break this matrix into three: one representing the rows, one the columns and one diagonal matrix of "weights". This representation can then be compressed by reducing the number of matrix dimensions. The "distance" between words/phrases/things is then determined by looking at the compressed analogue of the original matrix. The decomposition and compression steps force the matrix to reveal the hidden connections between the things (hence, Latent Semantics).

As the authors say, you can treat LSA as a useful technique regardless of whether or not you believe the larger claim that it (or something very close to it) is actually how our brains function. They do, however, present an impressive array of evidence that LSA matches human performance pretty darn well.

Perhaps the most surprising part of LSA is that it works so well without taking syntax into account — all LSA looks at is inclusion of things within groups. The order of these things doesn't matter. I'd be interested in finding domains where LSA failed because syntax really was important. It would also be fun to look for incremental algorithms (and/or ones that could be reasonably implemented in wet ware). In any case, it's a technique I want to add to my toolbox (Lisp programs coming someday).


Home | About | Quotes | Recent | Archives

Copyright -- Gary Warren King, 2004 - 2006