opening it up with Common Lisp

Favorite weblogs

Lisp Related

Bill Clementson

Finding Lisp


Planet Lisp



Talking Points Memo

This Modern World

Working for Change

Other home


Recent Readings

Book review: Darwinia
Reviewed: Friday, August 11, 2006

Summer reading: Spin
Reviewed: Saturday, August 5, 2006

Reviewed: Tuesday, July 18, 2006

the Omnivoire's Delimma
Reviewed: Wednesday, July 12, 2006

the Golem's Eye
Reviewed: Wednesday, May 31, 2006


Data Mining in Social Networks
Wednesday, April 28, 2004

This paper by David Jensen and Jennifer Neville provide a light weight overview of some of the issues in analyzing relational data. It includes a taxonomy of criteria by which to judge datasets and tools (e.g., network size, connectivity, relational dependence, and so forth) and highlights how concentrated linkages, degree disparity and relational autocorrelation can lead to biased feature selection and spurious correlation.

Although several technologies are mentioned, the example running throughout the paper is Jensen and Neville's own QGraph query language and Relational Probability Tree (RPT) as applied to the internet movie database. I've not read the details of RPT but it appears to be similar to decision trees expect that the input to the algorithm is a set of not necessarily isomorphic graphs instead of a set of attribute vectors. The user must decide the query to pull these graphs from the dataset and the attribute to be learned. The RPT algorithm then must build the decision nodes based on attributes of the graphs in the input set. These can consist of "regular" attributes, composites (e.g., link counts), statistical relations (e.g., (> (mean birth-year) X) and inequalities (e.g., ((> (proportion birth-year) X) Y). This is a big class of possible decisions and it's not clear how they are determined.

In summary, it's an easy read that doesn't require much background to understand. On the other hand, one is left wishing that more details were covered.


Home | About | Quotes | Recent | Archives

Copyright -- Gary Warren King, 2004 - 2006