opening it up with Common Lisp

Favorite weblogs

Lisp Related

Bill Clementson

Finding Lisp


Planet Lisp



Talking Points Memo

This Modern World

Working for Change

Other home


Recent Readings

Book review: Darwinia
Reviewed: Friday, August 11, 2006

Summer reading: Spin
Reviewed: Saturday, August 5, 2006

Reviewed: Tuesday, July 18, 2006

the Omnivoire's Delimma
Reviewed: Wednesday, July 12, 2006

the Golem's Eye
Reviewed: Wednesday, May 31, 2006


On the Robustness of Centrality Measures under Conditions of Imperfect Data
Stephen P. Borgatti and Kathleen M. Carley and David Krackhardt, 0 , (Paper URL)
Thursday, December 23, 2004

Everyone is familiar with how polling (or sampling) can be used to estimate values for entire populations while examining only a tiny fraction of its members. This works very well (2000 and 2004 elections not withstanding) when the values in question are independent of the relationships among the members of the population. That is, we can estimate the average height of all 26-year olds by measuring only a few of them because the height of each 26-old is assumed to be independent of all others. The problem is that when the values we care about are relationships, sampling is no longer so easy. We can draw valid conclusions from samples of a population if and only if we have independence between each sample member but when we have relationships (graphs), independence is out the door (if for no other reason than that most real world graphs have friend of a friend structure: it's more likely that you my friends will know each other than not).

To pursue this, Borgatti, Carley and Krackhardt investigate one typical graph measure under varying noise conditions to see how the noise alters the sampling results. This is definitely useful and important research - not to mention a paper that practically writes itself! - and the authors do a good job explaining their methodology and results. The only serious limitation of their work is that they study only Erdos / Renyi random graphs (see here for a summary of Marc Newman's excellent SIAM survey on graph theoyr) and only random noise. As they themselves conclude:

A crucial limitation of this study is that we have studied only random error on random networks. This is appropriate as a first step in understanding how measurement error affects the calculation of network indices, but it should be clear the results could be quite different for practical settings in which (a) the data collection methodology makes systematic errors (such as more readily losing nodes with low degree), and (b) the networks themselves are not randomly constructed (as we expect for most human networks).

Erdos / Renyi random graphs are easy to analyze but by now everyone knows (yes, even your two year old) that they are very poor models for real networks - the ones we care about. If I have time, I'd like to follow up their results.

Home | About | Quotes | Recent | Archives

Copyright -- Gary Warren King, 2004 - 2006