opening it up with Common Lisp
Book review: Darwinia
Summer reading: Spin
the Omnivoire's Delimma
the Golem's Eye
A measure of betweenness centrality based on random walks
Network analysis is big business now a days because, suddenly, we see networks everywhere we look (the internet, power grid, food web, social network, genome to name a few). Mark Newman is one of the physicists who has brought serious mathematical chops to their analysis. In this paper, he investigates a new way to determine how "important" a vertex is to its network, it's centrality. The intuition for most centrality measures is obvious: a vertex is more central the more other vertexes depend on it to reach the rest of the network. Measuring it requires forming this intuition into an algorithm that provides it with precise meaning. So, for example, degree measures how many connections a vertex has, closeness the average shortest path distance between a vertex and every other vertex, and betweenness centrality is a measure of how many paths a vertex between other vertexes a vertex is on.
Different variants of betweenness centrality exist because there are different ways of determining paths. At one extreme, we can look at all and only the shortest paths between vertexes (shortest path betweenness); somewhere in the middle, we can look at flow betweenness: look at every path but count longer ones less. Both of these cases presuppose some measure of intelligence (so to speak) on whatever is choosing the paths. In some cases, this is very reasonable (news transmission comes to mind); in others, it is not (the spread of infectious disease). In response to this, Newman defines random-walk betweenness: make lots of random walks between all pairs of vertexes on a network and measure how often these pass through a given vertex. One surprise is that you can compute this without making all those random walks (he gives an O((m+n)n2) algorithm involving matrix inversion).
Newman goes on to compare this new measure with the existing ones on several different graphs: a few designed to highlight deficiencies in the existing measures, a graph is sexually transmitted disease in Colorado and a graph depicting family relationships in 15th century Florence (I told you networks were everywhere!).
Newman's writing is always concise and clear. His math is hard but not insurmountable. This is a paper well worth reading and the random-walk betweenness is definitely a keeper.
Copyright -- Gary Warren King, 2004 - 2006