tobo (nibot) wrote in lj_nifty,

community membership overlap

A few months ago, I became interested in membership overlap between groups of livejournal communities. Specifically, there are at least 40 communities related to the geographic area in which I live. I expected some communities to be approximate subsets of others (general interest ⊇ specific interest), but I was also interested in more complex relationships. I decided to compute, for all pairs of communities in my sample, the conditional probability that, given that someone is a member of community A, they are also a member of community B. For the sake of perversity (certainly not efficiency), I implemented this computation as a unix shell script. The results are available as a big matrix. I'm interested in ideas about how to better arrange the rows/columns of the matrix to make groupings more obvious.


