Tuesday, May 22, 2007

Controversy Visualization

Alas, not our visualization of revert relationships, but someone got slashdotted for doing a visualization of power struggle in Wikipedia:
Slashdot article.

"todd450 pointed us to a nifty visualization of Wikipedia and controversial articles in it. The image started with a network of 650,000 articles color coded to indicate activity. The original image is apparently 5' square, but the sample image they have is still pretty neat."

The original blog post was here.

Saturday, May 19, 2007

Ed Chi talking about Augmented Social Cognition at Google

Recently, I gave a talk over at Google about the efforts we're starting here at PARC on Augmented Social Cognition (ASC). The Google video archive of the talk can be found here:


Interesting enough, it was already blogged by someone at AOL Search here.

Tuesday, May 15, 2007

Long Tail of user participation in Wikipedia

(Ed H. Chi; joing work with Niki Kittur, Bryan Pendleton, and Bongwon Suh)

As we were getting ready for the alt.CHI presentation last week at the CHI conference, I realized that the way we have been looking at the frequency of user edits in Wikipedia was not really getting at the root of the issue. What we really aspire to find out is "what processes are governing the users' participation in Wikipedia?"

In the alt.CHI paper, we discovered that around 2003-2004, administrators in Wikipedia was making around 50% of edits! Definitely seemed like "power of the few" was at work in Wikipedia. Indeed, admins in Wikipedia have a great deal of power. They set policies, ban destructive users, help resolve disputes, and generally keep order within the system.

Moreover, when we analyzed the data using high-edit users (users with 10,000 edits or more), we got the same result. The algorithm was: (1) For all wikipedia edits for all times, find users with more than 10k edits; (2) compute the total number of edits in month "x"; (3) compute the total number of edits made by users identified in step 1; (4) divide result from step 3 with the number from step 2. Here is the graph:

And when we computed the diff between all 58.5 million revisions of Wikipedia, we found that the number of words changed by admins (as a proportion of total words changed by everyone) was also waxing and waning from 10% to about 50% back down to near 10%.

We discovered, as outlined in the alt.CHI paper, that users with low number of edits is becoming a bigger part of the total population. It seemed like from the above analysis, users with low number of edits were becoming more powerful in Wikipedia.

When I presented these results to the Computing Science Laboratory here at PARC late last year, David Goldberg suggested to me "why don't you do the other analysis? Compute how much work the top 1% of the user (at any given moment in time) was doing?" The difference between this analysis and the analysis we did was somewhat subtle. The analysis we did was equivalent to understand the work of the top 1% users for the entire existence of Wikipedia, instead of top 1% for that month. The algorithm here would be: (1) First, for a given month, rank all users according to the number of edits they made; (2) From the ranking of users for that month, take the top 1% of those users; (3) For that month, compute the total number of edits made; (4) For that month, compute the total number of edits made by users found in Step 2; (5) Divide result from step 3 with step 4. Here is the result from that analysis:

This clearly showed a very different picture. So what's really going on? It was this past week I realized that we could have summarized the result in a different way. We could instead plot the long tail distribution of user contributions:

In fact, plotted on a log-log plot (also known as a power law plot), here is what it looks like:

This arises partially because of the user turnover rate on Wikipedia:

So what this appears to mean is that there is a rather simple explanation for what's going on here. We have a long tail architecture of participation in Wikipedia. At any given moment in time, a few users are a lot more active than the rest of the population, but there is a long tail of other users who are contributing to the effort.

Monday, May 14, 2007

Conflict and Power Structure in Wikipedia

We presented two papers at the CHI2007 conference. One paper was on the conflicts and coordination costs of Wikipedia. (Paper here.)

The other paper was an alt.chi paper on the power structure of Wikipedia. (Paper here and here.)

The room was absolutely packed (easily 200+ people there), and they were spilling out into the hallways! Picture above was found on flickr.

Information Foraging Blog

Peter has started a Blog on Information Foraging.

Thursday, May 10, 2007

New Project forming at PARC

Augmented Social Cognition is a new project that just formed at the Palo Alto Research Center. Its mission is to understand and develop engineering models for systems that enhance a group's ability to remember, think, and reason.

Our intention is to conduct research in two main different ways:
First, we are characterizing the various social web spaces, such as Wikipedia, del.icio.us, etc.
Second, we are building new social web applications based on the concepts of balancing interaction costs and participation levels. We are planning on extending information foraging theory to understand some of these economic models of behavior.

Tuesday, May 8, 2007

Augmented Social Cognition

  • Cognition: the ability to remember, think, and reason; the faculty of knowing.
  • Social Cognition: the ability of a group to remember, think, and reason; the construction of knowledge structures by a group.  (not quite the same as in the branch of psychology that studies the cognitive processes involved in social interaction, though included)
  • Augmented Social Cognition: Supported by systems, the enhancement of the ability of a group to remember, think, and reason; the system-supported construction of knowledge structures by a group.