Friday, August 7, 2009

PART 2: More details of changing editor resistance in Wikipedia

In the last week, we have received interesting press coverage in New Scientist (as well as Fast Company, Business Insider, and syndicated elsewhere), on the work done in our team on Wikipedia growth rate, and how it has plateaued, changing from an exponential growth model to one that look more linear. Even though this wasn't necessarily new finding, but it was really a teaser for some other observations we have found in the Wikipedia data that is about to be published in WikiSym2009 conference in October.

In the figure below, we see how the slowdown in growth of Wikipedia activity, specifically around different editor classes is different. For each month, we first partition the editors into different classes based on their monthly editing frequency. We then compare the total edit activities among the different editor classes over time.

Monthly edits by user class (in thousands).

[Consistently with the power law, we classified users using an exponential scale: we defined the classes of editors using powers of 10, e.g. 10^0, 10^1, 10^2. This resulted in five classes of users for each month: editors contributing 1 edit (i.e., 10^0), 2 to 9 edits (2-9 class), 10 to 99 (10-99 class), 100 to 999 (100-999 class), and more that 1000 edits (1000+ class).] Note that the classification of the editors was recalculated for each month.

Since the beginning of 2007, the trends of four classes slightly decrease their monthly edits. In contrast, only the highest-frequency class of editors (1000+ edits, dark blue line) shows an increase in their monthly edits.

Another way to look at this data is to analyze the relative amount of activities for each editor class by transforming the data into percentages of the total edits. The figure below complements the information in the figure above by showing the percentage of the volume of edits that each class contributes in relation to the total.

Monthly percentage of edits by each user class.

The two highest frequency classes of editors account for more than half of the total monthly edits (56% from 01/2005 to 08/2008). Furthermore, since 2005 the proportion of contributions by the highest-frequency editor class has increased slightly. In fact, the editors in 1000+ class have kept producing at an increasing rate over the past four years (their average monthly edits per editor for the years 2005 to 2008 were 1740, 1859, 1869, and 2095, respectively).

We now focus on specific evidence about what might have contributed to such slowdown. Revert is the action of deleting a prior edit. The following figure shows the percentage of edits that were reverted (reverted edits) monthly for each editor class. Note that edits related to vandalism and edits performed by robots are excluded.

Monthly ratio of reverted edits by editor class

This illustrates two indicators of a growing resistance from the Wikipedia community to new content.

First, the figure shows that the total percentage of edits reverted increased steadily over the years. The total percentage of monthly reverted edits (see dashed black line) has steadily increased over the years for the all classes of editors (e.g. 2.9, 4.2, 4.9, and 5.8 percent of all edits for 2005 through 2008 as shown by the dash line).

Second, more interestingly, low-frequency or occasional editors experience a visibly greater resistance compared to high-frequency editors [see the top two reddish lines, as compared to other lines]. The disparity of treatment of new edits from editors of different classes has been widening steadily over the years at the expense of low-frequency editors.

We consider this as evidence of growing resistance from the Wikipedia community to new content, especially when the edits come from occasional editors.