Who Does What: McClelland Professor of MIS Sudha Ram Dissects Collaboration Patterns on Wikipedia
of MIS Sudha Ram
collaboration and data
quality on Wikipedia.
“The more scientists collaborate, the more they make new discoveries,” says McClelland Professor of MIS Sudha Ram. It’s one of the central ideas behind the $50 million iPlant Collaborative, which aims to unite the international scientific community around solving plant biology’s “grand challenge” questions.
Ram’s role as a faculty advisor is to develop a cyberinfrastructure to facilitate collaboration. “We initially suggested wikis for this, but we faced a lot of resistance,” she says. Scientists expressed concerns ranging from lack of experience using the wikis to lack of incentive. “We wondered how we could make people collaborate,” she says. “So we looked at the English version of Wikipedia. There’s more than three million entries, and thousands of people contribute voluntarily on a daily basis.”
The public, she says, has never stopped criticizing the quality of Wikipedia articles, and critics never have trouble finding low quality articles. But in recent years, Wikipedia has moved to monitor quality more closely — flagging entries of low quality — and some research indicates that the quality of entries is close to that found in conventional encyclopedias.
“Most of the existing research on Wikipedia is at the aggregate level, looking at total number of edits for an article, for example, or how many unique contributors participated in its creation,” Ram says. “What was missing was an explanation for why some articles are of high quality and others are not. We investigated the relationship between collaboration and data quality.”
Wikipedia has an internal quality rating system for entries, with featured articles at the top, followed by A, B, and C level entries. Ram and co-author Jun Liu randomly collected 400 articles at each quality level and applied a data provenance model they developed in an earlier paper.
“We used data mining techniques and identified various patterns of collaboration based on the provenance or, more specifically, who does what to Wikipedia articles,” Ram says. “These collaboration patterns either help increase quality or are detrimental to data quality.”
McClelland Professor of MIS Sudha Ram has found
that in recent years, Wikipedia has moved to
monitor quality more closely — flagging entries of
low quality — and some research indicates that the
quality of entries is close to that found in
Ram and Liu identified seven specific roles that Wikipedia contributors play. Starters, for example, create sentences but seldom engage in other actions; content justifiers create sentences and justify them with resources and links; copy editors contribute primarily though modifying existing sentences. Some users — the all-round contributors — perform many different functions.
“We then clustered the articles based on these roles and examined the collaboration patterns within each cluster to see what kind of quality resulted,” Ram says. “We found that all-round contributors dominated the best quality entries. In the entries with the lowest quality, starters and casual contributors dominated.”
To generate the best-quality entries, she says, people in many different roles must collaborate. “If we want scientists to be collaborative, we need to assign them to these roles and motivate them to police themselves and justify their contributions.”
Ram and Liu suggest that the results of this study should spark the design of software tools that can help improve quality. “A software tool could prompt contributors to justify their insertions by adding links,” she says, “and down the line, other software tools could encourage specific role setting and collaboration patterns to improve overall quality.”
Ram and Liu’s work was recognized with a Best Paper Award from the Workshop on Information Technology and Systems conference held in conjunction with ICIS 2009.
Learn about the many research centers and labs of the management information systems department.