Sins of omission? An exploratory evaluation of Wikipedia’s topical coverage

Alex Halavais & Derek Lackaff

Looking at authority/authoritativeness of wikipedia articles. Accuracy is obviously an issue, but authority is more than that. It’s also about breadth, timeliness, etc. Encyclopedias are thought of as being authoritative, but what about the Wikipedia.

Halavais and Lackaff took three encyclopedias and compared entries in the encyclopedia with entries in the Wikipedia to measure breadth of coverage.

  • Lerner & Trigg (eds.) Encyclopedia of Physics
  • Preminger and Brogan (eds.) New Princeton Encyclopedia of Poetry and Poetics
  • Stranzy, P. (ed.). Encyclopedia of Linguistics

Each encyclopedia has its own way of organizing and naming topics, so it’s difficult to find an exact match of topics between encyclopedias.

Between 18% (poetry) and 37% (physics) of articles found in the print encyclopedias were not found in the Wikipedia, but it still does have a large number of the articles (a majority). Wikipedia covers the physical and natural sciences much better than the humanities. However, the physics encyclopedia covered more general topics while the poetry encyclopedia covered very specific topics (stanzas, iambic pentameter, etc.). These different approaches to covering topics affect the results as well (making their numbers kind of useless), but it is very true that the sciences are covered much more than humanities. Many of the topics in the Wikipedia cover the topics in the poetry encyclopedia, but they are part of more general articles rather than having their own specific topic. The sub-topics can easily be found because Wikipedia is searchable.

Why is this useful? According to Halavais, the study identifies where wikipedia is weak and needs to be better developed. But does it really? I don’t think these things can really be compared. Also, how the models differ for organizing and presenting topics.

Wikipedia presents a new model of authority. What is it? They never explain that. I suppose they mean the “peer review”/collective intelligence idea, but they never really discuss this.

I frankly don’t think this study of theirs looked at authority at all, but it did make the important point that it’s difficult to compare coverage. Two encyclopedias can cover the same things, but one may have general articles that cover more topics and others will have many more specific articles that cover very specific articles briefly.

The really important question to me is: How do you define authority?

Halavais was talking about how librarians see the Wikipedia in black or white — as either good or bad. I disagree and I think it is really an unfair characterization (yeah, all librarians think exactly the same way, right?). I think there are three issues that librarians are dealing with. 1) is it authoritative (and what does that mean)? 2) should it be used by students in their research? and 3) how is it being used? I think the answer to #1 is “in some cases” and the answer to #2 is “it depends.” It’s one thing to read the Wikipedia to get background info on a subject with the knowledge that the information may not be 100% accurate. It’s another thing to take a Wikipedia article completely “at it’s word” and use the information in a paper without looking for corroborating evidence. Perhaps part of the measurement of authority should consider what the article is being used for?

I get seriously annoyed with these negative characterizations of librarians as being completely against the Wikipedia. How about we’re critical? We’re skeptical. Which is what everyone should be. Grrr! More stereotypes. And Alex Halavais used to work at Buffalo’s School of Informatics!

Comparing Wikipedia and Britannica: The story behind Nature’s encyclopaedia story

by Jim Giles

Jim did the Nature study comparing the Wikipedia and Britannica. They used 50 pairs of articles — one from Wikipedia, one from Britannica.

Asked experts to evaluate two articles in one topic in a “blind study”, looking for inaccuracies and misleading statements. They were looking at the articles from the POV of the reader to see if the information would mislead him or her (which is a bit different than factual inaccuracies).

Results: 3.9 errors/article in Wikipedia. 2.9 errors/article in Britannica. Very few major errors in either source. More detailed data analysis was not performed — it was a journalistic article, not a scholarly study.

In March 2006, Britannica wrote a rebuttal (PDF) saying that almost everything in Nature’s investigation was wrong and misleading. They also demanded a retraction.

Criticisms of the article included:

  • referee reports were not given to other sources – protecting their anonymity
  • some of the content came from the Britannica Student Encyclopedias – when doing a search on material from both sources come up, and, furthermore, is it ok to have inaccuracies in a student edition???
  • Britannica material was edited – in a few cases, Nature combined two articles — like Ethanol and uses of ethanol — to make the article comparable to the Wikipedia article. They didn’t change content/facts
  • reviewers made mistakes – Nature assumed that the experts were right, but it was a comparative exercise and the same expert evaluated both articles so they were equally likely to make a mistake with either source.

Thoughts on evaluating Wikipedia.

  • Is Britannica the best benchmark?
  • how should omissions be dealt with? Is omission as important inaccuracy? It’s certainly an issue.
  • How important is timeliness?
  • Can style/clarity be evaluated?

What I got out of this is that the idea of authority is a really murky one and should take much more than accuracy into account.