Jessamyn said “learn this word: folksonomy” and I make it a point to always listen to Jessamyn. :) Actually, I’ve been hearing quite a lot about folksonomies lately, between my Theory of Information Retrieval class last semester and the recent discussion on Slashdot. Folksonomies are the taxonomic vocabularies generated from such sites as del.icio.us and flickr, where people create their own tags to describe what they are looking at. Some are arguing that folksonomies are superior to controlled vocabulary for indexing and retrieval. And in some ways they are. Because they are the terms real people actually used to describe an item, folksonomies reflect the language people would actually use when searching for certain concepts or concrete things. Controlled index terms may not even be words actually used in a document, and therefore, the terms used in folksonomies may get closer to the actual meaning of the document/site/photo/etc. In many cases also, controlled vocabularies map synonymous terms to one authorized term, when they may not actually mean the same thing in different contexts. The other side of the coin is that, with folksonomies, users searching under one term will not retrieve documents that are indexed under other similar terms. The problems with ambiguity, synonomy, and disambiguation of index terms all go back to the old natural language versus controlled vocabulary debate. Personally, I’m kind of on the fence about it. I think folksonomies are great and I’m always a fan of bottom-up approaches to everything, but I’ll readily admit that there are certain advantages to controlled vocabularies. I had actually written about the idea of using folksonomies to allow images to be searched without requiring the expensive and time-consuming creation of controlled metadata in my class last semester. While doing the research, I read an interesting article called Towards data-adaptive and user-adaptive image retrieval by peer indexing that described how peer indexing could be used in image retrieval. It’s a really fascinating idea. Especially with images, which are difficult to extract metadata from, using peer-indexing (or a folksonomy) offers the possibility for improving image retrieval techniques without increasing costs.

I think folksonomies are something we librarians should certainly be talking about. Jessamyn has some good links to sites that discuss folksonomies and I have a few more. Clay Shirky wrote an article about the pros and cons of folksonomies (versus controlled vocabulary) in Many-to-Many. Adam Mathes wrote a paper for the University of Illinois School of Library and Information Science entitled Folksonomies – Cooperative Classification and Communication Through Shared Metadata, which generated a lot of buzz and was the subject of the Slashdot thread. It’s a long paper, but very interesting for those interested in geeky things like indexing (like me). I would love to see more studies on the effectiveness of using folksonomies in information retrieval systems versus using controlled vocabulary. Cool stuff!