Casey Bisson speaks! We all should listen.

As I was working on the book and it started getting longer and longer… and longer, it became clear to me that I was going to have to cut some topics that I’d planned to cover. My book proposal included a few things that one may not define as social software but that I thought fit into the larger dialogue about the Read/Write Web and the evolving view of how Web pages/applications/middleware should be designed for our users. As my word count got longer and longer, some of those things had to be left out. One of those was Web services. I had written a chapter on Web services and did an interview for the chapter with Casey Bisson, who does amazing things with liwbrary middleware (see his WPopac and an explanation). And while I was not particularly sad to see my stunningly inadequate description of Web services go by the wayside, I was very sad that people would not have the opportunity to read Casey’s insights into why our systems suck and what Web services could mean for libraries.

So after talking with Casey last Friday night, he e-mailed me and let me know that I could blog the interview. So here it is!

MF: How are our systems broken?

CB: Our library systems are broken in three ways: usability, findability, and remixability. We’ve been talking about usability for some time now, it’s the source of the hugely famous lipstick on a pig quote (proper thanks go to NCSU’s Andrew Pace for giving us the appropriate words), and it’s a reflection how little we’ve used technology to improve the search experience since we created our systems as a near copy of paper-based card catalogs 30 or more years ago.

Part of this is that our patrons have growing expectations of self-service (and pre-established information search behaviors as a result of growing internet use), but our systems have developed in different directions. The challenge is to find ways to invest our online services with the value libraries bring to in-person services. Allowing comments (and pingbacks and trackbacks) in our catalogs allows us to move some of our discussions online (where they can be indexed and found by search engines).

In an online world driven by search engines, findability means making our services findable by patrons who don’t yet know how we can help them. Search engines offer a huge opportunity to meet needs of the growing number of internet users, but libraries are underrepresented there because our systems are largely invisible to their crawlers. Building systems that can easily be linked to means that our users can exchange those links wherever they are online. Students can link to their favorite book in their Facebook profile or in their LiveJournal, and we adults can use those links in our research, our reading groups, or elsewhere. Those links tell search engines and their crawlers what online resources are most valuable, and building systems that can be linked to and indexed makes libraries more findable online.

The third problem we must address is the remixability of our content. The biggest lesson of the past few years of development on the web may be that those who control or provide content don’t necessarily know best how it could or should be used. Every organization has limited resources and so cannot pursue every new idea about how to use or present the content they control. Enabling remixability by offering web services and employing broadly supported standards, like RSS and OpenSearch, opens the door to allow passionate individuals and creative organizations — the same forces that have given us most of what we now recognize as the web — to develop those new interfaces to that data. Amazon, in particular, has been enormously successful with this and now claims over 140,000 registered developers building applications that vary from ecommerce websites like MasterWish.com to personal library catalogs like LibraryThing.com. The technology that enables this is called web services, and they offer the rather revolutionary opportunity to separate the applications that store and maintain an organization’s data from the applications used to display and manipulate it.

MF: What could web services mean for libraries?

CB: Libraries face the challenge of trying to integrate information from a growing number of applications and sources. Alongside the catalogs we maintain in our ILSs we’ve added licensed databases from a number of providers and many of us keep special collections or digital archives separate in specialized databases. Our reasons for this are clear to us, but difficult to explain to our patrons, who must learn not only how to select which search box is most appropriate for which question, but also how to use the extended features each vendor may (or may not) provide.

“Integration,” as it turns out, means more than simply putting a common header and footer on each page of each application, it means making all of those many applications work in similar ways. How frustrating must it be for the patron who learns that the “book bag” she has been maintaining in the OPAC isn’t connected to the “my research” collection in the database she’s been using, and more frustrating yet to learn that the bookmarks she’s been collecting in her browser all this time no longer work because none of our services allow bookmarking that way.

Web services offer a promising solution to this. By separating the data in our ILSs and databases from the OPACs and other applications we use to display and manipulate it, we create an opportunity to advance the state of each separately. A search and retrieval application could be built or offered by a vendor that allows us to search from all services — our ILS, databases, digital collections, etc — using one interfaces and one set of features to learn. This search/retrieval app could include comments/pingbacks/trackbacks, browser bookmarkable permalinks, indexability by search engines, and all the other features we’re coming to expect in our software today — without regard for which database the content was housed in (so long as they’re all web services accessible).

MF: What are the barriers to improving our library systems?

CB: Our efforts to improve our online services are burdened by the enormous costs of trying to work with systems that weren’t built for remixing. Allowing patrons to comment on our catalog items in the OPAC means waiting for vendors to offer that enhancement, to hack our catalog enough to bolt comments on, or to completely replace our OPACs. And if we do find a way to hack in a comments in our OPAC, we take on the costs of having to secure it, build an administration interface, and build a mechanism to combat spamming. The problems with building and maintaining a comment system aren’t unique to libraries, but the complexity of our existing systems makes it difficult to apply a generalized solution. So instead, we end up investing development time on problems that are common to all online services, instead of problems that are unique to libraries. Worse, the work we invest with one system often can’t be easily applied to other systems because each has it’s own hurdles and barriers.

Another issue is the size of the library community and our ability to support programming efforts. Much of the success we’ve seen with online services outside the library community is based on standards that are uncommon within libraries. Though quite a number of the 140,000 registered Amazon developers could offer tips and advice — or even entire applications — to those within the library community, we can do little with it because our standards isolate us from their work. Outside libraries, Amazon’s web services are the de facto standard for exchanging bibliographic information, but within the much smaller community of library programmers, we must face dozens of metadata standards (MARC, MODS, METS, DC, the list goes on and on) and a handful of application interface standards (z39.50, SRU/SRW, vendor specific standards, etc) to accomplish the same tasks.

The OpenSearch standard extends RSS with features necessary for searching, like a means to return suggested alternate searches (as for spelling corrections) or faceted search refinements. In the year since its release, over 300 OpenSearch targets have been announced and now it’s being integrated into Internet Explorer 7. The rapid development of OpenSearch reflects the general interest both inside and outside libraries to solve the difficulties of making information systems work together; the opportunity here is to find more common ground with those outside libraries and solve more problems together.

You just can’t help but love this guy!

10 Comments

  1. Dean C. Rowan

    There is a subtext to Mr. Bisson’s remarks, namely, if we had it all to do over again, we wouldn’t. Our standards turn out to have been wrong, our systems too costly and unwieldy–not to mention broken–and if we haven’t affirmatively tried to frustrate our users’ “expectations,” we have at least failed to anticipate them. This is an unfortunately ahistorical view of matters. By now, folks have read the NYT obituary of Henriette Avram, who “helped transform the gentle art of librarianship into the sleek new field of information science” in the pre-‘net mid-‘60s. Perhaps she is the culprit, but how might she have known better? By such lights, our own eagerness merely to tailor our services to increasingly entrenched expectations is passé. Why aren’t we interested instead in divining user needs in advance of their emergence? Sounds like science fiction or occult science, I suppose, but not more vaguely fantastic than finding “ways to invest our online services with the value libraries bring to in-person services.”

    Libraries have always been rightly leery about marketing and solicitations of sales. We have largely relied for better or worse on word of mouth…and now word of mouth has been technologically appropriated. So “the success we’ve seen with online services outside the library community” becomes the touchstone for success within the library community. There are times when this baffles me: for example, I find Amazon’s search engine opaque. Long ago, searching for available books about Paul de Man, I was swamped with Amazon product about Leonardo DiCaprio. The experience permanently colored my expectations of Amazon, I admit.

    Mr. Bisson is surely not the first to recognize the need to make variegated systems cooperate. This goes as well for web services as for ILSs, but it also fails to confront an almost brute fact: the more integrated and simplified the resource, the more abstract the result. Abstract results are good so long as we train our expectations to appreciate them. Our “systems,” the ones that “suck,” also happen to include online services invested with personal values, if by “systems” we mean an assemblage of things connected to form a complex unity, per the OED, more or less. Broadly construed in the spirit of convergence, the telephone is an online service, for instance. It works wonders when a library user deploys it to connect, not only to find out what the library holds, but also how the heck to render fruitful those integrated, shared interface, web based services that promise, but don’t uniformly deliver, ease of use and good results.

  2. It’s difficult to speak simply and directly on critical matters without also appearing to lay blame or disregard antecedents.

    Let me correct that: If we had the chance to do it over, I’d say let’s do it again. Let’s make all the progress (and all the mistakes), and re-learn all of the lessons that today’s online services now take for granted.

    But now that we’re here, we have to answer where we go next.

    My argument speaks of some new lessons that now come from outside libraries. Yes, some of those lessons are vaguely fantastic, but they’re what keep me busy. They keep a lot of us busy.

    Aside: I agree that Amazon’s search tools are rotten. Their success is elsewhere: in making recommendations and in making their content linkable/indexable. Often, when I want to find a book I just Google it. Amazon’s catalog page for the work is usually in the first page of results.

  3. Having spent some time this afternoon talking to Casey, I have to say I agree with the title of this post.

    Casey is dead right about what Amazon does right. Early on, however, Amazon was much praised for their book search. I’m not sure if they got worse or our expectatins merely improved. I suspect it was the latter. I remember people saying “you just something in and it finds it, even if you didn’t get it quite right.” Back when Amazon started, that was a bit of a novelty. Since Google, our expectations have jumped again.

    On the indexing, another point comes easily to mind. Casey notes that when you Google a book, the first result in generally the Amazon page. But look at the other pages and you’ll see most of THOSE will be Amazon Associates. Sometimes this is good; sometimes bad (all those AWS parasites, with none of their own content). Either way it demonstrates the enormous power of Amazon Web Services.

    Step back a second and notice how weird this is. Somehow, the largest, richest sources of bibliographic data rank poorly or are completely missing from the web. The Library of Congress doesn’t come up, nor Harvard, nor Yale, nor OCLC. You don’t even get the publishers’ page, the NYT review or anything of the sort. Amazon is completely and utterly dominant.

    Casey hit on this well, but here’s my re-statement of why:

    (1) Technology: To a web application developer, Z39.50 and MARC records are a huge pain. There are no “easy” guides to Z39.50. Getting MARC records out of their (multiple) strange character sets is a challenge and parsing MARC records a chore. And once you parse it, you need a librian’s help to understand it. (I feel like I’ve been getting a sort of MLS-at-gunpoint developing LibraryThing.)

    By contrast, Amazon’s API takes simple URLs and delivers easy-to-understand XML (eg., … ), in UTF-8. Experienced programmers know how to work AWS in minutes. Helper tools exist in every web programming language. There are places all over the web to talk about it. Thousands of web aps run on it. The fact that AWS can be used to generate Amazon Associate revenue is part of the story, but not the decisive one. There are lots of non-commercial uses of AWS too. LibraryThing is one of 17 book-cataloging sites (albeit by far the most successful one). All the others rely exclusively on AWS data.

    (2) Openness: Why is it that librarians, which are mostly government or non-profit operations, whose business is based on giving information away, make it hard to get their data, but Amazon, a private corporation, makes it easy? Libraries take great pains to get books into people’s hands–but not data. The only people who use Z39.50 connections are other libraries and scholars with expensive software like EndNote.

    Two obvious solutions:

    1. Permanent, simple links. The immediate and painful execution of anyone who develops a session-based OPAC.

    2. Quick APIs that deliver simple data. We fund the LC. Why can’t I send them an ISBN and get a title and author back? My local library has a big expensive OPAC solution. Why can’t I sent it an ISBN and get whether they have it? Why can’t I put the books I’ve taken out up on my blog? Why don’t my book return dates go right into my calendar program? Compared to everything else a library does, this stuff is trivial. The problem is attitudinal, not technical.

  4. Dean C. Rowan

    All I’m trying to contribute to this discussion is a degree of perspective as to the value of libraries (the baby) vis-à-vis their legacy of idiosyncratic, Byzantine protocols (the bathwater). It isn’t necessarily all that weird that “the largest, richest sources of bibliographic data rank poorly or are completely missing from the web.” Until certain large-scale library collection scanning projects cropped up, few cared much about piles of bibliographic data.

    Mr. Spalding has lodged a few rhetorical questions. For instance, “Why is it that librarians, which are mostly government or non-profit operations, whose business is based on giving information away, make it hard to get their data, but Amazon, a private corporation, makes it easy?” Short answer: investment capital. But that answer buys the premise that libraries make it hard to get their data, which is a succinct but misleading way of characterizing the situation. Sometimes data are hard to get. Take the statistical database purveyed by LexisNexis Academic Service. Even for those libraries who can afford the service, it’s simply harder to locate meaningful tabular arrangements of numbers than to find a large type edition of the latest Sue Grafton, and no amount of XML is likely to remedy this situation.

    He asks, “My local library has a big expensive OPAC solution. Why can’t I sent it an ISBN and get whether they have it?” You probably can, of course, but it might take some chatting with staff and fiddling around. Some libraries assume, rightly or wrongly, that most of their users just don’t care about ISBNs and that they might be confused by the option. There’s a degree of paternalism at work here, but it’s the same rationale for prescribing output of simple data.

    One can’t search LibraryThing by ISBN, and the FAQ fairly explains why. The explanation helps with the issues broached in these postings, too. The world of bibliographically expressed information—let alone tabular statistical data, feeds of news and opinion, or calendar content—is complex because, despite ISBN, books do not rigorously obey standards. “At some point, however, things get debatable,” writes Mr. Spalding in the FAQ. Precisely. A fitting retort to “Amazon is completely and utterly dominant,” or “The problem is attitudinal, not technical.” The debate results in compromise, cutting corners, prioritizing, and so forth, approaches libraries routinely pursue.

  5. Meredith and Casey: I find this stuff very thought-provoking, and the provoked thoughts are very important things to think. Granted, I’m a Gen X librarian intoxicated with technology, but I agree, we should be asking ourselves and each other: are our systems leading the way for patrons, or are we playing catch-up?

    As Dean says above, books (and other media) don’t obey standards. It seems to me, then, that we should be using any and all standards we can get our hands on to order information and make it easy to find. We shouldn’t abandon traditional information ordering standards, but why can’t we combine them with tagging and folksonomies? We shouldn’t drop everything to ape Amazon, but can’t we take the best parts of Amazon (and Google and Flickr and so on) and mash them up with traditional systems?

    Anyway, thanks for giving us all something to think about.

Comments are closed