When librarians are obstacles

Heading into the Open Ed Conference and especially the Mozilla Drumbeat Festival, I expected to be one of only a handful of librarians participating. Librarians haven’t been terribly involved or engaged with the open education movement, but our values and missions align so well that I expected to be welcomed by the professors and the edupunks as a peer and fellow traveller. Well, I got the first part right – I met only a couple of librarians all week – but the second, not so much. Imagine my surprise when the other two speakers in the session on libraries and the future of OER spent much of their time criticizing the ways in which librarians have engaged with open education, and lamenting the possibility of librarians being anything other than a liability.

Julià Minguillón, a computer science professor who spoke about digital preservation issues, described attempting to deposit an equation into his university library’s OER repository, only to be told that because his equation did not have a title, it could not be included in the collection. He then went on to criticize librarians’ obsession with the “useless” metadata of “author, title, date.” He argued that if we put librarians in charge of OER repositories (exactly the thing I argued for in my paper), we will sacrifice broad, immediate access in favor long-term preservation and proper metadata schemas.

R. John Robertson gave a paper about the role libraries can play in supporting OER initiatives but a significant portion of his presentation was given over to his concerns about librarian participation in this work. His experience with librarians is that they are so risk averse that the merest hint of a copyright issue is likely to send them running for the hills. Like Minguillón, he had anecdotes to back up his worries about librarians as obstacles in the field of open education.

In a word: Blergh! How did this happen? Why, despite biannual New York Times articles about how modern and hip librarians have become, are we still perceived on our own campuses as fearful impediments to progress?

Okay, I know why. Some librarians are fearful impediments to progress. Some librarians allow perfect metadata to be the enemy of good access. Some libraries, as institutions, do not foster innovation and experimentation, and are deeply resistant to change. It’s so disappointing.

It probably says something about the job I’ve had for the last year and a half that I see this primarily as a failure of management. On the plane to Barcelona I read a column by Meredith Farkas in American Libraries called “Nurturing Innovation: Tips for Managers and Administrators.” She offers a number of excellent suggestions for ways to adjust the institutional culture at libraries to support and embrace innovation: Encourage staff to learn and play, give staff time to experiment with potential new initiatives, keep an open mind, develop a risk tolerant culture. These suggestions kept coming back to me at Open Ed as I struggled to defend librarians and libraries against accusations of stodginess. I wanted to hand the article over to the people who complained about their uptight, change resistant libraries and say, “It doesn’t have to be this way. Go talk to your Dean. Make it better.” I also, in that way that sometimes happens, added one more suggestion to the list, thinking it was from Farkas but it’s from somewhere else, part of a theme that developed at Open Ed: Library administrators must make some room in their budgets for failure.

Innovation and progress can’t happen without failure. It’s how we learn, as individuals and as institutions and as species. Yes, library budgets are tight these days. Tighter than we ever thought they could get. With money so tight, and cuts so deep, it’s easy to think that now is not the time to take risks, but of course, now is exactly the time to take risks. How else will we prepared to address the challenges that await us in next year’s budget cycle, and the one after that, and the one 15 years from now?

To use one relevant example: The current commercial scholarly publishing apparatus is choking us. We know this. Knowing this, we have two choices: We can invest in activities that could ease the financial pressure – open repositories, deposit mandates, awareness campaigns – or we can choke. In this case, many libraries are experimenting, and sometimes those experiments even fail. As Farkas points out, when our experiments fail we still learn something valuable from them, something that can set us on a path to succeed the next time.

It’s not enough simply to encourage our staff to experiment. We need to give them money to play with, to set up a repository or buy a license to a promising tool or hire an expert to train staff in something new. It doesn’t have to be a lot of money, but it does have to be relatively free from strings. And then, we need to make sure our experimenting staff share what they’ve learned with colleagues in other libraries, the successes and the failures. It’s how we will all evolve.

So to wrap up this meandering post with a tidy bow: Higher education is changing, and our campuses are full of people (many of whom were at Open Ed and Drumbeat) experimenting with new models, tools, and philosophies related to teaching, learning, and research. The primary responsibility of academic libraries is to support teaching, learning, and research, and so those experiments and the people conducting them are highly relevant to us. We must make sure that we remain relevant to them. If they see us as an obstacle it is only a matter of time before we become obsolete. We want those experimenters and innovators to view the library as both a resource for and a partner in their work, and we can do that by funding innovation among our own staff, expanding our definition of the library’s role on campus, and embracing the possibility of failure. If we neglect to do these things, we don’t just risk becoming obsolete, we guarantee it.

DLF Forum: Library of Congress and Flickr

Women at work on bomber, from the Library of Congress

Phil Michel and Michelle Springer from the Library of Congress presented on the LOC’s Flickr Pilot Project. The Library of Congress was the first cultural heritage institution to partner with Flickr to share photographic content and invite user participation and comments. With 15 institutions participating in what is now the Flickr Commons, it is an idea that caught on quickly and has been quite successful. I’ve been very excited about this project since its launch, and so I was motivated to clean up and blog my rather extensive notes on the session. For more information about the project, check out this LOC webcast.


The motivation for the project came from a desire to explore including user generated content (UGC) in LOC descriptive processes. Photos seemed like a good place to start because there is no language barrier, there was already a big collection of photos online, and because they’re fun.

Initial investigations showed that bringing tagging to LOC collections would have had high technical barriers if handled in-house. There was a desire to keep initial expenditures low, and so they started looking around for existing web 2.0 sites that were doing the things they wanted to do.

The project had three goals: Increase awareness of LOC collections; gain better understanding of social tagging; gain experience participating in the kinds of web communities that are interested in LOC materials

There were a number of principles that guided the development of the pilot project: The involved content must already be available on the LOC site; the agreement with the third party site must be non-exclusive; access to the content must must be free; there must be an option to control or exclude advertising on the account; LOC should be clearly identified as the source of the images; must allow LOC to remove and moderate user-supplied content to prevent inappropriate tags and comments; UGC must be clearly distinguishable from Library generated content; must be possible to accurately convey copyright status.

Flickr had a great deal of appeal as a partner: It recently announced the upload of its 3 billionth picture, and has an active user community of over 23 million members. It had a pre-existing, vibrant community built around photography and a conversation that included notes, comments, and tags. From a technical standpoint, it also had APIs that allow for batch uploads and batch downloads of UGC, and a history of dealing with alternative copyright status (Creative Commons licenses).

Getting it off the ground

Flickr programmed the “No Known Restrictions” option especially for the LOC partnership, and it is now used by most of the institutions participating in The Commons. Every institution has its own page in its own webspace the explains exactly what they mean by the statement.

Some time and effort was required on the part of the General Counsel’s office to work with Flickr to create a modified Terms and Conditions agreement that could deal appropriately with the Library’s status as a government institution.

Technical process: Someone (I missed who – Flickr, LOC, or both) built a Java(?) app called Flickrj to push and pull content between the LOC databases and Flickr’s. They chose selected MARC fields whose content would go to Flickr along with the photos: The MARC 856 field was used as unique machine tag value, and so was the DublinCore identifier field.

All together, getting the project off the ground took about 100 hours of work for technical staff.

The photos all went to Flickr in their rough state. LOC folks didn’t do any cropping, color fixing, or clean up of dust or scratches. Part of the curiosity was to see how the public would respond to the images in this rough form.

Startup investments: The Library of Congress purchased a $24.95 Flickr Pro account, which offers members unlimited uploads and stats about traffic to photos. The Pro account is an annual expense that will go on as long as the project does. All Commons member institutions have pro accounts. There was no full time staff assigned to the project, but it required General Counsel involvement, some big conference calls, and eight staffers who contributed about 20 hours each to collaborate with Flickr on development.

Launching and maintaining

This was the first project that LOC ever announced without a press release. There were announcements on the Library of Congress and Flickr blogs, and the organizers considered it a soft launch. Though it involved no mainstream press, there was an enormous initial response, totally out of proportion to what was expected. The result was some near-immediate revisions of plans for maintenance and direct staff involvement; the scale was too big to be as involved as they’d planned.

A number of LOC staff share responsibility for monitoring all new comments, notes, and tags. They use the Flickrj app to pull all the new UGC at once. It takes about 2 hours a week to moderate comments/notes/tags for spam and inappropriateness. Sometimes users call attention to these things before staff find them. There are very few problems with inappropriate tags or comments; the Flickr community is quite well-behaved. LOC staff don’t correct spelling or syntax or remove seemingly useless tags. Staff do accept group invitations from public group administrators, but they only join public, nudity/vulgarity-free groups, so monitoring the group invites also takes time.

Updates to the images themselves take 15-20 hours a week. These involve corrections to descriptive information, fixes in the LOC catalog, and occasionally image fixes. Sometimes the orientation of images is wrong. First they fix it on the LOC server, then they generate new derivatives, and then send corrected versions to Flickr. In general, they limit edits to very basic changes and real errors. Sometimes they’ll point people from the LOC catalog back to Flickr when large amounts of conversation, updating, and information-sharing are taking place for a particular photo.

During the Q&A, someone asked about how have time pressures changed over the course of the project. Turns out, they haven’t exactly gone down, though they have shifted. When the project first launched, staff was checking the new comments and tags every 24 hours, and it was totally overwhelming. Efficiencies have come from the technical solutions, like the ability to batch download all new comments, notes, and tags. However, as the number of photos keeps going up, time demands on moderators continue to go up. Part of time demand comes from level of participation in the community, which is a steady stream; activity doesn’t stop on the older photos, so the rising total number of images leads to a rising total amount of new user generated content.


One of the main goals of the project was to drive more traffic to the LOC photo collections website, and it worked. People visit the LOC pages for higher resolution images, to get additional information, and to browse related collections. The organizers feel that the pilot has definitely achieved goal of raising awareness of LOC photo collections.

An unexpected outcome: Major search engines are finding, exposing, and weighting LOC’s Flickr images in search results. Many of the photos rank very high in images searches. It’s an unforseen way to further expose the content to the world.

Many of LOC photos are also being embedded in blogs all over the web (including this one). When it happens via “Blog this” function in Flickr, it’s easy for LOC to track it (and I imagine it’s trackable even when it happens in other ways).

The user involvement has been very interesting as a source of further study. There is a core group of about 20 commenters who provide historical research, fixes, comments, notes, etc. They’ll often support the information with citations, links to NYTimes archives and other external sites and archives. There are also 10 “power taggers” who have applied more than 3,000 tags each. One person was responsible for over 5,000 tags. The people at LOC did some work examining the different types of tags that people apply, and identified nine different categories: LC description based, new descriptive words, new subject words, emotional/aesthetic responses, personal knowledge/research, machine tags, variant forms, foreign language, and miscellaneous.

Users frequently post modern photos in the comments to show what the featured locations look like now. Sometimes people will go to the featured location and reenact the photos. There is quite a bit of playfulness and humor in much of the user involvement. Notes are a useful way to identify people in crowd shots and to transcribe text that appears in the photos. Some people also use notes to make jokes or silly comments, and while some people in the Flickr community have objected to the proliferation of notes, LOC has decided that for now the value of the function outweighs the irritation.

Conclusion: There has been a great response to the pilot, and great user participation, learned a lot. It stimulated conversation both between users & librarians and also between librarians. The project tapped into expertise in that resides in communities of interest. It brought up issues related to presentation and engagement that can inform decisions about how materials are presented on the Library’s own web site. While there are some risks associated with jumping into the web 2.0 world, and you have to be willing to cede some control, the benefits and rewards have been terrific.

University of Michigan Library adopts Creative Commons licenses

I am thrilled to report that the University of Michigan Library has adopted Creative Commons licenses for Library-produced content.

From the press release:

The University of Michigan Library is adopting Creative Commons Attribution-Non-Commercial licenses for all works created by the Library for which the Regents of the University of Michigan hold the copyrights. These works include bibliographies, research guides, lesson plans, and technology tutorials. The Library believes that the adoption of Creative Commons licenses is perfectly aligned with our mission, “to contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge.”

Commented University Librarian Paul Courant, “Using Creative Commons licenses is another way the University Library can act on its commitment to the public good. By marking our copyrighted content as available for reuse, we offer the University community and the public a rich set of educational resources free from traditional permissions barriers.”

It is a proud week to be a Michigan librarian (see also this announcement about the new Hathi Trust shared digital repository, and this one about our shiny new Espresso Book Machine). It’s amazing to work in a library that has strongly committed to innovation without losing sight of a core value system centered around public service. I feel very lucky. Go blue!

UM receives grant for Copyright Review Management System

This is local news for me, but exciting and important on a national level (at least I like to think so).

The University of Michigan Library was just awarded a grant for over half a million dollars from the Institute of Museum and Library Services, to develop a copyright review management system which will improve the reliability of copyright status determinations.

Here are the details:

The University of Michigan Library will create a Copyright Review Management System (CRMS) to increase the reliability of copyright status determinations of books published in the United States from 1923 to 1963, and to help create a point of collaboration for other institutions. The system will aid in the process of making vast numbers of these books available online to the general public. Nearly half a million books were published in the United States between 1923 and 1963, and although many of these are likely to be in the public domain, individuals must manually check their copyright status. If a work is not in the public domain, it cannot be made accessible online. The CRMS will allow users to verify if the copyright status has been determined.

The project was inspired by the work that the Michigan Library is already doing to determine the copyright status of the thousands of books published between 1923 and 1963 that Google has digitized from our collections. Books published during that period are in the public domain if their copyrights were not renewed or if proper copyright notice was not included in the publication. Most digitization projects, including Google’s, block access to all books published after 1922 because their copyright status is unknown and difficult to determine. Michigan has a workflow in place that uses copyright renewal records and page images from the books to research the copyright status of those works, and to open up access to the ones that turn out to be in the public domain.

The Copyright Review Management System will build on this work, and support efficient collaboration among institutions. It joins OCLC’s new Copyright Evidence Registry in the growing field of collaborative library copyright determination projects. My understanding is that Michigan is already sharing data with OCLC, and presumably our collaborators will as well. It’s nice when collaborative projects collaborate with each other.

Michigan’s project will raise the impact and usefulness of mass digitization projects by drastically increasing the number of digitized works that libraries can safely share with the public. In the absence of a reasonable orphan works bill, or even, dare I say it, some much-needed improvements in copyright law, it’s great to see libraries working to expand the known public domain and squeeze every last usable work from their massively digitized stacks.