Design, query and evaluate information retrieval systems.


While this competency may have existed for some decades before the current, it implied something entirely different: the management of analog aids and systems. While libraries have been pioneers in the innovation of digital information retrieval systems for decades, in times past it was often sufficient for librarians to avoid mastering such systems, while a few exceptional minds in the literature led the way. In the contemporary age this is impossible: advanced information retrieval systems now are endemic, and it is not enough to simply know how to use these systems. On the contrary: the contemporary librarian must be an architect of these systems, and must be a computer scientist. There is some resistance to this – clearly one does not expect to become a computer programmer while going to library school as dated interpretations of the librarian abound. What is true and immutable is that the future of the librarian is not as a laborer, and not as a book tender. Those days have past. Today the laborious duties of the librarian (chief to mind being cataloging) are being competently performed by volunteer patrons. What good then is a librarian in this new age in which Web 2.0 (soon approaching 3.0) technologies have dominated the way in which the user interacts with knowledge?

The librarian should become, and must become, a builder of particularly digital information retrieval systems. Our expertise will pay dividends in our awareness of theory, information retrieval concepts and the rightful characteristics and function of such systems. In other words: the contemporary librarian must not squander his time with old tasks, but instead must be the creator of smart, interactive, extensible and collaborative software. This latter point is of added significance as we in the profession face an overwhelming influx of new information to manage in the digital age; librarians are no longer capable of handling it all. The method in which we must manage this information is the same way the open source community does: collaboration and volunteerism. That is to say that librarians must embrace the pillars of Library 2.0 and create systems in which their patrons collaborate to do the colossal work which otherwise cannot be reduced by a small professional cadre.

Yet before we depart on such grand endeavors we must be able to query and evaluate information retrieval systems. As information scientists we must have a mastered understanding of the theory behind information retrieval. For if that theory does not guide the more ambitious Web 2.0 information retrieval systems, a disservice to the patron is not far off. Here again we glimpse at the place of the librarian in the contemporary age: he is a guide and engineer, who ensures that the quality of yesteryear systems are maintained and only improved by the technologies of the contemporary. It follows that the information scientist must be familiar with the fundamentals (which I speak to with my fifth piece of evidence below) by which information can be queried, manipulated and represented before greater experiments consume us.

Furthermore the information scientist should be able to analyze if an information retrieval is performing properly and efficiently by querying it. A search box which does not return an article corresponding to a query misspelled by a letter or two is useless. Indeed, the superior information retrieval system is capable of heuristically interpreting the searcher’s intent and offering personalized, interactive feedback. Web 2.0 services are intelligent, able to collect information about end-users to heuristically redirect them to relevant content and affiliates. Amazon offers recommendations for products based upon a user’s access history and does so with astonishing accuracy and relevancy. Web 2.0 services guide the user to make wise decisions about what information to retrieve in the same way librarians did in the past and they do so by the use of dynamically informed, personalized scripts.

Ultimately the librarian must be able to answer the popular complaint that library systems are not up to snuff when compared to private and open source alternatives. For some institutions, this is quite a valid criticism and may account for decreased access: the user is getting what he or she needs elsewhere, even if it is not perhaps (although often is) up to the quality standards of a library institution. As a philosophical point I make no pretensions of defending the old order but on the contrary attempt to migrate it’s wisdom into new vessels of emergent technology. This latter work is the proper and rightful business of the contemporary librarian.

Applied Work

As I consider this competency to be the heart and soul of the contemporary librarian, I have worked extensively to achieve mastery of it through my coursework at SJSU, as well as by my professional activities and even my recreation time.

The most obvious piece of applied work is this very portfolio. It is an information retrieval system, and I created it from scratch with no assistance. I host it myself, configured the MySQL database which powers it, installed the WordPress software, templated it, customized the codebase to fit my vision and integrated it with my web host. This being said I have created information retrieval systems, ranging from websites to information databases, for as long as I can remember. I am currently working on a special collections digital repository for Yuba College, but have been creating web services since I could comprehend the code, somewhere between the age of eight and nine years old.

That being said my first official piece of evidence is a landmark paper I wrote on the issue of Web 2.0 information retrieval systems as they apply to library science and the profession at large. This paper examines the impact of “Web 2.0” technologies, design standards and information retrieval trends on the traditional library and staff. Web 2.0 is defined and an overview of its popular proliferation as a means of information retrieval is considered. The function, utility and nature of the librarian are discussed in contrast to these novel schemes of accessing, storing and classifying information. Lastly, the popular web 2.0 encyclopedia Wikipedia is examined in order to demonstrate how these technologies and design trends are fundamentally changing the way the end-user seeks information. This speaks to my ability to evaluate information retrieval systems.

My second piece of work is a XML schema I developed for Ron Gilmour’s XML class. This particular schema is capable of validating pubmed XML records. It was a painful and massive undertaking which was a summation for his class and my study of XML. Clearly having a knowledge of XML is a must in the contemporary day of the librarian, and this work speaks to my ability to design information systems using XML.

My third piece of evidence is as above, except with the substitution of PHP. This PHP code was written for Steve Perry’s PHP class, and is a program which queries a MySQL database and returns a list of bills. Additional bills may be inserted by the user and then saved into the database. As with XML, I consider a proficiency with PHP to be essential for the contemporary librarian and this work speaks to my ability for designing information retrieval systems around the language.

My fourth piece of evidence is a group assignment report I furnished for Mary Bolin’s metadata class. In this project we were asked to create a DB/Text database with Textworks from the ground up, including a comprehensive metadata plan, a textbase, user model, indexing rules and record structure. I was responsible for a variety of facets of this project outside the scope of this area, for a full summary see pages 19-20 of the report. In essence this works speaks to my ability to query and design information retrieval systems from the ground up.

My fifth piece of evidence is a midterm paper for LIBR 202. The paper covers basic concepts in information retrieval systems which are foundational to an understanding of them. Topics such as aggregation, discrimination, and disambiguation, metadata theory, full text versus natural language, controlled vocabulary, classification, Boolean logic, descriptors, search logics and heuristics are defined, discussed and explored. Understanding these concepts is essential to evaluating information retrieval systems, otherwise it is as a student of mathematics attempting calculus without a grounding in algebra; a fool’s errand.

My sixth piece of evidence is the website BMSwiki.com. BMS Wiki is a Mediawiki website which provides information on the Benchmark Sims modification for the computer game Falcon 4.0 by MicroProse. I created the wiki, established a team of editors, established editorial guidelines, created a user access plan, cataloging rules, subject categories and graphic template. While it is a new project, it demonstrates my ability to design and maintain information retrieval systems.

“The Web 2.0 Paradigm: Impacts on Library Science Methodology and Professionalism” (.PDF)

PubMed XML Schema (.xsd) also on PasteBin

Bills PHP program (.zip containing a .php file) also via PasteBin

DB/Textworks Group Project (.PDF)

LIBR 202 Midterm (.PDF)

BMSWiki.com (external website)


Head, A. J. (2007). Google: How do students conduct academic research?. First Monday, 8 (6). Retrieved from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1998/1873

Krause, C. (2009). Wikipedia.

Miller, P. (2005). Web 2.0: Building the New Library. Ariadne, 45. Retrieved from http://www.ariadne.ac.uk/issue45/miller/

Saini, A. (2008, May 14). Solving the web’s image problem. BBC News. Retrieved from http://news.bbc.co.uk/2/hi/technology/7395751.stm

Zimmer, M. (2008). Critical Perspectives on Web 2.0. First Monday, 13 (3). Retrieved from http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/issue/view/263/showToc

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>