by Lisa Dallape Matson and David J. Bonski ONLINE,
November 1997 Copyright © Online Inc. Editor's
Note: A year ago, I received the call that every editor lives for: Out of the blue, an
interested reader picks up the phone and offers to write an article on a timely and
important topic. In this case, it was Lisa Matson, a librarian from the National Drug
Intelligence Center (NDIC). Since Lisa was in start-up mode at a new,
information-intensive government agency, she had been doing a lot of thinking about
digital libraries and electronic delivery of information to the desktop. She wanted to
answer the question "Do digital libraries still need librarians?" I assured her
that others would like to have an answer to that question, too, and we agreed that she
would author an article on the topic for ONLINE. When the manuscript came in several
months later, it came from an unfamiliar email address and had acquired a co-author.
Here's what happened in the intervening time.
--Susanne Bjørner
Lisa: Since June of 1995, Dave and I had each been working within our own separate
divisions at NDIC to improve access to information for our ultimate
"customers"--the intelligence research specialists. Because I had the unique
opportunity to start a "library" from scratch, it seemed prudent not to create a
totally paper-based collection, but to think about advances in digital library research
and how they could be applied at NDIC.
Dave: In my job, I had been defining an overall system architecture for a data tagging
project--a capability that will allow users to more effectively handle high value
information derived from multiple sources. I became aware that Lisa was doing some initial
work to define a concept for a digital library.
Lisa: Our two projects were more or less independent of each other and we only had
occasional dialog with each other regarding various issues related to them.
Dave: A number of aspects of these two projects seemed to have similarities. Work on
the separate projects required us to be in a closer and more intensive dialog with each
other than we otherwise would have been.
Lisa: Last December (1996), during one of our meetings, when we were discussing, among
other things, information industry standards like HTML, I used certain
"librarian-centric" terms to describe what I was thinking. Dave had to translate
what I said into engineering terms. What I called "controlled vocabulary," he
called--
Dave: "metadata"
Lisa: and "searching the literature"--
Dave: "data mining."
Lisa: And so on. We finally looked at each other and said "Gosh, we're living in
parallel universes!" We're trying to accomplish many of the same things--pushing the
same boulder up the same hill. We're just using different words to describe it!
Dave: Lisa's suggestion to co-author this article seemed eminently logical, since it
attempts to bring together the two domains of information (library) science and systems
engineering. The former contributes its specialties of the organization of information and
methods for bringing order out of chaos (cataloging and searching with efficiency); the
latter offers the potential for providing technologies that can intelligently assist the
librarian in performing the craft of librarianship.
WHAT IS A DIGITAL LIBRARY? There is no need to sacrifice the past 100 years of the
learning of librarians to new technology. A century ago, the Carnegie Libraries springing
up in towns all across the United States drew a lot of attention. Today's new libraries
cannot be seen, but the Digital Library, even without an imposing marble edifice, is
transforming access to information. Not being a building, however, has its disadvantages;
the Digital Library remains an abstract and amorphous thing, even to library
professionals. Much of the discussion in the professional literature debates what digital
libraries really are, or proposes how to define them. Two of the most useful articles for
practitioners planning to build a digital library are William Saffady's "Digital
Library Concepts and Technologies for the Management of Library Collections" and
Peter Lyman's "What is a Digital Library?" [1,2].
Saffady provides practical techniques by which librarians and system engineers laboring
in the trenches can estimate the cost of creating a real digital library. He gives the
best summary we have seen in the literature of what the adjective digital adjacent to the
noun library really means. More precisely, he describes, with thorough scholarly
citations, the multiple meanings being bandied about.
To paraphrase the definitions that Saffady has noted over the course of 25 years, a
digital library may be considered to be any of these:
Machine-readable data files, often with scientific and technical applications
Components of the emerging National Information Infrastructure
Various online databases and CD-ROM information products
Computer storage devices on which information resides, such as optical disk jukeboxes
or magnetic tape autoloaders
Computerized networked library systems As practitioners today, we find his own
definition to be the most useful one: A digital library is a library that maintains all,
or a substantial part, of its collection in computer-processible form as an alternative,
supplement, or complement to the conventional printed and microfilm materials that
currently dominate library collections. Saffady's thoroughness and mastery of technology
are balanced by Lyman's more expansive scope. Lyman explains many ideas and changes in the
profession that practitioners have understood experientially. One of his most useful
points is that computers were originally created by engineers, and thus reflect a focus on
numeric problem solving--what he calls "masculine language of action." This
machine-dominated reality now has Web pages, a format Lyman sees as a new kind of
"living textuality." The language or content of the query--the need for the user
to find meaning, to answer the question, to continue thinking--has been and continues to
be librarianship in its best sense. These two noteworthy articles demonstrate a division
that exists generally in the literature. Commentators have tended to focus either upon the
importance of technology or of librarianship. As a result, the reading practitioner is
presented with a false dichotomy and, even worse, a sense that technology must take over
librarianship and the librarian. This fear misunderstands both roles.
Libraries, digital or otherwise, exist so that users can find information. We have
learned that these two sides must have respect for the professional knowledge that each
brings, the confidence to contend for what one understands and the willingness and
integrity to cooperate to meet the needs of the users. There is no need to sacrifice the
past 100 years of the learning of librarians to new technology.
THE ROLES OF LIBRARIANS: The effects of technology New technology is, of course, very
powerful and brings, in Neil Postman's phrase [3], "an imperialistic thrust" not
only into librarianship, but into everyday life. Not surprisingly, the new roles created
by changing technology have commanded attention and have made up a large part of recent
literature about librarians and digital libraries. Especially since the creation of the
USMARC record in the late 1960s, and the resulting proliferation of online catalogs,
librarians have been spurred by technological developments to become more efficient
organizers, indexers, abstractors, and archivers of the past. They have, in short, brought
their traditional skills--especially cataloging--to the service of the new technologies.
(Nevertheless, one can still find the old familiar card catalog in many government
libraries, long after it has disappeared from academic and public libraries, a discovery
that evokes mixed feelings, for librarians, of nostalgia and shock!) The struggle of
librarians to cope with the dynamic changes brought by technology is real. In the course
of this effort, multiple roles for the librarian have been proposed. They include:
The librarian as enhanced service provider, an actor who provides SDIs, or current
awareness services, in a proactive manner.
The librarian as guru of copyright or, in the electronic age, of licensing and
electronic redistribution. They may not have taught this in library school but, in fact,
it is a role that draws upon a traditional professional strength: the understanding of
what users and organizations actually want to do with information (how it will be used, by
whom, for how long).
The librarian as system interface designer, a role that makes use of experience with
how patrons request, use, and process information. Technology cannot yet replicate the
human interaction between practitioner and the user, or the ability to respond to the
episodic and eclectic nature of users and their queries. But as the design process moves
to the next step--the actual evaluation of the system--librarians and technologists have
many opportunities to cooperate on adjustments and enhancements of the systems. Still, as
Herbert White puts it, librarians "have had so many problems because we have been
willing to accept the status of bit players" instead of understanding the assets our
experience gives us [4]. Technologists have devised the term "metadata" to
describe the traditional organizing activities of librarians. That is, librarians create
data about data. But in some sense both technologists and librarians have missed the point
that this is nothing new--librarians have always been experts in it. Unfortunately,
though, librarians have often failed to take advantage of an exciting opportunity
technology has opened to them: actually participating cooperatively and mutually with
system engineers in the delivery of content to the user.
For their part, technologists tend to view the work of librarians, however excellent it
may be, as slow and expensive. The technological approach places emphasis on speed and
efficiency--on brute force rather than elegance to achieve results. Computer memory and
storage are cheap, and more computing horsepower makes for an economical and rapid method
of ingesting large volumes of text to make indices that can be used later to search for
document content.
THE NDIC: BALANCED EMPOWERMENT The National Drug Intelligence Center (NDIC) is a new
government agency within the Department of Justice. Its mission is to collect information
on drug trafficking organizations and to develop recommendations to senior officials on
dismantling them or rendering them ineffective. The concept of a digital library and the
enabling technologies that can implement it are important to the mission of NDIC for three
reasons:
Since all of NDIC's data are provided by other Federal agencies, proper handling and
accountability in its use is required.
As a new agency, an enormous potential exists to apply state-of-the-art technology and
thinking about data organization.
After only three years of operation, the NDIC's legacy repositories already hold a
large volume of data that must be effectively "mined" for relatively small
amounts of high-value information in order to provide the basis for the analytical work
that is performed at the Center. In the operations of the NDIC, a Collection Management
Branch--a team of people who are a "special library" of sorts within an
information organization--includes librarians and technical information specialists
(TISs). The Branch gathers data and prepares it for analysis, while Intelligence Analysts
attempt to use this data to find trends, patterns, vulnerabilities, weaknesses--in short,
to use professional judgment to create a finished intelligence "product" that
makes connections between people, places, drug transactions, and other factors related to
counternarcotics work. Our Intelligence Analysts clamor for full empowerment through
technology, and want the world at their fingertips, meaning full desktop access to the
Internet, as well as all the relevant commercial databases and proprietary government
databases. The Collection Management Branch contends that an enormous degree of expertise
and skill, based on years of work in librarianship and information science, is necessary
to understand how data is organized and what the most efficient method of searching is.
It is very human to ask "who is right?" but any real solution must recognize
that each discipline can contribute to the entire enterprise in a meaningful way.
Although desktop access to the Internet may be an eventuality, in practical terms it is
inefficient for analysts who are not trained in information retrieval to search for
information with most search engines (including AltaVista, Lycos, and so on). Typing in
search criteria that yield half a million hits, which are then viewed ten at a time,
produces no information at all.
Where Information Science (IS) Meets Information Technology (IT) As the boundary
between neurons and electrons becomes seamless, the human knowledge worker will rely on a
more sophisticated array of tools in his or her toolkit. A principal task at NDIC is the
collection of information from a universe of data. The technologist calls this "data
mining," the librarian calls it "searching the literature," and the analyst
says it's "asking critical questions." In all three cases, however, each request
for information must be presented to the technology at what is called the
"man-machine interface." In each case, use of technology must be made in a
highly precise manner, or "syntax." The precision demanded by technology places
an enormous restriction on human interaction with the machine. How does the human being
capture the meaning or intent behind the question that is asked? The librarian uses such
terms as linguistics, semantics, or context to describe ways to obtain information or
content. The technologist has developed precise mechanisms for accessing; these include
Structured Query Language (SQL) for relational data (data that lends itself well to be
stored and viewed in table form), and Z39.50 WAIS standard for accessing and retrieving
free text data. Early attempts at imbuing technology with "smarts" such as
artificial intelligence or expert system technology have met with only limited success
since the imitation of human thought processes works only within extremely narrow fields
of specialization.
Technology-based attempts at improving the interaction process between man and machine
fall into three primary areas of information access:
Finding It: Web-Enabled Technologies. NDIC has embarked on several initiatives to
exploit the state of the art in text-handling technologies and to deploy these
capabilities into the workplace. A key objective is to understand human behavior behind
seeking information and to implement information retrieval techniques behind each
behavior. Items being explored include linguistic patterns that exist within text and
exploiting document annotations and structure.
Viewing It: Data Visualization and Data Mining. An organization that cannot access
mission critical information in a timely manner will find itself being "data rich but
information poor." Brute force methods of drilling down through multiple levels of
sub-directories are enormously inefficient and time-consuming; they produce, at best,
marginal results. The use of data visualization/data mining tools can greatly improve
access to information and should confirm the hypothesis that this type of cerebral
activity is largely visual and not procedural.
Storing It: SGML as a Standard. The business process at NDIC is not unlike that of a
publishing house; we bring information in, provide "value added" to it, and
provide output in the form of a finished intelligence product. Our challenge is to manage
content when it exists in many diverse native formats. Through the use of SGML as an
information standard, NDIC hopes to create an "information rich" environment
where information can be separated easily and quickly from data so that the value of our
products to the law enforcement community can be measurably improved.
The trend will be to continue to develop technology so that it mimics human behavior,
but at higher levels of fidelity. As the boundary between neurons and electrons becomes
seamless, the human knowledge worker will rely on a more sophisticated array of tools in
his or her toolkit.
"Intelligent assistants" can reduce the amount of repetitive, routine data
preprocessing that must be done. This has already happened on the automobile assembly line
with robotic agents. These agents may fairly replicate the lowest levels of human thought,
while freeing us to engage at the highest levels of productive, creative, and conceptual
thinking. Being able to collect facts and simply bundle them into a document is quite
different from extracting meaning from them in a carefully reasoned manner. Until the time
when we can have our own individual Star Trek-like Data humanoid, we must be satisfied
with using the force-multiplier effect of evolving technology to our advantage.
CONCLUSION Our experience in attempting to build a digital library ... has taught us
that librarians and technologists are struggling together in the evolution of a new
profession. Digital libraries need librarians. Jaime Carbonell of Carnegie Mellon
University, pointed the way when he wrote: "Advances in all these technologies are
underway, but are not yet coordinated and targeted at the task of creating a digital
librarian" [5]. That task must necessarily occur as a partnership--what Chun Wei Choo
calls an "information partnership." In a recent book [6], Choo envisions three
groups of specialists working together:
Domain experts, who are personally engaged in the act of creating and using knowledge
and through whose coordinated efforts the organization as a whole performs its role and
attains its goals;
Information technology experts, who have the specialized expertise to fashion the
organization's information infrastructure by building the applications and networks that
allow the organization to do its work with accuracy, reliability and speed; and
Information experts--the librarians--who have the skills, training and knowledge to
organize knowledge into systems and structures that facilitate the productive use of
information and knowledge resources. Our experience in attempting to build a digital
library at NDIC has taught us that librarians and technologists are struggling together in
the evolution of a new profession. Librarians must gain the ability to live comfortably in
a new environment and to recognize that their role is much more than custodians of a
traditional edifice of knowledge. Technologists must come to see librarians as
"living metadata": breathing, thinking, creative. By itself, technology--however
marvelous or powerful, whatever its potential--is cold and sterile. It will remain so
unless someone adds the ability to bring the right information to the right user at the
right time. If technology is a great force multiplier, the digital librarian can be a
great force. ACKNOWLEDGMENT The authors acknowledge their indebtedness for conversations,
suggestions, and ideas to Robert Akscyn, Ed Leonard, and Mary and Joe Price. REFERENCES
[1] William Saffady. "Digital Library Concepts and Technologies for the Management of
Library Collections: An Analysis of Methods and Costs." Library Technology Reports 31
(May-June 1995): pp. 221+. [2] Peter Lyman. "What Is a Digital Library? Technology,
Intellectual Property, and the Public Interest," Daedalus 125 (Fall 1996): 1-33.
[3] Postman, Neil. Keynote speech at EDUCOM 93. Cincinnati, OH. General Session-Part I.
[4] Herbert S. White. "Our Retreat to Moscow and Beyond," Library Journal
(August 1994): pp. 54-55.
[5] Jaime Carbonell. "Digital Librarians: Beyond the Digital Book Stack,"
IEEE Expert 11 (June 1996): pp. 11-13.
[6] Chun Wei Choo. Information Management for the Intelligent Organization: The Art of
Scanning the Environment. Medford, NJ: Information Today, 1995. pp. 198-202.
Lisa Dallape Matson earned an MLS degree in the School of Information Sciences at the
University of Pittsburgh and has subsequently pursued related studies in computer science,
business, and legal research. Before joining the Department of Justice, Ms. Matson held
positions in the Blue Cross/Blue Shield Information Center and the Libraries of Juniata
College and the University of Pittsburgh at Johnstown.
David J. Bonski, a registered Professional Engineer, is a graduate of the University of
Pittsburgh and Virginia Polytechnic Institute and State University and has over twenty
years of professional public and private sector experience in applying the disciplines of
systems engineering and project management to the development of computer-based
intelligence gathering systems for the U. S. Government.
Copyright © 1997, Online Inc. All rights reserved. Feedback . |