
< back
to full column list
< back
to academic authors column list
< back
to textbook authors column list
Reader,
take heart! (Publisher, be very, very afraid!) Understanding Google
by Richard T. Hull

TAA Executive
Director Richard T. Hull |
The New York
Times Magazine for May 14, 2006, carries a long cover article called
"a manifesto by Kevin Kelly" titled "What will happen to books?" The
full article should be required reading for every academic author.
Kelly chronicles
the work being done to "bring us a planetary source of all written material,"
providing "all the works of humankind to all the people of the world."
Scanning technology" will enable us to grab and read any book ever written
. . ., any article ever written in any newspaper, magazine or journal
. . ., every painting, photograph, film and piece of music produced
by all artists, present and past . . ., all radio and television broadcasts
. . .,[and] a copy of the billions of dead Web pages no longer online
and the tens of millions of blog posts now gone . . . --in short, the
entire works of humankind, from the beginning of recorded history, in
all languages, available to all people, all of the time."
"Corporations and
libraries around the world are now scanning about a million books per
year." And much of this work is being outsourced to India and China,
where scanning costs for a book are about 1/3rd of those in the U.S.
But scanning is
only the first step in the revolution. Books will be provided links
to other books by avid readers. Tags, which are public annotations hung
on a file, page, picture or song, will enable others to search for that
file. This amounts to a reader-generated alternative to the Dewey Decimal
System.
Web surfing, which
we all do when we click on some paragraph or page, creates a strengthened
relationship between the end points of every link and the connections
suggested by each tag. The effect is a kind of social, democratic intelligence.
When books are
"deeply linked, you'll be able to click on the title in any bibliography
or any footnote and find the actual book referred to in the footnote.
"So what happens
when all the books in the world become a single liquid fabric of interconnected
words and ideas? Four things: First, works on the margins of popularity
will find a small audience larger than the near-zero audience they usually
have now. . . . (D)ital interlinking will lift the readership of almost
any title, no matter how esoteric.
"Second, the universal
library will deepen our grasp of history, as every original document
in the course of civilization is scanned and cross-linked.
"Third, the universal
library of all books will cultivate a new sense of authority. If you
can truly incorporate all texts -- past and present, multilingual --
on a particular subject, then you can have a clearer sense of what we
as a . . . species do know and don't know.
"Finally, the full,
complete universal library of all works becomes more than just a better
Ask Jeeves. Search on the Web becomes a new infrastructure for entirely
new functions and services."
Kelly traces the
evolution of intellectual property rights to control copy, noting that
the period of protected ownership has increased from 14 years (in 1976)
to 70 years today. What gives difficulty for the emergence of the digital
library is not the 15 percent of the world's 32 million cataloged books
that are in the public domain (most of the current scanning effort by
American libraries is directed toward digitizing these works), nor the
10 percent of all books actively in print. It is the 75 percent that
fall in between.
There is no catalog
of copyrighted works. Publishers don't have exhaustive lists of the
copyrights they own. The Library of Congress does not have such a catalog.
"The older, the more obscure the work, the less likely a publisher will
be able to tell you (that is, if the publisher still exists) whether
the copyright has reverted to the author, whether the author is alive
or dead, whether the copyright has been sold to another company, whether
the publisher still owns the copyright or whether it plans to resurrect
or scan it."
"The legal limbo
surrounding their status as copies prevents them from being digitized
. . . . And if they are not scanned, they in effect will disappear."
The year 2019 is
now the point at which the 70-year span of copyright beyond the life
of the creator, enacted by Congress in 1998, will begin to yield this
75 percent to the public domain.
Enter Google. "No
one was able to unravel the Gordian knot of copydom, until 2004, when
Google came up with a clever solution. In addition to scanning the 15
percent out-of-copyright public-domain books with their library partners
and the 10 percent in-print books with their publishing partners, Google
executives declared that they would also scan the 75 percent out-of-print
that no one else would touch."
"For out-of-copyright
books, Google would show the whole book, page by page. For the in-print
books, Google would work with publishers and let them decide what parts
of their books would be shown and under what conditions. For the dark
orphans, Google would show only limited snippets. And any copy-right
holder (author or corporation) who could establish ownership of a supposed
orphan could ask Google to remove the snippets for any reason."
Kelly points out
two arguments against Google's strategy for the "dark orphans." The
first is that publishers of such works have accused Google of blatant
copyright infringement. Google is potentially making ad revenue from
the snippets that it makes searchable, and is doing so without either
having a contractual arrangement for sharing the revenue with copyright
holders or even obtaining permission before scanning the work. Ironically,
publishers have now started caring "about these orphans now because
Google has shifted the economic equation: because of Book Search," out-of-print
books may have some renewed income potential, "and the publishers don't
want this potential revenue stream to slip away from them:
Google maintains
"that it is nearly impossible to track down copyright holders of orphan
works, and so, it says, it must scan those books first and only afterward
honor any legitimate requests to remove the scan" that present themselves.
"It is up to you as an author to notify Google to scan or search your
copyrighted material." And the problem is ultimately precedent: if digitizing
works that are out of print becomes profitable, you as copyright holder
or author would have to be omniscient about who was doing so and "find
and notify each and every geek who scanned your work, if for some reason
you did not want it indexed". If you missed one, you might end up being
indexed anyway. And once that genie is out of the bottle, it will be
hard to cram it back in.
Kelly concludes
with the observation that authors and other creators of intellectual
property don't yet have alternatives to the copyright based forms of
compensation for their labors. And it is clear to TAA readers that copyright-based
compensation has been greatly weakened as the doctrine of fair use and
the reselling of works without royalty payments has undermined the publishing
industry's financial model. As Kelly sums it up, "Search is a wholly
new concept, not foreseen in version 1.0 of our intellectual-property
law." He proposes "a new covenant: Copyrights must be counterbalanced
by copyduties. In exchange for public protection of a work's copies
(what we call copy-right), a creator has an obligation to allow that
work to be searched. No search, no copyright."
There is much here
to ponder.
|