Comments on: Google Print — the debate

By: Imre Simon

Imre Simon — Sun, 13 Nov 2005 10:51:52 +0000

In a 1965 book, coordinated by Internet pioneer J.C.R. Licklider and called "The Libraries of the Future", based on statistical projections the authors predicted that around the year 2000 a computer system would be able to store the text of all the books ever written.

The program Google Prints seems to be a realization of this insight.

Curiously, as far as I know, not even Licklider and his co-authors were able to forsee that such a landmark realization would open the doors to cross-polinate the contents of all these books, through the building and instantaneous use of global indices, for instance, which are an important part of the Google Print project.

Equally astonishing to me is the fact that they did not forsee either, as far as I know, that such a technological breakthrough might eventually be pre-empted by upholding and enforcing copyright arguments. Note that all the copyright priciples were already in place at that time and that the ones we have now are not that much different from the ones of 1965. There would have been no need whatsoever for extrapolations!

A curiosity, anyway.

By: Michael Taus

Michael Taus — Fri, 11 Nov 2005 19:08:08 +0000

I just got around to reading the article in Wired -- Google's Tough Call. Good stuff, Lawrence.

I do have some additional thoughts that I'd like to share with the group. You can find them here.

Best,
Mike

By: Martin Bizzarro

Martin Bizzarro — Fri, 11 Nov 2005 01:44:50 +0000

Bonjour (!)

Being a full time law student, in Montr�al, I won't be able to attend this great evening. Any chance this could be "podcast" later on?

Merci and thanks for this great website!

By: Asma

Asma — Wed, 09 Nov 2005 16:29:51 +0000

Ditto. I'm in Los Angeles, and as much as I'd love to jump on a plane, a videopost would be FANTASTIC. Please let us know if this can be arranged?

By: Joseph Pietro Riolo

Joseph Pietro Riolo — Wed, 09 Nov 2005 07:25:47 +0000

To nate:

There is no difference between creating an index and
distributing that index. It is considered as a separate
work in the U.S. copyright law (see the definition of
"supplementary work" within the definition of "work made
for hire" in Section 101). I don't know if it has
been tested in any U.S. court. The U.S. copyright law
is pretty clear on this because it also defines
"derivative work" which is very different from
"supplementary work".

However, you can't recreate from index an identical
copy of a work that still has copyright. It is similar
to trying to recreate an identical copy from snippets
from different sources as allowed by fair use. For
example, 100,000 people quote snippets from the latest
Harry Potter book as allowed by fair use (for school
assignments/homework, research, criticism, analysis,
and so on). Then, I collect snippets from their works
to create a work that is identical or substantially
similar to the Harry Potter book. The U.S. copyright
law does not permit this.

There is nothing wrong with creating an index of the
color at each position of an image. After all, it is
just an index. But, if someone tries to recreate
an image from the index, he crosses the line into
the zone of infringement.

Joseph Pietro Riolo
<josephpietrojeungriolo@gmail.com>
<riolo@voicenet.com>

Public domain notice: I put all of my expressions in this
comment in the public domain.

By: nate

nate — Tue, 08 Nov 2005 23:51:25 +0000

Joseph Pietro Riolo writes:
> Creating an index of words in a book is permissible by the U.S. copyright law.

Has this been tested anywhere? And is there a legal difference between creating that index and distributing that index? If the full text index includes the word order (as any index capable of phrase searching is), it's trivial to re-invert that index to produce the original text. If it is both legal to create and distribute that index, that's tantamount to saying that it's legal to make a copy of and distribute the work itself.

It's tempting to make a parallel between digital text and digital images. In 'Kelly v. Arriba Soft' (according to my layman's understanding) it was decided that thumbnails are allowed to be created and distributed but high resolution copies were not. Obviously (red flag word) if you take a copyright GIF and convert it into an (approximately) equal resolution TIFF, it does not suddenly lose its copyright. But essentially, apart from the lack of a widespread tool to automatically rebuild the document from the inverted index, building an index is just such a conversion to another format in that no information is lost. There is (in my mind at least) a nifty parallel between an index of the color at each position of an image and the location of each word within a document.

Does this mean that one is not actually allowed to build such an index? Although people seem to be offering the 'well, if building the index is illegal then web search itself is illegal' reasoning as an apparent reductio ad absurdam, I'm not so sure that it is absurd. I think there is going to need to be a significant clarification of the law here (either by the courts or the Congress), and probably one that eventually makes a much more explicit link between the medium in which an object is expressed, the degree of 'fair use' that one is allowed, and the purpose to which that 'use' is being put.

--nate

By: Peter Rock

Peter Rock — Tue, 08 Nov 2005 22:09:44 +0000

Thanks JPR. That was informative.

Yes, it is important to distinguish between ownership and control. It would seem as though there is no copyright available on such a database as it - as far as I understand - won't contain any arguable "creativity".

But as you point out, in European law there is an opportunity for control through sui generis. The control available under this regime is worrisome. That is, Google can claim infringement if one were to either make substantial copies of portions of the database for their own personal use (extraction) or distribute copies to others (re-utilization).

So let's say this project goes forth.

How do the copyrights of the authors relate to Google's sui generis database rights? Can Google simply license the database to individuals (effectively offering copies of the scanned material) yet exercise their "re-utilization" rights to prevent anyone else from competing in the searchable-database arena? Or do the copyrights affecting the data contained within the database trump the rights of sui generis thus forcing Google to basically enforce their rights whether they want to or not?

I can't help but believe that much of this headache is a result of trying to continue to apply the archaic notion of ALL RIGHTS RESERVED to a digital world. If only all works were under a creative commons license!

It is becoming clearer and clearer to me that the framers of Section 1, Article 8, Clause 8 of the U.S. Constitution never intended perfect control over distribution. I suppose exclusive rights weren't much of an issue back then given the cost of reproduction with no opportunity to sell. If only they had imagined a digital world with the Internet. Perhaps they would have made this point clear and stated that 100% "exclusive" rights are not a fair bargain for the public - the supposed benefactors of copyright. The non-commercial right of creative commons licenses is, I believe, the key to striking a balance between free distribution and $$$ for authors. Sharing will not turn authors into beggars.

Where'd I put that tylenol?

By: H.B.

H.B. — Tue, 08 Nov 2005 22:01:36 +0000

I clicked on the link and couldn’t find a place to purchase the tickets. I searched for “library,” “lessig,” “battle,” “books,” “google,” by 11/17/2005 date…Nothing. Any help? Am I missing something obvious?

By: Joseph Pietro Riolo

Joseph Pietro Riolo — Tue, 08 Nov 2005 20:58:31 +0000

To Peter Rock:

Google obviously will have control over the database.
After all, it is the one that has database. The amount
of control that Google has over the database greatly
depends on how much control it wants through security
measures and/or license.

It is important to make a distinction between control
and ownership. Google can't claim ownership in the
database because the U.S. copyright law does not grant
copyright to any non-creative work. The database is
simply a list of all occurrences of words in books.
There is no creativity in selecting which words to
keep and which words to discard. It is like telephone
book that attempts to list all phone numbers in an
area. So, if Google accidentally releases database
to the people in the public, they can copy database as
much as they want to. In Europe, however, Google
can claim sui generis right in the database.

Google is not the only company that does it. Ancestry.com
is another company that controls its databases through
license even though much information in the database is
in the public domain.

It is not an issue of "fair use". Creating an index
of words in a book is permissible by the U.S. copyright
law. Database containing index is permissible as well.
What is the issue is the snippet. Plaintiffs in the
lawsuits claim that snippets are not within the boundary
of fair use.

Joseph Pietro Riolo
<josephpietrojeungriolo@gmail.com>
<riolo@voicenet.com>

Public domain notice: I put all of my expressions in this
comment in the public domain.

By: Peter Rock

Peter Rock — Tue, 08 Nov 2005 18:04:31 +0000

The more I think about it, the aspect of Google Print that concerns me is - who will have control over the database created by this project? I REALLY don't feel it is right for Google to exclusively own it even if they are the ones creating it. Exclusive rights to such a database seems wrong to me.

Is the focus going to simply be on "fair use" and whether or not Google gets the green light?

I understand this is an important "fair use" issue, but the rights to the database seem - to me at least - to be a much more vital question that needs to be considered before Google gets the green light.

What am I missing? This is a very complex issue...my head hurts.