ERPANET Case Study: Project Gutenberg | Page 4

ERPANET
to be done before an eBook
can be digitised and made available.
Asset Value and Risk Exposure
Project Gutenberg exists to make literature and reference materials
freely accessible to the general public in a digitised format. As
mentioned above, Michael Hart believes that free access to literary
works is vital for enabling the sharing of knowledge, art, music and
culture.

Regulatory Environment
Project Gutenberg must adhere to U.S. laws involving operation as a
not-for-profit corporation. However, these regulations are not sector
specific. Project Gutenberg must be exceedingly careful to respect U.S.
copyright laws regarding the works that they digitise and make
available over the Internet. However, once a publication has been
verified as being in the public domain, there are no other legal
restrictions affecting Project Gutenberg.
Preservation Activity
Policies and Strategies
Project Gutenberg scans literary works and employs OCR technology
to create eBooks. In some cases, eBooks are typed in by hand. The
eBooks are then edited by a team of volunteer proof-readers. There are
procedures and guidelines available online for volunteers to consult
when scanning and editing texts for Project Gutenberg to ensure that all
eBooks follow a standard format. Once the eBook has been produced, it
is uploaded to two main servers. The eBook is made accessible via the
official Project Gutenberg website and the Internet Archive site and on
over thirty mirror sites around the world. As there are no access or
distribution issues, Project Gutenberg encourages users to save copies
of the eBooks to CD or DVD.

Project Gutenberg believes that by generating a multitude of versions -
those stored on the main servers, on local servers (through mirror sites)
and those downloaded to CD and DVD - will ensure that the bit stream
of the literary work is preserved for access. This embodies the
philosophy of the LOCKSS strategy. LOCKSS 'uses the caching
technology of the web to collect pages of journals as they are published,
allowing libraries to take physical custody of selected electronic titles
they purchase'(11). LOCKSS was inspired by the words of Thomas
Jefferson who said "let us save what remains: not by vaults and locks
which fence them from the public eye and use in consigning them to
the waste of time, but by such a multiplication of copies, as shall place
them beyond the reach of accident." (12)
Selection
Project Gutenberg aims to make digitised versions of popular literature
and reference materials in the public domain freely accessible to the
general public. As copyright expires, publications can be freely
replicated and distributed. Many of these works are out of print. By
digitising the out of print works, Project Gutenberg feels that they are
saving the publications from 'obscurity and ultimate oblivion'(13).
Basically, all of the texts can be classified into three categories: light
literature (such as Alice in Wonderland), heavy literature (such as
Shakespeare and Dante) and references (such as Roget's Thesaurus).
Mathematical and scientific works are also made available including
the Human Genome. There are no real restrictions to what Project
Gutenberg will make accessible. As long as the material is in the public
domain, they can be digitised and submitted to Project Gutenberg.
However, Project Gutenberg aims to benefit the widest possible
audience and therefore prioritise the digitisation of popular literature
and reference materials rather than extremely specialised works. Project
Gutenberg already have texts in over 31 languages and are especially
keen to increase their multilingual holdings.
Preservation
Project Gutenberg already has numerous plain text files that are 20-30
years old. In that time, many file formats have come and gone while

plain text is still readable on virtually all computers. The use of plain
text will also help to insure against future obsolescence. All Project
Gutenberg eBooks are created as plain ASCII text files. This means
that people with 'Apples and Ataris all the way to the old homebrew
Z80 computers' (14) as well as Mac and UNIX users are all able to read
the text files. Any open format can be submitted but the Project
Gutenberg team will also generate plain ASCII (15) text files. Project
Gutenberg encourages users to created new formats from the plain text
files to suit their individual needs. Once the eBook has been generated
and edited by volunteers, it is uploaded to two main servers. The first is
the Project Gutenberg site itself and the other is the Internet Archive
site. From this point, mirror sites can download the redundant files to
their own sites and store them on their own servers.
Project Gutenberg uses the unique eBook number as the file name.
Therefore, if the eBook is the 10001 plain text file created it will be
named 10001. txt. Project Gutenberg will accept as many open file
formats as volunteers are willing to submit, but will also generate a
plain
Continue reading on your phone by scaning this QR Code

 / 10
Tip: The current page has been bookmarked automatically. If you wish to continue reading later, just open the Dertz Homepage, and click on the 'continue reading' link at the bottom of the page.