Sunday, July 15, 2012


FYI France : Europeana digital library Newspapers Project

Worth-a-visit -- for any fans or foes of Google Digital Libraries, or GoogleBooks, GoogleScholar, Google Books Library Project, or Amazon's Kindle, or Apple's iBook Author or iTunes Producer, or Barnes & Noble's Nook, or the BnF's Gallica, or Project Gutenberg's Project Gutenberg, or Artelittera Téléchargement, or ebooks or Epub, etc. -- or the many online digital Theories Of Everything surrounding all & each -- it's Europeana's recent announcement --

"Press Release, The Hague, 26th of June 2012 : Launch of the Europeana Newspapers Project"

"A group of 17 European partner institutions have joined forces in the Europeana Newspapers project to, over the next 3 years, provide more than 18 million newspaper pages to the online service Europeana ."

For any of us who have fought the various battles involved in newsprint -- from acid paper to indexing or more often the inadequacy or complete lack thereof, from storage questions to microfilm's manifold issues and complex secondary and tertiary intellectual property agendas, and above all the thorny lineages of news outfits, which change their names and swallow one another or get swallowed as often as the rest of us change our socks -- the sheer courage of such an announcement is impressive...

 

"Europeana is a single access point to millions of digitised books, paintings, films, museum objects and archival records sourced from throughout Europe. The Europeana Newspapers project is funded under the Competitiveness and Innovation Framework Program 2007-2013 of the European Commission with the aim of aggregation and refinement of newspaper content through The European Library.

"Each library participating in the project will distribute digitised newspapers and full-text via Europeana. The project aims to make the newspaper content directly accessible for users through a special interface within the content browser. This will be integrated into the Europeana portal and will allow queries of phrases or single words within the newspapers' texts. This goes far beyond the standard libraries catalogue search functions which usually allow the searching by date or title only."

As to that last, well... Indexation of newspapers anywhere in the past has ranged from none-at-all to erroneous -- even fine attempts have run afoul often of the "editions game", their own indexing failing to account accurately for differences, in an article's update version or even publication at all, in any given outfit's West Coast or Weekend or Local or National or European or Far East etc. "edition"... Researchers, and research librarians, tear their hair...

 

"The project addresses challenges linked with digitised newspapers such as Optical Character Recognition (OCR), Optical Layout Recognition (OLR), article segmentation and page class recognition, and named entity recognition (NER). OCR is the electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text. OLR is concerned with the detection and separation of articles on a scanned page with more than one article. NER seeks to locate entities in the full text and to classify them according to standardised names for persons, locations, and organisations."

It will be fascinating to see what new tricks -- techniques & approaches & degrees of understanding -- the Europeans will bring to bear on these old problems, some of which are very old indeed. Language policy and publication have struggled with weird character sets and layout and naming conventions for millennia, in Europe: through several Ages of Incunabula and various publication formats -- manuscript, print, radio & movies & tv, and now digital -- the problems always have been not just technical, also legal & political & social, cultural. What improvements will the latest digital innovations bring -- what new wrinkles in new solutions to the very old problems?

 

"The project will also evaluate the quality of the refinement technologies and transform the local metadata into the Europeana Data Model standard in close collaboration with stakeholders from the public and private sector."

As I just mentioned, "The problems always have been not just technical, also legal & political & social, cultural..."

 

"The Europeana Newspapers project is co-ordinated by the Staatsbibliothek zu Berlin - Preußischer Kulturbesitz. Follow the advancements of the Europeana Newspapers project at www.europeana-newspapers.eu. For any further information please contact Hans-Jörg Lieder or Thorsten Siegmann at Staatsbibliothek zu Berlin, via,
info@europeana-newspapers.eu

Project Partners:

  • Berlin State Library
  • National Library of the Netherlands
  • National Library of Estonia
  • Austrian National Library
  • University of Helsinki
  • National Library of Finland
  • Hamburg State and University Library
  • National Library of France
  • National Library of Poland
  • CCS Content Conversion Specialists GmbH
  • LIBER Foundation
  • National Library of Latvia
  • National Library of Turkey
  • University of Beograd
  • University of Innsbruck
  • Dr. Friedrich Tessmann Library
  • The British Library
  • University of Salford
  • The European Library

 

"Europeana is a multi-lingual online collection of millions of digitized items from European museums, libraries, archives and audiovisual collections. Currently Europeana gives integrated access to 23 million books, films, paintings, museum objects and archival documents from some 2,200 content providers from across Europe.

 

--oOo--

 

A Note:

Kudos to these librarians, and others, anywhere and everywhere, who undertake newspapers recon projects such as this one. As any exhausted & double-visioned microfilm or microfiche user will attest, newspaper research is a difficult task -- yet any historical researcher also knows its inestimable value, in research there is little hard evidence comparable to the immediacy of news reports and current events analysis.

Well-considered weighty tomes written many years later may get the history right. But history is not what people experience, they experience the news, the current events, with its uncertainties, rumors, unverified reports, unlikely sets of always-complicated circumstances: without some knowledge of these, and an appreciation of their significance in people's real lives, we cannot appreciate the significance of events in our own -- their importance, also very often their lack of importance -- preserving "the news", then, helps us greatly, we can better see events of history through the eyes of those who were there, and that helps us better understand our own.

Any set of decision-maker "mémoires" can be used to support this -- all of them accounts from "the fog of war", valuable as much for their reminder that wartime gets foggy and decision-making takes place in times of uncertainty, as that certainty and conclusions taken at the time very often look very odd later on -- from Julius Caesar's analysis of his invasions of Gaul, to Condoleeza Rice's reasons offered, in her lucid recent mémoires, for the US invasion of Iraq -- yet the mémoires make great reading, the weighty tomes more often not so -- as a famous anecdote explains,

    "Professional historians do not esteem William Shirer. His historical books are simplistic in interpretation, unbalanced in coverage, superficially researched and full of wrongheaded theories. Worst of all, they sell like crazy..."

      -- William Sheridan Allen, historian

So let's save the newspapers! We'll want to know what they said: what they told us, what we told others in them, how we all felt about it at the time -- all before we'd had a chance to think too much but nevertheless were forced to act.

 

Jack Kessler, kessler@well.com