Supriya Ghosh (Editor)

Internet Memory Foundation

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Type
  
Non-profit foundation

Website
  
internetmemory.org/en

Industry
  
Web archiving and preservation

Founded
  
2004 as European Archive 2010 as Internet Memory

Headquarters
  
Amsterdam, The Netherlands

The Internet Memory Foundation (formerly the European Archive Foundation) is a non-profit foundation whose purpose is archiving content of the World Wide Web. It supports projects and research that include the preservation and protection of digital media content in various forms to form a digital library of cultural content.

Contents

History

The non-profit institution European Archive Foundation was incorporated in 2004 in Amsterdam. An announcement at the opening of the Cross Media Week in Amsterdam during September 2006 included a quote from Brewster Kahle, who founded the Internet Archive. Julien Masanès was its first director. Operating from Amsterdam and Paris, it said it would make freely accessible public domain collections and web archives. Masanès, previously at the Bibliothèque nationale de France, edited a book on Web archiving in 2007. The Paris organization is called Internet Memory Research, which operates a service known as ArchiveTheNet.

In December 2010, the Foundation changed its name to Internet Memory Foundation to express its goal of preserving internet content for current and future generations.

The foundation has many partners, including cultural institutions and research institutions, who collaborate on its web archiving projects. These partners include UK National Archives, the Max Planck Institute, Technische Universität Berlin, University of Southampton, and the Institut Mines-Télécom. The foundation is also a member of the International Internet Preservation Consortium.

Research

The foundation is involved in research projects to improve technologies of web crawling, data extraction, text mining, and preservation to support the growth and use of web archives. Their projects are funded by the European Commission through the Seventh Research Framework Program.

  • Scalable Preservation Environments (SCAPE, Project No. 270137) runs from February 2011 through July 2014. It is developing an open source, scalable preservation platform.
  • Large-scale, Cross-lingual Trend Mining and Summarization of Real-time Media Streams (TrendMiner, Project No. 287863) runs from November 2011 through October 2014. It aims to develop tools to mine social media, especially across multiple languages.
  • Collect-All ARchives to COmmunity MEMories (ARCOMEM, Project No. 270239) ran from January 2011 through December 2013. It studies the preservation of ephemeral web information, such as that used in social network sites.
  • Web Archiving in Europe survey ran in December 2010. Assessed the state of web archiving projects across different European institutions.
  • Longitudinal Analytics of Web Archive data (LAWA, Project No. 258105) ran from September 2010 through August 2013. The project experimented with large-scale data analytics for use in the Future Internet Research and Experimentation project.
  • LivingKnowledge (Project No. 231126) ran from February 2009 through January 2012. The goal was to improve navigation and search in large multimodal datasets.
  • Living Web Archives (LiWA, Project No. 216267) ran from February 2008 through January 2011. LiWA developed web archiving methods and tools that aimed to capture a more accurate, "living" archive of the web.
  • Audio and video

    Before focusing on web archiving, the European Archive Foundation has collected one of the largest online free classical music collections (more than 800 pieces, from Mozart to Dvorak) and Public Information Films from the British Government, made in collaboration with the Netherlands Institute for Sound and Vision and the UK National Archives.

    Selective web collection

    The foundation archived a snapshot of the Italian web domain, made in collaboration with the National Library of Italy, an archive of political websites of the 25 EU member states captured during the European constitutional debate, and archives (among others):

  • The National Archives (United Kingdom)
  • National Library of Ireland
  • CERN, Organisation européenne pour la recherche nucléaire (Switzerland)
  • Parliament of the United Kingdom
  • Public Record Office of Northern Ireland
  • The Web crawler used by the project is Heritrix version 3. Heritrix generates resources stored in a “container”, the ARC file (.arc). The ARC file was extended to the Web ARChive file format (.warc), which was approved as an international standard in June 2009 (ISO 28500:2009).

    References

    Internet Memory Foundation Wikipedia