Puneet Varma (Editor)

Memento Project

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Memento Project

Memento is a United States National Digital Information Infrastructure and Preservation Program (NDIIPP)-funded project aimed at making Web-archived content more readily discoverable.

The project is being led by the Los Alamos National Laboratory and Old Dominion University.

Rather than expecting people to know about the growing number of Web archives, and to guess which archive might hold an older version of the resource they’re looking for, Memento proposes to make archived content discoverable via the original URL that the searcher already knew about. Essentially, Memento is an attempt to permit users to view any web page as it looked on a given date in the past.

Technical description

A variety of web archives exist, collecting specific revisions of web pages as they existed at a particular point in time. Memento allows a user to seamlessly transition between these archives in search of the best archived page matching the datetime for the page that they desire.

Memento is defined in RFC 7089 as an implementation of the time dimension of content negotiation, as defined by Tim Berners Lee in 1996. HTTP accomplishes negotiation of content via headers. The table below shows the different headers available for HTTP that allow clients and servers to find the content that the user desires.

Memento provides the Accept-Datetime request header so that clients can provide a date to the server, and the server can provide the best archived version of a page for that date. This is referred to as datetime negotiation.

To understand Memento fully, one must realize that the Last-Modified header provided by HTTP does not necessarily reflect when a particular version of a web page came into existence. Also, the Last-Modified header may not exist in some cases. To provide more information, the Memento-Datetime header has been introduced to indicate when a specific representation of a web page was observed on the web.

The diagram above shows the 3 step process by which Memento finds the best archived web page for the datetime supplied by the user. The process works as follows:

  1. The Memento client contacts the original resource to see if it will return information about a TimeGate (URI-G) in the Link header.
  2. The Memento client then uses the Accept-Datetime request header to submit the datetime desired by the user to the URI-G discovered in the previous step. Most resources on the web do not return a URI-G yet, so most Memento clients use a predefined list of TimeGates to accomplish this step. The TimeGate then returns a 302 redirection status code and a Location header to tell the client where to find the archived resource (URI-M).
  3. The Memento client then requests the archived resource (URI-M) like it would any other web page. The response for the URI-M contains a Memento-Datetime indicating when it was observed on the web.

In this way, Memento utilizes the existing infrastructure of HTTP to accomplish the goals of finding the best archived web page based on a user's desired datetime and URI.

References

Memento Project Wikipedia