Video synopsis (often abbreviated V.S.) is an approach to create a short video summary of a long video. It tracks and analyzes moving objects (also called events), and converts video streams into a database of objects and activities. The technology has specific applications in the field of video surveillance where, despite technological advancements and increased growth in the deployment of CCTV (closed circuit television) cameras, viewing and analysis of recorded footage is still a costly and labor-intensive and time-intensive task.


Technology overview

Video synopsis combines a visual summary of stored video together with an indexing mechanism.

When a summary is required, all objects from the target period are collected and shifted in time to create a much shorter synopsis video showing maximum activity. A synopsis video clip is generated, in real time, in which objects and activities that originally occurred in different times are displayed simultaneously. (See Figure 1 - Screen shots: Before and after Video Synopsis).

The process begins by detecting and tracking objects of interest. Each object is represented as a "tube" in "space-time" of all video frames. Objects are detected and stored in a database in approximately real time.

Following a request to summarize a time period, all objects from the desired time are extracted from the database, and indexed to create a much shorter summary video containing maximum activity. (See Figure 2 - Tube packing).

Real time rendering is used to generate the summary video after object re-timing. This allows end-user control over object/event density.

Video Synopsis technology was invented by Prof. Shmuel Peleg of The Hebrew University of Jerusalem, Israel, and is being developed under commercial license by BriefCam, Ltd. BriefCam received a license to use the technology from Yissum which is the owner of the patents registered for the technology.

Recent Advances

Recent advances in the field of Video Synopsis have resulted in methods that focus in collecting key-points(or frames) from the long uncut video and presenting them as a chain of "key" events that summarize the video. As mentioned in, this is only one of the many methods employed in modern literature to perform this task. Recently, these event-driven methods have focused on correlating objects in frames, but in a more semantically related way that has been called a story-driven method of summarizing video. These methods have been shown to work well for egocentric settings where the video is basically a point-of-view perspective of a single person or a group of people.


