Samiksha Jaiswal (Editor)

EPUB

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Filename extension
  
.epub

Magic number
  
PK 0x03 0x04

Internet media type
  
application/epub+zip

Developed by
  
International Digital Publishing Forum (IDPF)

Initial release
  
September 2007; 9 years ago (2007 -09)

Latest release
  
3.1 (January 5, 2017; 2 months ago (2017-01-05))

EPUB is an e-book file format with the extension .epub that can be downloaded and read on devices like smartphones, tablets, computers, or e-readers. It is a technical standard published by the International Digital Publishing Forum (IDPF). The term is short for electronic publication and is sometimes styled ePub. EPUB became an official standard of the IDPF in September 2007, superseding the older Open eBook standard. The Book Industry Study Group endorses EPUB 3 as the format of choice for packaging content and has stated that the global book publishing industry should rally around a single standard. EPUB is the most widely supported vendor-independent XML-based (as opposed to PDF) e-book format; that is, it is supported by the largest number of hardware readers.

Contents

History

EPUB 2.0 was approved in October 2007, with a maintenance update (2.0.1) approved in September 2010. The EPUB 3.0 specification became effective in October 2011, superseded by a minor maintenance update (3.0.1) in June 2014. New major features include support for precise layout or specialized formatting (Fixed Layout Documents), such as for comic books, and MathML support. The current version of EPUB is 3.1, effective January 5, 2017. The (text of) format specification underwent reorganization and clean-up; format supports remotely-hosted resources and new font formats (WOFF 2.0 and SFNT) and uses more pure HTML and CSS.

In May 2016 IDPF Members approved World Wide Web Consortium (W3C) merger, "to fully align the publishing industry and core Web technology".

Features

The format and many readers support the following:

  • Reflowable document: optimize text for a particular display
  • Fixed-layout content: pre-paginated content can be useful for certain kinds of highly designed content, such as illustrated books intended only for larger screens, such as tablets.
  • Like an HTML web site, the format supports inline raster and vector images, metadata, and CSS styling.
  • Page bookmarking
  • Passage highlighting and notes
  • A library that stores books and can be searched
  • Re-sizable fonts, and changeable text and background colors
  • Support for a subset of MathML
  • Digital rights management—can contain digital rights management (DRM) as an optional layer
  • The EPUB specification does not enforce or suggest a particular DRM scheme. This could affect the level of support for various DRM systems on devices and the portability of purchased e-books. Consequently, such DRM incompatibility may segment the EPUB format along the lines of DRM systems, undermining the advantages of a single standard format and confusing the consumer.

    Adoption and market share

    It is also widely used on many software readers such as iBooks on iOS and Google Books on Android, but notably not by the Amazon Kindle e-readers. iBooks also supports the proprietary iBook format, which is based on the EPUB format but depends upon code from the iBooks app to function.

    As final product
    As in the eBook market PDF format is not generally sold at sites that sell eBooks, the EPUB participation in the eBook market is very low. The main EPUB producers are suppliers of public domain and open licensed content, as Project Gutenberg, PubMed Central, SciELO and many others. In that context is growing and replacing the PDF format.
    Data interchange
    EPUB is a popular way to feed ebook creation process, because is an open format and is based in HTML, while Amazon's format is proprietary. EPUB is the "first step" content format in many production process and supply chains.

    Implementation

    An EPUB file is a ZIP archive that contains, in effect, a website—including HTML files, images, CSS style sheets, and other assets. It also contains metadata. EPUB 3 is the latest version. By using HTML5, publications can contain video, audio, and interactivity, just like websites in web browsers.

    Container

    An ePub publication is delivered as a single file. This file is an unencrypted zipped archive containing a set of interrelated resources.

    An OCF Abstract Container defines a file system model for the contents of the container. The file system model uses a single common root directory for all contents in the container. All (non-remote) resources for publications are in the directory tree headed by the container's root directory, though EPUB mandates no specific file system structure for this. The file system model includes a mandatory directory named META-INF that is a direct child of the container's root directory. META-INF stores container.xml.

    The first file in the archive must be the mimetype file. It must be uncompressed so that non-ZIP utilities can read the mimetype. The mimetype file must be an ASCII file that contains the string application/epub+zip. It must be unencrypted, and the first file in the ZIP archive. This file provides a more reliable way for applications to identify the mimetype of the file than just the .epub extension.

    An example file structure:

    --ZIP Container-- mimetype META-INF/ container.xml OEBPS/ content.opf chapter1.xhtml ch1-pic.png css/ style.css myfont.otf toc.ncx

    There must be a META-INF directory containing container.xml. This file points to the file defining the contents of the book, the OPF file, though additional alternative rootfile elements are allowed. Apart from mimetype and META-INF/container.xml, the other files (OPF, NCX, XHTML, CSS and images files) are traditionally put in a directory named OEBPS. An example container.xml:

    Publication

    The ePUB container must contain:

  • At least one content document.
  • One navigation document.
  • One package document listing all publication resources. This file should use the file extension .opf. It contains metadata, a manifest, fallback chains, bindings, and a spine. This is an ordered sequence of ID references defining the default reading order.
  • The ePUB container may contain:

  • style sheets.
  • PLS Documents.
  • media overlay documents.
  • Contents

    Content documents include: HTML 5 content, navigation documents, SVG documents, scripted content documents, and fixed layout documents. Contents also include CSS and PLS documents. Navigation documents supersedes the NCX grammar used in EPUB 2.

    Media overlays

    Books with synchronized audio narration are created in EPUB 3 by using media overlay documents to describe the timing for the pre-recorded audio narration and how it relates to the EPUB Content Document markup. The file format for Media Overlays is defined as a subset of SMIL.

    Software

    Many editors exist including calibre, Sigil, LaTeX, and Genebook. An open source tool, called epubcheck, can be used for validating and detecting errors in the structural markup (OCF, OPF, OPS), image, and XHTML files; the tool can either be run from the command line or used in applications or webapps as a library; it is also available on EPUB Validator. Readers exist for all major hardware platforms with the exception of Amazon Kindle, including Google Play Books (Android and iOS) and Apple iBooks (MacOS and iOS).

    References

    EPUB Wikipedia