Samiksha Jaiswal (Editor)

XSLT

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Paradigm
  
First appeared
  
1998

Preview release
  
3.0

Developer
  
W3C

Stable release
  
2.0

Website
  
W3C Transformation

XSLT (Extensible Stylesheet Language Transformations) is a language for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subsequently be converted to other formats, such as PDF, PostScript and PNG. XSLT 1.0 is widely supported in modern web browsers.

Contents

The original document is not changed; rather, a new document is created based on the content of an existing one. Typically, input documents are XML files, but anything from which the processor can build an XQuery and XPath Data Model can be used, such as relational database tables or geographical information systems.

XSLT is a Turing-complete language, meaning it can specify any computation that can be performed by a computer.

History

XSLT is influenced by functional languages, and by text-based pattern matching languages like SNOBOL and AWK. Its most direct predecessor is DSSSL, which did for SGML what XSLT does for XML.

  • XSLT 1.0: XSLT was part of the World Wide Web Consortium (W3C)'s Extensible Stylesheet Language (XSL) development effort of 1998–1999, a project that also produced XSL-FO and XPath. Some members of the standards committee that developed XSLT, including James Clark, the editor, had previously worked on DSSSL. XSLT 1.0 was published as a W3C recommendation in November 1999.
  • XSLT 2.0: after an abortive attempt to create a version 1.1 in 2001, the XSL working group joined forces with the XQuery working group to create XPath 2.0, with a richer data model and type system based on XML Schema. Building on this is XSLT 2.0, developed under the editorship of Michael Kay, which reached recommendation status in January 2007. As of 2010, however, XSLT 1.0 is still widely used, since 2.0 is not supported natively in web browsers or for environments like LAMP.
  • XSLT 3.0: is a W3C Candidate Recommendation as at February 2017. The main new features are:
  • Streaming transformations: in previous versions the entire input document had to be read into memory before it could be processed, and output could not be written until processing had finished (although Saxon does have a streaming extension). The working draft allows XML streaming which will be useful for processing documents too large to fit in memory, or when transformations are chained in XML Pipelines.
  • Improvements to the modularity of large stylesheets.
  • Improved handling of dynamic errors with, for example, an xsl:try instruction.
  • Functions can now be arguments to other (higher-order) functions.
  • Design and processing model

    The XSLT processor takes one or more XML source documents, plus one or more XSLT stylesheets, and processes them to produce an output document. In contrast to widely implemented imperative programming languages like C, XSLT is declarative. The basic processing paradigm is pattern matching. Rather than listing an imperative sequence of actions to perform in a stateful environment, template rules only define how to handle a node matching a particular XPath-like pattern, if the processor should happen to encounter one, and the contents of the templates effectively comprise functional expressions that directly represent their evaluated form: the result tree, which is the basis of the processor's output.

    The processor follows a fixed algorithm. First, assuming a stylesheet has already been read and prepared, the processor builds a source tree from the input XML document. It then processes the source tree's root node, finds the best-matching template for that node in the stylesheet, and evaluates the template's contents. Instructions in each template generally direct the processor to either create nodes in the result tree, or to process more nodes in the source tree in the same way as the root node. Output derives from the result tree.

    Processor implementations

  • Altova RaptorXML Server: cross-platform engine that supports XSLT 1.0 and 2.0, most of XPath 3.0, and some features from the XSLT 3.0 working draft; also XQuery. Allows command line operations and interfaces to COM, Java, and .NET and also includes a built-in HTTP server.
  • Exselt: A streaming XSLT 3.0 processor which runs on the .NET framework written in F#. Fully supports the XSLT 3.0 Draft, XPath 3.0 Recommendation and XDM 3.0 Recommendation standards.
  • libxslt is a free library released under the MIT License that can be reused in commercial applications. It is based on libxml and implemented in C for speed and portability. It supports XSLT 1.0 and EXSLT extensions.
  • It can be used at the command line via xsltproc which is included in macOS and many Linux distributions, and can be used on Windows via Cygwin.
  • The WebKit and Blink layout engines, used for example in the Safari and Chrome web browsers respectively, uses the libxslt library to do XSL transformations.
  • Bindings exist for Python, Perl, Ruby, PHP, Common Lisp, Tcl, and C++.
  • MSXML and .NET. MSXML includes an XSLT 1.0 processor. From MSXML 4.0 it includes the command line utility msxsl.exe.
  • Saxon: an XSLT (2.0 and partial 3.0) and XQuery 3.0 processor with open-source and proprietary versions for stand-alone operation and for Java, JavaScript and .NET.
  • QuiXSLT: an XSLT 3.0 processor doing streaming implemented in Java by Innovimax and INRIA.
  • Xalan: an open source XSLT 1.0 processor from the Apache Software Foundation available stand-alone and for Java and C++.
  • Web browsers: Safari, Chrome, Firefox, Opera and Internet Explorer all support XSLT 1.0. None support XSLT 2.0 natively, although the third party products like Saxon-CE and Frameless can provide this functionality. Browsers can perform on-the-fly transformations of XML files and display the transformation output in the browser window. This is done either by embedding the XSL in the XML document or by referencing a file containing XSL instructions from the XML document. The latter may not work with Chrome because of its security model.
  • XMLStarlet is "a set of command line utilities (tools) which can be used to transform, query, validate, and edit XML documents". It can "apply XSLT stylesheets to XML documents" and does not require Java. It uses libxslt and supports XSLT 1.0.
  • Xuriella and Plexippus-xpath are XSLT 1.0 processors written in Common Lisp.
  • Performance

    Most early XSLT processors were interpreters. More recently, code generation is increasingly common, using portable intermediate languages (such as Java bytecode or .NET Common Intermediate Language) as the target. However, even the interpretive products generally offer separate analysis and execution phases, allowing an optimized expression tree to be created in memory and reused to perform multiple transformations. This gives substantial performance benefits in online publishing applications, where the same transformation is applied many times per second to different source documents. This separation is reflected in the design of XSLT processing APIs (such as JAXP).

    Early XSLT processors had very few optimizations. Stylesheet documents were read into Document Object Models and the processor would act on them directly. XPath engines were also not optimized. Increasingly, however, XSLT processors use optimization techniques found in functional programming languages and database query languages, such as static rewriting of an expression tree (e.g., to move calculations out of loops), and lazy pipelined evaluation to reduce the memory footprint of intermediate results (and allow "early exit" when the processor can evaluate an expression such as following-sibling::*[1] without a complete evaluation of all subexpressions). Many processors also use tree representations that are significantly more efficient (in both space and time) than general-purpose DOM implementations.

    In June 2014, Debbie Lockett and Michael Kay introduced an open-source benchmarking framework for XSLT processors called XT-Speedo.

    XSLT and XPath

    XSLT uses XPath to identify subsets of the source document tree and perform calculations. XPath also provides a range of functions, which XSLT itself further augments.

    XSLT 1.0 uses XPath 1.0. XSLT 2.0 uses XPath 2.0. And XSLT 3.0 uses XPath 3.0. In the case of 1.0 and 2.0, the specifications were published on the same date. With 3.0, however, they were no longer synchronized; XPath 3.0 became a Recommendation in April 2014, while XSLT 3.0 was still work in progress.

    XSLT and XQuery compared

    XSLT functionalities overlap with those of XQuery, which was initially conceived as a query language for large collections of XML documents.

    The XSLT 2.0 and XQuery 1.0 standards were developed by separate working groups within W3C, working together to ensure a common approach where appropriate. They share the same data model, type system, and function library, and both include XPath 2.0 as a sublanguage.

    The two languages, however, are rooted in different traditions and serve the needs of different communities. XSLT was primarily conceived as a stylesheet language whose primary goal was to render XML for the human reader on screen, on the web (as web template language), or on paper. XQuery was primarily conceived as a database query language in the tradition of SQL.

    Because the two languages originate in different communities, XSLT is stronger in its handling of narrative documents with more flexible structure, while XQuery is stronger in its data handling, for example when performing relational joins.

    XSLT media types

    The <output> element can optionally take the attribute media-type, which allows one to set the media type (or MIME type) for the resulting output, for example: <xsl:output output="xml" media-type="application/xml"/>. The XSLT 1.0 recommendation recommends the more general attribute types text/xml and application/xml since for a long time there was no registered media type for XSLT. During this time text/xsl became the de facto standard. In XSLT 1.0 it was not specified how the media-type values should be used.

    With the release of the XSLT 2.0, the W3C recommended the registration of the MIME media type application/xslt+xml and it was later registered with the Internet Assigned Numbers Authority.

    Pre-1.0 working drafts of XSLT used text/xsl in their embedding examples, and this type was implemented and continues to be promoted by Microsoft in Internet Explorer and MSXML. It is also widely recognized in the xml-stylesheet processing instruction by other browsers. In practice, therefore, users wanting to control transformation in the browser using this processing instruction are obliged to use this unregistered media type.

    XSLT examples

    For grouping problems, see XSLT/Muenchian grouping. Below is a sample of incoming XML document

    Example 1 (transforming XML to XML)

    This XSLT stylesheet provides templates to transform the XML document:

    Its evaluation results in a new XML document, having another structure:

    Example 2 (transforming XML to XHTML)

    Processing the following example XSLT file

    with the XML input file shown above results in the following XHTML (whitespace has been adjusted here for clarity):

    This XHTML generates the output below when rendered in a web browser.

    In order for a web browser to be able automatically to apply an XSL transformation to an XML document on display, an XML stylesheet processing instruction can be inserted into XML. So, for example, if the stylesheet in Example 2 above were available as "example2.xsl", the following instruction could be added to the original incoming XML:

    In this example, text/xsl is technically incorrect according to the W3C specifications (which say the type should be text/xml), but it is the only media type that is widely supported across browsers as of 2009.

    References

    XSLT Wikipedia