XML Tutorial - XML Master Professional Application Developer Edition
Volume 4 : XML Document Transformation via XSLT Stylesheet
Tatsuya Kimura
Section 3 presents a wide range of questions for the exam taker. While I can only cover a few examples of questions here, please use the explanatory comments as a guide for what areas you need to study to be successful.
Section 3 Major Points for Study
In order to solve the questions related to XSLT presented in Section 3, you will need to have an understanding of, and skill in, deriving a transformation result for a given XML document and XSLT Stylesheet. For example, there will be questions that test your knowledge related to XSLT elements having particular functions, and yet other questions that test your general understanding with respect to XSLT, including template usage methods. It is important to systematically balance your study across these areas.
XSLT Transformation through Java
Here is a brief explanation of the method to use Java for XSLT transformation. J2SE 5.0 includes a standard API "JAXP (Java API for XML Processing)" for processing XML. Using this allows you to easily create a program that performs XLST transformation. Let’s look at List1, an example of a program to perform XSLT transformation created using JAXP:
List1: A program (XSLTExec.java) to perform XSLT transformation using Java (JAXP)
import java.io.FileOutputStream;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
public class XSLTExec {
public static void main(String[] args) {
if (args.length < 3) {
System.out.println(
"usage: XSLTExec XML-file XSLT-stylesheet Output-file");
return;
}
String xml = args[0];
String xslt = args[1];
String output = args[2];
try {
TransformerFactory tf = TransformerFactory.newInstance(); ----- (2)
Transformer tr = tf.newTransformer(new StreamSource(xslt)); -- (1)
tr.transform(new StreamSource(xml),
new StreamResult(new FileOutputStream(output)));
System.out.println("Output to " + output);
} catch (Exception e) {
System.out.println(e);
}
}
}
This program uses the XSLT Stylesheet identified in the second argument to transform the XML document identified in the first argument, outputting the results to a file identified in the third argument.
As you can see from List1, when using JAXP to perform XSLT transformation, you create an instance of class javax.xml.transform.Transformer (List1-(1)). When creating this instance, you must first create a factory class javax.xml.transform.TransformerFactory instance (List1-(2)).
When performing XSLT transformation using Java in this way, you will generally use JAXP. If you have an understanding of these programs in addition to what transformation results are obtained according to XSLT Stylesheets, you should be able to immediately incorporate what you learn into your practical work applications. To understand the coding details for XSLT Stylesheets, I recommend that you look over existing XSLT Stylesheets, as well as create a variety of Stylesheets on your own, trying out transformations.
Let’s take a look at some XSLT practice questions/answers, as well as some comments related to notable points:
Example of an XSLT Question Appearing on the Exam - (1)
Select the answer that correctly describes the output results when transforming the following [XML Document] using [XSLT Stylesheet]. Assume that the XSLT processor used can output the transformation results in a document.
[XML Document]
<Series>
<Title section="1">DOM/SAX</Title>
<Title section="2">DOM/SAX Programming</Title>
</Series>
[XSLT Stylesheet]
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text" />
<xsl:strip-space elements="Series" />
<xsl:template match="/">
<xsl:apply-templates select="Series//* | Series//text() | Series//@*" />
</xsl:template>
<xsl:template match="* | text() | @*">
<xsl:value-of select="name(.)" />[<xsl:value-of select="." />]
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
Option
Title[]
Title[]
[DOM/SAX]
[DOM/SAX Programming]
section[1]
section[2]
Title[DOM/SAX]
Title[DOM/SAX Programming]
[DOM/SAX]
[DOM/SAX Programming]
section[1]
section[2]
Title[]
[DOM/SAX]
section[1]
Title[]
[DOM/SAX Programming]
section[2]
Title[DOM/SAX]
section[1]
[DOM/SAX]
Title[DOM/SAX Programming]
section[2]
[DOM/SAX Programming]
Answer
D
Commentary
The XSLT Stylesheet presented in this question includes two templates. However, the XSLT specification provides that processing always begins with template rule for the root node (template rule that starts with <xsl:template match="/">). Under the template for the root node previously mentioned, the following nodes are selected from the XML document (the pipe [ | ] in the select attribute value of the xsl:apply-templates element is a symbol representing node union).
- All element nodes that are descendants of Series element
- All text nodes below Series element
- All attribute nodes below Series element
Each node noted above selected in the template for the root node is processed by a different template (template rule that starts with <xsl:template match="* | text() | @*">). The process performed by this template outputs the text string in the following format for each of the nodes. "node name"["node value"]
To correctly answer this question, you must be able to recognize information such as that presented above from the XSLT Stylesheet and XML document, as well as have an understanding of the following matters.
- What text string is obtained for the node name?
- What text string is obtained for the node value?
- In what order is the processing for a multiple number of nodes performed?
The following is an explanation of the points raised above:
First, with respect to the node name, as you probably know, XSLT uses the extended data model of XPath. While seven types of nodes have been determined under XPath (including element node, text node, etc.), there is not a "name" concept for the root node, text node or comment node. This is the same for XSLT. Names can be output as text strings for nodes having names, but nothing can be obtained for those without names.
The text string when obtaining node values differs according to node type:
- Element Node: Text string that combines the contained text string of the element
- Attribute Node: Attribute value normalized according to XML 1.0
- Text Node: text string value of the text node
The order when processing a multiple number of nodes conforms to that provided for XPath (the order when coding the nodes in the template is irrelevant). XPath basically determines that processing is performed according to the order that the nodes occur in the XML document; accordingly, for an XML document such as the one in this question, processing is performed in the order "element node -> attribute node -> text node," and "element node -> attribute node -> text node" again. *1
*1 | However, in the event that a multiple number of attributes have been coded for one element, the processing order of the attribute node depends on the implementation of the XSLT processor. |
When dealing with XSLT questions, you have to have a solid understanding of not only XSLT, but also of the XPath data model.
While I didn’t address this here, you will also see questions on the exam related to whitespace control methods via xsl:strip-space element and xsl:preserve-space element, as well as XML document transformation methods using XML namespaces. The following are areas that frequently show up on the exam, as well as being very important in practical work applications. I advise you to take the time to go over these areas in detail.
- Conflict resolution/mode/include/import of template rules
- Processing using variables via xsl:variable element or xsl:param element
- Processing using keys (node identification information) via xsl:key element and key function
- Processing multiple XML documents using document function
Example of an XSLT Question Appearing on the Exam - (2)
Select the answer that correctly describes the output results when transforming the following [XML Document] using [XSLT Stylesheet]. Assume that the XSLT processor can output the transformation results as a document. Ignore line returns and indents.
[XML Document]
<Series>
<Title category="XML">XML Tutorial - Professional Application Developer - </Title>
<SubTitle section="1">DOM/SAX </SubTitle>
<SubTitle section="2">DOM/SAX Programming</SubTitle>
</Series>
[XSLT Stylesheet]
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="*|text()|@*">
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<xsl:template match="Series">
<xsl:copy>
<xsl:apply-templates select="Title/@category" />
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<xsl:template match="@category">
<xsl:element name="category">
<xsl:value-of select="." />
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Option
- Nothing is output.
<Series>
<Title category="XML">XML Tutorial - Professional Application Developer - </Title>
<SubTitle section="1">DOM/SAX</SubTitle>
<SubTitle section="2">DOM/SAX Programming</SubTitle>
</Series>
<Series>
<category>XML</category>
<Title category="XML">XML Tutorial - Professional Application Developer - </Title>
<SubTitle section="1">DOM/SAX</SubTitle>
<SubTitle section="2">DOM/SAX Programming</SubTitle>
</Series>
<Series>
<category>XML</category>
<Title>XML Tutorial - Professional Application Developer - </Title>
<SubTitle>DOM/SAX</SubTitle>
<SubTitle>DOM/SAX Programming</SubTitle>
</Series>
<Series>
<category>XML</category>
<Title category="XML" />
<SubTitle section="1" />
<SubTitle section="2" />
</Series>
<Series>
<category>XML</category>
<Title />
<SubTitle />
<SubTitle />
</Series>
Answer
C
Commentary
There are two important points related to this question.
The first point is the behavior of the XSLT processor when the select attribute of the xsl:apply-templates element has been omitted. In this case, the XSLT processor operates for all child nodes. However, since the attribute node is not a child node of the element, it is not a target for processing.
The second point is the role of xsl:copy element. This element makes a copy of the node targeted for processing as of that point. Be careful: the child nodes and attribute nodes of the targeted element are not copied.
Given the preceding, thinking through the processing that is coded for the XSLT Stylesheet in this question leads us to the following:
- Under the template for the root node (template rule that starts with <xsl:template match="/">), Series element is selected as a target for processing.
- Under the template for the Series element node (template rule that starts with <xsl:template match="Series">), the Series element node is first copied via xsl:copy element. Next, the category attribute of the Title element is selected as a target for processing. Last, all element Title and SubTitle are selected for processing.
- Under the template for the attribute category node (template rule that starts with <xsl:template match="@category">), the category element is created, and the value "XML" of the category attribute is used as the content.
- Under the template for element Title and SubTitle node (template rule that starts with <xsl:template match="*|text()|@*">), each node is copied, after which the child node (text node) is processed (copied, in this case). Note that the attribute is not processed.
XML Tutorial - XML Master Professional Application Developer Edition Indexs