Betsy Rolland
LIS 600 Independent Study

Transforming XML to Microsoft XML
Terry Brooks
Winter 2006

Transform a specially designed XML file to WordprocessingML using XSLT

The instructions below will walk you through creating an XSLT transformation style sheet to transform a specially-designed XML file to WordprocessingML. The resulting file can be opened directly in Microsoft Word.

The XML schema JournalArticle.xsd has been designed to model a scholarly journal article. In the XML schema, I added elements to contain formatting requests by the author of the XML file. For instance, the author can specify that the Abstract be displayed in a certain font using a certain font size. This gives the author of the XML document some control over the display. This may or may not be what the organization desires. Obviously, this information can be easily ignored.

Another way to accomplish this goal is to leave all formatting information out of the XML document but allow the author to create her own XSL stylesheet through which she can add that information later. That leaves the XML document "pure," maintaining greater separation between content and display.

One of the challenges in this approach is transferring style information from the plain or XSL-transformed XML document to the WordprocessingML document. In WordprocessingML, all style information needs to be at the top of the document, so styles need to be the first information transferred in the XSLT transformation. One way to handle this issue is to add "style" elements to the original XML file, which is what I've done in JournalArticle.xml. In the XML schema, I have defined a style element, which allows the user to enter all information necessary to define a WordprocessingML-readable style. For more information on WordprocessingML styles, please see this tutorial. A second way is to have pre-defined styles in your XSLT stylesheet. In the stylesheet JournalArticleTransformation.xslt I have used a combination of these techniques. There are several pre-defined styles included in the XSLT stylesheet, and all styles defined in the XML document are added during the XSLT transformation. Processes would need to be put into place in the organization to ensure that style name collisions did not occur.

Documents:

JournalArticle.xsd
JournalArticle.xml
JournalArticleTransformation.xslt (Transforms JournalArticle.xml into JournalArticle_Output.xml)
JournalArticle_Output.xml (Can be opened directly in Word)

Steps:

  1. Start with an XML file, like JournalArticle.xml, above, based on an XML schema like JournalArticle.xsd that allows the inclusion of formatting information.

  2. Create a new XSLT document that will output XML.

  3. Your XSLT file will need the following headers:
    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:fo="http://www.w3.org/1999/XSL/Format"
    xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"<!-- WordprocessingML namespace -->
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <!--Specify output format -->

  4. Open your XSL template tag (<xsl:template match="/">). Inside the template tag, add the following lines:
    <?mso-application progid="Word.Document"?> <!-- Associates this file with MS Word -->
    <w:wordDocument <!-- write processing instructions in new XML file -->
    xmlns:fo="http://www.w3.org/1999/XSL/Format"
    xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" <!-- WordprocessingML namespace -->
    xml:space="preserve"> <!-- Preserves whitespace in the document. -->

  5. Define any styles you want to use in your document. For more information, please see the tutorial on using styles in Wordprocessing ML.

  6. Use XSLT to process your XML file, placing the desired information into paragraph structures as described in the first tutorial.

  7. Close all tags and use your XLST stylesheet to process your XML file into a WordprocessingML file that can be opened directly in Microsoft Word.