Table of contents
17 September 2007
While DDX is nothing new to Adobe, it is new to ColdFusion with the latest release of Adobe ColdFusion 8. DDX is an XML document definition that was originally created for Adobe LiveCycle Assembler and is used to describe documents. DDX elements are used to represent PDF pages, comments, bookmarks, and styled text.
ColdFusion 8 ships with a scaled-down edition of the LiveCycle Assembler that can process a subset of the total DDX definition. In this article, we will look at some of the possibilities with DDX and how its addition to ColdFusion has given developers new tools and capabilities for creating and manipulating PDF documents.
Before we get too far into what DDX is and how to use it, we need to take a look at the new
CFPDF tag that was introduced in ColdFusion 8. As its name suggests, this tag is used for manipulating PDF documents. It should be noted that
CFPDF is only used for manipulating existing PDF documents. If you want to create a PDF document, that is a task for the
CFDocument tag, which you can find more information on in the ColdFusion documentation.
As the ColdFusion documentation describes, the
CFPDF tag allows you to:
- Merge several PDF documents into one PDF document.
- Delete pages from a PDF document.
- Merge pages from one or more PDF documents and generate a new PDF document.
- Linearize PDF documents for faster web display.
- Remove interactivity from forms created in Acrobat® to generate flat PDF documents.
- Encrypt and add password protection to PDF documents.
- Generate thumbnail images from PDF documents or pages.
- Add or remove watermarks from PDF documents or pages.
- Retrieve information associated with a PDF document, such as the software used to generate the file or the author, and set information for a PDF document, such as the title, author, and keywords.
Now, a lot of this functionality is possible without the use of DDX, but DDX really gives the
CFPDF tag its power. For example, not only can you merge pages from one or more PDF documents, but with DDX you can also build a table of contents for that document, set up the headers and footers on all pages, configure backgrounds or watermarks, and so forth. The list of possibilities is not endless, but it is much more than what we have ever had as ColdFusion developers without a lot of extra add-ons and trickery.
So, without further ado, let's see what capabilities Adobe has given us through the use of DDX.
<?xml version="1.0" encoding="UTF-8"?> <DDX xmlns="http://ns.adobe.com/DDX/1.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://ns.adobe.com/DDX/1.0/coldfusion_ddx.xsd"> </DDX>
If you have worked with XML documents before, this should be pretty straightforward. If not, it is still very simple. The first line simply declares that this is an XML document and follows the version 1.0 specification. The second line is where the DDX part comes into play. The opening tag sets up the namespace and schema, which define what DDX is and what a properly-formed DDX document looks like. The closing tag wraps the contents of the document, similar to what an
<html> tag would do in an HTML document.
The magic of DDX comes with what you put inside of the DDX tags. To start with a simple example, the following DDX document takes two PDF documents (doc1 and doc2) and merges them together to form a third document (output1).
<?xml version="1.0" encoding="UTF-8"?> <DDX xmlns="http://ns.adobe.com/DDX/1.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://ns.adobe.com/DDX/1.0/_coldfusion_ddx.xsd"> <PDF result="output1"> <PDF source="doc1" /> <PDF source="doc2" /> </PDF> </DDX>
All that I added to the base DDX document was the PDF element, which specifies the result document, wrapping two additional PDF elements, each of which specify the two source documents to be merged. One thing to note, DDX is processed from top to bottom, so the fact that doc1 is listed before doc2 means that in the resulting merged document, doc1 will appear first, followed by doc2.
Now that you have this DDX document, what do you do with it? The following ColdFusion code snippet shows how one might use this DDX document.
<!--- check to see if the ddx file exists and is properly formed ---> <cfif IsDDX("sample.ddx")> <!--- setup the mapping for the source PDF files ---> <cfset sInput = structNew() /> <cfset sInput.doc1 = "intro.pdf" /> <cfset sInput.doc2 = "content.pdf" /> <!--- setup the mapping for the result PDF files ---> <cfset sOutput = structNew() /> <cfset sOutput.output1 = "document.pdf" /> <!--- use the cfpdf tag to process the DDX document ---> <cfpdf action="processDDX" ddxFile="sample.ddx" inputFiles="#sInput#" outputFiles="#sOutput#" name="result" /> <!--- display the success / failure result of the operation ---> <cfoutput>#result.output1#</cfoutput> <cfelse> <p>The sample.ddx file is not a valid DDX xml document.</p> </cfif>
The first thing this code snippet does is to use the new
IsDDX function to check whether the DDX file exists and that the specified DDX file represents a properly-formed DDX file. If one of these is not the case, you skip the process as you cannot proceed without a properly-formed DDX file.
Assuming you have a properly-formed DDX file, the next thing to do is set up the input and output mappings. If you noticed in the DDX sample created above, the result and source lines merely specify doc1 or output1, not real PDF files or paths to PDF files. The CFPDF tag takes as parameters a structure containing the input files and output files, which are then matched to the variables in the DDX file. The name of the input or output structure does not matter, but the structure keys must match up to the variable names in the DDX file.
After all of that is the
CFPDF tag, which does all of the processing work. To process a DDX file, the action attribute of the
CFPDF file is
processDDX – clever, huh? Then, the snippet passes in the path to or the string containing the DDX XML document, along with the input and output structures created before, and sets the name of the result. This result is not the resulting PDF document, but a structure containing the results of the
CFPDF operation itself. Remember, the output structure and the DDX specified what the resulting PDF file should be named and where it should be stored.
And there you have it …. PDF operations using DDX. This was not the most exciting example since you can merge PDF documents without DDX, so let's take a look at some of the other capabilities that DDX provides that are not possible with the
CFPDF tag alone.
The version of Adobe LiveCycle that is imbedded in ColdFusion 8 is a somewhat limited version compared to the full server that you could purchase from Adobe, but the DDX elements that are supported still give ColdFusion developers a pretty wide array of functionality, especially compared to what they've had up until now. The following list describes the different DDX elements that are available through ColdFusion 8.
The full details of each of these elements can be found in the Adobe LiveCycle documentation, in the DDX Language Elements section.
About: Returns an "About" XML document with information about the LiveCycle assembler service including build, copyright, processor, and version. More information on the structure and contents of the returned About XML document can be found in the LiveCycle documentation.
Author: Allows you to specify the author of the PDF document, which is then stored in the metadata. Multiple authors may be specified using a comma or semicolon delimited list.
Background: Inserts styled text or PDF content behind the existing page content. The background is applied to all pages and replaces any pre-existing backgrounds by default. The background element contains a series of attributes, which allow for things like fitting the background to the page, setting the opacity of the background, setting the scale or rotation, and more.
Center: Specifies that the content should be centered within the header or footer element and thus is required to be nested within a Header or Footer element.
DatePattern: Specifies the format of a date and time string. This format can only be applied to built-in date/time keys including
_DateTime, and has a series of sub-elements including second, minute, hour, and so forth.
DDX: The root element of the DDX document, in which all other DDX elements are nested. There are a series of other attributes including metadata, styles, and password encryption and access that can be specified as part of the root DDX element.
DocumentInformation: Returns an XML document that contains information about the source PDF document including metadata and other information about the pages in the document including the number of pages, page sizes, labels, and so forth.
DocumentText: Returns an XML document that contains information about the text that is contained in the specified source PDF document. This allows you to easily grab all of the text within a document.
Footer: Is used to specify the footer to be placed on one or more pages within the PDF document. The footer element has attributes like padding, alteration, and styles, and contains sub-elements like center, left, and right to indicate how the content of the footer is to be placed.
Header: Is used to specify the header to be placed on one or more pages within the PDF document. Like the footer, the header element has attributes like padding, alteration, and styles, and contains sub-elements like center, left, and right to indicate how the content of the header is to be placed.
InitialViewProfile: Specifies how the document is to be viewed when it is opened in Adobe Acrobat Reader. Through this element, you could specify that the document should start on page 2 and the magnification set to fit the page to the viewing window.
Keyword: Specifies a single keyword that is then used for metadata.
Keywords: Specifies a series of key words that may append or overwrite the current keywords in the metadata of the resulting document.
Left: Specifies that the content should be left justified within the header or footer element and thus is required to be nested within a Header or Footer element.
MasterPassword: Specifies the master password for the result document, which, if provided, will be required to change permissions on the document. If this element is omitted, the document will not be password-protected at the master level.
Metadata: Allows for setting or getting the metadata for a PDF document including information like the title, author, and date created.
NoBookmarks: Specifies that all bookmarks on the PDF document should be removed.
OpenPassword: Specifies the password that will be required to open the document. This password only provides the user with the ability to open the document and the user can still be restricted from other actions.
PageLabel: Specifies the format and content of the page labels including page numbers, format, and other information.
Password: Specifies a password that the LiveCycle assembler can use to open an encrypted document for manipulation.
PasswordAccessProfile: Specifies a named profile containing a password to allow the LiveCycle assembler to access a document. This profile allows password access to be defined once and used later on in the DDX document.
PasswordEncryptionProfile: Specifies the profile containing security settings for the resulting document, including compatibility level, encryption level, passwords, and more. This profile allows encryption settings to be defined once and used later on in the DDX document.
PDFGroup: Allows for a collection of source PDF documents to be grouped and page properties, content, and document components may then be applied to the group as a whole.
Permissions: Specifies the permissions to be set for a document including
screenReadingas well as containing the nested
Right: Specifies that the content should be right-justified within the header or footer element and thus is required to be nested within a Header or Footer element.
StyledText: Describes rich text content that is to be added to the page and can include sub-elements like
tableOfContentsEntryPattern. Rich text content within the
styleTextelement conforms to a subset of the XHTML 1.0 specification and can also accept a limit set of style properties from the CSS specification, version 2.0.
StyleProfile: Specifies a named profile that contains sub-elements like a table of contents, header and footer, watermark or background, which can then be applied to a document. This profile allows styles to be defined once and used later on in the DDX document.
Subject: Provides the value for the subject metadata in the resulting document.
TableOfContents: The parent, wrapper, element for a table of contents that will be added to the result document. This element can contain sub-elements such as
footer, as well as
TableOfContentsEntryPattern: Defines the style that is applied to all table of contents entries and can optionally be limited to a specific bookmark level to apply different styles to different levels of the table of contents.
TableOfContentsPagePattern: Defines the style that is to be applied to all pages of the table of contents including page size, rotation, margins, header and footer, watermark, and so forth.
Title: Provides the value for the document title metadata in the resulting document.
Watermark: Inserts styled text or PDF content over the existing page content. The watermark is applied to all pages and replaces any pre-existing watermarks by default. The watermark element contains a series of attributes, which allow for things like fitting the watermark to the page, setting the opacity of the watermark, setting the scale or rotation, and more.
If you have made it this far, you can see the number of options for customization of PDF documents through DDX is quite impressive.
Let's look at one more example that shows a larger collection of DDX elements being put to use.
<?xml version="1.0" encoding="UTF-8"?> <DDX xmlns="http://ns.adobe.com/DDX/1.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://ns.adobe.com/DDX/1.0/_coldfusion_ddx.xsd"> <StyleProfile name="bookFooter"> <Footer> <Right> <StyledText><p color="gray"> Page <_PageNumber/> of <_LastPageNumber/></p></StyledText> </Right> </Footer> </StyleProfile> <PDF result="completeBook"> <PDF source="titlePage" /> <TableOfContents maxBookmarkLevel="infinite" createLiveLinks="true" includeInTOC="false" > <Footer styleReference="bookFooter" /> </TableOfContents> <PDF source="chapter1"> <Header> <Right> <StyledText><p color="gray">Chapter 1</p></StyledText> </Right> </Header> <Footer styleReference="bookFooter" /> </PDF> <PDF source="chapter2" > <Header> <Right> <StyledText><p color="gray">Chapter 2</p></StyledText> </Right> </Header> <Footer styleReference="bookFooter" /> </PDF> <PDF source="chapter3" > <Header> <Right> <StyledText><p color="gray">Chapter 3</p></StyledText> </Right> </Header> <Footer styleReference="bookFooter" /> </PDF> <PDF source="chapter4" > <Header> <Right> <StyledText><p color="gray">Chapter 4</p></StyledText> </Right> </Header> <Footer styleReference="bookFooter" /> </PDF> <PDF source="glossary" > <Header> <Right> <StyledText><p color="gray">Glossary</p></StyledText> </Right> </Header> <Footer styleReference="bookFooter" /> </PDF> </PDF> <PDF result="preview"> <PDF source="titlePage"> <Watermark opacity="25%"> <StyledText><p color="gray">Preview</span></StyledText> </Watermark> </PDF> <TableOfContents maxBookmarkLevel="infinite" createLiveLinks="true" includeInTOC="false" > <Footer styleReference="bookFooter" /> </TableOfContents> <PDF source="chapter1"> <Header> <Right> <StyledText><p color="gray">Chapter 1</p></StyledText> </Right> </Header> <Footer styleReference="bookFooter" /> </PDF> </PDF> </DDX>
As you can see in this code snippet, I have specified a lot more information about the document. Let's start at the top and run through what is going on here. First, we have the standard DDX element, which is nothing new.
The second element is a
StyleProfile definition that I have named
bookFooter. This style definition represents style changes that you can then apply later on when building the result document. In this style definition , I am setting up a footer and within that footer, specifying some text to be right-aligned. The text is wrapped within the
StyledText element and includes a couple of other special elements such as
_LastPageNumber, which will be automatically computed as the result document is built.
The next element in the DDX description is the result PDF element that you saw in the initial example; the element simply specifies the result PDF document variable. Within the result PDF element, there are a few new elements. After the initial
titlePage source document, I insert a table of contents. For that table of contents, I have specified the code to show all bookmark levels, created hyperlinks within the document for each table of contents entry, and that I do not wish for the table of contents itself to be listed in the table of contents. Since I inserted the
TableOfContents element after the initial
titlePage source document,
the titlePage will not be listed on the table of contents either. Within the TableOfContents element, I have nested a Footer element, which specifies the
bookFooter style defined earlier. This places the footer definition on the table of contents pages.
TableOfContents element, there are five PDF source elements, which define each of the "chapters" in the book and the order in which they will be merged for the final resulting document. For each PDF source element, I have added a sub-Header element, which, like the Footer before, defines the contents of the Header. In this case the header is a right-aligned line of text that displays the current chapter number. Also in each PDF source element is a sub Footer element that uses the same
bookFooter style as the table of contents. By defining this style once, you have been able to use it in six different places, preserving the common footer throughout the document.
Following the closing PDF tag for the
completeBook result PDF is a second PDF result tag. In this DDX document definition, you are actually creating two result PDF documents. The first one is a complete book with all of the chapters and the second is a preview of the book that only includes the first chapter.
Where to Go from Here
Hopefully this preview of the new DDX technology available with ColdFusion 8 has whetted your appetite a bit and made PDFs look a bit more interesting and useful within your ColdFusion applications. If you want to find out more about DDX, you can dig into some of the elements I introduced here. Check out the DDX Reference in the Adobe LiveCycle documentation.