Accessibility

Table of Contents

XML Overview

XML Syntax

If you followed the previous example, you should have an idea of what an XML document looks like. The syntax is pretty straightforward─its rules are clear and very simple. An XML document is made up of an XML declaration and a root element or tag containing several nested elements. I will briefly present the most important syntax rules before:

  • All XML documents must have a root element.
  • All XML elements must have a closing tag.
  • Tags are case-sensitive.
  • All XML elements must be properly nested.
  • Attributes must be included in the opening tag and must be enclosed between quotes.

All XML documents must start with the XML declaration. If you use Dreamweaver to create your XML documents, the XML declaration is added automatically. The XML declaration is used by the applications calling the XML document to correctly read and interpret the information. By default, Dreamweaver creates XML documents that conform to the 1.0 specification and use the iso-8859-1 (Latin-1/West European) character set. The XML declaration is not an element and is not regarded as part of the XML document.

Next, the document should contain a single root element. In the previous example, the root element is <department>. Suppose, however, the company has more than one department. Could a second <department> element be added to the document, like this?

<?xml version="1.0" encoding="iso-8859-1"?>
<department> </department>
<department> </department>

No. You will need to define a new root element in this case: <company>. The new root element can now have any number of child elements, that is, departments:

<company>
 <department>
  	<employee>
    	<name>John Doe</name>
	<job>Software Analyst</job>
    	<salary>2000</salary>
  	</employee>
  <employee>
    	<name>Jane Fletcher</name>
    	<job>Designer</job>
	<salary>2500</salary>
  </employee>
 </department>
 <department>
   <employee>
   </employee>
 </department>
</company>

All other child elements must be enclosed within the root tag.

While in HTML single-tagged elements such as <hr> or <br> are allowed, in XML all elements must have a closing tag. If you omit a closing tag, your browser will throw an error similar to this one:

The following tags were not closed: department. Error processing resource 'http://www.domain.org/company.xml'.

One of the new features in Dreamweaver 8 is the default code completion, which also works for XML files. If I type the following:

<company>
   <department>
      <employees>

Dreamweaver 8 will automatically close the tags correctly when I type </. In the example above, the first time I type </, Dreamweaver will insert </item>. The next time I type </, Dreamweaver will insert </items>. The next time I type </, Dreamweaver will insert </root> Dreamweaver 8 understands where you are in the page and closes the tag appropriately. Code completion can really help you produce well-formed XML documents, especially if you're not a coding guru.

Moreover, tag names are case-sensitive, which means <Department> is a totally different element than <department> or <DEPARTMENT>. Obviously, the opening and the closing tags of the same element must be written in the same case. The following is an example of an illegal tag pair in XML:

<JOB>Software Analyst </job>

As I mentioned previously, XML elements are related through child-parent relationships. In the above example, <employee> is a child of <department>, which, in its turn, is a child of the unique root element, <company>. In order to preserve these relationships, elements must be properly nested. While HTML allows tags to cross, as in the following example, in XML all elements must be properly nested within each other.

<b>This text is <i> emphasized </b> and italic</i>.

This is perfectly legal in HTML displays in the browser like this:

Figure 6. Cross nested like this is valid in HTML, but not in XML

Figure 6. Cross-nesting tags like this are valid in HTML, but not in XML

In XML, content, or actual information, is carried by elements and/or their attributes. An element can contain simple text, other elements or both. For instance, the following element:

<employee>
    	<name>John Doe</name>
	<job>Software Analyst</job>
    	<salary>2000</salary>
</employee>

can be rewritten as:

<employee>
    	John Doe
	<job>Software Analyst</job>
    	<salary>2000</salary>
</employee>

This means the element employee contains mixed content: simple text and other elements.

Empty elements are also allowed. The next element could be read as "we have a job offering, but we're still looking for the right person":

<employee></employee>

The same element could be re-written using attributes:

<employee job="Software Analyst">
	John Doe
	<salary>2000</salary>
</employee>

Attributes in XML are the properties of an element. They describe its characteristics. You can use simple quotes (' ') or double quotes (" ") to mark attribute values. As you can see from the previous series of examples, the same data can be stored as either child elements or attributes. So which method is better? Ideally, you should use attributes only to give extra information about data, that is, when you need meta data. For instance:

<employee id="31">
    	<name>John Doe</name>
	<job>Software Analyst</job>
    	<salary>2000</salary>
</employee>

The employee ID is not relevant in this case for the actual data. This ID however, can be used by XML processing software to identify the corresponding employee faster. Such information is called meta data, that is, data about data.

Using attributes instead of elements also has some disadvantages. The overall structure of the XML document becomes less clear and less expandable. Also, attributes cannot have multiple values and are more difficult to work with. Imagine if the information about an employee were stored like this:

<employee name="John Doe" job="Software Analyst" salary="2000"></employee>

This would defeat the whole purpose of an XML document–making information clearly structured and easy to exchange.

Naming Conventions – What Am I Allowed to Write?

At this point, you may ask yourself this legitimate question: "OK, if I get to define my own tags, can I really use anything for an element?" The answer is yes and no. You can use anything for an element name, since there are no reserved words in XML, BUT you must observe these simple naming rules:

  • Names can contain any alphanumeric character, but must not start with a number or punctuation character.
  • Names cannot contain spaces.
  • Names must not start with the letters xml, as they could be easily confused with an XML document definition.
  • Do not use ":" characters within element names.

While it is OK to use "." and "-" characters within element names, I don't recommend it. The application processing the XML file might interpret these signs as operators. You can replace them with "_" characters if you need to use a longer name, as in the following example:

<employee>
    	<first_name>John</first_name>
	<last_name>Doe</last_name>
	<job>Software Analyst</job>
    	<salary>2000</salary>
</employee>

What About Content? Can Anything Be Used As The Content Of An Element?

Yes, basically anything. Non-English characters are also supported, but make sure you set the correct character set and that the client application processing the XML document supports non-English content. Also, white space inside the content will be preserved, unlike in HTML. This means you can write consecutive spaces and they will not be stripped or removed.

All markup or programming languages allow comments. XML does too! The syntax is the same as in HTML:

<!-- This employee deserves a raise. -->

Earlier in this article, I've mentioned RSS as an application of XML. The next example shows a simplified version of the Macromedia Developer Center feed, in order to illustrate the structure of XML documents and the syntax rules that were applied.

<?xml version="1.0" encoding="utf-8"?>
<rss version="1.0">
	<channel>
		<title>Macromedia Developer Center RSS Feed June 27, 2005</title>
		<link>http://www.macromedia.com/devnet/</link>
		<description>Macromedia Developer Center is your center for the tutorials, articles, and sample applications you need to master Macromedia products.</description>
	<item>
		<title>Creating a Dynamic Playlist for Progressive Flash Video</title>
		<link>http://www.macromedia.com/devnet/flash/articles/prog_download.html</link>
		<description>Learn how to create an XML-driven playlist for viewing progressive FLV files.</description>
	</item>
	<item>
		<title>Chatting Through IM Gateways in ColdFusion MX 7</title>
		<link>http://www.macromedia.com/devnet/coldfusion/articles/imgateway.html</link>
		<description>Learn the fundamentals of gateway apps as you build a sample chat application that monitors your ColdFusion server status</description>
	</item>
</channel>
</rss>

Notice how RSS carries list-oriented information (items, or articles). Of course, RSS was designed to exchange large quantities of similar items, but for simplicity, I only included two items in the above example. The items make up a "channel" of information, which is characterized by a title, a link, and a description. Each item represents an article published on the Developer Center and has as child elements the title of the article, the link to it and a short description. Additionally, elements can be added for the author, the publishing date and the subject of each article, as you will see in my next articles. In this topic, I have covered what is necessary for XML to be well formed, meaning the rules to make it syntactically correct. When you become more familiar with XML, you should also understand XML document type definitions (DTD) and XML schemas, which are the rules you make for the XML-based languages you create and use. A DTD specifies the legal elements that can be used in an XML document. To learn more about validating your XML documents against a DTD, you can read this tutorial from W3Schools. An XML schema is just an XML-based version of DTD.

Where to Go from Here

This article was meant to introduce you to XML. It is by no means a comprehensive description of the technology or a complete guide on developing XML-based applications. Instead, I encouraged you to complement your knowledge of XML and explore the different ways XML can be used for web development by visiting these additional resources:

Other Tutorials

  • The XML tutorial on W3 Schools takes you from the basics of XML to advanced concepts, such as namespaces and using XML with databases.
  • The RSS Tutorial for Content Publishers and Webmasters explains what RSS is and shows you how to build your own feed. The tutorial also features a wealth of resources and tools for creating, validating and aggregating RSS feeds.
  • Mike Chambers shows you how to consume Macromedia's RSS feeds in this short article on the Developer Center.

If you already started working with XML, this XML validator from W3 Schools will let you know if your XML documents are well-formed.

My next article will introduce you to XSL, the language for formatting and presenting XML data. You can read it here. Also you should take a look at my articles on using the new XML and XSL features in Dreamweaver 8 on how to consume an RSS feed in your site using Dreamweaver 8 and how to configure your server for server-side XSL transformations.