Chapter 11. Parsing and Generating XML
With
XML, you can effortlessly
exchange data between programs written in different languages,
running on different operating systems, located on computers anywhere
in the world. At least, that's what enthusiastic
computer programmers and salespeople who work for companies that sell
XML tools will tell you. They're sort of telling the
truth. XML does make it easier to trade structured information
between two programs. But you still have to do some work to herd your
data into the right structure. This chapter shows you how to do that
work with PHP.
XML is a markup language that looks a lot like HTML. An XML document
is plain text and contains tags delimited by <
and >. There are two big differences between
XML and HTML:
In one sense, XML gives you a lot more freedom than HTML. HTML has a
certain set of tags: the <a></a> tags
surround a link, the <ul></ul> tags
denote an unordered list, the
<li></li> tags indicate a list
element, and so on. An XML document, however, can use any tags you
want. Put <rating></rating> tags
around a movie rating,
<height></height> tags around
someone's height, or
<favoritecolor></favoritecolor> tags
around someone's favorite color—XML
doesn't care. Of course, whomever (or whatever
program) you're sharing the XML document with also
needs to agree to use and understand the same set of tags.
While you get more freedom in the tag-choice department, XML clamps
down much harder than HTML when it comes to document structure. HTML
lets you play fast and loose with some opening and closing tags. The
HTML list in Example 11-1 renders just fine in a web
browser.
Example 11-1. HTML list that's not valid XML
<ul>
<li>Braised Sea Cucumber
<li>Baked Giblets with Salt
<li>Abalone with Marrow and Duck Feet
</ul>
As an XML document, though, Example 11-1 has a
problem. There are no closing </li> tags to
match up with the three opening <li> tags.
Every opened tag in an XML document must be closed. The XML-friendly
way to write Example 11-1 is shown in Example 11-2.
Example 11-2. HTML list that is valid XML
<ul>
<li>Braised Sea Cucumber</li>
<li>Baked Giblets with Salt</li>
<li>Abalone with Marrow and Duck Feet</li>
</ul>
There are lots of existing standard XML tag sets for describing
different kinds of information. XHTML, an XML-compatible version of HTML,
is described at http://www.w3.org/TR/xhtml11/.
Lots of web sites distribute lists of article headlines or other
syndicated data using an XML format called RSS (described at
http://blogs.law.harvard.edu/tech/rss). Many of
the examples in this chapter also involve RSS. You can get a
PHP-themed RSS feed from the Planet PHP web site, which collects many
PHP-related blogs. The Planet PHP RSS feed is available at
http://www.planet-php.net/rss/.
To learn more about XML, check out Learning
XML by Erik T. Ray (O'Reilly). To learn
more about XML in PHP, read Chapter 11 of
Programming PHP by Rasmus
Lerdorf and
Kevin Tatroe
(O'Reilly), Chapter 12 of
PHP Cookbook by David
Sklar and Adam
Trachtenberg (O'Reilly),
or Chapter 5 of Upgrading to PHP 5
by Adam Trachtenberg
(O'Reilly).
|