Приглашаем посетить
Тургенев (turgenev-lit.ru)

Chapter 11. Parsing and Generating XML

Previous Table of Contents Next

Chapter 11. Parsing and Generating XML

With XML, you can effortlessly exchange data between programs written in different languages, running on different operating systems, located on computers anywhere in the world. At least, that's what enthusiastic computer programmers and salespeople who work for companies that sell XML tools will tell you. They're sort of telling the truth. XML does make it easier to trade structured information between two programs. But you still have to do some work to herd your data into the right structure. This chapter shows you how to do that work with PHP.

XML is a markup language that looks a lot like HTML. An XML document is plain text and contains tags delimited by < and >. There are two big differences between XML and HTML:

  • XML doesn't define a specific set of tags you must use.

  • XML is extremely picky about document structure.

In one sense, XML gives you a lot more freedom than HTML. HTML has a certain set of tags: the <a></a> tags surround a link, the <ul></ul> tags denote an unordered list, the <li></li> tags indicate a list element, and so on. An XML document, however, can use any tags you want. Put <rating></rating> tags around a movie rating, <height></height> tags around someone's height, or <favoritecolor></favoritecolor> tags around someone's favorite color—XML doesn't care. Of course, whomever (or whatever program) you're sharing the XML document with also needs to agree to use and understand the same set of tags.

While you get more freedom in the tag-choice department, XML clamps down much harder than HTML when it comes to document structure. HTML lets you play fast and loose with some opening and closing tags. The HTML list in Example 11-1 renders just fine in a web browser.

Example 11-1. HTML list that's not valid XML
<ul>

  <li>Braised Sea Cucumber

  <li>Baked Giblets with Salt

  <li>Abalone with Marrow and Duck Feet

</ul>

As an XML document, though, Example 11-1 has a problem. There are no closing </li> tags to match up with the three opening <li> tags. Every opened tag in an XML document must be closed. The XML-friendly way to write Example 11-1 is shown in Example 11-2.

Example 11-2. HTML list that is valid XML
<ul>

  <li>Braised Sea Cucumber</li>

  <li>Baked Giblets with Salt</li>

  <li>Abalone with Marrow and Duck Feet</li>

</ul>

There are lots of existing standard XML tag sets for describing different kinds of information. XHTML, an XML-compatible version of HTML, is described at http://www.w3.org/TR/xhtml11/. Lots of web sites distribute lists of article headlines or other syndicated data using an XML format called RSS (described at http://blogs.law.harvard.edu/tech/rss). Many of the examples in this chapter also involve RSS. You can get a PHP-themed RSS feed from the Planet PHP web site, which collects many PHP-related blogs. The Planet PHP RSS feed is available at http://www.planet-php.net/rss/.

To learn more about XML, check out Learning XML by Erik T. Ray (O'Reilly). To learn more about XML in PHP, read Chapter 11 of Programming PHP by Rasmus Lerdorf and Kevin Tatroe (O'Reilly), Chapter 12 of PHP Cookbook by David Sklar and Adam Trachtenberg (O'Reilly), or Chapter 5 of Upgrading to PHP 5 by Adam Trachtenberg (O'Reilly).

    Previous Table of Contents Next