Multi-Tier

Multi-tier development can be defined as a development process where applications are built by components from different layers. Each layer provides services to the other layers, meaning each layer can abstract a particular aspect of the application. This abstraction leads to very maintainable applications since changes to one particular layer can be done without modifying the others. PHP has a lot of features and capabilities that allow the implementation of a multi-tier oriented methodology to programming.

The common layers that can be identified with a web application are:

Content layer
Logic layer
Presentation layer

Each layer encapsulates a specific part of the application:

The Content Layer

The content layer consists of components that provide access routes to the application's data. All the program components built on this layer must go through it to access the application's data.

The most important entity in this layer is the data model. The data model defines how you store your data and how you should manipulate it. It is best to select a data modeling strategy for your content before you actually start adding to the content. The most common data modeling alternatives are:

Plain files model
Relational database model
XML model
Hybrid models

These types should be abstracted to the high-level program code using common objects. These are called Data Access Components (or Data Access Objects, simply DAOs).

Plain Files Model

The idea behind this model is quite simple: you can use binary or text files to store data. The model consists of textual descriptions of the data structure and names to be used for the data files. There are some applications where this type of a data model would be highly useful. For example, massive search engines using indices or hashing with plain files run much faster and have easier maintenance than gigantic data stores.

This model can use a very wide range of file structures, from plain sequential files to b-trees, b* trees, b+ trees, hash tables, binomial heaps, union-find structures, and many others.

Let's consider building a web-based poll application. Here we have different polls with a number of options each, we have votes and might even have comments about polls. We could design a data model based on plain files like this:

    Polls file:
    Binary file, fixed length registers.
    Pollname: 40 bytes.
    Question: 250 bytes.

    Options file:
    Binary file, fixed length registers.
    Pollname: 40 bytes.
    Option: 80 bytes.
    Votes: 4 bytes.

    Comments file:
    Binary file, variable length registers.
    Structure: name length (1 byte) + name + comment length (2 bytes) + comment

    Current poll:
    Binary file, fixed length registers.
    Pollname: 40 bytes.

If we want to find which poll is the current poll, we can open the Current poll file, read 40 bytes, and get the pollname. We can then get the question for the poll by sequentially searching the pollname in the Poll file.

The same sequential search can be adopted to retrieve all the options. Voting only implies adding one to a selected option. Comments can be added by appending data to the comments file. However, deleting data is a difficult task. First, the data has to be deleted by marking a poll or an option with blanks in its pollname, and then building a packing process that will recreate the file by physically eliminating the logically deleted records. For more on advanced file structures see File Structures and Object Oriented Approaches in C++ from Addison-Wesley.

A plain files based data model is really useful when dealing with large amounts of data or when very specific searches must be done over a big volume of information and the queries can't really be adapted to SQL or XML related common queries. For example, search engines such as Google (http://www.google.com/) use a data model based on plain files to maintain the repository of information that is collected from the Web. The extra time needed to design, code, and maintain the programs that manage this information is compensated with superb performance that is really specific to the application.

Using a plain file based data model in PHP is easy since PHP has plenty of file manipulation functions such as flock(), fwrite(), fgets(), fputs(), fopen(), fclose(), fseek(), ftell(), unlink(), file_exists(), filesize(), and so on. These have been explained in Chapter 9.

Relational Database Model

This is the most common approach for web applications today. SQL statements are used to insert, delete, and update the data. The model is defined by an entity relationship diagram (ERD), where you indicate the entities used and the relationship between them. Then you can convert the ERD into a table structure and use this structure to establish table relationships. Let's see how we can model our poll application using a relational database:

    Polls table:
    pollid integer(4)
    pollname varchar(40)
    question varchar(200)

    Options table:
    pollid integer(4)
    option varchar(80)
    votes integer(4)

    Comments table:
    pollid integer(4)
    comment text

This structure is similar to the plain file model except that tables replace files. We don't have to bother about the internal structure of tables since this is what a Database Management System (DBMS) is for. We don't have to update or delete each entity, since we do that with SQL statements.

DBMS gives these common advantages:

Higher integrity of data (not guaranteed with files)
Higher consistency of data using multiple access
Higher security
Common query language
Different views using same structures for multiple uses
Independence of file structures
No redundancy of information
Relational mapping with OO
Less hard drive space from loss-less joins

And these disadvantages:

DBMSs are slower than files
DBMSs require additional software
Commercial DBMSs may be expensive

PHP is great for database programming. It has support for most of the DBMSs available today, such as Oracle, MySQL, PostgreSQL, Sybase, and DB2 (see Chapters 17-19 for more details). It has an established strategy to build or use a database abstraction class that can handle all the regular database operations. You can easily change the database without greatly changing the code written for a particular platform.

Building a true data layer for the application will result in no code changes to the application logic and the presentation layer (increasing code maintainability). However, abstraction is the best way to go to be completely scalable in terms of data storage.

Important

It is best to consider PEAR (PHP Extension and Add-on Repository) or PHPLib to find a database abstraction class for your DBMS. PEAR is an effort from PHP developers to build a common repository for reusable pieces of code, similar to Perl's CPAN. You can obtain more information about what is PEAR and how to write or use PEAR code from http://pear.php.net/. However, we will look at writing our own custom abstraction class in Chapter 17. It's not advisable to use native PHP functions for direct database connectivity with applications.

XML Model

XML (Extended Markup Language), a recognized standard from the W3C (World Wide Web Consortium), is an excellent data modeling language. XML stored data is modeled by a set of DTDs or schemas that define the structure of XML documents in an industry or task-based way.

Today there are hundreds of applications and systems built using XML, both for interchanging and storing data. Since XML is a great standard for converting data from one format to another, it is fast becoming a cornerstone for application interaction. Even if applications do not have to interact with other systems, the use of XML can standardize internal structures, simplifying the development process.

Let's consider this single XML file for the polls application:

    <pollsapp>

      <poll>
        <question>Which is your favorite color?</question>

        <option>
          <name>blue</name>
          <votes>6</votes>
        </option>

        <option>
          <name>green</name>
          <votes>7</votes>
        </option>

        <comment>I really like blue</comment>
      <poll>
    </pollsapp>

As you can see here, the poll, option, and votes elements can all be stored in a single XML file.

Let's assume we did not consider that a user might want to add comments to polls. To include the comments in the relational model, we would have to create a table for the comments. In XML we just add comment elements to the XML file.

The really big advantages of XML over a relational model are the following:

SQL has a proprietary model for metadata while XML uses an open standard.
SQL is more or less a standard, but the way in which the data is internally stored is proprietary to each DBMS.
You cannot take a table and understand content without the DBMS interpretation and reporting facilities.
XML is designed for data interchange while the relational model has to bolt on transform tools.
You cannot send a proprietary SQL table in open code form. With XML, you can send simple XML files. Any transformation problems are also addressed by the open standard of XSLT.

Finally, to implement an XML-based data model in PHP you must first define how, where, and why to store the data stream. You could use plain files somewhere in the file system or you could use an XML repository solution, like Ozone or dbXML (http://www.dbxml.org/).

Chapter 21 has more in-depth coverage of PHP's XML APIs. There we cover reading, writing, and transforming.

Hybrid Model

The hybrid data model combines two or more different data modeling strategies. For example, you can have a relational model and a set of XML DTDs and files co-existing in the same application.

Hybrid models add complexity to the content layer since there will be more than one interface to store and retrieve data. Although hybrid models demand high levels of design skill at the planning stage, they are the most flexible, scalable, and useful in today's e-business world.

The Logic Layer

The logic layer is where you find all the intelligence of an application. In this layer you manipulate data pulled from the content layer and prepare it. Data manipulation like calculations, transformations, statistical information, security, and audit pathing are all set by the logic tier. User tracking systems, logging systems, caching systems, and many others are found at the heart of this layer.

The most important consideration about the logic layer in PHP is to design it in a modular way. You can design separate classes/functions for the different business rules or functions that the application demands.

In our polls example, we could create a Polls class where we encapsulate all the methods needed for manipulation, such as getting the current poll, adding a poll, voting, getting the options from a poll, and so on. Then, if we decide to add forums to our site or application, we can design and create a new class/module that has absolutely no relationship or dependence on objects before and after it.

The Presentation Layer

In the presentation layer you add design and layout elements to the content prepared in the logic layer. This is where you generate HTML using CSS, Flash, images, and whatever else design experts want to use to make the application attractive.

Also, client-side code or presentation layer plug-ins give the presentation layer increasing power to share the computing load of the application proper.

The Explosion of Web Devices

In the beginning, only browsers accessed the web and they were limited in their functionality. Now we have web-enabled devices such as cell phones, pagers, e-mail clients, PDAs, hand helds, POS terminals, data capturing devices, and more. In the near future even small appliances such as microwave ovens and freezers may want to access the web to get data or publish information about their state.

Different devices require different presentation languages. We can have HTML, XHTML, XML vocabularies, WML, and other presentation languages. If there are two devices accepting the same presentation language it is clear that they might require a different kind of content. We cannot compare the screen of a modern PDA with a text display in a fridge. So there will be a requirement for different presentation languages and formats from a dozen different devices accessing a web application.

Don't forget machines, programs accessing the web collecting information. We might have to generate a presentation language for these programs, usually an XML vocabulary or similar. This will enable the creation of web services where organizations provide services and use others to create complex distributed web applications.

While the classic multi-tiered architecture is very useful to separate different layers it is certainly oriented towards HTML-based web applications and sites. We can change the presentation language but this usually implies a lot of effort since we have to recode a lot of functions, some almost impossible in the new presentation language.

Sometimes there will be no mapping between an HTML function and the presentation language we want to use. If this happens then there's something wrong: we are using HTML as our base language. We have built an HTML class and are trying to adapt it to other presentation languages. We tend to force every presentation language to be mapped to HTML and that's impossible. The best solution to the problem of dealing with a lot of different presentation languages seems to be the use of XML.