Приглашаем посетить
Куприн (kuprin-lit.ru)

Section 9.3.  Handling Data

Previous
Table of Contents
Next

9.3. Handling Data

Handling data coming in from HTML pages is by far the most common task in PHP, and many might say it deserves a whole chapter to itself! In this section, we will be looking at how variables get into your scripts, and also at how you can distinguish between where those variables come from.

9.3.1. register_globals

Prior to PHP 4.1, variables submitted from external sourcessuch as session variables, cookies, form fields, etc.were automatically converted to variables inside PHP, as long as register_globals was enabled in the php.ini file, which it was by default. These variables were also accessible through the arrays $HTTP_POST_VARS, $HTTP_COOKIE_VARS, $HTTP_SESSION_VARS, etc.

Imagine the following situation: you have a secure site, where members are identified by logon names, such as "Administrator," "Joe," and "Peter." The pages on this site track the username by way of the variable UserID, which is stored in a cookie on the computer when the user authenticates to the site. With register_globals enabled, $UserID is available as a variable to all scripts on your site, which, while helpful, is a security hole.

Here is a URL that demonstrates the problem: http://www.yoursite.com/secure.php?UserID=root. When register_globals is enabled, all variables sent by GET and POST are also converted to variables, and are indistinguishable from variables from other sources. The result of this is that a hacker could, by using the URL above, impersonate someone elselike root!

This was clearly a critical situation, and it was worryingly common. As such, the decision was made to recommend that all users disable register_globals. In PHP 4.2, this was pushed further by having the default value of register_globals changed to off, and this is how it has remained in PHP 5. Register_globals is not likely to be changed back to on for its default value, which means that it is best to learn the proper way of doing things: using the superglobals.

9.3.2. Working Around register_globals

In order to provide a middle ground for users who did not want to use the superglobals but also did not want to enable register_globals, the function import_request_variables( ) was introduced. This copies variables from the superglobal arrays into variables in their own right, and takes two parameters: a special string of which types of variables to convert, and the prefix that should be added to them.

The special string can contain "g" for GET variables, "p" for POST, "c" for cookies, or any combination of them. The prefix works in almost the same way as the prefix to extract( ) does, except that it does not add an underscore, which means that scripts relying on older functionality can use import_request_variables( ) to get back to the old manner of working. As with the prefix used in extract( ), the string is appended to the beginning of the names of each variable created to ensure there is no naming clash with existing data.

Here are some examples:

    import_request_variable("p", "post");
    import_request_variable("gp", "gp");
    import_request_variable("cg", "cg");

Note that the order of the letters in the first parameter mattersin gp, for example, any POST variables that have the same names as GET variables will overwrite the GET variables. In other words, the GET variables are imported first, then the POST variables. If we had used pg, it would have been POST and then GET, so the ordering is crucial.

Once import_request_variables( ) is used, you can use the new variables immediately, like this:

    print $_GET['Name'];
    import_request_variables("g", "var");
    print $varName;

If you don't specify a prefix, or if the prefix is empty, you will get a notice to warn you of the security issue.

Section 9.3.  Handling Data

It is strongly recommended that you avoid using import_request_variables( ) unless you cannot live without it. Importing external data into the global variable namespace is dangerous; the superglobal arrays are much safer.


9.3.3. Magic Quotes

PHP has a special php.ini setting called magic_quotes_gpc, which means that PHP will automatically place backslashes (\) before all quotes and other backslashes for GET, POST, and COOKIE data (GPC)the equivalent of running the addslashes( ) function. These slashes are required to make user input safe for database entry. Without them, strings are likely to be interpreted incorrectly.

This functionality is usually turned on by default, which means that all GPC data coming into your script is safe for database entry. But it also means that if your data is not destined for a database, you need to disable magic quotes in your php.ini file.

Section 9.3.  Handling Data

I prefer to turn off magic quotes and handle the slashes myself, as this leads to much more predictable and easily understood behavior. Changing your execution environment at runtime to enable magic quotes will have no effect on the script, as the variables are already parsed and ready for use by the time your code is executed. So, the only way to do this is to set magic_quotes_gpc to off in your php.ini file.


9.3.4. Handling Our Form

You now know enough to be able to program a script to handle the advanced form presented previously. Our variables will be coming in using the GET method. In the real world, you would use POST because it is possible that users will submit large quantities of data in the "Life story" field; however, using GET here lets you see how it all works. Because we're using the GET method, we should be reading our variables from $_GET.

The first two fields sent are Name and Password, which will both contain string data. Remember that the password HTML form element transmits its data as plain text, which means that both Name and Password can be handled the same way. As they are coming in via GET, the values entered by our visitors will be in $_GET['Name'] and $_GET['Password']note that the cases have been preserved from the form exactly and that, as per usual, PHP considers $_GET['name'] to be different from $_GET['Name'].

The next input is the select list box Age, which will return a string valueeither "Under 16", "16-30", "31-50", or "51-80". From the PHP point of view, this is no different from handling input from a text box other than that we can, to a certain extent, have an idea about what the values will be. That is, under normal circumstances, we will always know what the values will be, as our users have to pick one option from a list we present. However, it takes only a little knowledge to "hack" the page so that users can input what they likejust remember the golden rule: "Never trust user input."

The Story text area element submits data in the same way as a normal text box does, with the difference that it can contain new line characters \n. The chances are that you want to HTML line breaks (the <br /> tag) as well as the \n line breaks, so you should use nl2br( ), like this:

    $_GET['Story'] = nl2br($_GET['Story']);

Next we get to our radio buttons, FaveSport. As radio buttons can only submit one value, this one value will be available as a normal variable in $_GET['FaveSport']. This is in contrast to the checkbox form elements that followthey have the name Languages[ ], which will make PHP convert them into a single array of values, available in $_GET['Languages'].

We can put the whole script together using the above information, plus the other techniques we've covered in previous chapters. This script parses the form properly:

    $_GET['Languages'] = implode(', ', $_GET['Languages']);
    $_GET['Story'] = str_replace("\n", "<br />", $_GET['Story']);

    print "Your name: {$_GET['Name']}<br />";
    print "Your password: {$_GET['Password']}<br />";
    print "Your age: {$_GET['Age']}<br /><br />";
    print "Your life story:<br />{$_GET['Story']}<br /><br />";
    print "Your favorite sport: {$_GET['FaveSport']}<br />";
    print "Languages you chose: {$_GET['Languages']}<br />";

The entire script to handle the HTML form we created is just eight lines long, of which six are just print statements reading from the $_GET array. The first two lines aren't anything special either: line one converts the Languages array created from the checkboxes into one string using implode( ), and line two converts the new line characters in the Story text area into HTML line breaks.

However, the script above contains a bug. What happens if our users don't check any boxes for languages? The answer is that browsers will not send any languages information, which means that $_GET['Languages'] will not be set, which in turn means that the first line in the script will cause an error. The solution is simple: use if (isset($_GET['Languages'])) to check whether there is a value set. If there is, use implode( ) to make it a string, and if not, put a dummy text string in there like, "You didn't select any languages!" The final output of this form is shown in Figure 9-5.

Figure 9-5. The finished form handlernote the variables being passed in the URL bar because we used GET
Section 9.3.  Handling Data



Previous
Table of Contents
Next