Приглашаем посетить
Черный Саша (cherny-sasha.lit-info.ru)

Variables

Previous
Table of Contents
Next

Variables

Programming languages come in two basic flavors when it comes to how variables are declared:

  • Statically typed Statically typed languages include languages such as C++ or Java, where a variable is assigned a type (for example, int or String) and that type is fixed at compile time.

  • Dynamically typed Dynamically typed languages include languages such as PHP, Perl, Python, and VBScript, where types are automatically inferred at runtime. If you use this:

    $variable = 0;
    

    PHP will automatically create it as an integer type.

Furthermore, there are two additional criteria for how types are enforced or converted between:

  • Strongly typed In a strongly typed language, if an expression receives an argument of the wrong type, an error is generated. Without exception, statically typed languages are strongly typed (although many allow one type to be cast, or forced to be interpreted, as another type). Some dynamically typed languages, such as Python and Ruby, have strong typing; in them, exceptions are thrown if variables are used in an incorrect context.

  • Weakly typed A weakly typed language does not necessarily enforce types. This is usually accompanied by autoconversion of variables to appropriate types. For instance, in this:

    $string = "The value of \$variable is $variable.";
    

    $variable (which was autocast into an integer when it was first set) is now autoconverted into a string type so that it can be used to create $string.

All these typing strategies have their relative benefits and drawbacks. Static typing allows you to enforce a certain level of data validation at compile time. For this reason, dynamically typed languages tend to be slower than statically typed languages. Dynamic typing is, of course, more flexible. Most interpreted languages choose to go with dynamic typing because it fits their flexibility.

Strong typing similarly allows you a good amount of built-in data validation, in this case at runtime. Weak typing provides additional flexibility by allowing variables to autoconvert between types as necessary. The interpreted languages are pretty well split on strong typing versus weak typing. Python and Ruby (both of which bill themselves as general-purpose "enterprise" languages) implement strong typing, whereas Perl, PHP, and JavaScript implement weak typing.

PHP is both dynamically typed and weakly typed. One slight exception is the optional type checking for argument types in functions. For example, this:

function foo(User $array) { }

and this:

function bar( Exception $array) {}

enforce being passed a User or an Exception object (or one of its descendants or implementers), respectively.

To fully understand types in PHP, you need to look under the hood at the data structures used in the engine. In PHP, all variables are zvals, represented by the following C structure:

struct _zval_struct {
  /* Variable information */
  zvalue_value value;     /* value */
  zend_uint refcount;
  zend_uchar type;        /* active type */
  zend_uchar is_ref;
};

and its complementary data container:

typedef union _zvalue_value {
  long lval;              /* long value */
  double dval;            /* double value */
  struct {
    char *val;
    int len;
  } str;                  /* string value */
  HashTable *ht;          /* hashtable value */
  zend_object_value obj;  /* handle to an object */
} zvalue_value;

The zval consists of its own value (which we'll get to in a moment), a refcount, a type, and the flag is_ref.

A zval's refcount is the reference counter for the value associated with that variable. When you instantiate a new variable, like this, it is created with a reference count of 1:

$variable = 'foo';

If you create a copy of $variable, the zval for its value has its reference count incremented. So after you perform the following, the zval for 'foo' has a reference count of 2:

$variable_copy = $variable;

If you then change $variable, it will be associated to a new zval with a reference count of 1, and the original string 'foo' will have its reference count decremented to 1, as follows:

$variable = 'bar';

When a variable falls out of scope (say it's defined in a function and that function is returned from), or when the variable is destroyed, its zval's reference count is decremented by one. When a zval's refcount reaches 0, it is picked up by the garbage-collection system and its contents will be freed.

The zval type is especially interesting. The fact that PHP is a weakly typed language does not mean that variables do not have types. The type attribute of the zval specifies what the current type of the zval is; this indicates which part of the zvalue_value union should be looked at for its value.

Finally, is_ref indicates whether this zval actually holds data or is simply a reference to another zval that holds data.

The zvalue_value value is where the data for a zval is actually stored. This is a union of all the possible base types for a variable in PHP: long integers, doubles, strings, hashtables (arrays), and object handles. union in C is a composite data type that uses a minimal amount of space to store at different times different possible types. Practically, this means that the data stored for a zval is either a numeric representation, a string representation, an array representation, or an object representation, but never more than one at a time. This is in contrast to a language such as Perl, where all these potential representations can coexist (this is how in Perl you can have a variable that has entirely different representations when accessed as a string than when accessed as a number).

When you switch types in PHP (which is almost never done explicitlyalmost always implicitly, when a usage demands a zval be in a different representation than it currently is), zvalue_value is converted into the required format. This is why you get behavior like this:

$a = "00";
$a += 0;
echo $a;

which prints 0 and not 00 because the extra characters are silently discarded when $a is converted to an integer on the second line.

Variable types are also important in comparison. When you compare two variables with the identical operator (= = =), like this, the active types for the zvals are compared, and if they are different, the comparison fails outright:

$a = 0;
$b = '0';
echo ($a = = = $b)?"Match":"Doesn't Match";

For that reason, this example fails.

With the is equal operator (= =), the comparison that is performed is based on the active types of the operands. If the operands are strings or nulls, they are compared as strings, if either is a Boolean, they are converted to Boolean values and compared, and otherwise they are converted to numbers and compared. Although this results in the = = operator being symmetrical (for example, if $a = = $b is the same as $b = = $a), it actually is not transitive. The following example of this was kindly provided by Dan Cowgill:

$a = "0";
$b = 0;
$c = "";
echo ($a = = $b)?"True":"False"; //  True
echo ($b = = $c)?"True":"False";  // True
echo ($a = = $c)?"True":"False";  // False

Although transitivity may seem like a basic feature of an operator algebra, understanding how = = works makes it clear why transitivity does not hold. Here are some examples:

  • "0" = = 0 because both variables end up being converted to integers and compared.

  • $b = = $c because both $b and $c are converted to integers and compared.

  • However, $a != $c because both $a and $c are strings, and when they are compared as strings, they are decidedly different.

In his commentary on this example, Dan compared this to the = = and eq operators in Perl, which are both transitive. They are both transitive, though, because they are both typed comparison. = = in Perl coerces both operands into numbers before performing the comparison, whereas eq coerces both operands into strings. The PHP = = is not a typed comparator, though, and it coerces variables only if they are not of the same active type. Thus the lack of transitivity.


Previous
Table of Contents
Next