Asymmetry in PHP’s array_diff()

I have been working on my ueber-basic PHP-driven website engine, and ran into a really odd ‘feature’ of the array_diff() function in PHP 5.1.2: its results seem inconsistent.

Take two arrays, each with the same number of elements (say n). One of the arrays contains some string values, but the other contains all null/empty strings. array_diff($array1, $array2) returns all the elements in $array1, as expected; however, array_diff($array2, $array1) returns an array containing n empty elements. It should return just a single (empty) element.

This code demonstrates the behaviour:

// array_diff() weirdness
echo 'DEBUG - testing array_diff()';
$array1 = array('a', 'b', 'c'); // 3 entries, non-null
$array2 = array('', 'b', '');   // 3 entries, 1 non-null
$array3 = array('', '', '');    // 3 entries, all null
echo '$array1: ' . implode(',', $array1) . '.';
echo '$array2: ' . implode(',', $array2) . '.';
echo '$array3: ' . implode(',', $array3) . '.';
$diff12 = array_diff($array1, $array2); // diff array1 against array2
$diff21 = array_diff($array2, $array1); // diff array2 against array1
$diff13 = array_diff($array1, $array3); // etc.
$diff31 = array_diff($array3, $array1);
$diff23 = array_diff($array2, $array3);
$diff32 = array_diff($array3, $array2);
echo '$diff12: ' . implode(',', $diff12) . '.';
echo '$diff21: ' . implode(',', $diff21) . '.';
echo '$diff13: ' . implode(',', $diff13) . '.';
echo '$diff31: ' . implode(',', $diff31) . '.';
echo '$diff23: ' . implode(',', $diff23) . '.';
echo '$diff32: ' . implode(',', $diff32) . '.';

This returns the following:

DEBUG - testing array_diff()
$array1: a,b,c.
$array2: ,b,.
$array3: ,,.
$diff12: a,c.
$diff21: ,.
$diff13: a,b,c.
$diff31: ,,.
$diff23: b.
$diff32: .

Notice the result for $diff31, compared to that for $diff21. Why are there three elements in the result for $diff31? array_diff() should return an array of unique elements which exist in one array and not another. It seems that there are ‘special cases’ which depend on (what?) array size? nullness? element ordering? The documentation for array_diff() states:

Multiple occurrences in $array1 are all treated the same way.

The crucial question: Why are multiple occurances in $array2 not treated in the same way?

The context for this is here: html_simple_tidy()

What a shame there is no decent interactive PHP shell, like those for Python and Ruby.


Join the discussion...

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s