Share De Duping an Array in PHP the Easy Way
Friday, April 9th, 2010
This is a quick and dirty way to clean up the values in an array, removing duplicates while preserving (as much as possible) the original order of the values. This method works well for smaller sorted data sets, having less than a few hundred elements in the array. Of course this type of processing is heavily dependent on your equipment so your mileage may vary.
$arr = array("a","b","c","c","d","e","e","f","g","h","h","h","i"); // removing the duplicates $tmp = array(); while(count($arr) > 0){ $item = array_pop($arr); if(!in_array($item, $tmp)) array_unshift($tmp, $item); } $arr = $tmp; unset($tmp); // array now holds unique values // array("a","b","c","d","e","f","g","h","i"); |
You can extend this to work with larger arrays if the original data is sorted by replacing the in_array() function. The in_array() function looks over the entire set of values in the array from the first element to the last until it finds a match, so the execution time increases with each additional element you find.
You can also use this method to work with complex array elements, such as an array of associative arrays, although the process is slightly different. In this case you will use a helper array (called $keys below) to keep track of the primary key value you are de duping against.
$arr = array( array("id"=>"1", "data"=>"apple", "qty"=>10), array("id"=>"1", "data"=>"apple", "qty"=>10), array("id"=>"2", "data"=>"orange", "qty"=>8), array("id"=>"2", "data"=>"orange", "qty"=>8), array("id"=>"3", "data"=>"pear", "qty"=>12), array("id"=>"3", "data"=>"pear", "qty"=>12), ); // Remove duplicate values using 'id' as the unique value $keys = array(); $tmp = array(); while(count($arr) > 0){ $item = array_pop($arr); if(!in_array($item['id'], $keys)){ array_unshift($keys, $item['id']); array_unshift($tmp, $item); } } $arr = $tmp; unset($tmp); unset($keys); |
In case your wondering why the process starts at the end of the array – using array_pop() – and work towards the beginning, for any row that is a duplicate of the prior value in_array() will find the match quickly because it is the first element in the array. This of course doesn’t provide any real benefit for values that are not sorted and any value that doesn’t match still requires a scan of the entire $tmp or $keys array.

I have found the tool I’ve been looking for!
Another great tool that I discovered today is
If you develop code in PHP or any other language take 5 minutes to
For the last couple of weeks I’ve been developing and working completely remotely. Removed from my office, removed from my resources and often while in remote locations from the car with my family as we travel from destination to destination. It’s been enjoyable to be part of everything the family is doing but it has not been without difficulty. For example, it can be hard to debug complex problems with kids unhappy in the back seat or to read a terminal session with the sun washing out the screen. It can also be challenging to be working via SSH and have the connection drop because we travel through an area without data connectivity.
Mashable has a nice collection of