Hash strings to integers in PHP with the DJB hashing algorithm

I recently found myself needing a PHP implementation of the DJB hashing algorithm, but ran into a problem—in 64-bit PHP5, integers don’t overflow. Instead, they magically turn into floating point variables large enough to hold the new value. For short strings this isn’t really a problem (the hashing algorithm won’t cause an integer overflow in the first place), but for anything over five or six characters, you end up with numbers that aren’t comparable with other implementations of the algorithm (nor, for that matter, will they fit into any of MySQL’s numeric data types).

So, here’s a short function that uses PHP’s GNU Multiple Precision (GMP) module to perform the arithmetic at the necessary level of precision, then convert the result back to a standard PHP int:

define('PHP_INT_MIN', ~PHP_INT_MAX);

function hash_djb2($str){
	$hash = 5381;
	$length = strlen($str);

	for($i = 0; $i < $length; $i++) {
		$hash = gmp_add(gmp_mul($hash, 33), ord($str[$i]));
	        
		while (gmp_cmp($hash, PHP_INT_MAX) > 0) {
			$diff =	gmp_sub($hash, PHP_INT_MAX);
			$hash =	gmp_add($diff, PHP_INT_MIN);
			$hash =	gmp_sub($hash, 1); // off by 1
		}
	}
	return gmp_intval($hash);
}

I’ve only tested this on 64-bit PHP5 on Linux. PHP seems to handle integers a little differently across different platforms, so your mileage may vary. Enjoy!

Leave a Reply

Your email address will not be published. Required fields are marked *