-
Notifications
You must be signed in to change notification settings - Fork 7.8k
xxh3 hash ignores seed if string? #10305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There's very little documentation for many of the hash algorithms, but XXH3 accepts a numeric It makes sense to me that PHP should complain if either is provided incorrectly, yet right now it silently ignores a non-integer |
It seems to me we should use And while not directly related to this ticket, we also should review the |
If no one is working on fixing this yet, I'd like to give this a shot later today :) (both this issue and the toString exception) |
I worked on this, and used Additional test here for testing strings: var_dump(hash('xxh3', 'foo', options: ['seed' => '1'])); // 54557a2c8b633298
var_dump(hash('xxh3', 'foo', options: ['seed' => 1])); // 54557a2c8b633298
class StringableThrowingClass {
public function __toString(): string {
throw new Exception('exception in __toString');
return '';
}
}
$testString = str_repeat('a', 136);
class StringableClass {
public function __toString(): string {
global $testString;
return $testString;
}
}
try {
var_dump(hash('xxh3', 'foo', options: ['secret' => new StringableThrowingClass]));
} catch (Exception $e) {
echo $e->getMessage() . "\n";
}
var_dump(hash('xxh3', 'foo', options: ['secret' => new StringableClass]));
var_dump(hash('xxh3', 'foo', options: ['secret' => $testString])); Current changes/patch: diff --git a/ext/hash/hash_xxhash.c b/ext/hash/hash_xxhash.c
index 7ecedd8128..36540e91b9 100644
--- a/ext/hash/hash_xxhash.c
+++ b/ext/hash/hash_xxhash.c
@@ -168,13 +168,16 @@ zend_always_inline static void _PHP_XXH3_Init(PHP_XXH3_64_CTX *ctx, HashTable *a
return;
}
- if (_seed && IS_LONG == Z_TYPE_P(_seed)) {
+ if (_seed) {
+ zend_long seed_as_long = zval_get_long(_seed);
/* This might be a bit too restrictive, but thinking that a seed might be set
once and for all, it should be done a clean way. */
- func_init_seed(&ctx->s, (XXH64_hash_t)Z_LVAL_P(_seed));
+ func_init_seed(&ctx->s, (XXH64_hash_t)seed_as_long);
return;
} else if (_secret) {
- convert_to_string(_secret);
+ if (!try_convert_to_string(_secret)) {
+ return;
+ }
size_t len = Z_STRLEN_P(_secret);
if (len < PHP_XXH3_SECRET_SIZE_MIN) {
zend_throw_error(NULL, "%s: Secret length must be >= %u bytes, %zu bytes passed", algo_name, XXH3_SECRET_SIZE_MIN, len); |
While involving Thansk |
I'm going to PR the string exception fix later today, because that can stand on its own as a commit and shouldn't impact BC. As for the string seed: I think it might be a bad idea after all to use zval_get_long because of the implicit behaviour with strings like "1abc" and non-numeric strings (like the concern weltling raised in the previoud comment). Perhaps it is best to be strict about this and only allow LONGs (or floats that are actually longs), and if the input is any other type we could throw. |
This may or may not need a deprecation and could be added to the mass PHP 8.3 deprecation RFC, but at least having an implementation to look at the impact is a good idea. |
The initialization routine for XXH3 was not prepared for exceptions from seed. Fix this by using try_convert_to_string. For discussion, please see: GH-10305 Closes GH-10352 Signed-off-by: George Peter Banyard <[email protected]>
I agree with this, adding the current behaviour to the deprecation RFC would be a good idea.
In that way, we can see the impact on current and on future versions of PHP. |
I hope its ok to ping here. Same result with script from initial post and PHP 8.3.3. |
Of course it's okay to ping. The consensus was that we can't make it throw without having it deprecated first. So I added it to the (in-draft) 8.4 deprecation rfc as a todo: https://wiki.php.net/rfc/deprecations_php_8_4#non-numeric_seed_strings_in_xxh3 |
Note the same issue exists with xxh128 (the 128 bit variant of 64 bit xxh3), and maybe other hash algos of that family? |
You're correct, I changed the stub in the RFC. |
Description
The following code:
Resulted in this output:
I'd expect the second hash to be different from the first one, just like the third hash. It seems a string as 'seed' is ignored. In case this is intended, it might be good to document this, or maybe except on
stringnon-int input?PHP Version
Tested with PHP 8.1.13 & 8.2.1
The text was updated successfully, but these errors were encountered: