Skip to content

IntlDateFormatter returns incorrect week number for an invalid locale (and doesn't throw an exception either) #12282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
campbell-m opened this issue Sep 23, 2023 · 12 comments

Comments

@campbell-m
Copy link

Description

The following code:

<?php
$date = DateTimeImmutable::createFromFormat('Y-m-d', '2023-09-23', new DateTimeZone('Europe/London'));

foreach (['en', 'xx'] as $locale)
{
    $formatter = new IntlDateFormatter(
        $locale,
        IntlDateFormatter::FULL,
        IntlDateFormatter::FULL,
        null,
        null,
        'w'
    );
    
    echo "Locale: $locale, Week number: " . $formatter->format($date) . "<br>\n";
}

Resulted in this output:

Locale: en, Week number: 38
Locale: xx, Week number: 39

But I expected this output instead:

Locale: en, Week number: 38
Locale: xx, Week number: 38

or else for an exception to be thrown when trying to use 'xx'.

This problem happens on all versions of PHP and all operating systems I have tested.

PHP Version

PHP 8.3.0RC2

Operating System

Windows NT 10.0 build 22621 (Windows 11) AMD64

@mpdude
Copy link

mpdude commented Oct 30, 2023

IMHO this is a breaking change that should not go into a bugfix release.

There might be multiple reasons why you might be using an invalid locale string, e. g. the locale is being taken from a request parameter, is taken from the LANG environment variable and is set to C or the locale requested is simply not available (generated) on a particular system/installation.

While I agree that all those should be checked/caught by the surrounding/preceding code, turning this into an exception in a bugfix release is not an option IMHO.

Also, I think something that throws an exception should not be added in a RC3 (like 8.3.0RC3), but that's another story.

@campbell-m
Copy link
Author

If it doesn't throw an exception then I think that it should at least return the correct week number, even if the locale is invalid. The documentation is not very clear, but as far as I can see the 'w' pattern returns the ISO week number. If that's the case then surely the ISO week number is independent of the locale?

@mpdude
Copy link

mpdude commented Oct 31, 2023

I am afraid it is:

w3c/sdw#886

@mpdude
Copy link

mpdude commented Oct 31, 2023

Maybe you should not expect a particular week number for an invalid locale, given that the week number depends on the locale?

Garbage (locale) in, garbage (week no) out, but no exception in a bugfix release.

If we want stricter locale validation, that needs to go elsewhere or in more places than just this.

@campbell-m
Copy link
Author

I agree that the locale could be validated elsewhere, but my point was that if 'w' is the pattern for the ISO week number then in the example above it should give 38 rather than 39 as the ISO week number does not depend on locale (well, at least the value though the character representation may).

@mpdude
Copy link

mpdude commented Oct 31, 2023

The week number does depend on the locale, see this:

<?php
$date = DateTimeImmutable::createFromFormat('Y-m-d', '2016-01-04', new DateTimeZone('Europe/London'));

foreach (['en_US', 'de_DE'] as $locale)
{
    $formatter = new IntlDateFormatter(
        $locale,
        IntlDateFormatter::FULL,
        IntlDateFormatter::FULL,
        null,
        null,
        'w'
    );

    echo "Locale: $locale, Week number: " . $formatter->format($date) . "<br>\n";
}
Locale: en_US, Week number: 2<br>
Locale: de_DE, Week number: 1<br>

@campbell-m
Copy link
Author

Ok, my misunderstanding then. As I said, the documentation isn't very clear. I had assumed that 'w' is the pattern for the ISO week number. If it isn't, do you know what is?

@damianwadley
Copy link
Member

Ok, my misunderstanding then. As I said, the documentation isn't very clear. I had assumed that 'w' is the pattern for the ISO week number. If it isn't, do you know what is?

The "week of year" varies. ISO 8601 has one specification, but that doesn't mean every locale uses it.

https://www.unicode.org/reports/tr35/tr35-dates.html#week-of-year

Values calculated for the Week of Year field range from 1 to 53 for the Gregorian calendar (they may have different ranges for other calendars). Week 1 for a year is the first week that contains at least the specified minimum number of days from that year. Weeks between week 1 of one year and week 1 of the following year are numbered sequentially from 2 to 52 or 53 (if needed). For example, January 1, 1998 was a Thursday. If the first day of the week is MONDAY and the minimum days in a week is 4 (these are the values reflecting ISO 8601 and many national standards), then week 1 of 1998 starts on December 29, 1997, and ends on January 4, 1998. However, if the first day of the week is SUNDAY, then week 1 of 1998 starts on January 4, 1998, and ends on January 10, 1998. The first three days of 1998 are then part of week 53 of 1997.

@campbell-m
Copy link
Author

Many thanks for the clarifications. I understand the 'w' pattern a lot better now.

@filozofer
Copy link

filozofer commented Nov 9, 2023

We have update our php version to the lastest yesterday (8.2.12) and the bug still provoke 500 in our production environment :
Uncaught PHP Exception IntlException: "datefmt_create: invalid locale: U_ILLEGAL_ARGUMENT_ERROR" at /var/www/web/vendor/symfony/form/Extension/Core/DataTransformer/DateTimeToLocalizedStringTransformer.php line 179

Concern line in Symfony core :
$intlDateFormatter = new \IntlDateFormatter(\Locale::getDefault(), $dateFormat, $timeFormat, $timezone, $calendar, $pattern ?? '');

With inputs :

input value
calendar 1
dateType 2
locale cz
pattern yyyy-MM-dd
timeType -1
timezone Object DateTimeZone

So it seems than "cz" locale is considerate as an invalid locale ><
CZ is part of the Iana list of language : https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry

Type: region
Subtag: CZ
Description: Czechia
Description: Czech Republic
Added: 2005-10-16

So... bug or misconfiguration of our app ?

@mpdude
Copy link

mpdude commented Nov 9, 2023

The fix (revert of the change) will land in the next 8.1 and 8.2 bugfix release. PHP 8.2.12 was released before the change was made.

Caution: PHP 8.3 will bring back the change.

@filozofer
Copy link

We pass our users from "cz" locale to "cs" and it works again. Thanx for your response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants