What’s new in 2.3.0 (June 4, 2025)#
These are the changes in pandas 2.3.0. See Release notes for a full changelog including other versions of pandas.
Enhancements#
Other enhancements#
The semantics for the
copy
keyword in__array__
methods (i.e. called when usingnp.array()
ornp.asarray()
on pandas objects) has been updated to work correctly with NumPy >= 2 (GH 57739)Series.str.decode()
result now hasStringDtype
whenfuture.infer_string
is True (GH 60709)to_hdf()
andto_hdf()
now round-trip withStringDtype
(GH 60663)Improved
repr
ofNumpyExtensionArray
to account for NEP51 (GH 61085)The
Series.str.decode()
has gained the argumentdtype
to control the dtype of the result (GH 60940)The
cumsum()
,cummin()
, andcummax()
reductions are now implemented forStringDtype
columns (GH 60633)The
sum()
reduction is now implemented forStringDtype
columns (GH 59853)
Notable bug fixes#
These are bug fixes that might have notable behavior changes.
notable_bug_fix1#
In previous versions, comparing Series
of different string dtypes (e.g. pd.StringDtype("pyarrow", na_value=pd.NA)
against pd.StringDtype("python", na_value=np.nan)
) would result in inconsistent resulting dtype or incorrectly raise. pandas will now use the hierarchy
Increased minimum version for Python#
in determining the result dtype when there are different string dtypes compared. Some examples:
When
pd.StringDtype("pyarrow", na_value=pd.NA)
is compared against any other string dtype, the result will always beboolean[pyarrow]
.When
pd.StringDtype("python", na_value=pd.NA)
is compared againstpd.StringDtype("pyarrow", na_value=np.nan)
, the result will beboolean
, the NumPy-backed nullable extension array.When
pd.StringDtype("python", na_value=pd.NA)
is compared againstpd.StringDtype("python", na_value=np.nan)
, the result will beboolean
, the NumPy-backed nullable extension array.
API changes#
When enabling the
future.infer_string
option,Index
set operations (like union or intersection) will now ignore the dtype of an emptyRangeIndex
or emptyIndex
withobject
dtype when determining the dtype of the resulting Index (GH 60797)
Deprecations#
Deprecated allowing non-
bool
values forna
instr.contains()
,str.startswith()
, andstr.endswith()
for dtypes that do not already disallow these (GH 59615)Deprecated the
"pyarrow_numpy"
storage option forStringDtype
(GH 60152)The deprecation of setting the argument
include_groups
toTrue
inDataFrameGroupBy.apply()
has been promoted from aDeprecationWarning
toFutureWarning
; onlyFalse
will be allowed (GH 7155)
Bug fixes#
Numeric#
Bug in
Series.mode()
andDataFrame.mode()
withdropna=False
where not all dtypes would sort in the presence ofNA
values (GH 60702)Bug in
Series.round()
where aTypeError
would always raise withobject
dtype (GH 61206)
Strings#
Bug in
DataFrameGroupBy.min()
,DataFrameGroupBy.max()
,Resampler.min()
,Resampler.max()
where all NA values of string dtype would return float instead of string dtype (GH 60810)Bug in
DataFrame.sum()
withaxis=1
,DataFrameGroupBy.sum()
orSeriesGroupBy.sum()
withskipna=True
, andResampler.sum()
with all NA values ofStringDtype
resulted in0
instead of the empty string""
(GH 60229)Bug in
Series.__pos__()
andDataFrame.__pos__()
where anException
was not raised forStringDtype
withstorage="pyarrow"
(GH 60710)Bug in
Series.rank()
forStringDtype
withstorage="pyarrow"
that incorrectly returned integer results withmethod="average"
and raised an error if it would truncate results (GH 59768)Bug in
Series.replace()
withStringDtype
when replacing with a non-string value was not upcasting toobject
dtype (GH 60282)Bug in
Series.str.center()
withStringDtype
withstorage="pyarrow"
not matching the python behavior in corner cases with an odd number of fill characters (GH 54792)Bug in
Series.str.replace()
whenn < 0
forStringDtype
withstorage="pyarrow"
(GH 59628)Bug in
Series.str.slice()
with negativestep
withArrowDtype
andStringDtype
withstorage="pyarrow"
giving incorrect results (GH 59710)
Indexing#
Bug in
Index.get_indexer()
round-tripping through string dtype wheninfer_string
is enabled (GH 55834)
I/O#
Bug in
DataFrame.to_excel()
which stored decimals as strings instead of numbers (GH 49598)
Other#
Fixed usage of
inspect
when the optional dependenciespyarrow
orjinja2
are not installed (GH 60196)
Contributors#
A total of 24 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
ChiLin Chiu +
Irv Lustig
Isuru Fernando +
Jake Thomas Trevallion +
Joris Van den Bossche
Kevin Amparado +
LOCHAN PAUDEL +
Lumberbot (aka Jack)
Marc Mueller +
Marco Edward Gorelli
Matthew Roeschke
Pandas Development Team
Patrick Hoefler
Richard Shadrach
SALCAN +
Sebastian Berg
Simon Hawkins
Thomas Li
Will Ayd
William Andrea
William Ayd
dependabot[bot]
jbrockmendel
tasfia8 +