What's new in 1.4.0 (??)

These are the changes in pandas 1.4.0. See :ref:`release` for a full changelog including other versions of pandas.

Enhancements

More flexible numeric dtypes for indexes

Until now, it has only been possible to create numeric indexes with int64/float64/uint64 dtypes. It is now possible to create an index of any numpy int/uint/float dtype using the new :class:`NumericIndex` index type (:issue:`41153`):

.. ipython:: python

    pd.NumericIndex([1, 2, 3], dtype="int8")
    pd.NumericIndex([1, 2, 3], dtype="uint32")
    pd.NumericIndex([1, 2, 3], dtype="float32")

In order to maintain backwards compatibility, calls to the base :class:`Index` will currently return :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index`, where relevant. For example, the code below returns an Int64Index with dtype int64:

In [1]: pd.Index([1, 2, 3], dtype="int8")
Int64Index([1, 2, 3], dtype='int64')

but will in a future version return a :class:`NumericIndex` with dtype int8.

More generally, currently, all operations that until now have returned :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index` will continue to so. This means, that in order to use NumericIndex in the current version, you will have to call NumericIndex explicitly. For example the below series will have an Int64Index:

In [2]: ser = pd.Series([1, 2, 3], index=[1, 2, 3])
In [3]: ser.index
Int64Index([1, 2, 3], dtype='int64')

Instead, if you want to use a NumericIndex, you should do:

.. ipython:: python

    idx = pd.NumericIndex([1, 2, 3], dtype="int8")
    ser = pd.Series([1, 2, 3], index=idx)
    ser.index

In a future version of Pandas, :class:`NumericIndex` will become the default numeric index type and Int64Index, UInt64Index and Float64Index are therefore deprecated and will be removed in the future, see :ref:`here <whatsnew_140.deprecations.int64_uint64_float64index>` for more.

See :ref:`here <advanced.numericindex>` for more about :class:`NumericIndex`.

Styler

:class:`.Styler` has been further developed in 1.4.0. The following enhancements have been made:

Styling and formatting of indexes has been added, with :meth:`.Styler.apply_index`, :meth:`.Styler.applymap_index` and :meth:`.Styler.format_index`. These mirror the signature of the methods already used to style and format data values, and work with both HTML and LaTeX format (:issue:`41893`, :issue:`43101`).

:meth:`.Styler.bar` introduces additional arguments to control alignment, display and colors (:issue:`26070`, :issue:`36419`, :issue:`43662`), and it also validates the input arguments width and height (:issue:`42511`).

:meth:`.Styler.to_latex` introduces keyword argument environment, which also allows a specific "longtable" entry through a separate jinja2 template (:issue:`41866`).

:meth:`.Styler.to_html` introduces keyword arguments sparse_index, sparse_columns, bold_headers, caption, max_rows and max_columns (:issue:`41946`, :issue:`43149`, :issue:`42972`).

Keyword arguments level and names added to :meth:`.Styler.hide_index` and :meth:`.Styler.hide_columns` for additional control of visibility of MultiIndexes and index names (:issue:`25475`, :issue:`43404`, :issue:`43346`)

Global options have been extended to configure default Styler properties including formatting and encoding and mathjax options and LaTeX (:issue:`41395`)

Naive sparsification is now possible for LaTeX without the multirow package (:issue:`43369`)

:meth:`Styler.to_html` omits CSSStyle rules for hidden table elements (:issue:`43619`)

Formerly Styler relied on display.html.use_mathjax, which has now been replaced by styler.html.mathjax.

There are also bug fixes and deprecations listed below.

Validation now for caption arg (:issue:`43368`)

Multithreaded CSV reading with a new CSV Engine based on pyarrow

:func:`pandas.read_csv` now accepts engine="pyarrow" (requires at least pyarrow 0.17.0) as an argument, allowing for faster csv parsing on multicore machines with pyarrow installed. See the :doc:`I/O docs </user_guide/io>` for more info. (:issue:`23697`, :issue:`43706`)

Rank function for rolling and expanding windows

Added rank function to :class:`Rolling` and :class:`Expanding`. The new function supports the method, ascending, and pct flags of :meth:`DataFrame.rank`. The method argument supports min, max, and average ranking methods. Example:

.. ipython:: python

    s = pd.Series([1, 4, 2, 3, 5, 3])
    s.rolling(3).rank()

    s.rolling(3).rank(method="max")

Groupby positional indexing

It is now possible to specify positional ranges relative to the ends of each group.

Negative arguments for :meth:`.GroupBy.head` and :meth:`.GroupBy.tail` now work correctly and result in ranges relative to the end and start of each group, respectively. Previously, negative arguments returned empty frames.

.. ipython:: python

    df = pd.DataFrame([["g", "g0"], ["g", "g1"], ["g", "g2"], ["g", "g3"],
                       ["h", "h0"], ["h", "h1"]], columns=["A", "B"])
    df.groupby("A").head(-1)

:meth:`.GroupBy.nth` now accepts a slice or list of integers and slices.

.. ipython:: python

    df.groupby("A").nth(slice(1, -1))
    df.groupby("A").nth([slice(None, 1), slice(-1, None)])

DataFrame.from_dict and DataFrame.to_dict have new `'tight'` option

A new 'tight' dictionary format that preserves :class:`MultiIndex` entries and names is now available with the :meth:`DataFrame.from_dict` and :meth:`DataFrame.to_dict` methods and can be used with the standard json library to produce a tight representation of :class:`DataFrame` objects (:issue:`4889`).

.. ipython:: python

    df = pd.DataFrame.from_records(
        [[1, 3], [2, 4]],
        index=pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")],
                                        names=["n1", "n2"]),
        columns=pd.MultiIndex.from_tuples([("x", 1), ("y", 2)],
                                          names=["z1", "z2"]),
    )
    df
    df.to_dict(orient='tight')

Other enhancements

:class:`DataFrameGroupBy` operations with as_index=False now correctly retain ExtensionDtype dtypes for columns being grouped on (:issue:`41373`)
Add support for assigning values to by argument in :meth:`DataFrame.plot.hist` and :meth:`DataFrame.plot.box` (:issue:`15079`)
:meth:`Series.sample`, :meth:`DataFrame.sample`, and :meth:`.GroupBy.sample` now accept a np.random.Generator as input to random_state. A generator will be more performant, especially with replace=False (:issue:`38100`)
:meth:`Series.ewm`, :meth:`DataFrame.ewm`, now support a method argument with a 'table' option that performs the windowing operation over an entire :class:`DataFrame`. See :ref:`Window Overview <window.overview>` for performance and functional benefits (:issue:`42273`)
:meth:`.GroupBy.cummin` and :meth:`.GroupBy.cummax` now support the argument skipna (:issue:`34047`)
:meth:`read_table` now supports the argument storage_options (:issue:`39167`)
:meth:`DataFrame.to_stata` and :meth:`StataWriter` now accept the keyword only argument value_labels to save labels for non-categorical columns
Methods that relied on hashmap based algos such as :meth:`DataFrameGroupBy.value_counts`, :meth:`DataFrameGroupBy.count` and :func:`factorize` ignored imaginary component for complex numbers (:issue:`17927`)
Add :meth:`Series.str.removeprefix` and :meth:`Series.str.removesuffix` introduced in Python 3.9 to remove pre-/suffixes from string-type :class:`Series` (:issue:`36944`)
Attempting to write into a file in missing parent directory with :meth:`DataFrame.to_csv`, :meth:`DataFrame.to_html`, :meth:`DataFrame.to_excel`, :meth:`DataFrame.to_feather`, :meth:`DataFrame.to_parquet`, :meth:`DataFrame.to_stata`, :meth:`DataFrame.to_json`, :meth:`DataFrame.to_pickle`, and :meth:`DataFrame.to_xml` now explicitly mentions missing parent directory, the same is true for :class:`Series` counterparts (:issue:`24306`)
:meth:`IntegerArray.all` , :meth:`IntegerArray.any`, :meth:`FloatingArray.any`, and :meth:`FloatingArray.all` use Kleene logic (:issue:`41967`)
Added support for nullable boolean and integer types in :meth:`DataFrame.to_stata`, :class:`~pandas.io.stata.StataWriter`, :class:`~pandas.io.stata.StataWriter117`, and :class:`~pandas.io.stata.StataWriterUTF8` (:issue:`40855`)
:meth:`DataFrame.__pos__`, :meth:`DataFrame.__neg__` now retain ExtensionDtype dtypes (:issue:`43883`)
The error raised when an optional dependency can't be imported now includes the original exception, for easier investigation (:issue:`43882`)
Added :meth:`.ExponentialMovingWindow.sum` (:issue:`13297`)

Notable bug fixes

These are bug fixes that might have notable behavior changes.

Inconsistent date string parsing

The dayfirst option of :func:`to_datetime` isn't strict, and this can lead to surprising behaviour:

.. ipython:: python
    :okwarning:

    pd.to_datetime(["31-12-2021"], dayfirst=False)

Now, a warning will be raised if a date string cannot be parsed accordance to the given dayfirst value when the value is a delimited date string (e.g. 31-12-2012).

Ignoring dtypes in concat with empty or all-NA columns

When using :func:`concat` to concatenate two or more :class:`DataFrame` objects, if one of the DataFrames was empty or had all-NA values, its dtype was _sometimes_ ignored when finding the concatenated dtype. These are now consistently _not_ ignored (:issue:`43507`).

.. ipython:: python

    df1 = pd.DataFrame({"bar": [pd.Timestamp("2013-01-01")]}, index=range(1))
    df2 = pd.DataFrame({"bar": np.nan}, index=range(1, 2))
    res = df1.append(df2)

Previously, the float-dtype in df2 would be ignored so the result dtype would be datetime64[ns]. As a result, the np.nan would be cast to NaT.

Previous behavior:

In [4]: res
Out[4]:
         bar
0 2013-01-01
1        NaT

Now the float-dtype is respected. Since the common dtype for these DataFrames is object, the np.nan is retained.

New behavior:

.. ipython:: python

    res

notable_bug_fix3

Backwards incompatible API changes

Increased minimum versions for dependencies

Some minimum supported versions of dependencies were updated. If installed, we now require:

Package	Minimum Version	Required	Changed
numpy	1.18.5	X	X
pytz	2020.1	X	X
python-dateutil	2.8.1	X	X
bottleneck	1.3.1		X
numexpr	2.7.1		X
pytest (dev)	6.0
mypy (dev)	0.910		X

For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.

Package	Minimum Version	Changed
beautifulsoup4	4.8.2	X
fastparquet	0.4.0
fsspec	0.7.4
gcsfs	0.6.0
lxml	4.5.0	X
matplotlib	3.3.2	X
numba	0.50.1	X
openpyxl	3.0.2	X
pyarrow	0.17.0
pymysql	0.10.1	X
pytables	3.6.1	X
s3fs	0.4.0
scipy	1.4.1	X
sqlalchemy	1.3.11	X
tabulate	0.8.7
xarray	0.15.1	X
xlrd	2.0.1	X
xlsxwriter	1.2.2	X
xlwt	1.3.0
pandas-gbq	0.14.0	X

See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.

Other API changes

:meth:`Index.get_indexer_for` no longer accepts keyword arguments (other than 'target'); in the past these would be silently ignored if the index was not unique (:issue:`42310`)

Deprecations

Deprecated Int64Index, UInt64Index & Float64Index

:class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index` have been deprecated in favor of the new :class:`NumericIndex` and will be removed in Pandas 2.0 (:issue:`43028`).

Currently, in order to maintain backward compatibility, calls to :class:`Index` will continue to return :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index` when given numeric data, but in the future, a :class:`NumericIndex` will be returned.

Current behavior:

In [1]: pd.Index([1, 2, 3], dtype="int32")
Out [1]: Int64Index([1, 2, 3], dtype='int64')
In [1]: pd.Index([1, 2, 3], dtype="uint64")
Out [1]: UInt64Index([1, 2, 3], dtype='uint64')

Future behavior:

In [3]: pd.Index([1, 2, 3], dtype="int32")
Out [3]: NumericIndex([1, 2, 3], dtype='int32')
In [4]: pd.Index([1, 2, 3], dtype="uint64")
Out [4]: NumericIndex([1, 2, 3], dtype='uint64')

Other Deprecations

Deprecated :meth:`Index.is_type_compatible` (:issue:`42113`)
Deprecated method argument in :meth:`Index.get_loc`, use index.get_indexer([label], method=...) instead (:issue:`42269`)
Deprecated treating integer keys in :meth:`Series.__setitem__` as positional when the index is a :class:`Float64Index` not containing the key, a :class:`IntervalIndex` with no entries containing the key, or a :class:`MultiIndex` with leading :class:`Float64Index` level not containing the key (:issue:`33469`)
Deprecated treating numpy.datetime64 objects as UTC times when passed to the :class:`Timestamp` constructor along with a timezone. In a future version, these will be treated as wall-times. To retain the old behavior, use Timestamp(dt64).tz_localize("UTC").tz_convert(tz) (:issue:`24559`)
Deprecated ignoring missing labels when indexing with a sequence of labels on a level of a MultiIndex (:issue:`42351`)
Creating an empty Series without a dtype will now raise a more visible FutureWarning instead of a DeprecationWarning (:issue:`30017`)
Deprecated the 'kind' argument in :meth:`Index.get_slice_bound`, :meth:`Index.slice_indexer`, :meth:`Index.slice_locs`; in a future version passing 'kind' will raise (:issue:`42857`)
Deprecated dropping of nuisance columns in :class:`Rolling`, :class:`Expanding`, and :class:`EWM` aggregations (:issue:`42738`)
Deprecated :meth:`Index.reindex` with a non-unique index (:issue:`42568`)
Deprecated :meth:`.Styler.render` in favour of :meth:`.Styler.to_html` (:issue:`42140`)
Deprecated passing in a string column label into times in :meth:`DataFrame.ewm` (:issue:`43265`)
Deprecated the 'include_start' and 'include_end' arguments in :meth:`DataFrame.between_time`; in a future version passing 'include_start' or 'include_end' will raise (:issue:`40245`)
Deprecated the squeeze argument to :meth:`read_csv`, :meth:`read_table`, and :meth:`read_excel`. Users should squeeze the DataFrame afterwards with .squeeze("columns") instead. (:issue:`43242`)
Deprecated the index argument to :class:`SparseArray` construction (:issue:`23089`)
Deprecated the closed argument in :meth:`date_range` and :meth:`bdate_range` in favor of inclusive argument; In a future version passing closed will raise (:issue:`40245`)
Deprecated :meth:`.Rolling.validate`, :meth:`.Expanding.validate`, and :meth:`.ExponentialMovingWindow.validate` (:issue:`43665`)
Deprecated silent dropping of columns that raised a TypeError in :class:`Series.transform` and :class:`DataFrame.transform` when used with a dictionary (:issue:`43740`)
Deprecated silent dropping of columns that raised a TypeError, DataError, and some cases of ValueError in :meth:`Series.aggregate`, :meth:`DataFrame.aggregate`, :meth:`Series.groupby.aggregate`, and :meth:`DataFrame.groupby.aggregate` when used with a list (:issue:`43740`)

Performance improvements

Performance improvement in :meth:`.GroupBy.sample`, especially when weights argument provided (:issue:`34483`)
Performance improvement when converting non-string arrays to string arrays (:issue:`34483`)
Performance improvement in :meth:`.GroupBy.transform` for user-defined functions (:issue:`41598`)
Performance improvement in constructing :class:`DataFrame` objects (:issue:`42631`)
Performance improvement in :meth:`GroupBy.shift` when fill_value argument is provided (:issue:`26615`)
Performance improvement in :meth:`DataFrame.corr` for method=pearson on data without missing values (:issue:`40956`)
Performance improvement in some :meth:`GroupBy.apply` operations (:issue:`42992`)
Performance improvement in :func:`read_stata` (:issue:`43059`)
Performance improvement in :meth:`to_datetime` with uint dtypes (:issue:`42606`)
Performance improvement in :meth:`Series.sparse.to_coo` (:issue:`42880`)
Performance improvement in indexing with a :class:`MultiIndex` indexer on another :class:`MultiIndex` (:issue:43370`)
Performance improvement in :meth:`GroupBy.quantile` (:issue:`43469`)
:meth:`SparseArray.min` and :meth:`SparseArray.max` no longer require converting to a dense array (:issue:`43526`)
Indexing into a :class:`SparseArray` with a slice with step=1 no longer requires converting to a dense array (:issue:`43777`)
Performance improvement in :meth:`SparseArray.take` with allow_fill=False (:issue:`43654`)
Performance improvement in :meth:`.Rolling.mean` and :meth:`.Expanding.mean` with engine="numba" (:issue:`43612`)
Improved performance of :meth:`pandas.read_csv` with memory_map=True when file encoding is UTF-8 (:issue:`43787`)

Bug fixes

Categorical

Bug in setting dtype-incompatible values into a :class:`Categorical` (or Series or DataFrame backed by Categorical) raising ValueError instead of TypeError (:issue:`41919`)
Bug in :meth:`Categorical.searchsorted` when passing a dtype-incompatible value raising KeyError instead of TypeError (:issue:`41919`)
Bug in :meth:`Series.where` with CategoricalDtype when passing a dtype-incompatible value raising ValueError instead of TypeError (:issue:`41919`)
Bug in :meth:`Categorical.fillna` when passing a dtype-incompatible value raising ValueError instead of TypeError (:issue:`41919`)
Bug in :meth:`Categorical.fillna` with a tuple-like category raising ValueError instead of TypeError when filling with a non-category tuple (:issue:`41919`)

Datetimelike

Bug in :class:`DataFrame` constructor unnecessarily copying non-datetimelike 2D object arrays (:issue:`39272`)
Bug in :func:`to_datetime` with format and pandas.NA was raising ValueError (:issue:`42957`)
:func:`to_datetime` would silently swap MM/DD/YYYY and DD/MM/YYYY formats if the given dayfirst option could not be respected - now, a warning is raised in the case of delimited date strings (e.g. 31-12-2012) (:issue:`12585`)
Bug in :meth:`date_range` and :meth:`bdate_range` do not return right bound when start = end and set is closed on one side (:issue:`43394`)
Bug in inplace addition and subtraction of :class:`DatetimeIndex` or :class:`TimedeltaIndex` with :class:`DatetimeArray` or :class:`TimedeltaArray` (:issue:`43904`)
Bug in in calling np.isnan, np.isfinite, or np.isinf on a timezone-aware :class:`DatetimeIndex` incorrectly raising TypeError (:issue:`43917`)

Timedelta

Timezones

Bug in :func:`to_datetime` with infer_datetime_format=True failing to parse zero UTC offset (Z) correctly (:issue:`41047`)
Bug in :meth:`Series.dt.tz_convert` resetting index in a :class:`Series` with :class:`CategoricalIndex` (:issue:`43080`)

Numeric

Bug in :meth:`DataFrame.rank` raising ValueError with object columns and method="first" (:issue:`41931`)
Bug in :meth:`DataFrame.rank` treating missing values and extreme values as equal (for example np.nan and np.inf), causing incorrect results when na_option="bottom" or na_option="top used (:issue:`41931`)
Bug in numexpr engine still being used when the option compute.use_numexpr is set to False (:issue:`32556`)
Bug in :class:`DataFrame` arithmetic ops with a subclass whose :meth:`_constructor` attribute is a callable other than the subclass itself (:issue:`43201`)
Bug in arithmetic operations involving :class:`RangeIndex` where the result would have the incorrect name (:issue:`43962`)

Conversion

Bug in :class:`UInt64Index` constructor when passing a list containing both positive integers small enough to cast to int64 and integers too large too hold in int64 (:issue:`42201`)
Bug in :class:`Series` constructor returning 0 for missing values with dtype int64 and False for dtype bool (:issue:`43017`, :issue:`43018`)

Strings

Interval

Indexing

Bug in :meth:`Series.rename` when index in Series is MultiIndex and level in rename is provided. (:issue:`43659`)
Bug in :meth:`DataFrame.truncate` and :meth:`Series.truncate` when the object's Index has a length greater than one but only one unique value (:issue:`42365`)
Bug in :meth:`Series.loc` and :meth:`DataFrame.loc` with a :class:`MultiIndex` when indexing with a tuple in which one of the levels is also a tuple (:issue:`27591`)
Bug in :meth:`Series.loc` when with a :class:`MultiIndex` whose first level contains only np.nan values (:issue:`42055`)
Bug in indexing on a :class:`Series` or :class:`DataFrame` with a :class:`DatetimeIndex` when passing a string, the return type depended on whether the index was monotonic (:issue:`24892`)
Bug in indexing on a :class:`MultiIndex` failing to drop scalar levels when the indexer is a tuple containing a datetime-like string (:issue:`42476`)
Bug in :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` when passing an ascending value, failed to raise or incorrectly raising ValueError (:issue:`41634`)
Bug in updating values of :class:`pandas.Series` using boolean index, created by using :meth:`pandas.DataFrame.pop` (:issue:`42530`)
Bug in :meth:`Index.get_indexer_non_unique` when index contains multiple np.nan (:issue:`35392`)
Bug in :meth:`DataFrame.query` did not handle the degree sign in a backticked column name, such as `Temp(°C)`, used in an expression to query a dataframe (:issue:`42826`)
Bug in :meth:`DataFrame.drop` where the error message did not show missing labels with commas when raising KeyError (:issue:`42881`)
Bug in :meth:`DataFrame.query` where method calls in query strings led to errors when the numexpr package was installed. (:issue:`22435`)
Bug in :meth:`DataFrame.nlargest` and :meth:`Series.nlargest` where sorted result did not count indexes containing np.nan (:issue:`28984`)
Bug in indexing on a non-unique object-dtype :class:`Index` with an NA scalar (e.g. np.nan) (:issue:`43711`)
Bug in :meth:`Series.__setitem__` with object dtype when setting an array with matching size and dtype='datetime64[ns]' or dtype='timedelta64[ns]' incorrectly converting the datetime/timedeltas to integers (:issue:`43868`)

Missing

Bug in :meth:`DataFrame.fillna` with limit and no method ignores axis='columns' or axis = 1 (:issue:`40989`)
Bug in :meth:`DataFrame.fillna` not replacing missing values when using a dict-like value and duplicate column names (:issue:`43476`)

MultiIndex

Bug in :meth:`MultiIndex.get_loc` where the first level is a :class:`DatetimeIndex` and a string key is passed (:issue:`42465`)
Bug in :meth:`MultiIndex.reindex` when passing a level that corresponds to an ExtensionDtype level (:issue:`42043`)
Bug in :meth:`MultiIndex.get_loc` raising TypeError instead of KeyError on nested tuple (:issue:`42440`)
Bug in :meth:`MultiIndex.putmask` where the other value was also a :class:`MultiIndex` (:issue:`43212`)

I/O

Bug in :func:`read_excel` attempting to read chart sheets from .xlsx files (:issue:`41448`)
Bug in :func:`json_normalize` where errors=ignore could fail to ignore missing values of meta when record_path has a length greater than one (:issue:`41876`)
Bug in :func:`read_csv` with multi-header input and arguments referencing column names as tuples (:issue:`42446`)
Bug in :func:`read_fwf`, where difference in lengths of colspecs and names was not raising ValueError (:issue:`40830`)
Bug in :func:`Series.to_json` and :func:`DataFrame.to_json` where some attributes were skipped when serialising plain Python objects to JSON (:issue:`42768`, :issue:`33043`)
Column headers are dropped when constructing a :class:`DataFrame` from a sqlalchemy's Row object (:issue:`40682`)
Bug in unpickling a :class:`Index` with object dtype incorrectly inferring numeric dtypes (:issue:`43188`)
Bug in :func:`read_csv` where reading multi-header input with unequal lengths incorrectly raising uncontrolled IndexError (:issue:`43102`)
Bug in :func:`read_csv`, changed exception class when expecting a file path name or file-like object from OSError to TypeError (:issue:`43366`)
Bug in :func:`read_json` not handling non-numpy dtypes correctly (especially category) (:issue:`21892`, :issue:`33205`)
Bug in :func:`json_normalize` where multi-character sep parameter is incorrectly prefixed to every key (:issue:`43831`)
Bug in :func:`read_csv` with float_precision="round_trip" which did not skip initial/trailing whitespace (:issue:`43713`)

Period

Plotting

Groupby/resample/rolling

Fixed bug in :meth:`SeriesGroupBy.apply` where passing an unrecognized string argument failed to raise TypeError when the underlying Series is empty (:issue:`42021`)
Bug in :meth:`Series.rolling.apply`, :meth:`DataFrame.rolling.apply`, :meth:`Series.expanding.apply` and :meth:`DataFrame.expanding.apply` with engine="numba" where *args were being cached with the user passed function (:issue:`42287`)
Bug in :meth:`GroupBy.max` and :meth:`GroupBy.min` with nullable integer dtypes losing precision (:issue:`41743`)
Bug in :meth:`DataFrame.groupby.rolling.var` would calculate the rolling variance only on the first group (:issue:`42442`)
Bug in :meth:`GroupBy.shift` that would return the grouping columns if fill_value was not None (:issue:`41556`)
Bug in :meth:`SeriesGroupBy.nlargest` and :meth:`SeriesGroupBy.nsmallest` would have an inconsistent index when the input Series was sorted and n was greater than or equal to all group sizes (:issue:`15272`, :issue:`16345`, :issue:`29129`)
Bug in :meth:`pandas.DataFrame.ewm`, where non-float64 dtypes were silently failing (:issue:`42452`)
Bug in :meth:`pandas.DataFrame.rolling` operation along rows (axis=1) incorrectly omits columns containing float16 and float32 (:issue:`41779`)
Bug in :meth:`Resampler.aggregate` did not allow the use of Named Aggregation (:issue:`32803`)
Bug in :meth:`Series.rolling` when the :class:`Series` dtype was Int64 (:issue:`43016`)
Bug in :meth:`DataFrame.rolling.corr` when the :class:`DataFrame` columns was a :class:`MultiIndex` (:issue:`21157`)
Bug in :meth:`DataFrame.groupby.rolling` when specifying on and calling __getitem__ would subsequently return incorrect results (:issue:`43355`)
Bug in :meth:`GroupBy.apply` with time-based :class:`Grouper` objects incorrectly raising ValueError in corner cases where the grouping vector contains a NaT (:issue:`43500`, :issue:`43515`)
Bug in :meth:`GroupBy.mean` failing with complex dtype (:issue:`43701`)
Fixed bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` not calculating window bounds correctly for the first row when center=True and index is decreasing (:issue:`43927`)
Fixed bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` for centered datetimelike windows with uneven nanosecond (:issue:`43997`)
Bug in :meth:`GroupBy.nth` failing on axis=1 (:issue:`43926`)
Fixed bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` not respecting right bound on centered datetime-like windows, if the index contain duplicates (:issue:`#3944`)

Reshaping

Improved error message when creating a :class:`DataFrame` column from a multi-dimensional :class:`numpy.ndarray` (:issue:`42463`)
:func:`concat` creating :class:`MultiIndex` with duplicate level entries when concatenating a :class:`DataFrame` with duplicates in :class:`Index` and multiple keys (:issue:`42651`)
Bug in :meth:`pandas.cut` on :class:`Series` with duplicate indices (:issue:`42185`) and non-exact :meth:`pandas.CategoricalIndex` (:issue:`42425`)
Bug in :meth:`DataFrame.append` failing to retain dtypes when appended columns do not match (:issue:`43392`)
Bug in :func:`concat` of bool and boolean dtypes resulting in object dtype instead of boolean dtype (:issue:`42800`)
Bug in :func:`crosstab` when inputs are are categorical Series, there are categories that are not present in one or both of the Series, and margins=True. Previously the margin value for missing categories was NaN. It is now correctly reported as 0 (:issue:`43505`)
Bug in :func:`concat` would fail when the objs argument all had the same index and the keys argument contained duplicates (:issue:`43595`)
Bug in :func:`concat` which ignored the sort parameter (:issue:`43375`)

Sparse

Bug in :meth:`DataFrame.sparse.to_coo` raising AttributeError when column names are not unique (:issue:`29564`)
Bug in :meth:`SparseArray.max` and :meth:`SparseArray.min` raising ValueError for arrays with 0 non-null elements (:issue:`43527`)
Bug in :meth:`DataFrame.sparse.to_coo` silently converting non-zero fill values to zero (:issue:`24817`)
Bug in :class:`SparseArray` comparison methods with an array-like operand of mismatched length raising AssertionError or unclear ValueError depending on the input (:issue:`43863`)

ExtensionArray

Bug in :func:`array` failing to preserve :class:`PandasArray` (:issue:`43887`)
NumPy ufuncs np.abs, np.positive, np.negative now correctly preserve dtype when called on ExtensionArrays that implement __abs__, __pos__, __neg__, respectively. In particular this is fixed for :class:`TimedeltaArray` (:issue:`43899`)

Styler

Minor bug in :class:`.Styler` where the uuid at initialization maintained a floating underscore (:issue:`43037`)
Bug in :meth:`.Styler.to_html` where the Styler object was updated if the to_html method was called with some args (:issue:`43034`)
Bug in :meth:`.Styler.copy` where uuid was not previously copied (:issue:`40675`)
Bug in :meth:`Styler.apply` where functions which returned Series objects were not correctly handled in terms of aligning their index labels (:issue:`13657`, :issue:`42014`)
Bug when rendering an empty DataFrame with a named index (:issue:`43305`).
Bug when rendering a single level MultiIndex (:issue:`43383`).
Bug when combining non-sparse rendering and :meth:`.Styler.hide_columns` or :meth:`.Styler.hide_index` (:issue:`43464`)

Other

Bug in :meth:`CustomBusinessMonthBegin.__add__` (:meth:`CustomBusinessMonthEnd.__add__`) not applying the extra offset parameter when beginning (end) of the target month is already a business day (:issue:`41356`)

Files

v1.4.0.rst

Latest commit

History

v1.4.0.rst

File metadata and controls

What's new in 1.4.0 (??)

Enhancements

More flexible numeric dtypes for indexes

Styler

Multithreaded CSV reading with a new CSV Engine based on pyarrow

Rank function for rolling and expanding windows

Groupby positional indexing

DataFrame.from_dict and DataFrame.to_dict have new 'tight' option

Other enhancements

Notable bug fixes

Inconsistent date string parsing

Ignoring dtypes in concat with empty or all-NA columns

notable_bug_fix3

Backwards incompatible API changes

Increased minimum versions for dependencies

Other API changes

Deprecations

Deprecated Int64Index, UInt64Index & Float64Index

Other Deprecations

Performance improvements

Bug fixes

Categorical

Datetimelike

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Period

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Styler

Other

Contributors

DataFrame.from_dict and DataFrame.to_dict have new `'tight'` option