lithomas1
diff --git a/‎doc/source/development/policies.rst
Lines changed: 2 additions & 2 deletions b/‎doc/source/development/policies.rst
Lines changed: 2 additions & 2 deletions
diff --git a/‎doc/source/reference/frame.rst
Lines changed: 1 addition & 0 deletions b/‎doc/source/reference/frame.rst
Lines changed: 1 addition & 0 deletions
diff --git a/‎doc/source/reference/general_functions.rst
Lines changed: 7 additions & 0 deletions b/‎doc/source/reference/general_functions.rst
Lines changed: 7 additions & 0 deletions
diff --git a/‎doc/source/user_guide/gotchas.rst
Lines changed: 1 addition & 1 deletion b/‎doc/source/user_guide/gotchas.rst
Lines changed: 1 addition & 1 deletion
diff --git a/‎doc/source/whatsnew/v1.5.0.rst
Lines changed: 39 additions & 21 deletions b/‎doc/source/whatsnew/v1.5.0.rst
Lines changed: 39 additions & 21 deletions
diff --git a/‎pandas/_libs/hashtable.pyi
Lines changed: 4 additions & 1 deletion b/‎pandas/_libs/hashtable.pyi
Lines changed: 4 additions & 1 deletion
diff --git a/‎pandas/_libs/hashtable_func_helper.pxi.in
Lines changed: 33 additions & 19 deletions b/‎pandas/_libs/hashtable_func_helper.pxi.in
Lines changed: 33 additions & 19 deletions
diff --git a/‎pandas/_libs/lib.pyx
Lines changed: 14 additions & 14 deletions b/‎pandas/_libs/lib.pyx
Lines changed: 14 additions & 14 deletions
@@ -51,7 +51,7 @@ pandas may change the behavior of experimental features at any time.
 Python support
 ~~~~~~~~~~~~~~
 
-pandas will only drop support for specific Python versions (e.g. 3.6.x, 3.7.x) in
-pandas **major** or **minor** releases.
+pandas mirrors the `NumPy guidelines for Python support <https://numpy.org/neps/nep-0029-deprecation_policy.html#implementation>`__.
+
 
 .. _SemVer: https://semver.org
@@ -391,3 +391,4 @@ Serialization / IO / conversion
    DataFrame.to_clipboard
    DataFrame.to_markdown
    DataFrame.style
+   DataFrame.__dataframe__
@@ -78,3 +78,10 @@ Hashing
 
    util.hash_array
    util.hash_pandas_object
+
+Importing from other DataFrame libraries
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. autosummary::
+   :toctree: api/
+
+   api.exchange.from_dataframe
@@ -367,7 +367,7 @@ integer arrays to floating when NAs must be introduced.
 Differences with NumPy
 ----------------------
 For :class:`Series` and :class:`DataFrame` objects, :meth:`~DataFrame.var` normalizes by
-``N-1`` to produce unbiased estimates of the sample variance, while NumPy's
+``N-1`` to produce `unbiased estimates of the population variance <https://en.wikipedia.org/wiki/Bias_of_an_estimator>`__, while NumPy's
 :meth:`numpy.var` normalizes by N, which measures the variance of the sample. Note that
 :meth:`~DataFrame.cov` normalizes by ``N-1`` in both pandas and NumPy.
 
 
@@ -14,15 +14,40 @@ including other versions of pandas.
 Enhancements
 ~~~~~~~~~~~~
 
+.. _whatsnew_150.enhancements.dataframe_exchange:
+
+DataFrame exchange protocol implementation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Pandas now implement the DataFrame exchange API spec.
+See the full details on the API at https://data-apis.org/dataframe-protocol/latest/index.html
+
+The protocol consists of two parts:
+
+  - New method :meth:`DataFrame.__dataframe__` which produces the exchange object.
+    It effectively "exports" the Pandas dataframe as an exchange object so
+    any other library which has the protocol implemented can "import" that dataframe
+    without knowing anything about the producer except that it makes an exchange object.
+  - New function :func:`pandas.api.exchange.from_dataframe` which can take
+    an arbitrary exchange object from any conformant library and construct a
+    Pandas DataFrame out of it.
+
 .. _whatsnew_150.enhancements.styler:
 
 Styler
 ^^^^^^
 
-  - New method :meth:`.Styler.to_string` for alternative customisable output methods (:issue:`44502`)
-  - Added the ability to render ``border`` and ``border-{side}`` CSS properties in Excel (:issue:`42276`)
-  - Added a new method :meth:`.Styler.concat` which allows adding customised footer rows to visualise additional calculations on the data, e.g. totals and counts etc. (:issue:`43875`, :issue:`46186`)
-  - :meth:`.Styler.highlight_null` now accepts ``color`` consistently with other builtin methods and deprecates ``null_color`` although this remains backwards compatible (:issue:`45907`)
+The most notable development is the new method :meth:`.Styler.concat` which
+allows adding customised footer rows to visualise additional calculations on the data,
+e.g. totals and counts etc. (:issue:`43875`, :issue:`46186`)
+
+Additionally there is an alternative output method :meth:`.Styler.to_string`,
+which allows using the Styler's formatting methods to create, for example, CSVs (:issue:`44502`).
+
+Minor feature improvements are:
+
+  - Adding the ability to render ``border`` and ``border-{side}`` CSS properties in Excel (:issue:`42276`)
+  - Making keyword arguments consist: :meth:`.Styler.highlight_null` now accepts ``color`` and deprecates ``null_color`` although this remains backwards compatible (:issue:`45907`)
 
 .. _whatsnew_150.enhancements.resample_group_keys:
 
@@ -79,6 +104,7 @@ as seen in the following example.
 
 Other enhancements
 ^^^^^^^^^^^^^^^^^^
+- :meth:`Series.map` now raises when ``arg`` is dict but ``na_action`` is not either ``None`` or ``'ignore'`` (:issue:`46588`)
 - :meth:`MultiIndex.to_frame` now supports the argument ``allow_duplicates`` and raises on duplicate labels if it is missing or False (:issue:`45245`)
 - :class:`StringArray` now accepts array-likes containing nan-likes (``None``, ``np.nan``) for the ``values`` parameter in its constructor in addition to strings and :attr:`pandas.NA`. (:issue:`40839`)
 - Improved the rendering of ``categories`` in :class:`CategoricalIndex` (:issue:`45218`)
@@ -94,7 +120,9 @@ Other enhancements
 - :meth:`DataFrame.reset_index` now accepts a ``names`` argument which renames the index names (:issue:`6878`)
 - :meth:`pd.concat` now raises when ``levels`` is given but ``keys`` is None (:issue:`46653`)
 - :meth:`pd.concat` now raises when ``levels`` contains duplicate values (:issue:`46653`)
-- Added ``numeric_only`` argument to :meth:`DataFrame.corr`, :meth:`DataFrame.corrwith`, and :meth:`DataFrame.cov` (:issue:`46560`)
+- Added ``numeric_only`` argument to :meth:`DataFrame.corr`, :meth:`DataFrame.corrwith`, :meth:`DataFrame.cov`, :meth:`DataFrame.idxmin`, :meth:`DataFrame.idxmax`, :meth:`.GroupBy.idxmin`, :meth:`.GroupBy.idxmax`, :meth:`.GroupBy.var`, :meth:`.GroupBy.std`, :meth:`.GroupBy.sem`, and :meth:`.GroupBy.quantile` (:issue:`46560`)
+- A :class:`errors.PerformanceWarning` is now thrown when using ``string[pyarrow]`` dtype with methods that don't dispatch to ``pyarrow.compute`` methods (:issue:`42613`, :issue:`46725`)
+- Added ``validate`` argument to :meth:`DataFrame.join` (:issue:`46622`)
 - A :class:`errors.PerformanceWarning` is now thrown when using ``string[pyarrow]`` dtype with methods that don't dispatch to ``pyarrow.compute`` methods (:issue:`42613`)
 - Added ``numeric_only`` argument to :meth:`Resampler.sum`, :meth:`Resampler.prod`, :meth:`Resampler.min`, :meth:`Resampler.max`, :meth:`Resampler.first`, and :meth:`Resampler.last` (:issue:`46442`)
 
@@ -106,13 +134,6 @@ Notable bug fixes
 
 These are bug fixes that might have notable behavior changes.
 
-.. _whatsnew_150.notable_bug_fixes.notable_bug_fix1:
-
-Styler
-^^^^^^
-
-- Fixed bug in :class:`CSSToExcelConverter` leading to ``TypeError`` when border color provided without border style for ``xlsxwriter`` engine (:issue:`42276`)
-
 .. _whatsnew_150.notable_bug_fixes.groupby_transform_dropna:
 
 Using ``dropna=True`` with ``groupby`` transforms
@@ -173,13 +194,6 @@ did not have the same index as the input.
     df.groupby('a', dropna=True).transform('ffill')
     df.groupby('a', dropna=True).transform(lambda x: x)
 
-.. _whatsnew_150.notable_bug_fixes.visualization:
-
-Styler
-^^^^^^
-
-- Fix showing "None" as ylabel in :meth:`Series.plot` when not setting ylabel (:issue:`46129`)
-
 .. _whatsnew_150.notable_bug_fixes.to_json_incorrectly_localizing_naive_timestamps:
 
 Serializing tz-naive Timestamps with to_json() with ``iso_dates=True``
@@ -587,7 +601,7 @@ Missing
 - Bug in :meth:`Series.fillna` and :meth:`DataFrame.fillna` with ``downcast`` keyword not being respected in some cases where there are no NA values present (:issue:`45423`)
 - Bug in :meth:`Series.fillna` and :meth:`DataFrame.fillna` with :class:`IntervalDtype` and incompatible value raising instead of casting to a common (usually object) dtype (:issue:`45796`)
 - Bug in :meth:`DataFrame.interpolate` with object-dtype column not returning a copy with ``inplace=False`` (:issue:`45791`)
--
+- Bug in :meth:`DataFrame.dropna` allows to set both ``how`` and ``thresh`` incompatible arguments (:issue:`46575`)
 
 MultiIndex
 ^^^^^^^^^^
@@ -619,6 +633,8 @@ Period
 ^^^^^^
 - Bug in subtraction of :class:`Period` from :class:`PeriodArray` returning wrong results (:issue:`45999`)
 - Bug in :meth:`Period.strftime` and :meth:`PeriodIndex.strftime`, directives ``%l`` and ``%u`` were giving wrong results (:issue:`46252`)
+- Bug in inferring an incorrect ``freq`` when passing a string to :class:`Period` microseconds that are a multiple of 1000 (:issue:`46811`)
+- Bug in constructing a :class:`Period` from a :class:`Timestamp` or ``np.datetime64`` object with non-zero nanoseconds and ``freq="ns"`` incorrectly truncating the nanoseconds (:issue:`46811`)
 -
 
 Plotting
@@ -629,6 +645,7 @@ Plotting
 - Bug in :meth:`DataFrame.boxplot` that prevented specifying ``vert=False`` (:issue:`36918`)
 - Bug in :meth:`DataFrame.plot.scatter` that prevented specifying ``norm`` (:issue:`45809`)
 - The function :meth:`DataFrame.plot.scatter` now accepts ``color`` as an alias for ``c`` and ``size`` as an alias for ``s`` for consistency to other plotting functions (:issue:`44670`)
+- Fix showing "None" as ylabel in :meth:`Series.plot` when not setting ylabel (:issue:`46129`)
 
 Groupby/resample/rolling
 ^^^^^^^^^^^^^^^^^^^^^^^^
@@ -645,6 +662,7 @@ Groupby/resample/rolling
 - Bug in :meth:`GroupBy.max` with empty groups and ``uint64`` dtype incorrectly raising ``RuntimeError`` (:issue:`46408`)
 - Bug in :meth:`.GroupBy.apply` would fail when ``func`` was a string and args or kwargs were supplied (:issue:`46479`)
 - Bug in :meth:`SeriesGroupBy.apply` would incorrectly name its result when there was a unique group (:issue:`46369`)
+- Bug in :meth:`Rolling.sum` and :meth:`Rolling.mean` would give incorrect result with window of same values (:issue:`42064`, :issue:`46431`)
 - Bug in :meth:`Rolling.var` and :meth:`Rolling.std` would give non-zero result with window of same values (:issue:`42064`)
 - Bug in :meth:`.Rolling.var` would segfault calculating weighted variance when window size was larger than data size (:issue:`46760`)
 - Bug in :meth:`Grouper.__repr__` where ``dropna`` was not included. Now it is (:issue:`46754`)
@@ -672,7 +690,7 @@ ExtensionArray
 Styler
 ^^^^^^
 - Bug when attempting to apply styling functions to an empty DataFrame subset (:issue:`45313`)
--
+- Bug in :class:`CSSToExcelConverter` leading to ``TypeError`` when border color provided without border style for ``xlsxwriter`` engine (:issue:`42276`)
 
 Metadata
 ^^^^^^^^
 
@@ -197,10 +197,13 @@ def duplicated(
     values: np.ndarray,
     keep: Literal["last", "first", False] = ...,
 ) -> npt.NDArray[np.bool_]: ...
-def mode(values: np.ndarray, dropna: bool) -> np.ndarray: ...
+def mode(
+    values: np.ndarray, dropna: bool, mask: npt.NDArray[np.bool_] | None = None
+) -> np.ndarray: ...
 def value_count(
     values: np.ndarray,
     dropna: bool,
+    mask: npt.NDArray[np.bool_] | None = None,
 ) -> tuple[np.ndarray, npt.NDArray[np.int64],]: ...  # np.ndarray[same-as-values]
 
 # arr and values should have same dtype
 
@@ -31,9 +31,9 @@ dtypes = [('Complex128', 'complex128', 'complex128',
 @cython.wraparound(False)
 @cython.boundscheck(False)
 {{if dtype == 'object'}}
-cdef value_count_{{dtype}}(ndarray[{{dtype}}] values, bint dropna):
+cdef value_count_{{dtype}}(ndarray[{{dtype}}] values, bint dropna, const uint8_t[:] mask=None):
 {{else}}
-cdef value_count_{{dtype}}(const {{dtype}}_t[:] values, bint dropna):
+cdef value_count_{{dtype}}(const {{dtype}}_t[:] values, bint dropna, const uint8_t[:] mask=None):
 {{endif}}
     cdef:
         Py_ssize_t i = 0
@@ -46,6 +46,11 @@ cdef value_count_{{dtype}}(const {{dtype}}_t[:] values, bint dropna):
         {{c_type}} val
 
         int ret = 0
+        bint uses_mask = mask is not None
+        bint isna_entry = False
+
+    if uses_mask and not dropna:
+        raise NotImplementedError("uses_mask not implemented with dropna=False")
 
     # we track the order in which keys are first seen (GH39009),
     # khash-map isn't insertion-ordered, thus:
@@ -56,6 +61,9 @@ cdef value_count_{{dtype}}(const {{dtype}}_t[:] values, bint dropna):
     table = kh_init_{{ttype}}()
 
     {{if dtype == 'object'}}
+    if uses_mask:
+        raise NotImplementedError("uses_mask not implemented with object dtype")
+
     kh_resize_{{ttype}}(table, n // 10)
 
     for i in range(n):
@@ -74,7 +82,13 @@ cdef value_count_{{dtype}}(const {{dtype}}_t[:] values, bint dropna):
     for i in range(n):
         val = {{to_c_type}}(values[i])
 
-        if not is_nan_{{c_type}}(val) or not dropna:
+        if dropna:
+            if uses_mask:
+                isna_entry = mask[i]
+            else:
+                isna_entry = is_nan_{{c_type}}(val)
+
+        if not dropna or not isna_entry:
             k = kh_get_{{ttype}}(table, val)
             if k != table.n_buckets:
                 table.vals[k] += 1
@@ -251,37 +265,37 @@ ctypedef fused htfunc_t:
     complex64_t
 
 
-cpdef value_count(ndarray[htfunc_t] values, bint dropna):
+cpdef value_count(ndarray[htfunc_t] values, bint dropna, const uint8_t[:] mask=None):
     if htfunc_t is object:
-        return value_count_object(values, dropna)
+        return value_count_object(values, dropna, mask=mask)
 
     elif htfunc_t is int8_t:
-        return value_count_int8(values, dropna)
+        return value_count_int8(values, dropna, mask=mask)
     elif htfunc_t is int16_t:
-        return value_count_int16(values, dropna)
+        return value_count_int16(values, dropna, mask=mask)
     elif htfunc_t is int32_t:
-        return value_count_int32(values, dropna)
+        return value_count_int32(values, dropna, mask=mask)
     elif htfunc_t is int64_t:
-        return value_count_int64(values, dropna)
+        return value_count_int64(values, dropna, mask=mask)
 
     elif htfunc_t is uint8_t:
-        return value_count_uint8(values, dropna)
+        return value_count_uint8(values, dropna, mask=mask)
     elif htfunc_t is uint16_t:
-        return value_count_uint16(values, dropna)
+        return value_count_uint16(values, dropna, mask=mask)
     elif htfunc_t is uint32_t:
-        return value_count_uint32(values, dropna)
+        return value_count_uint32(values, dropna, mask=mask)
     elif htfunc_t is uint64_t:
-        return value_count_uint64(values, dropna)
+        return value_count_uint64(values, dropna, mask=mask)
 
     elif htfunc_t is float64_t:
-        return value_count_float64(values, dropna)
+        return value_count_float64(values, dropna, mask=mask)
     elif htfunc_t is float32_t:
-        return value_count_float32(values, dropna)
+        return value_count_float32(values, dropna, mask=mask)
 
     elif htfunc_t is complex128_t:
-        return value_count_complex128(values, dropna)
+        return value_count_complex128(values, dropna, mask=mask)
     elif htfunc_t is complex64_t:
-        return value_count_complex64(values, dropna)
+        return value_count_complex64(values, dropna, mask=mask)
 
     else:
         raise TypeError(values.dtype)
@@ -361,7 +375,7 @@ cpdef ismember(ndarray[htfunc_t] arr, ndarray[htfunc_t] values):
 
 @cython.wraparound(False)
 @cython.boundscheck(False)
-def mode(ndarray[htfunc_t] values, bint dropna):
+def mode(ndarray[htfunc_t] values, bint dropna, const uint8_t[:] mask=None):
     # TODO(cython3): use const htfunct_t[:]
 
     cdef:
@@ -372,7 +386,7 @@ def mode(ndarray[htfunc_t] values, bint dropna):
         int64_t count, max_count = -1
         Py_ssize_t nkeys, k, j = 0
 
-    keys, counts = value_count(values, dropna)
+    keys, counts = value_count(values, dropna, mask=mask)
     nkeys = len(keys)
 
     modes = np.empty(nkeys, dtype=values.dtype)
 
@@ -873,7 +873,7 @@ def get_level_sorter(
     """
     cdef:
         Py_ssize_t i, l, r
-        ndarray[intp_t, ndim=1] out = np.empty(len(codes), dtype=np.intp)
+        ndarray[intp_t, ndim=1] out = cnp.PyArray_EMPTY(1, codes.shape, cnp.NPY_INTP, 0)
 
     for i in range(len(starts) - 1):
         l, r = starts[i], starts[i + 1]
@@ -2255,11 +2255,11 @@ def maybe_convert_numeric(
         int status, maybe_int
         Py_ssize_t i, n = values.size
         Seen seen = Seen(coerce_numeric)
-        ndarray[float64_t, ndim=1] floats = np.empty(n, dtype='f8')
-        ndarray[complex128_t, ndim=1] complexes = np.empty(n, dtype='c16')
-        ndarray[int64_t, ndim=1] ints = np.empty(n, dtype='i8')
-        ndarray[uint64_t, ndim=1] uints = np.empty(n, dtype='u8')
-        ndarray[uint8_t, ndim=1] bools = np.empty(n, dtype='u1')
+        ndarray[float64_t, ndim=1] floats = cnp.PyArray_EMPTY(1, values.shape, cnp.NPY_FLOAT64, 0)
+        ndarray[complex128_t, ndim=1] complexes = cnp.PyArray_EMPTY(1, values.shape, cnp.NPY_COMPLEX128, 0)
+        ndarray[int64_t, ndim=1] ints = cnp.PyArray_EMPTY(1, values.shape, cnp.NPY_INT64, 0)
+        ndarray[uint64_t, ndim=1] uints = cnp.PyArray_EMPTY(1, values.shape, cnp.NPY_UINT64, 0)
+        ndarray[uint8_t, ndim=1] bools = cnp.PyArray_EMPTY(1, values.shape, cnp.NPY_UINT8, 0)
         ndarray[uint8_t, ndim=1] mask = np.zeros(n, dtype="u1")
         float64_t fval
         bint allow_null_in_int = convert_to_masked_nullable
@@ -2479,11 +2479,11 @@ def maybe_convert_objects(ndarray[object] objects,
 
     n = len(objects)
 
-    floats = np.empty(n, dtype='f8')
-    complexes = np.empty(n, dtype='c16')
-    ints = np.empty(n, dtype='i8')
-    uints = np.empty(n, dtype='u8')
-    bools = np.empty(n, dtype=np.uint8)
+    floats = cnp.PyArray_EMPTY(1, objects.shape, cnp.NPY_FLOAT64, 0)
+    complexes = cnp.PyArray_EMPTY(1, objects.shape, cnp.NPY_COMPLEX128, 0)
+    ints = cnp.PyArray_EMPTY(1, objects.shape, cnp.NPY_INT64, 0)
+    uints = cnp.PyArray_EMPTY(1, objects.shape, cnp.NPY_UINT64, 0)
+    bools = cnp.PyArray_EMPTY(1, objects.shape, cnp.NPY_UINT8, 0)
     mask = np.full(n, False)
 
     if convert_datetime:
@@ -2785,7 +2785,7 @@ cdef _infer_all_nats(dtype, ndarray datetimes, ndarray timedeltas):
     else:
         # ExtensionDtype
         cls = dtype.construct_array_type()
-        i8vals = np.empty(len(datetimes), dtype="i8")
+        i8vals = cnp.PyArray_EMPTY(1, datetimes.shape, cnp.NPY_INT64, 0)
         i8vals.fill(NPY_NAT)
         result = cls(i8vals, dtype=dtype)
     return result
@@ -2888,7 +2888,7 @@ def map_infer(
         object val
 
     n = len(arr)
-    result = np.empty(n, dtype=object)
+    result = cnp.PyArray_EMPTY(1, arr.shape, cnp.NPY_OBJECT, 0)
     for i in range(n):
         if ignore_na and checknull(arr[i]):
             result[i] = arr[i]
@@ -3083,7 +3083,7 @@ cpdef ndarray eq_NA_compat(ndarray[object] arr, object key):
     key is assumed to have `not isna(key)`
     """
     cdef:
-        ndarray[uint8_t, cast=True] result = np.empty(len(arr), dtype=bool)
+        ndarray[uint8_t, cast=True] result = cnp.PyArray_EMPTY(arr.ndim, arr.shape, cnp.NPY_BOOL, 0)
         Py_ssize_t i
         object item