Skip to content

Commit 2924cc5

Browse files
committed
Merge remote-tracking branch 'upstream/master' into Interval-array
2 parents 02c7720 + ec3d786 commit 2924cc5

File tree

99 files changed

+1615
-1180
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

99 files changed

+1615
-1180
lines changed

README.md

+2
Original file line numberDiff line numberDiff line change
@@ -233,3 +233,5 @@ You can also triage issues which may include reproducing bug reports, or asking
233233
Or maybe through using pandas you have an idea of your own or are looking for something in the documentation and thinking ‘this can be improved’...you can do something about it!
234234

235235
Feel free to ask questions on the [mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata) or on [Gitter](https://gitter.im/pydata/pandas).
236+
237+
As contributors and maintainers to this project, you are expected to abide by pandas' code of conduct. More information can be found at: [Contributor Code of Conduct](https://github.com/pandas-dev/pandas/blob/master/.github/CODE_OF_CONDUCT.md)

ci/deps/azure-37-locale.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@ dependencies:
2626
- xlsxwriter
2727
- xlwt
2828
# universal
29-
- pytest>=4.0.2
30-
- pytest-xdist
29+
- pytest>=5.0.1
30+
- pytest-xdist>=1.29.0
3131
- pytest-mock
3232
- pytest-azurepipelines
3333
- pip

ci/deps/azure-37-numpydev.yaml

+2-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ dependencies:
66
- pytz
77
- Cython>=0.28.2
88
# universal
9-
- pytest>=4.0.2
9+
# pytest < 5 until defaults has pytest-xdist>=1.29.0
10+
- pytest>=4.0.2,<5.0
1011
- pytest-xdist
1112
- pytest-mock
1213
- hypothesis>=3.58.0

ci/deps/azure-macos-35.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,8 @@ dependencies:
2525
- pip:
2626
- pyreadstat
2727
# universal
28-
- pytest==4.5.0
29-
- pytest-xdist
28+
- pytest>=5.0.1
29+
- pytest-xdist>=1.29.0
3030
- pytest-mock
3131
- hypothesis>=3.58.0
3232
# https://github.com/pandas-dev/pandas/issues/27421

ci/deps/azure-windows-36.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ dependencies:
2323
- xlwt
2424
# universal
2525
- cython>=0.28.2
26-
- pytest>=4.0.2
27-
- pytest-xdist
26+
- pytest>=5.0.1
27+
- pytest-xdist>=1.29.0
2828
- pytest-mock
2929
- pytest-azurepipelines
3030
- hypothesis>=3.58.0

ci/deps/azure-windows-37.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@ dependencies:
2626
- xlwt
2727
# universal
2828
- cython>=0.28.2
29-
- pytest>=4.0.2
30-
- pytest-xdist
29+
- pytest>=5.0.0
30+
- pytest-xdist>=1.29.0
3131
- pytest-mock
3232
- pytest-azurepipelines
3333
- hypothesis>=3.58.0

ci/deps/travis-36-cov.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,8 @@ dependencies:
3939
- xlsxwriter
4040
- xlwt
4141
# universal
42-
- pytest
43-
- pytest-xdist
42+
- pytest>=5.0.1
43+
- pytest-xdist>=1.29.0
4444
- pytest-cov
4545
- pytest-mock
4646
- hypothesis>=3.58.0

ci/deps/travis-37.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,8 @@ dependencies:
1313
- pyarrow
1414
- pytz
1515
# universal
16-
- pytest>=4.0.2
17-
- pytest-xdist
16+
- pytest>=5.0.0
17+
- pytest-xdist>=1.29.0
1818
- pytest-mock
1919
- hypothesis>=3.58.0
2020
- s3fs

doc/source/user_guide/io.rst

+8-6
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ The pandas I/O API is a set of top level ``reader`` functions accessed like
2828
:delim: ;
2929

3030
text;`CSV <https://en.wikipedia.org/wiki/Comma-separated_values>`__;:ref:`read_csv<io.read_csv_table>`;:ref:`to_csv<io.store_in_csv>`
31+
text;Fixed-Width Text File;:ref:`read_fwf<io.fwf_reader>`
3132
text;`JSON <https://www.json.org/>`__;:ref:`read_json<io.json_reader>`;:ref:`to_json<io.json_writer>`
3233
text;`HTML <https://en.wikipedia.org/wiki/HTML>`__;:ref:`read_html<io.read_html>`;:ref:`to_html<io.html>`
3334
text; Local clipboard;:ref:`read_clipboard<io.clipboard>`;:ref:`to_clipboard<io.clipboard>`
@@ -1372,6 +1373,7 @@ should pass the ``escapechar`` option:
13721373
print(data)
13731374
pd.read_csv(StringIO(data), escapechar='\\')
13741375
1376+
.. _io.fwf_reader:
13751377
.. _io.fwf:
13761378

13771379
Files with fixed width columns
@@ -3572,7 +3574,7 @@ Closing a Store and using a context manager:
35723574
Read/write API
35733575
''''''''''''''
35743576

3575-
``HDFStore`` supports an top-level API using ``read_hdf`` for reading and ``to_hdf`` for writing,
3577+
``HDFStore`` supports a top-level API using ``read_hdf`` for reading and ``to_hdf`` for writing,
35763578
similar to how ``read_csv`` and ``to_csv`` work.
35773579

35783580
.. ipython:: python
@@ -3687,7 +3689,7 @@ Hierarchical keys
36873689
Keys to a store can be specified as a string. These can be in a
36883690
hierarchical path-name like format (e.g. ``foo/bar/bah``), which will
36893691
generate a hierarchy of sub-stores (or ``Groups`` in PyTables
3690-
parlance). Keys can be specified with out the leading '/' and are **always**
3692+
parlance). Keys can be specified without the leading '/' and are **always**
36913693
absolute (e.g. 'foo' refers to '/foo'). Removal operations can remove
36923694
everything in the sub-store and **below**, so be *careful*.
36933695

@@ -3825,7 +3827,7 @@ data.
38253827

38263828
A query is specified using the ``Term`` class under the hood, as a boolean expression.
38273829

3828-
* ``index`` and ``columns`` are supported indexers of a ``DataFrames``.
3830+
* ``index`` and ``columns`` are supported indexers of ``DataFrames``.
38293831
* if ``data_columns`` are specified, these can be used as additional indexers.
38303832

38313833
Valid comparison operators are:
@@ -3917,7 +3919,7 @@ Use boolean expressions, with in-line function evaluation.
39173919
39183920
store.select('dfq', "index>pd.Timestamp('20130104') & columns=['A', 'B']")
39193921
3920-
Use and inline column reference
3922+
Use inline column reference.
39213923

39223924
.. ipython:: python
39233925
@@ -4593,8 +4595,8 @@ Performance
45934595
write chunksize (default is 50000). This will significantly lower
45944596
your memory usage on writing.
45954597
* You can pass ``expectedrows=<int>`` to the first ``append``,
4596-
to set the TOTAL number of expected rows that ``PyTables`` will
4597-
expected. This will optimize read/write performance.
4598+
to set the TOTAL number of rows that ``PyTables`` will expect.
4599+
This will optimize read/write performance.
45984600
* Duplicate rows can be written to tables, but are filtered out in
45994601
selection (with the last items being selected; thus a table is
46004602
unique on major, minor pairs)

doc/source/whatsnew/v0.25.1.rst

+27-15
Original file line numberDiff line numberDiff line change
@@ -25,14 +25,13 @@ Bug fixes
2525
Categorical
2626
^^^^^^^^^^^
2727

28-
-
29-
-
28+
- Bug in :meth:`Categorical.fillna` would replace all values, not just those that are ``NaN`` (:issue:`26215`)
3029
-
3130

3231
Datetimelike
3332
^^^^^^^^^^^^
3433
- Bug in :func:`to_datetime` where passing a timezone-naive :class:`DatetimeArray` or :class:`DatetimeIndex` and ``utc=True`` would incorrectly return a timezone-naive result (:issue:`27733`)
35-
-
34+
- Bug in :meth:`Period.to_timestamp` where a :class:`Period` outside the :class:`Timestamp` implementation bounds (roughly 1677-09-21 to 2262-04-11) would return an incorrect :class:`Timestamp` instead of raising ``OutOfBoundsDatetime`` (:issue:`19643`)
3635
-
3736
-
3837

@@ -54,8 +53,8 @@ Numeric
5453
^^^^^^^
5554
- Bug in :meth:`Series.interpolate` when using a timezone aware :class:`DatetimeIndex` (:issue:`27548`)
5655
- Bug when printing negative floating point complex numbers would raise an ``IndexError`` (:issue:`27484`)
57-
-
58-
-
56+
- Bug where :class:`DataFrame` arithmetic operators such as :meth:`DataFrame.mul` with a :class:`Series` with axis=1 would raise an ``AttributeError`` on :class:`DataFrame` larger than the minimum threshold to invoke numexpr (:issue:`27636`)
57+
- Bug in :class:`DataFrame` arithmetic where missing values in results were incorrectly masked with ``NaN`` instead of ``Inf`` (:issue:`27464`)
5958

6059
Conversion
6160
^^^^^^^^^^
@@ -83,14 +82,15 @@ Indexing
8382
^^^^^^^^
8483

8584
- Bug in partial-string indexing returning a NumPy array rather than a ``Series`` when indexing with a scalar like ``.loc['2015']`` (:issue:`27516`)
86-
- Break reference cycle involving :class:`Index` to allow garbage collection of :class:`Index` objects without running the GC. (:issue:`27585`)
87-
-
85+
- Break reference cycle involving :class:`Index` and other index classes to allow garbage collection of index objects without running the GC. (:issue:`27585`, :issue:`27840`)
86+
- Fix regression in assigning values to a single column of a DataFrame with a ``MultiIndex`` columns (:issue:`27841`).
87+
- Fix regression in ``.ix`` fallback with an ``IntervalIndex`` (:issue:`27865`).
8888
-
8989

9090
Missing
9191
^^^^^^^
9292

93-
-
93+
- Bug in :func:`pandas.isnull` or :func:`pandas.isna` when the input is a type e.g. `type(pandas.Series())` (:issue:`27482`)
9494
-
9595
-
9696

@@ -103,37 +103,41 @@ MultiIndex
103103

104104
I/O
105105
^^^
106-
107-
-
108-
-
106+
- Avoid calling ``S3File.s3`` when reading parquet, as this was removed in s3fs version 0.3.0 (:issue:`27756`)
107+
- Better error message when a negative header is passed in :func:`pandas.read_csv` (:issue:`27779`)
108+
- Follow the ``min_rows`` display option (introduced in v0.25.0) correctly in the html repr in the notebook (:issue:`27991`).
109109
-
110110

111111
Plotting
112112
^^^^^^^^
113113

114114
- Added a pandas_plotting_backends entrypoint group for registering plot backends. See :ref:`extending.plotting-backends` for more (:issue:`26747`).
115+
- Fixed the re-instatement of Matplotlib datetime converters after calling
116+
`pandas.plotting.deregister_matplotlib_converters()` (:issue:`27481`).
117+
-
115118
- Fix compatibility issue with matplotlib when passing a pandas ``Index`` to a plot call (:issue:`27775`).
116119
-
117120

118121
Groupby/resample/rolling
119122
^^^^^^^^^^^^^^^^^^^^^^^^
120123

121124
- Bug in :meth:`pandas.core.groupby.DataFrameGroupBy.transform` where applying a timezone conversion lambda function would drop timezone information (:issue:`27496`)
125+
- Bug in :meth:`pandas.core.groupby.GroupBy.nth` where ``observed=False`` was being ignored for Categorical groupers (:issue:`26385`)
122126
- Bug in windowing over read-only arrays (:issue:`27766`)
123-
-
127+
- Fixed segfault in `pandas.core.groupby.DataFrameGroupBy.quantile` when an invalid quantile was passed (:issue:`27470`)
124128
-
125129

126130
Reshaping
127131
^^^^^^^^^
128132

129133
- A ``KeyError`` is now raised if ``.unstack()`` is called on a :class:`Series` or :class:`DataFrame` with a flat :class:`Index` passing a name which is not the correct one (:issue:`18303`)
130-
- Bug in :meth:`DataFrame.crosstab` when ``margins`` set to ``True`` and ``normalize`` is not ``False``, an error is raised. (:issue:`27500`)
134+
- Bug in :meth:`DataFrame.crosstab` when ``margins`` set to ``True`` and ``normalize`` is not ``False``, an error is raised. (:issue:`27500`)
131135
- :meth:`DataFrame.join` now suppresses the ``FutureWarning`` when the sort parameter is specified (:issue:`21952`)
132-
-
136+
- Bug in :meth:`DataFrame.join` raising with readonly arrays (:issue:`27943`)
133137

134138
Sparse
135139
^^^^^^
136-
140+
- Bug in reductions for :class:`Series` with Sparse dtypes (:issue:`27080`)
137141
-
138142
-
139143
-
@@ -160,6 +164,14 @@ Other
160164
-
161165
-
162166

167+
I/O and LZMA
168+
~~~~~~~~~~~~
169+
170+
Some users may unknowingly have an incomplete Python installation, which lacks the `lzma` module from the standard library. In this case, `import pandas` failed due to an `ImportError` (:issue: `27575`).
171+
Pandas will now warn, rather than raising an `ImportError` if the `lzma` module is not present. Any subsequent attempt to use `lzma` methods will raise a `RuntimeError`.
172+
A possible fix for the lack of the `lzma` module is to ensure you have the necessary libraries and then re-install Python.
173+
For example, on MacOS installing Python with `pyenv` may lead to an incomplete Python installation due to unmet system dependencies at compilation time (like `xz`). Compilation will succeed, but Python might fail at run time. The issue can be solved by installing the necessary dependencies and then re-installing Python.
174+
163175
.. _whatsnew_0.251.contributors:
164176

165177
Contributors

doc/source/whatsnew/v0.7.3.rst

-6
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,6 @@ New features
2525
from pandas.tools.plotting import scatter_matrix
2626
scatter_matrix(df, alpha=0.2) # noqa F821
2727
28-
.. image:: ../savefig/scatter_matrix_kde.png
29-
:width: 5in
3028
3129
- Add ``stacked`` argument to Series and DataFrame's ``plot`` method for
3230
:ref:`stacked bar plots <visualization.barplot>`.
@@ -35,15 +33,11 @@ New features
3533
3634
df.plot(kind='bar', stacked=True) # noqa F821
3735
38-
.. image:: ../savefig/bar_plot_stacked_ex.png
39-
:width: 4in
4036
4137
.. code-block:: python
4238
4339
df.plot(kind='barh', stacked=True) # noqa F821
4440
45-
.. image:: ../savefig/barh_plot_stacked_ex.png
46-
:width: 4in
4741
4842
- Add log x and y :ref:`scaling options <visualization.basic>` to
4943
``DataFrame.plot`` and ``Series.plot``

doc/source/whatsnew/v1.0.0.rst

+12-8
Original file line numberDiff line numberDiff line change
@@ -21,27 +21,27 @@ including other versions of pandas.
2121
Enhancements
2222
~~~~~~~~~~~~
2323

24-
.. _whatsnew_1000.enhancements.other:
25-
2624
-
2725
-
2826

27+
.. _whatsnew_1000.enhancements.other:
28+
2929
Other enhancements
3030
^^^^^^^^^^^^^^^^^^
3131

32-
.. _whatsnew_1000.api_breaking:
33-
3432
-
3533
-
3634

35+
.. _whatsnew_1000.api_breaking:
36+
3737
Backwards incompatible API changes
3838
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3939

40-
.. _whatsnew_1000.api.other:
41-
4240
- :class:`pandas.core.groupby.GroupBy.transform` now raises on invalid operation names (:issue:`27489`).
4341
-
4442

43+
.. _whatsnew_1000.api.other:
44+
4545
Other API changes
4646
^^^^^^^^^^^^^^^^^
4747

@@ -87,6 +87,7 @@ Bug fixes
8787
Categorical
8888
^^^^^^^^^^^
8989

90+
- Added test to assert the :func:`fillna` raises the correct ValueError message when the value isn't a value from categories (:issue:`13628`)
9091
-
9192
-
9293

@@ -157,14 +158,17 @@ MultiIndex
157158
I/O
158159
^^^
159160

160-
-
161+
- :meth:`read_csv` now accepts binary mode file buffers when using the Python csv engine (:issue:`23779`)
161162
-
162163

163164
Plotting
164165
^^^^^^^^
165166

167+
- Bug in :meth:`Series.plot` not able to plot boolean values (:issue:`23719`)
166168
-
167-
-
169+
- Bug in :meth:`DataFrame.plot` producing incorrect legend markers when plotting multiple series on the same axis (:issue:`18222`)
170+
- Bug in :meth:`DataFrame.plot` when ``kind='box'`` and data contains datetime or timedelta data. These types are now automatically dropped (:issue:`22799`)
171+
- Bug in :meth:`DataFrame.plot.line` and :meth:`DataFrame.plot.area` produce wrong xlim in x-axis (:issue:`27686`, :issue:`25160`, :issue:`24784`)
168172

169173
Groupby/resample/rolling
170174
^^^^^^^^^^^^^^^^^^^^^^^^

pandas/_libs/groupby.pyx

+5
Original file line numberDiff line numberDiff line change
@@ -719,6 +719,11 @@ def group_quantile(ndarray[float64_t] out,
719719
ndarray[int64_t] counts, non_na_counts, sort_arr
720720

721721
assert values.shape[0] == N
722+
723+
if not (0 <= q <= 1):
724+
raise ValueError("'q' must be between 0 and 1. Got"
725+
" '{}' instead".format(q))
726+
722727
inter_methods = {
723728
'linear': INTERPOLATION_LINEAR,
724729
'lower': INTERPOLATION_LOWER,

pandas/_libs/hashtable.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ cdef class Int64Factorizer:
108108
def get_count(self):
109109
return self.count
110110

111-
def factorize(self, int64_t[:] values, sort=False,
111+
def factorize(self, const int64_t[:] values, sort=False,
112112
na_sentinel=-1, na_value=None):
113113
"""
114114
Factorize values with nans replaced by na_sentinel

pandas/_libs/parsers.pyx

+5-3
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
# See LICENSE for the license
33
import bz2
44
import gzip
5-
import lzma
65
import os
76
import sys
87
import time
@@ -59,9 +58,12 @@ from pandas.core.arrays import Categorical
5958
from pandas.core.dtypes.concat import union_categoricals
6059
import pandas.io.common as icom
6160

61+
from pandas.compat import _import_lzma, _get_lzma_file
6262
from pandas.errors import (ParserError, DtypeWarning,
6363
EmptyDataError, ParserWarning)
6464

65+
lzma = _import_lzma()
66+
6567
# Import CParserError as alias of ParserError for backwards compatibility.
6668
# Ultimately, we want to remove this import. See gh-12665 and gh-14479.
6769
CParserError = ParserError
@@ -645,9 +647,9 @@ cdef class TextReader:
645647
'zip file %s', str(zip_names))
646648
elif self.compression == 'xz':
647649
if isinstance(source, str):
648-
source = lzma.LZMAFile(source, 'rb')
650+
source = _get_lzma_file(lzma)(source, 'rb')
649651
else:
650-
source = lzma.LZMAFile(filename=source)
652+
source = _get_lzma_file(lzma)(filename=source)
651653
else:
652654
raise ValueError('Unrecognized compression type: %s' %
653655
self.compression)

0 commit comments

Comments
 (0)