Skip to content

Commit 4479e37

Browse files
committed
Merge branch 'master' into cleanup/matplotlib-style
2 parents 765836f + 54fa3da commit 4479e37

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+754
-281
lines changed

.pre-commit-config.yaml

+2
Original file line numberDiff line numberDiff line change
@@ -71,3 +71,5 @@ repos:
7171
hooks:
7272
- id: end-of-file-fixer
7373
exclude: ^LICENSES/|\.(html|csv|txt|svg|py)$
74+
- id: trailing-whitespace
75+
exclude: \.(html|svg)$

asv_bench/benchmarks/groupby.py

+20
Original file line numberDiff line numberDiff line change
@@ -358,6 +358,26 @@ def time_category_size(self):
358358
self.draws.groupby(self.cats).size()
359359

360360

361+
class FillNA:
362+
def setup(self):
363+
N = 100
364+
self.df = DataFrame(
365+
{"group": [1] * N + [2] * N, "value": [np.nan, 1.0] * N}
366+
).set_index("group")
367+
368+
def time_df_ffill(self):
369+
self.df.groupby("group").fillna(method="ffill")
370+
371+
def time_df_bfill(self):
372+
self.df.groupby("group").fillna(method="bfill")
373+
374+
def time_srs_ffill(self):
375+
self.df.groupby("group")["value"].fillna(method="ffill")
376+
377+
def time_srs_bfill(self):
378+
self.df.groupby("group")["value"].fillna(method="bfill")
379+
380+
361381
class GroupByMethods:
362382

363383
param_names = ["dtype", "method", "application"]

doc/source/development/contributing.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -206,7 +206,7 @@ You will need `Build Tools for Visual Studio 2017
206206
scrolling down to "All downloads" -> "Tools for Visual Studio 2019".
207207
In the installer, select the "C++ build tools" workload.
208208

209-
**Mac OS**
209+
**macOS**
210210

211211
Information about compiler installation can be found here:
212212
https://devguide.python.org/setup/#macos
@@ -299,7 +299,7 @@ Creating a Python environment (pip)
299299
If you aren't using conda for your development environment, follow these instructions.
300300
You'll need to have at least Python 3.6.1 installed on your system.
301301

302-
**Unix**/**Mac OS with virtualenv**
302+
**Unix**/**macOS with virtualenv**
303303

304304
.. code-block:: bash
305305
@@ -318,7 +318,7 @@ You'll need to have at least Python 3.6.1 installed on your system.
318318
python setup.py build_ext --inplace -j 4
319319
python -m pip install -e . --no-build-isolation --no-use-pep517
320320
321-
**Unix**/**Mac OS with pyenv**
321+
**Unix**/**macOS with pyenv**
322322

323323
Consult the docs for setting up pyenv `here <https://github.com/pyenv/pyenv>`__.
324324

doc/source/user_guide/io.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ The pandas I/O API is a set of top level ``reader`` functions accessed like
2323
text;`JSON <https://www.json.org/>`__;:ref:`read_json<io.json_reader>`;:ref:`to_json<io.json_writer>`
2424
text;`HTML <https://en.wikipedia.org/wiki/HTML>`__;:ref:`read_html<io.read_html>`;:ref:`to_html<io.html>`
2525
text; Local clipboard;:ref:`read_clipboard<io.clipboard>`;:ref:`to_clipboard<io.clipboard>`
26-
;`MS Excel <https://en.wikipedia.org/wiki/Microsoft_Excel>`__;:ref:`read_excel<io.excel_reader>`;:ref:`to_excel<io.excel_writer>`
26+
binary;`MS Excel <https://en.wikipedia.org/wiki/Microsoft_Excel>`__;:ref:`read_excel<io.excel_reader>`;:ref:`to_excel<io.excel_writer>`
2727
binary;`OpenDocument <http://www.opendocumentformat.org>`__;:ref:`read_excel<io.ods>`;
2828
binary;`HDF5 Format <https://support.hdfgroup.org/HDF5/whatishdf5.html>`__;:ref:`read_hdf<io.hdf5>`;:ref:`to_hdf<io.hdf5>`
2929
binary;`Feather Format <https://github.com/wesm/feather>`__;:ref:`read_feather<io.feather>`;:ref:`to_feather<io.feather>`

doc/source/user_guide/text.rst

+4-4
Original file line numberDiff line numberDiff line change
@@ -302,10 +302,10 @@ positional argument (a regex object) and return a string.
302302
return m.group(0)[::-1]
303303
304304
305-
pd.Series(
306-
["foo 123", "bar baz", np.nan],
307-
dtype="string"
308-
).str.replace(pat, repl, regex=True)
305+
pd.Series(["foo 123", "bar baz", np.nan], dtype="string").str.replace(
306+
pat, repl, regex=True
307+
)
308+
309309
310310
# Using regex groups
311311
pat = r"(?P<one>\w+) (?P<two>\w+) (?P<three>\w+)"

doc/source/user_guide/visualization.rst

+28-28
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ On DataFrame, :meth:`~DataFrame.plot` is a convenience to plot all of the column
6464
6565
plt.figure();
6666
@savefig frame_plot_basic.png
67-
df.plot()
67+
df.plot();
6868
6969
You can plot one column versus another using the ``x`` and ``y`` keywords in
7070
:meth:`~DataFrame.plot`:
@@ -119,7 +119,7 @@ For example, a bar plot can be created the following way:
119119
plt.figure();
120120
121121
@savefig bar_plot_ex.png
122-
df.iloc[5].plot(kind="bar")
122+
df.iloc[5].plot(kind="bar");
123123
124124
You can also create these other plots using the methods ``DataFrame.plot.<kind>`` instead of providing the ``kind`` keyword argument. This makes it easier to discover plot methods and the specific arguments they use:
125125

@@ -180,7 +180,7 @@ bar plot:
180180
df2 = pd.DataFrame(np.random.rand(10, 4), columns=["a", "b", "c", "d"])
181181
182182
@savefig bar_plot_multi_ex.png
183-
df2.plot.bar()
183+
df2.plot.bar();
184184
185185
To produce a stacked bar plot, pass ``stacked=True``:
186186

@@ -193,7 +193,7 @@ To produce a stacked bar plot, pass ``stacked=True``:
193193
.. ipython:: python
194194
195195
@savefig bar_plot_stacked_ex.png
196-
df2.plot.bar(stacked=True)
196+
df2.plot.bar(stacked=True);
197197
198198
To get horizontal bar plots, use the ``barh`` method:
199199

@@ -206,7 +206,7 @@ To get horizontal bar plots, use the ``barh`` method:
206206
.. ipython:: python
207207
208208
@savefig barh_plot_stacked_ex.png
209-
df2.plot.barh(stacked=True)
209+
df2.plot.barh(stacked=True);
210210
211211
.. _visualization.hist:
212212

@@ -414,7 +414,7 @@ groupings. For instance,
414414
df = pd.DataFrame(np.random.rand(10, 2), columns=["Col1", "Col2"])
415415
df["X"] = pd.Series(["A", "A", "A", "A", "A", "B", "B", "B", "B", "B"])
416416
417-
plt.figure()
417+
plt.figure();
418418
419419
@savefig box_plot_ex2.png
420420
bp = df.boxplot(by="X")
@@ -518,7 +518,7 @@ When input data contains ``NaN``, it will be automatically filled by 0. If you w
518518
df = pd.DataFrame(np.random.rand(10, 4), columns=["a", "b", "c", "d"])
519519
520520
@savefig area_plot_stacked.png
521-
df.plot.area()
521+
df.plot.area();
522522
523523
To produce an unstacked plot, pass ``stacked=False``. Alpha value is set to 0.5 unless otherwise specified:
524524

@@ -531,7 +531,7 @@ To produce an unstacked plot, pass ``stacked=False``. Alpha value is set to 0.5
531531
.. ipython:: python
532532
533533
@savefig area_plot_unstacked.png
534-
df.plot.area(stacked=False)
534+
df.plot.area(stacked=False);
535535
536536
.. _visualization.scatter:
537537

@@ -554,7 +554,7 @@ These can be specified by the ``x`` and ``y`` keywords.
554554
df = pd.DataFrame(np.random.rand(50, 4), columns=["a", "b", "c", "d"])
555555
556556
@savefig scatter_plot.png
557-
df.plot.scatter(x="a", y="b")
557+
df.plot.scatter(x="a", y="b");
558558
559559
To plot multiple column groups in a single axes, repeat ``plot`` method specifying target ``ax``.
560560
It is recommended to specify ``color`` and ``label`` keywords to distinguish each groups.
@@ -563,7 +563,7 @@ It is recommended to specify ``color`` and ``label`` keywords to distinguish eac
563563
564564
ax = df.plot.scatter(x="a", y="b", color="DarkBlue", label="Group 1")
565565
@savefig scatter_plot_repeated.png
566-
df.plot.scatter(x="c", y="d", color="DarkGreen", label="Group 2", ax=ax)
566+
df.plot.scatter(x="c", y="d", color="DarkGreen", label="Group 2", ax=ax);
567567
568568
.. ipython:: python
569569
:suppress:
@@ -576,7 +576,7 @@ each point:
576576
.. ipython:: python
577577
578578
@savefig scatter_plot_colored.png
579-
df.plot.scatter(x="a", y="b", c="c", s=50)
579+
df.plot.scatter(x="a", y="b", c="c", s=50);
580580
581581
582582
.. ipython:: python
@@ -591,7 +591,7 @@ bubble chart using a column of the ``DataFrame`` as the bubble size.
591591
.. ipython:: python
592592
593593
@savefig scatter_plot_bubble.png
594-
df.plot.scatter(x="a", y="b", s=df["c"] * 200)
594+
df.plot.scatter(x="a", y="b", s=df["c"] * 200);
595595
596596
.. ipython:: python
597597
:suppress:
@@ -837,7 +837,7 @@ You can create a scatter plot matrix using the
837837
df = pd.DataFrame(np.random.randn(1000, 4), columns=["a", "b", "c", "d"])
838838
839839
@savefig scatter_matrix_kde.png
840-
scatter_matrix(df, alpha=0.2, figsize=(6, 6), diagonal="kde")
840+
scatter_matrix(df, alpha=0.2, figsize=(6, 6), diagonal="kde");
841841
842842
.. ipython:: python
843843
:suppress:
@@ -1086,7 +1086,7 @@ layout and formatting of the returned plot:
10861086
10871087
plt.figure();
10881088
@savefig series_plot_basic2.png
1089-
ts.plot(style="k--", label="Series")
1089+
ts.plot(style="k--", label="Series");
10901090
10911091
.. ipython:: python
10921092
:suppress:
@@ -1144,7 +1144,7 @@ it empty for ylabel.
11441144
df.plot();
11451145
11461146
@savefig plot_xlabel_ylabel.png
1147-
df.plot(xlabel="new x", ylabel="new y")
1147+
df.plot(xlabel="new x", ylabel="new y");
11481148
11491149
.. ipython:: python
11501150
:suppress:
@@ -1320,7 +1320,7 @@ with the ``subplots`` keyword:
13201320
.. ipython:: python
13211321
13221322
@savefig frame_plot_subplots.png
1323-
df.plot(subplots=True, figsize=(6, 6))
1323+
df.plot(subplots=True, figsize=(6, 6));
13241324
13251325
.. ipython:: python
13261326
:suppress:
@@ -1343,7 +1343,7 @@ or columns needed, given the other.
13431343
.. ipython:: python
13441344
13451345
@savefig frame_plot_subplots_layout.png
1346-
df.plot(subplots=True, layout=(2, 3), figsize=(6, 6), sharex=False)
1346+
df.plot(subplots=True, layout=(2, 3), figsize=(6, 6), sharex=False);
13471347
13481348
.. ipython:: python
13491349
:suppress:
@@ -1354,7 +1354,7 @@ The above example is identical to using:
13541354

13551355
.. ipython:: python
13561356
1357-
df.plot(subplots=True, layout=(2, -1), figsize=(6, 6), sharex=False)
1357+
df.plot(subplots=True, layout=(2, -1), figsize=(6, 6), sharex=False);
13581358
13591359
.. ipython:: python
13601360
:suppress:
@@ -1379,9 +1379,9 @@ otherwise you will see a warning.
13791379
target1 = [axes[0][0], axes[1][1], axes[2][2], axes[3][3]]
13801380
target2 = [axes[3][0], axes[2][1], axes[1][2], axes[0][3]]
13811381
1382-
df.plot(subplots=True, ax=target1, legend=False, sharex=False, sharey=False)
1382+
df.plot(subplots=True, ax=target1, legend=False, sharex=False, sharey=False);
13831383
@savefig frame_plot_subplots_multi_ax.png
1384-
(-df).plot(subplots=True, ax=target2, legend=False, sharex=False, sharey=False)
1384+
(-df).plot(subplots=True, ax=target2, legend=False, sharex=False, sharey=False);
13851385
13861386
.. ipython:: python
13871387
:suppress:
@@ -1409,15 +1409,15 @@ Another option is passing an ``ax`` argument to :meth:`Series.plot` to plot on a
14091409
14101410
fig, axes = plt.subplots(nrows=2, ncols=2)
14111411
plt.subplots_adjust(wspace=0.2, hspace=0.5)
1412-
df["A"].plot(ax=axes[0, 0])
1413-
axes[0, 0].set_title("A")
1414-
df["B"].plot(ax=axes[0, 1])
1415-
axes[0, 1].set_title("B")
1416-
df["C"].plot(ax=axes[1, 0])
1417-
axes[1, 0].set_title("C")
1418-
df["D"].plot(ax=axes[1, 1])
1412+
df["A"].plot(ax=axes[0, 0]);
1413+
axes[0, 0].set_title("A");
1414+
df["B"].plot(ax=axes[0, 1]);
1415+
axes[0, 1].set_title("B");
1416+
df["C"].plot(ax=axes[1, 0]);
1417+
axes[1, 0].set_title("C");
1418+
df["D"].plot(ax=axes[1, 1]);
14191419
@savefig series_plot_multi.png
1420-
axes[1, 1].set_title("D")
1420+
axes[1, 1].set_title("D");
14211421
14221422
.. ipython:: python
14231423
:suppress:

doc/source/whatsnew/v1.1.4.rst

+2
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ Fixed regressions
2828
Bug fixes
2929
~~~~~~~~~
3030
- Bug causing ``groupby(...).sum()`` and similar to not preserve metadata (:issue:`29442`)
31+
- Bug in :meth:`Series.isin` and :meth:`DataFrame.isin` raising a ``ValueError`` when the target was read-only (:issue:`37174`)
32+
- Bug in :meth:`GroupBy.fillna` that introduced a performance regression after 1.0.5 (:issue:`36757`)
3133

3234
.. ---------------------------------------------------------------------------
3335

doc/source/whatsnew/v1.2.0.rst

+33-2
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,32 @@ For example:
9696
buffer = io.BytesIO()
9797
data.to_csv(buffer, mode="w+b", encoding="utf-8", compression="gzip")
9898
99+
Support for short caption and table position in ``to_latex``
100+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
101+
102+
:meth:`DataFrame.to_latex` now allows one to specify
103+
a floating table position (:issue:`35281`)
104+
and a short caption (:issue:`36267`).
105+
106+
New keyword ``position`` is implemented to set the position.
107+
108+
.. ipython:: python
109+
110+
data = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
111+
table = data.to_latex(position='ht')
112+
print(table)
113+
114+
Usage of keyword ``caption`` is extended.
115+
Besides taking a single string as an argument,
116+
one can optionally provide a tuple of ``(full_caption, short_caption)``
117+
to add a short caption macro.
118+
119+
.. ipython:: python
120+
121+
data = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
122+
table = data.to_latex(caption=('the full long caption', 'short caption'))
123+
print(table)
124+
99125
.. _whatsnew_120.read_csv_table_precision_default:
100126

101127
Change in default floating precision for ``read_csv`` and ``read_table``
@@ -194,6 +220,7 @@ Other enhancements
194220
- Added :meth:`Rolling.sem()` and :meth:`Expanding.sem()` to compute the standard error of mean (:issue:`26476`).
195221
- :meth:`Rolling.var()` and :meth:`Rolling.std()` use Kahan summation and Welfords Method to avoid numerical issues (:issue:`37051`)
196222
- :meth:`DataFrame.plot` now recognizes ``xlabel`` and ``ylabel`` arguments for plots of type ``scatter`` and ``hexbin`` (:issue:`37001`)
223+
- :class:`DataFrame` now supports ``divmod`` operation (:issue:`37165`)
197224

198225
.. _whatsnew_120.api_breaking.python:
199226

@@ -348,8 +375,7 @@ Datetimelike
348375
Timedelta
349376
^^^^^^^^^
350377
- Bug in :class:`TimedeltaIndex`, :class:`Series`, and :class:`DataFrame` floor-division with ``timedelta64`` dtypes and ``NaT`` in the denominator (:issue:`35529`)
351-
-
352-
-
378+
- Bug in parsing of ISO 8601 durations in :class:`Timedelta`, :meth:`pd.to_datetime` (:issue:`37159`, fixes :issue:`29773` and :issue:`36204`)
353379

354380
Timezones
355381
^^^^^^^^^
@@ -368,9 +394,11 @@ Numeric
368394
- Bug in :meth:`DataFrame.__rmatmul__` error handling reporting transposed shapes (:issue:`21581`)
369395
- Bug in :class:`Series` flex arithmetic methods where the result when operating with a ``list``, ``tuple`` or ``np.ndarray`` would have an incorrect name (:issue:`36760`)
370396
- Bug in :class:`IntegerArray` multiplication with ``timedelta`` and ``np.timedelta64`` objects (:issue:`36870`)
397+
- Bug in :class:`MultiIndex` comparison with tuple incorrectly treating tuple as array-like (:issue:`21517`)
371398
- Bug in :meth:`DataFrame.diff` with ``datetime64`` dtypes including ``NaT`` values failing to fill ``NaT`` results correctly (:issue:`32441`)
372399
- Bug in :class:`DataFrame` arithmetic ops incorrectly accepting keyword arguments (:issue:`36843`)
373400
- Bug in :class:`IntervalArray` comparisons with :class:`Series` not returning :class:`Series` (:issue:`36908`)
401+
- Bug in :class:`DataFrame` allowing arithmetic operations with list of array-likes with undefined results. Behavior changed to raising ``ValueError`` (:issue:`36702`)
374402

375403
Conversion
376404
^^^^^^^^^^
@@ -400,6 +428,8 @@ Indexing
400428
- Bug in :meth:`DataFrame.sort_index` where parameter ascending passed as a list on a single level index gives wrong result. (:issue:`32334`)
401429
- Bug in :meth:`DataFrame.reset_index` was incorrectly raising a ``ValueError`` for input with a :class:`MultiIndex` with missing values in a level with ``Categorical`` dtype (:issue:`24206`)
402430
- Bug in indexing with boolean masks on datetime-like values sometimes returning a view instead of a copy (:issue:`36210`)
431+
- Bug in :meth:`DataFrame.__getitem__` and :meth:`DataFrame.loc.__getitem__` with :class:`IntervalIndex` columns and a numeric indexer (:issue:`26490`)
432+
- Bug in :meth:`Series.loc.__getitem__` with a non-unique :class:`MultiIndex` and an empty-list indexer (:issue:`13691`)
403433

404434
Missing
405435
^^^^^^^
@@ -462,6 +492,7 @@ Groupby/resample/rolling
462492
- Bug in :meth:`RollingGroupby.count` where a ``ValueError`` was raised when specifying the ``closed`` parameter (:issue:`35869`)
463493
- Bug in :meth:`DataFrame.groupby.rolling` returning wrong values with partial centered window (:issue:`36040`).
464494
- Bug in :meth:`DataFrameGroupBy.rolling` returned wrong values with timeaware window containing ``NaN``. Raises ``ValueError`` because windows are not monotonic now (:issue:`34617`)
495+
- Bug in :meth:`Rolling.__iter__` where a ``ValueError`` was not raised when ``min_periods`` was larger than ``window`` (:issue:`37156`)
465496

466497
Reshaping
467498
^^^^^^^^^

0 commit comments

Comments
 (0)