Skip to content

Commit 9ced7d4

Browse files
committed
Merge branch 'master' of https://github.com/pandas-dev/pandas into boilerplate-4
2 parents 1682b92 + 5f74d8a commit 9ced7d4

File tree

101 files changed

+1615
-1357
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

101 files changed

+1615
-1357
lines changed

asv_bench/benchmarks/arithmetic.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -466,7 +466,7 @@ def setup(self, offset):
466466
self.rng = rng
467467

468468
def time_apply_index(self, offset):
469-
offset.apply_index(self.rng)
469+
self.rng + offset
470470

471471

472472
class BinaryOpsMultiIndex:

asv_bench/benchmarks/io/json.py

+6
Original file line numberDiff line numberDiff line change
@@ -53,12 +53,18 @@ def time_read_json_lines(self, index):
5353
def time_read_json_lines_concat(self, index):
5454
concat(read_json(self.fname, orient="records", lines=True, chunksize=25000))
5555

56+
def time_read_json_lines_nrows(self, index):
57+
read_json(self.fname, orient="records", lines=True, nrows=25000)
58+
5659
def peakmem_read_json_lines(self, index):
5760
read_json(self.fname, orient="records", lines=True)
5861

5962
def peakmem_read_json_lines_concat(self, index):
6063
concat(read_json(self.fname, orient="records", lines=True, chunksize=25000))
6164

65+
def peakmem_read_json_lines_nrows(self, index):
66+
read_json(self.fname, orient="records", lines=True, nrows=15000)
67+
6268

6369
class ToJSON(BaseIO):
6470

ci/deps/travis-36-locale.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ dependencies:
2727
- numexpr
2828
- numpy
2929
- openpyxl
30-
- pandas-gbq=0.8.0
30+
- pandas-gbq=0.12.0
3131
- psycopg2=2.6.2
3232
- pymysql=0.7.11
3333
- pytables

doc/source/development/contributing.rst

+27-3
Original file line numberDiff line numberDiff line change
@@ -270,7 +270,7 @@ Creating a Python environment (pip)
270270
If you aren't using conda for your development environment, follow these instructions.
271271
You'll need to have at least Python 3.6.1 installed on your system.
272272

273-
**Unix**/**Mac OS**
273+
**Unix**/**Mac OS with virtualenv**
274274

275275
.. code-block:: bash
276276
@@ -286,7 +286,31 @@ You'll need to have at least Python 3.6.1 installed on your system.
286286
python -m pip install -r requirements-dev.txt
287287
288288
# Build and install pandas
289-
python setup.py build_ext --inplace -j 0
289+
python setup.py build_ext --inplace -j 4
290+
python -m pip install -e . --no-build-isolation --no-use-pep517
291+
292+
**Unix**/**Mac OS with pyenv**
293+
294+
Consult the docs for setting up pyenv `here <https://github.com/pyenv/pyenv>`__.
295+
296+
.. code-block:: bash
297+
298+
# Create a virtual environment
299+
# Use an ENV_DIR of your choice. We'll use ~/Users/<yourname>/.pyenv/versions/pandas-dev
300+
301+
pyenv virtualenv <version> <name-to-give-it>
302+
303+
# For instance:
304+
pyenv virtualenv 3.7.6 pandas-dev
305+
306+
# Activate the virtualenv
307+
pyenv activate pandas-dev
308+
309+
# Now install the build dependencies in the cloned pandas repo
310+
python -m pip install -r requirements-dev.txt
311+
312+
# Build and install pandas
313+
python setup.py build_ext --inplace -j 4
290314
python -m pip install -e . --no-build-isolation --no-use-pep517
291315
292316
**Windows**
@@ -312,7 +336,7 @@ should already exist.
312336
python -m pip install -r requirements-dev.txt
313337
314338
# Build and install pandas
315-
python setup.py build_ext --inplace -j 0
339+
python setup.py build_ext --inplace -j 4
316340
python -m pip install -e . --no-build-isolation --no-use-pep517
317341
318342
Creating a branch

doc/source/getting_started/comparison/comparison_with_sas.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ Reading external data
115115

116116
Like SAS, pandas provides utilities for reading in data from
117117
many formats. The ``tips`` dataset, found within the pandas
118-
tests (`csv <https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv>`_)
118+
tests (`csv <https://raw.github.com/pandas-dev/pandas/master/pandas/tests/io/data/csv/tips.csv>`_)
119119
will be used in many of the following examples.
120120

121121
SAS provides ``PROC IMPORT`` to read csv data into a data set.
@@ -131,7 +131,7 @@ The pandas method is :func:`read_csv`, which works similarly.
131131
.. ipython:: python
132132
133133
url = ('https://raw.github.com/pandas-dev/'
134-
'pandas/master/pandas/tests/data/tips.csv')
134+
'pandas/master/pandas/tests/io/data/csv/tips.csv')
135135
tips = pd.read_csv(url)
136136
tips.head()
137137

doc/source/getting_started/comparison/comparison_with_sql.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ structure.
2525
.. ipython:: python
2626
2727
url = ('https://raw.github.com/pandas-dev'
28-
'/pandas/master/pandas/tests/data/tips.csv')
28+
'/pandas/master/pandas/tests/io/data/csv/tips.csv')
2929
tips = pd.read_csv(url)
3030
tips.head()
3131

doc/source/getting_started/comparison/comparison_with_stata.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ Reading external data
112112

113113
Like Stata, pandas provides utilities for reading in data from
114114
many formats. The ``tips`` data set, found within the pandas
115-
tests (`csv <https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv>`_)
115+
tests (`csv <https://raw.github.com/pandas-dev/pandas/master/pandas/tests/io/data/csv/tips.csv>`_)
116116
will be used in many of the following examples.
117117

118118
Stata provides ``import delimited`` to read csv data into a data set in memory.
@@ -128,7 +128,7 @@ the data set if presented with a url.
128128
.. ipython:: python
129129
130130
url = ('https://raw.github.com/pandas-dev'
131-
'/pandas/master/pandas/tests/data/tips.csv')
131+
'/pandas/master/pandas/tests/io/data/csv/tips.csv')
132132
tips = pd.read_csv(url)
133133
tips.head()
134134

doc/source/getting_started/install.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -274,7 +274,7 @@ lxml 3.8.0 HTML parser for read_html (see :ref
274274
matplotlib 2.2.2 Visualization
275275
numba 0.46.0 Alternative execution engine for rolling operations
276276
openpyxl 2.5.7 Reading / writing for xlsx files
277-
pandas-gbq 0.8.0 Google Big Query access
277+
pandas-gbq 0.12.0 Google Big Query access
278278
psycopg2 PostgreSQL engine for sqlalchemy
279279
pyarrow 0.12.0 Parquet, ORC (requires 0.13.0), and feather reading / writing
280280
pymysql 0.7.11 MySQL engine for sqlalchemy

doc/source/reference/frame.rst

+9-3
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,6 @@ Conversion
4747
DataFrame.convert_dtypes
4848
DataFrame.infer_objects
4949
DataFrame.copy
50-
DataFrame.isna
51-
DataFrame.notna
5250
DataFrame.bool
5351

5452
Indexing, iteration
@@ -211,10 +209,18 @@ Missing data handling
211209
.. autosummary::
212210
:toctree: api/
213211

212+
DataFrame.backfill
213+
DataFrame.bfill
214214
DataFrame.dropna
215+
DataFrame.ffill
215216
DataFrame.fillna
216-
DataFrame.replace
217217
DataFrame.interpolate
218+
DataFrame.isna
219+
DataFrame.isnull
220+
DataFrame.notna
221+
DataFrame.notnull
222+
DataFrame.pad
223+
DataFrame.replace
218224

219225
Reshaping, sorting, transposing
220226
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

doc/source/reference/groupby.rst

+5
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ Computations / descriptive stats
5050
GroupBy.all
5151
GroupBy.any
5252
GroupBy.bfill
53+
GroupBy.backfill
5354
GroupBy.count
5455
GroupBy.cumcount
5556
GroupBy.cummax
@@ -67,6 +68,7 @@ Computations / descriptive stats
6768
GroupBy.ngroup
6869
GroupBy.nth
6970
GroupBy.ohlc
71+
GroupBy.pad
7072
GroupBy.prod
7173
GroupBy.rank
7274
GroupBy.pct_change
@@ -88,10 +90,12 @@ application to columns of a specific data type.
8890

8991
DataFrameGroupBy.all
9092
DataFrameGroupBy.any
93+
DataFrameGroupBy.backfill
9194
DataFrameGroupBy.bfill
9295
DataFrameGroupBy.corr
9396
DataFrameGroupBy.count
9497
DataFrameGroupBy.cov
98+
DataFrameGroupBy.cumcount
9599
DataFrameGroupBy.cummax
96100
DataFrameGroupBy.cummin
97101
DataFrameGroupBy.cumprod
@@ -106,6 +110,7 @@ application to columns of a specific data type.
106110
DataFrameGroupBy.idxmin
107111
DataFrameGroupBy.mad
108112
DataFrameGroupBy.nunique
113+
DataFrameGroupBy.pad
109114
DataFrameGroupBy.pct_change
110115
DataFrameGroupBy.plot
111116
DataFrameGroupBy.quantile

doc/source/reference/series.rst

+9-2
Original file line numberDiff line numberDiff line change
@@ -214,11 +214,18 @@ Missing data handling
214214
.. autosummary::
215215
:toctree: api/
216216

217-
Series.isna
218-
Series.notna
217+
Series.backfill
218+
Series.bfill
219219
Series.dropna
220+
Series.ffill
220221
Series.fillna
221222
Series.interpolate
223+
Series.isna
224+
Series.isnull
225+
Series.notna
226+
Series.notnull
227+
Series.pad
228+
Series.replace
222229

223230
Reshaping, sorting
224231
------------------

doc/source/user_guide/gotchas.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -321,7 +321,7 @@ Byte-ordering issues
321321
--------------------
322322
Occasionally you may have to deal with data that were created on a machine with
323323
a different byte order than the one on which you are running Python. A common
324-
symptom of this issue is an error like:::
324+
symptom of this issue is an error like::
325325

326326
Traceback
327327
...

doc/source/user_guide/indexing.rst

+20-10
Original file line numberDiff line numberDiff line change
@@ -1866,29 +1866,39 @@ A chained assignment can also crop up in setting in a mixed dtype frame.
18661866

18671867
These setting rules apply to all of ``.loc/.iloc``.
18681868

1869-
This is the correct access method:
1869+
The following is the recommended access method using ``.loc`` for multiple items (using ``mask``) and a single item using a fixed index:
18701870

18711871
.. ipython:: python
18721872
1873-
dfc = pd.DataFrame({'A': ['aaa', 'bbb', 'ccc'], 'B': [1, 2, 3]})
1874-
dfc.loc[0, 'A'] = 11
1875-
dfc
1873+
dfc = pd.DataFrame({'a': ['one', 'one', 'two',
1874+
'three', 'two', 'one', 'six'],
1875+
'c': np.arange(7)})
1876+
dfd = dfc.copy()
1877+
# Setting multiple items using a mask
1878+
mask = dfd['a'].str.startswith('o')
1879+
dfd.loc[mask, 'c'] = 42
1880+
dfd
1881+
1882+
# Setting a single item
1883+
dfd = dfc.copy()
1884+
dfd.loc[2, 'a'] = 11
1885+
dfd
18761886
1877-
This *can* work at times, but it is not guaranteed to, and therefore should be avoided:
1887+
The following *can* work at times, but it is not guaranteed to, and therefore should be avoided:
18781888

18791889
.. ipython:: python
18801890
:okwarning:
18811891
1882-
dfc = dfc.copy()
1883-
dfc['A'][0] = 111
1884-
dfc
1892+
dfd = dfc.copy()
1893+
dfd['a'][2] = 111
1894+
dfd
18851895
1886-
This will **not** work at all, and so should be avoided:
1896+
Last, the subsequent example will **not** work at all, and so should be avoided:
18871897

18881898
::
18891899

18901900
>>> pd.set_option('mode.chained_assignment','raise')
1891-
>>> dfc.loc[0]['A'] = 1111
1901+
>>> dfd.loc[0]['a'] = 1111
18921902
Traceback (most recent call last)
18931903
...
18941904
SettingWithCopyException:

doc/source/user_guide/visualization.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -865,7 +865,7 @@ for more information. By coloring these curves differently for each class
865865
it is possible to visualize data clustering. Curves belonging to samples
866866
of the same class will usually be closer together and form larger structures.
867867

868-
**Note**: The "Iris" dataset is available `here <https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/iris.csv>`__.
868+
**Note**: The "Iris" dataset is available `here <https://raw.github.com/pandas-dev/pandas/master/pandas/tests/io/data/csv/iris.csv>`__.
869869

870870
.. ipython:: python
871871
@@ -1025,7 +1025,7 @@ be colored differently.
10251025
See the R package `Radviz <https://cran.r-project.org/package=Radviz/>`__
10261026
for more information.
10271027

1028-
**Note**: The "Iris" dataset is available `here <https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/iris.csv>`__.
1028+
**Note**: The "Iris" dataset is available `here <https://raw.github.com/pandas-dev/pandas/master/pandas/tests/io/data/csv/iris.csv>`__.
10291029

10301030
.. ipython:: python
10311031

0 commit comments

Comments
 (0)