Skip to content

Commit 0e581ad

Browse files
merge upstream/master
1 parent b12f658 commit 0e581ad

File tree

313 files changed

+8137
-7393
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

313 files changed

+8137
-7393
lines changed

.github/CONTRIBUTING.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,16 @@ Our main contributing guide can be found [in this repo](https://github.com/panda
88

99
If you are looking to contribute to the *pandas* codebase, the best place to start is the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues). This is also a great place for filing bug reports and making suggestions for ways in which we can improve the code and documentation.
1010

11-
If you have additional questions, feel free to ask them on the [mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata) or on [Gitter](https://gitter.im/pydata/pandas). Further information can also be found in the "[Where to start?](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#where-to-start)" section.
11+
If you have additional questions, feel free to ask them on the [mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata) or on [Gitter](https://gitter.im/pydata/pandas). Further information can also be found in the "[Where to start?](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#where-to-start)" section.
1212

1313
## Filing Issues
1414

15-
If you notice a bug in the code or documentation, or have suggestions for how we can improve either, feel free to create an issue on the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) using [GitHub's "issue" form](https://github.com/pandas-dev/pandas/issues/new). The form contains some questions that will help us best address your issue. For more information regarding how to file issues against *pandas*, please refer to the "[Bug reports and enhancement requests](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#bug-reports-and-enhancement-requests)" section.
15+
If you notice a bug in the code or documentation, or have suggestions for how we can improve either, feel free to create an issue on the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) using [GitHub's "issue" form](https://github.com/pandas-dev/pandas/issues/new). The form contains some questions that will help us best address your issue. For more information regarding how to file issues against *pandas*, please refer to the "[Bug reports and enhancement requests](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#bug-reports-and-enhancement-requests)" section.
1616

1717
## Contributing to the Codebase
1818

19-
The code is hosted on [GitHub](https://www.github.com/pandas-dev/pandas), so you will need to use [Git](http://git-scm.com/) to clone the project and make changes to the codebase. Once you have obtained a copy of the code, you should create a development environment that is separate from your existing Python environment so that you can make and test changes without compromising your own work environment. For more information, please refer to the "[Working with the code](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#working-with-the-code)" section.
19+
The code is hosted on [GitHub](https://www.github.com/pandas-dev/pandas), so you will need to use [Git](http://git-scm.com/) to clone the project and make changes to the codebase. Once you have obtained a copy of the code, you should create a development environment that is separate from your existing Python environment so that you can make and test changes without compromising your own work environment. For more information, please refer to the "[Working with the code](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#working-with-the-code)" section.
2020

21-
Before submitting your changes for review, make sure to check that your changes do not break any tests. You can find more information about our test suites in the "[Test-driven development/code writing](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#test-driven-development-code-writing)" section. We also have guidelines regarding coding style that will be enforced during testing, which can be found in the "[Code standards](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#code-standards)" section.
21+
Before submitting your changes for review, make sure to check that your changes do not break any tests. You can find more information about our test suites in the "[Test-driven development/code writing](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#test-driven-development-code-writing)" section. We also have guidelines regarding coding style that will be enforced during testing, which can be found in the "[Code standards](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#code-standards)" section.
2222

23-
Once your changes are ready to be submitted, make sure to push your changes to GitHub before creating a pull request. Details about how to do that can be found in the "[Contributing your changes to pandas](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#contributing-your-changes-to-pandas)" section. We will review your changes, and you will most likely be asked to make additional changes before it is finally ready to merge. However, once it's ready, we will merge it, and you will have successfully contributed to the codebase!
23+
Once your changes are ready to be submitted, make sure to push your changes to GitHub before creating a pull request. Details about how to do that can be found in the "[Contributing your changes to pandas](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#contributing-your-changes-to-pandas)" section. We will review your changes, and you will most likely be asked to make additional changes before it is finally ready to merge. However, once it's ready, we will merge it, and you will have successfully contributed to the codebase!

.gitignore

+2-2
Original file line numberDiff line numberDiff line change
@@ -101,14 +101,14 @@ asv_bench/pandas/
101101
# Documentation generated files #
102102
#################################
103103
doc/source/generated
104-
doc/source/api/generated
104+
doc/source/user_guide/styled.xlsx
105+
doc/source/reference/api
105106
doc/source/_static
106107
doc/source/vbench
107108
doc/source/vbench.rst
108109
doc/source/index.rst
109110
doc/build/html/index.html
110111
# Windows specific leftover:
111112
doc/tmp.sv
112-
doc/source/styled.xlsx
113113
env/
114114
doc/source/savefig/

Makefile

-1
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,3 @@ doc:
2323
cd doc; \
2424
python make.py clean; \
2525
python make.py html
26-
python make.py spellcheck

asv_bench/benchmarks/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
"""Pandas benchmarks."""

asv_bench/benchmarks/algorithms.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@
55
import pandas as pd
66
from pandas.util import testing as tm
77

8-
98
for imp in ['pandas.util', 'pandas.tools.hashing']:
109
try:
1110
hashing = import_module(imp)
@@ -142,4 +141,4 @@ def time_quantile(self, quantile, interpolation, dtype):
142141
self.idx.quantile(quantile, interpolation=interpolation)
143142

144143

145-
from .pandas_vb_common import setup # noqa: F401
144+
from .pandas_vb_common import setup # noqa: F401 isort:skip

asv_bench/benchmarks/categoricals.py

+13-6
Original file line numberDiff line numberDiff line change
@@ -223,12 +223,19 @@ class CategoricalSlicing(object):
223223

224224
def setup(self, index):
225225
N = 10**6
226-
values = list('a' * N + 'b' * N + 'c' * N)
227-
indices = {
228-
'monotonic_incr': pd.Categorical(values),
229-
'monotonic_decr': pd.Categorical(reversed(values)),
230-
'non_monotonic': pd.Categorical(list('abc' * N))}
231-
self.data = indices[index]
226+
categories = ['a', 'b', 'c']
227+
values = [0] * N + [1] * N + [2] * N
228+
if index == 'monotonic_incr':
229+
self.data = pd.Categorical.from_codes(values,
230+
categories=categories)
231+
elif index == 'monotonic_decr':
232+
self.data = pd.Categorical.from_codes(list(reversed(values)),
233+
categories=categories)
234+
elif index == 'non_monotonic':
235+
self.data = pd.Categorical.from_codes([0, 1, 2] * N,
236+
categories=categories)
237+
else:
238+
raise ValueError('Invalid index param: {}'.format(index))
232239

233240
self.scalar = 10000
234241
self.list = list(range(10000))

asv_bench/benchmarks/ctors.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ class SeriesDtypesConstructors(object):
7272

7373
def setup(self):
7474
N = 10**4
75-
self.arr = np.random.randn(N, N)
75+
self.arr = np.random.randn(N)
7676
self.arr_str = np.array(['foo', 'bar', 'baz'], dtype=object)
7777
self.s = Series([Timestamp('20110101'), Timestamp('20120101'),
7878
Timestamp('20130101')] * N * 10)

asv_bench/benchmarks/index_object.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,8 @@ def setup(self, dtype):
138138
self.sorted = self.idx.sort_values()
139139
half = N // 2
140140
self.non_unique = self.idx[:half].append(self.idx[:half])
141-
self.non_unique_sorted = self.sorted[:half].append(self.sorted[:half])
141+
self.non_unique_sorted = (self.sorted[:half].append(self.sorted[:half])
142+
.sort_values())
142143
self.key = self.sorted[N // 4]
143144

144145
def time_boolean_array(self, dtype):

asv_bench/benchmarks/strings.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -102,10 +102,10 @@ def setup(self, repeats):
102102
N = 10**5
103103
self.s = Series(tm.makeStringIndex(N))
104104
repeat = {'int': 1, 'array': np.random.randint(1, 3, N)}
105-
self.repeat = repeat[repeats]
105+
self.values = repeat[repeats]
106106

107107
def time_repeat(self, repeats):
108-
self.s.str.repeat(self.repeat)
108+
self.s.str.repeat(self.values)
109109

110110

111111
class Cat(object):

azure-pipelines.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ jobs:
104104
if git diff upstream/master --name-only | grep -q "^asv_bench/"; then
105105
cd asv_bench
106106
asv machine --yes
107-
ASV_OUTPUT="$(asv dev)"
107+
ASV_OUTPUT="$(asv run --quick --show-stderr --python=same --launch-method=spawn)"
108108
if [[ $(echo "$ASV_OUTPUT" | grep "failed") ]]; then
109109
echo "##vso[task.logissue type=error]Benchmarks run with errors"
110110
echo "$ASV_OUTPUT"

ci/code_checks.sh

+7-6
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ if [[ -z "$CHECK" || "$CHECK" == "lint" ]]; then
9393
# this particular codebase (e.g. src/headers, src/klib, src/msgpack). However,
9494
# we can lint all header files since they aren't "generated" like C files are.
9595
MSG='Linting .c and .h' ; echo $MSG
96-
cpplint --quiet --extensions=c,h --headers=h --recursive --filter=-readability/casting,-runtime/int,-build/include_subdir pandas/_libs/src/*.h pandas/_libs/src/parser pandas/_libs/ujson pandas/_libs/tslibs/src/datetime
96+
cpplint --quiet --extensions=c,h --headers=h --recursive --filter=-readability/casting,-runtime/int,-build/include_subdir pandas/_libs/src/*.h pandas/_libs/src/parser pandas/_libs/ujson pandas/_libs/tslibs/src/datetime pandas/io/msgpack pandas/_libs/*.cpp pandas/util
9797
RET=$(($RET + $?)) ; echo $MSG "DONE"
9898

9999
echo "isort --version-number"
@@ -174,9 +174,10 @@ if [[ -z "$CHECK" || "$CHECK" == "patterns" ]]; then
174174
MSG='Check that no file in the repo contains tailing whitespaces' ; echo $MSG
175175
set -o pipefail
176176
if [[ "$AZURE" == "true" ]]; then
177-
! grep -n --exclude="*.svg" -RI "\s$" * | awk -F ":" '{print "##vso[task.logissue type=error;sourcepath=" $1 ";linenumber=" $2 ";] Tailing whitespaces found: " $3}'
177+
# we exclude all c/cpp files as the c/cpp files of pandas code base are tested when Linting .c and .h files
178+
! grep -n '--exclude=*.'{svg,c,cpp,html} -RI "\s$" * | awk -F ":" '{print "##vso[task.logissue type=error;sourcepath=" $1 ";linenumber=" $2 ";] Tailing whitespaces found: " $3}'
178179
else
179-
! grep -n --exclude="*.svg" -RI "\s$" * | awk -F ":" '{print $1 ":" $2 ":Tailing whitespaces found: " $3}'
180+
! grep -n '--exclude=*.'{svg,c,cpp,html} -RI "\s$" * | awk -F ":" '{print $1 ":" $2 ":Tailing whitespaces found: " $3}'
180181
fi
181182
RET=$(($RET + $?)) ; echo $MSG "DONE"
182183
fi
@@ -206,7 +207,7 @@ if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
206207

207208
MSG='Doctests frame.py' ; echo $MSG
208209
pytest -q --doctest-modules pandas/core/frame.py \
209-
-k"-axes -combine -itertuples -join -pivot_table -query -reindex -reindex_axis -round"
210+
-k" -itertuples -join -reindex -reindex_axis -round"
210211
RET=$(($RET + $?)) ; echo $MSG "DONE"
211212

212213
MSG='Doctests series.py' ; echo $MSG
@@ -240,8 +241,8 @@ fi
240241
### DOCSTRINGS ###
241242
if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
242243

243-
MSG='Validate docstrings (GL06, GL07, GL09, SS04, PR03, PR05, EX04)' ; echo $MSG
244-
$BASE_DIR/scripts/validate_docstrings.py --format=azure --errors=GL06,GL07,GL09,SS04,PR03,PR05,EX04
244+
MSG='Validate docstrings (GL06, GL07, GL09, SS04, PR03, PR05, PR10, EX04, RT04, SS05, SA05)' ; echo $MSG
245+
$BASE_DIR/scripts/validate_docstrings.py --format=azure --errors=GL06,GL07,GL09,SS04,PR03,PR05,EX04,RT04,SS05,SA05
245246
RET=$(($RET + $?)) ; echo $MSG "DONE"
246247

247248
fi

doc/cheatsheet/Pandas_Cheat_Sheet.pdf

6.7 KB
Binary file not shown.
-261 Bytes
Binary file not shown.
210 KB
Binary file not shown.
5.73 KB
Binary file not shown.

doc/make.py

+80-6
Original file line numberDiff line numberDiff line change
@@ -15,15 +15,18 @@
1515
import sys
1616
import os
1717
import shutil
18+
import csv
1819
import subprocess
1920
import argparse
2021
import webbrowser
22+
import docutils
23+
import docutils.parsers.rst
2124

2225

2326
DOC_PATH = os.path.dirname(os.path.abspath(__file__))
2427
SOURCE_PATH = os.path.join(DOC_PATH, 'source')
2528
BUILD_PATH = os.path.join(DOC_PATH, 'build')
26-
BUILD_DIRS = ['doctrees', 'html', 'latex', 'plots', '_static', '_templates']
29+
REDIRECTS_FILE = os.path.join(DOC_PATH, 'redirects.csv')
2730

2831

2932
class DocBuilder:
@@ -50,7 +53,7 @@ def __init__(self, num_jobs=0, include_api=True, single_doc=None,
5053
if single_doc and single_doc.endswith('.rst'):
5154
self.single_doc_html = os.path.splitext(single_doc)[0] + '.html'
5255
elif single_doc:
53-
self.single_doc_html = 'api/generated/pandas.{}.html'.format(
56+
self.single_doc_html = 'reference/api/pandas.{}.html'.format(
5457
single_doc)
5558

5659
def _process_single_doc(self, single_doc):
@@ -60,7 +63,7 @@ def _process_single_doc(self, single_doc):
6063
6164
For example, categorial.rst or pandas.DataFrame.head. For the latter,
6265
return the corresponding file path
63-
(e.g. generated/pandas.DataFrame.head.rst).
66+
(e.g. reference/api/pandas.DataFrame.head.rst).
6467
"""
6568
base_name, extension = os.path.splitext(single_doc)
6669
if extension in ('.rst', '.ipynb'):
@@ -118,8 +121,6 @@ def _sphinx_build(self, kind):
118121
raise ValueError('kind must be html or latex, '
119122
'not {}'.format(kind))
120123

121-
self.clean()
122-
123124
cmd = ['sphinx-build', '-b', kind]
124125
if self.num_jobs:
125126
cmd += ['-j', str(self.num_jobs)]
@@ -139,6 +140,77 @@ def _open_browser(self, single_doc_html):
139140
single_doc_html)
140141
webbrowser.open(url, new=2)
141142

143+
def _get_page_title(self, page):
144+
"""
145+
Open the rst file `page` and extract its title.
146+
"""
147+
fname = os.path.join(SOURCE_PATH, '{}.rst'.format(page))
148+
option_parser = docutils.frontend.OptionParser(
149+
components=(docutils.parsers.rst.Parser,))
150+
doc = docutils.utils.new_document(
151+
'<doc>',
152+
option_parser.get_default_values())
153+
with open(fname) as f:
154+
data = f.read()
155+
156+
parser = docutils.parsers.rst.Parser()
157+
# do not generate any warning when parsing the rst
158+
with open(os.devnull, 'a') as f:
159+
doc.reporter.stream = f
160+
parser.parse(data, doc)
161+
162+
section = next(node for node in doc.children
163+
if isinstance(node, docutils.nodes.section))
164+
title = next(node for node in section.children
165+
if isinstance(node, docutils.nodes.title))
166+
167+
return title.astext()
168+
169+
def _add_redirects(self):
170+
"""
171+
Create in the build directory an html file with a redirect,
172+
for every row in REDIRECTS_FILE.
173+
"""
174+
html = '''
175+
<html>
176+
<head>
177+
<meta http-equiv="refresh" content="0;URL={url}"/>
178+
</head>
179+
<body>
180+
<p>
181+
The page has been moved to <a href="{url}">{title}</a>
182+
</p>
183+
</body>
184+
<html>
185+
'''
186+
with open(REDIRECTS_FILE) as mapping_fd:
187+
reader = csv.reader(mapping_fd)
188+
for row in reader:
189+
if not row or row[0].strip().startswith('#'):
190+
continue
191+
192+
path = os.path.join(BUILD_PATH,
193+
'html',
194+
*row[0].split('/')) + '.html'
195+
196+
try:
197+
title = self._get_page_title(row[1])
198+
except Exception:
199+
# the file can be an ipynb and not an rst, or docutils
200+
# may not be able to read the rst because it has some
201+
# sphinx specific stuff
202+
title = 'this page'
203+
204+
if os.path.exists(path):
205+
raise RuntimeError((
206+
'Redirection would overwrite an existing file: '
207+
'{}').format(path))
208+
209+
with open(path, 'w') as moved_page_fd:
210+
moved_page_fd.write(
211+
html.format(url='{}.html'.format(row[1]),
212+
title=title))
213+
142214
def html(self):
143215
"""
144216
Build HTML documentation.
@@ -150,6 +222,8 @@ def html(self):
150222

151223
if self.single_doc_html is not None:
152224
self._open_browser(self.single_doc_html)
225+
else:
226+
self._add_redirects()
153227
return ret_code
154228

155229
def latex(self, force=False):
@@ -184,7 +258,7 @@ def clean():
184258
Clean documentation generated files.
185259
"""
186260
shutil.rmtree(BUILD_PATH, ignore_errors=True)
187-
shutil.rmtree(os.path.join(SOURCE_PATH, 'api', 'generated'),
261+
shutil.rmtree(os.path.join(SOURCE_PATH, 'reference', 'api'),
188262
ignore_errors=True)
189263

190264
def zip_html(self):

0 commit comments

Comments
 (0)