Skip to content

Commit 1c4da0e

Browse files
committed
Merge remote-tracking branch 'upstream/master' into pd.array
2 parents 3186ded + ab55d05 commit 1c4da0e

File tree

322 files changed

+15630
-23643
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

322 files changed

+15630
-23643
lines changed

.circleci/config.yml

-38
This file was deleted.

.github/CONTRIBUTING.md

+13-14
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,23 @@
1-
Contributing to pandas
2-
======================
1+
# Contributing to pandas
32

43
Whether you are a novice or experienced software developer, all contributions and suggestions are welcome!
54

6-
Our main contribution docs can be found [here](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst), but if you do not want to read it in its entirety, we will summarize the main ways in which you can contribute and point to relevant places in the docs for further information.
5+
Our main contributing guide can be found [in this repo](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst) or [on the website](https://pandas-docs.github.io/pandas-docs-travis/contributing.html). If you do not want to read it in its entirety, we will summarize the main ways in which you can contribute and point to relevant sections of that document for further information.
6+
7+
## Getting Started
78

8-
Getting Started
9-
---------------
109
If you are looking to contribute to the *pandas* codebase, the best place to start is the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues). This is also a great place for filing bug reports and making suggestions for ways in which we can improve the code and documentation.
1110

12-
If you have additional questions, feel free to ask them on the [mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata) or on [Gitter](https://gitter.im/pydata/pandas). Further information can also be found in our [Getting Started](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#where-to-start) section of our main contribution doc.
11+
If you have additional questions, feel free to ask them on the [mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata) or on [Gitter](https://gitter.im/pydata/pandas). Further information can also be found in the "[Where to start?](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#where-to-start)" section.
12+
13+
## Filing Issues
14+
15+
If you notice a bug in the code or documentation, or have suggestions for how we can improve either, feel free to create an issue on the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) using [GitHub's "issue" form](https://github.com/pandas-dev/pandas/issues/new). The form contains some questions that will help us best address your issue. For more information regarding how to file issues against *pandas*, please refer to the "[Bug reports and enhancement requests](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#bug-reports-and-enhancement-requests)" section.
1316

14-
Filing Issues
15-
-------------
16-
If you notice a bug in the code or in docs or have suggestions for how we can improve either, feel free to create an issue on the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) using [GitHub's "issue" form](https://github.com/pandas-dev/pandas/issues/new). The form contains some questions that will help us best address your issue. For more information regarding how to file issues against *pandas*, please refer to the [Bug reports and enhancement requests](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#bug-reports-and-enhancement-requests) section of our main contribution doc.
17+
## Contributing to the Codebase
1718

18-
Contributing to the Codebase
19-
----------------------------
20-
The code is hosted on [GitHub](https://www.github.com/pandas-dev/pandas), so you will need to use [Git](http://git-scm.com/) to clone the project and make changes to the codebase. Once you have obtained a copy of the code, you should create a development environment that is separate from your existing Python environment so that you can make and test changes without compromising your own work environment. For more information, please refer to our [Working with the code](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#working-with-the-code) section of our main contribution docs.
19+
The code is hosted on [GitHub](https://www.github.com/pandas-dev/pandas), so you will need to use [Git](http://git-scm.com/) to clone the project and make changes to the codebase. Once you have obtained a copy of the code, you should create a development environment that is separate from your existing Python environment so that you can make and test changes without compromising your own work environment. For more information, please refer to the "[Working with the code](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#working-with-the-code)" section.
2120

22-
Before submitting your changes for review, make sure to check that your changes do not break any tests. You can find more information about our test suites can be found [here](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#test-driven-development-code-writing). We also have guidelines regarding coding style that will be enforced during testing. Details about coding style can be found [here](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#code-standards).
21+
Before submitting your changes for review, make sure to check that your changes do not break any tests. You can find more information about our test suites in the "[Test-driven development/code writing](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#test-driven-development-code-writing)" section. We also have guidelines regarding coding style that will be enforced during testing, which can be found in the "[Code standards](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#code-standards)" section.
2322

24-
Once your changes are ready to be submitted, make sure to push your changes to GitHub before creating a pull request. Details about how to do that can be found in the [Contributing your changes to pandas](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#contributing-your-changes-to-pandas) section of our main contribution docs. We will review your changes, and you will most likely be asked to make additional changes before it is finally ready to merge. However, once it's ready, we will merge it, and you will have successfully contributed to the codebase!
23+
Once your changes are ready to be submitted, make sure to push your changes to GitHub before creating a pull request. Details about how to do that can be found in the "[Contributing your changes to pandas](https://github.com/pandas-dev/pandas/blob/master/doc/source/contributing.rst#contributing-your-changes-to-pandas)" section. We will review your changes, and you will most likely be asked to make additional changes before it is finally ready to merge. However, once it's ready, we will merge it, and you will have successfully contributed to the codebase!

.travis.yml

+11-14
Original file line numberDiff line numberDiff line change
@@ -36,30 +36,21 @@ matrix:
3636
env:
3737
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" PATTERN="not slow and not network"
3838

39-
- dist: trusty
40-
env:
41-
- JOB="2.7, locale, slow, old NumPy" ENV_FILE="ci/deps/travis-27-locale.yaml" LOCALE_OVERRIDE="zh_CN.UTF-8" PATTERN="slow"
42-
addons:
43-
apt:
44-
packages:
45-
- language-pack-zh-hans
4639
- dist: trusty
4740
env:
4841
- JOB="2.7" ENV_FILE="ci/deps/travis-27.yaml" PATTERN="not slow"
4942
addons:
5043
apt:
5144
packages:
5245
- python-gtk2
46+
5347
- dist: trusty
5448
env:
55-
- JOB="3.6, coverage" ENV_FILE="ci/deps/travis-36.yaml" PATTERN="not slow and not network" PANDAS_TESTING_MODE="deprecate" COVERAGE=true
49+
- JOB="3.6, locale" ENV_FILE="ci/deps/travis-36-locale.yaml" PATTERN="not slow and not network" LOCALE_OVERRIDE="zh_CN.UTF-8"
50+
5651
- dist: trusty
5752
env:
58-
- JOB="3.7, NumPy dev" ENV_FILE="ci/deps/travis-37-numpydev.yaml" PATTERN="not slow and not network" TEST_ARGS="-W error" PANDAS_TESTING_MODE="deprecate"
59-
addons:
60-
apt:
61-
packages:
62-
- xsel
53+
- JOB="3.6, coverage" ENV_FILE="ci/deps/travis-36.yaml" PATTERN="not slow and not network" PANDAS_TESTING_MODE="deprecate" COVERAGE=true
6354

6455
# In allow_failures
6556
- dist: trusty
@@ -90,6 +81,12 @@ before_install:
9081
- uname -a
9182
- git --version
9283
- git tag
84+
# Because travis runs on Google Cloud and has a /etc/boto.cfg,
85+
# it breaks moto import, see:
86+
# https://github.com/spulec/moto/issues/1771
87+
# https://github.com/boto/boto/issues/3741
88+
# This overrides travis and tells it to look nowhere.
89+
- export BOTO_CONFIG=/dev/null
9390

9491
install:
9592
- echo "install start"
@@ -106,7 +103,7 @@ before_script:
106103
script:
107104
- echo "script start"
108105
- source activate pandas-dev
109-
- ci/run_build_docs.sh
106+
- ci/build_docs.sh
110107
- ci/run_tests.sh
111108

112109
after_script:

README.md

+2-10
Original file line numberDiff line numberDiff line change
@@ -45,14 +45,6 @@
4545
</a>
4646
</td>
4747
</tr>
48-
<tr>
49-
<td></td>
50-
<td>
51-
<a href="https://circleci.com/gh/pandas-dev/pandas">
52-
<img src="https://circleci.com/gh/circleci/mongofinil/tree/master.svg?style=shield&circle-token=223d8cafa7b02902c3e150242520af8944e34671" alt="circleci build status" />
53-
</a>
54-
</td>
55-
</tr>
5648
<tr>
5749
<td></td>
5850
<td>
@@ -231,9 +223,9 @@ Most development discussion is taking place on github in this repo. Further, the
231223

232224
All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome.
233225

234-
A detailed overview on how to contribute can be found in the **[contributing guide.](https://pandas.pydata.org/pandas-docs/stable/contributing.html)**
226+
A detailed overview on how to contribute can be found in the **[contributing guide](https://pandas-docs.github.io/pandas-docs-travis/contributing.html)**. There is also an [overview](.github/CONTRIBUTING.md) on GitHub.
235227

236-
If you are simply looking to start working with the pandas codebase, navigate to the [GitHub issues tab](https://github.com/pandas-dev/pandas/issues) and start looking through interesting issues. There are a number of issues listed under [Docs](https://github.com/pandas-dev/pandas/issues?labels=Docs&sort=updated&state=open) and [good first issue](https://github.com/pandas-dev/pandas/issues?labels=good+first+issue&sort=updated&state=open) where you could start out.
228+
If you are simply looking to start working with the pandas codebase, navigate to the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) and start looking through interesting issues. There are a number of issues listed under [Docs](https://github.com/pandas-dev/pandas/issues?labels=Docs&sort=updated&state=open) and [good first issue](https://github.com/pandas-dev/pandas/issues?labels=good+first+issue&sort=updated&state=open) where you could start out.
237229

238230
You can also triage issues which may include reproducing bug reports, or asking for vital information such as version numbers or reproduction instructions. If you would like to start triaging issues, one easy way to get started is to [subscribe to pandas on CodeTriage](https://www.codetriage.com/pandas-dev/pandas).
239231

asv_bench/benchmarks/frame_ctor.py

+13
Original file line numberDiff line numberDiff line change
@@ -91,4 +91,17 @@ def time_frame_from_ndarray(self):
9191
self.df = DataFrame(self.data)
9292

9393

94+
class FromLists(object):
95+
96+
goal_time = 0.2
97+
98+
def setup(self):
99+
N = 1000
100+
M = 100
101+
self.data = [[j for j in range(M)] for i in range(N)]
102+
103+
def time_frame_from_lists(self):
104+
self.df = DataFrame(self.data)
105+
106+
94107
from .pandas_vb_common import setup # noqa: F401

asv_bench/benchmarks/frame_methods.py

+62-1
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,7 @@ def setup(self):
103103
self.df2 = DataFrame(np.random.randn(N * 50, 10))
104104
self.df3 = DataFrame(np.random.randn(N, 5 * N),
105105
columns=['C' + str(c) for c in range(N * 5)])
106+
self.df4 = DataFrame(np.random.randn(N * 1000, 10))
106107

107108
def time_iteritems(self):
108109
# (monitor no-copying behaviour)
@@ -119,10 +120,70 @@ def time_iteritems_indexing(self):
119120
for col in self.df3:
120121
self.df3[col]
121122

123+
def time_itertuples_start(self):
124+
self.df4.itertuples()
125+
126+
def time_itertuples_read_first(self):
127+
next(self.df4.itertuples())
128+
122129
def time_itertuples(self):
123-
for row in self.df2.itertuples():
130+
for row in self.df4.itertuples():
131+
pass
132+
133+
def time_itertuples_to_list(self):
134+
list(self.df4.itertuples())
135+
136+
def mem_itertuples_start(self):
137+
return self.df4.itertuples()
138+
139+
def peakmem_itertuples_start(self):
140+
self.df4.itertuples()
141+
142+
def mem_itertuples_read_first(self):
143+
return next(self.df4.itertuples())
144+
145+
def peakmem_itertuples(self):
146+
for row in self.df4.itertuples():
147+
pass
148+
149+
def mem_itertuples_to_list(self):
150+
return list(self.df4.itertuples())
151+
152+
def peakmem_itertuples_to_list(self):
153+
list(self.df4.itertuples())
154+
155+
def time_itertuples_raw_start(self):
156+
self.df4.itertuples(index=False, name=None)
157+
158+
def time_itertuples_raw_read_first(self):
159+
next(self.df4.itertuples(index=False, name=None))
160+
161+
def time_itertuples_raw_tuples(self):
162+
for row in self.df4.itertuples(index=False, name=None):
124163
pass
125164

165+
def time_itertuples_raw_tuples_to_list(self):
166+
list(self.df4.itertuples(index=False, name=None))
167+
168+
def mem_itertuples_raw_start(self):
169+
return self.df4.itertuples(index=False, name=None)
170+
171+
def peakmem_itertuples_raw_start(self):
172+
self.df4.itertuples(index=False, name=None)
173+
174+
def peakmem_itertuples_raw_read_first(self):
175+
next(self.df4.itertuples(index=False, name=None))
176+
177+
def peakmem_itertuples_raw(self):
178+
for row in self.df4.itertuples(index=False, name=None):
179+
pass
180+
181+
def mem_itertuples_raw_to_list(self):
182+
return list(self.df4.itertuples(index=False, name=None))
183+
184+
def peakmem_itertuples_raw_to_list(self):
185+
list(self.df4.itertuples(index=False, name=None))
186+
126187
def time_iterrows(self):
127188
for row in self.df.iterrows():
128189
pass

asv_bench/benchmarks/join_merge.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ def setup(self, axis):
5050
self.empty_right = [df, DataFrame()]
5151

5252
def time_concat_series(self, axis):
53-
concat(self.series, axis=axis)
53+
concat(self.series, axis=axis, sort=False)
5454

5555
def time_concat_small_frames(self, axis):
5656
concat(self.small_frames, axis=axis)

asv_bench/benchmarks/panel_ctor.py

+6-6
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import warnings
22
from datetime import datetime, timedelta
33

4-
from pandas import DataFrame, Panel, DatetimeIndex, date_range
4+
from pandas import DataFrame, Panel, date_range
55

66

77
class DifferentIndexes(object):
@@ -23,9 +23,9 @@ def time_from_dict(self):
2323
class SameIndexes(object):
2424

2525
def setup(self):
26-
idx = DatetimeIndex(start=datetime(1990, 1, 1),
27-
end=datetime(2012, 1, 1),
28-
freq='D')
26+
idx = date_range(start=datetime(1990, 1, 1),
27+
end=datetime(2012, 1, 1),
28+
freq='D')
2929
df = DataFrame({'a': 0, 'b': 1, 'c': 2}, index=idx)
3030
self.data_frames = dict(enumerate([df] * 100))
3131

@@ -40,10 +40,10 @@ def setup(self):
4040
start = datetime(1990, 1, 1)
4141
end = datetime(2012, 1, 1)
4242
df1 = DataFrame({'a': 0, 'b': 1, 'c': 2},
43-
index=DatetimeIndex(start=start, end=end, freq='D'))
43+
index=date_range(start=start, end=end, freq='D'))
4444
end += timedelta(days=1)
4545
df2 = DataFrame({'a': 0, 'b': 1, 'c': 2},
46-
index=DatetimeIndex(start=start, end=end, freq='D'))
46+
index=date_range(start=start, end=end, freq='D'))
4747
dfs = [df1] * 50 + [df2] * 50
4848
self.data_frames = dict(enumerate(dfs))
4949

asv_bench/benchmarks/period.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
from pandas import (DataFrame, Series, Period, PeriodIndex, date_range,
2-
period_range)
1+
from pandas import (
2+
DataFrame, Period, PeriodIndex, Series, date_range, period_range)
33

44

55
class PeriodProperties(object):
@@ -94,7 +94,7 @@ def time_value_counts(self, typ):
9494
class Indexing(object):
9595

9696
def setup(self):
97-
self.index = PeriodIndex(start='1985', periods=1000, freq='D')
97+
self.index = period_range(start='1985', periods=1000, freq='D')
9898
self.series = Series(range(1000), index=self.index)
9999
self.period = self.index[500]
100100

asv_bench/benchmarks/reindex.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
import numpy as np
22
import pandas.util.testing as tm
3-
from pandas import (DataFrame, Series, DatetimeIndex, MultiIndex, Index,
3+
from pandas import (DataFrame, Series, MultiIndex, Index,
44
date_range)
55
from .pandas_vb_common import lib
66

77

88
class Reindex(object):
99

1010
def setup(self):
11-
rng = DatetimeIndex(start='1/1/1970', periods=10000, freq='1min')
11+
rng = date_range(start='1/1/1970', periods=10000, freq='1min')
1212
self.df = DataFrame(np.random.rand(10000, 10), index=rng,
1313
columns=range(10))
1414
self.df['foo'] = 'bar'

asv_bench/benchmarks/timedelta.py

+5-4
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
import datetime
22

33
import numpy as np
4-
from pandas import Series, timedelta_range, to_timedelta, Timestamp, \
5-
Timedelta, TimedeltaIndex, DataFrame
4+
5+
from pandas import (
6+
DataFrame, Series, Timedelta, Timestamp, timedelta_range, to_timedelta)
67

78

89
class TimedeltaConstructor(object):
@@ -122,8 +123,8 @@ def time_timedelta_nanoseconds(self, series):
122123
class TimedeltaIndexing(object):
123124

124125
def setup(self):
125-
self.index = TimedeltaIndex(start='1985', periods=1000, freq='D')
126-
self.index2 = TimedeltaIndex(start='1986', periods=1000, freq='D')
126+
self.index = timedelta_range(start='1985', periods=1000, freq='D')
127+
self.index2 = timedelta_range(start='1986', periods=1000, freq='D')
127128
self.series = Series(range(1000), index=self.index)
128129
self.timedelta = self.index[500]
129130

asv_bench/benchmarks/timestamp.py

+4-3
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
import datetime
22

3-
from pandas import Timestamp
4-
import pytz
53
import dateutil
4+
import pytz
5+
6+
from pandas import Timestamp
67

78

89
class TimestampConstruction(object):
@@ -46,7 +47,7 @@ def time_dayofweek(self, tz, freq):
4647
self.ts.dayofweek
4748

4849
def time_weekday_name(self, tz, freq):
49-
self.ts.weekday_name
50+
self.ts.day_name
5051

5152
def time_dayofyear(self, tz, freq):
5253
self.ts.dayofyear

azure-pipelines.yml

+1
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ jobs:
4343
ci/incremental/install_miniconda.sh
4444
ci/incremental/setup_conda_environment.sh
4545
displayName: 'Set up environment'
46+
condition: true
4647
4748
# Do not require pandas
4849
- script: |

0 commit comments

Comments
 (0)