Skip to content

BUG: Patch Checked Add Method #14324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -278,7 +278,7 @@ Please try to maintain backward compatibility. *pandas* has lots of users with l

Adding tests is one of the most common requests after code is pushed to *pandas*. Therefore, it is worth getting in the habit of writing tests ahead of time so this is never an issue.

Like many packages, *pandas* uses the [Nose testing system](http://nose.readthedocs.org/en/latest/index.html) and the convenient extensions in [numpy.testing](http://docs.scipy.org/doc/numpy/reference/routines.testing.html).
Like many packages, *pandas* uses the [Nose testing system](https://nose.readthedocs.io/en/latest/index.html) and the convenient extensions in [numpy.testing](http://docs.scipy.org/doc/numpy/reference/routines.testing.html).

#### Writing tests

Expand Down Expand Up @@ -323,7 +323,7 @@ Performance matters and it is worth considering whether your code has introduced
>
> The asv benchmark suite was translated from the previous framework, vbench, so many stylistic issues are likely a result of automated transformation of the code.

To use asv you will need either `conda` or `virtualenv`. For more details please check the [asv installation webpage](http://asv.readthedocs.org/en/latest/installing.html).
To use asv you will need either `conda` or `virtualenv`. For more details please check the [asv installation webpage](https://asv.readthedocs.io/en/latest/installing.html).

To install asv:

Expand Down Expand Up @@ -360,7 +360,7 @@ This command is equivalent to:

This will launch every test only once, display stderr from the benchmarks, and use your local `python` that comes from your `$PATH`.

Information on how to write a benchmark can be found in the [asv documentation](http://asv.readthedocs.org/en/latest/writing_benchmarks.html).
Information on how to write a benchmark can be found in the [asv documentation](https://asv.readthedocs.io/en/latest/writing_benchmarks.html).

#### Running the vbench performance test suite (phasing out)

Expand Down
26 changes: 26 additions & 0 deletions asv_bench/benchmarks/algorithms.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,14 @@ def setup(self):
self.int = pd.Int64Index(np.arange(N).repeat(5))
self.float = pd.Float64Index(np.random.randn(N).repeat(5))

# Convenience naming.
self.checked_add = pd.core.nanops._checked_add_with_arr

self.arr = np.arange(1000000)
self.arrpos = np.arange(1000000)
self.arrneg = np.arange(-1000000, 0)
self.arrmixed = np.array([1, -1]).repeat(500000)

def time_int_factorize(self):
self.int.factorize()

Expand All @@ -29,3 +37,21 @@ def time_int_duplicated(self):

def time_float_duplicated(self):
self.float.duplicated()

def time_add_overflow_pos_scalar(self):
self.checked_add(self.arr, 1)

def time_add_overflow_neg_scalar(self):
self.checked_add(self.arr, -1)

def time_add_overflow_zero_scalar(self):
self.checked_add(self.arr, 0)

def time_add_overflow_pos_arr(self):
self.checked_add(self.arr, self.arrpos)

def time_add_overflow_neg_arr(self):
self.checked_add(self.arr, self.arrneg)

def time_add_overflow_mixed_arr(self):
self.checked_add(self.arr, self.arrmixed)
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/attrs_caching.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ def setup(self):
self.cur_index = self.df.index

def time_setattr_dataframe_index(self):
self.df.index = self.cur_index
self.df.index = self.cur_index
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/ctors.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,4 @@ def setup(self):
self.s = Series(([Timestamp('20110101'), Timestamp('20120101'), Timestamp('20130101')] * 1000))

def time_index_from_series_ctor(self):
Index(self.s)
Index(self.s)
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/frame_ctor.py
Original file line number Diff line number Diff line change
Expand Up @@ -1703,4 +1703,4 @@ def setup(self):
self.dict_list = [dict(zip(self.columns, row)) for row in self.frame.values]

def time_series_ctor_from_dict(self):
Series(self.some_dict)
Series(self.some_dict)
26 changes: 26 additions & 0 deletions asv_bench/benchmarks/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -548,6 +548,32 @@ def time_groupby_sum(self):
self.df.groupby(['a'])['b'].sum()


class groupby_period(object):
# GH 14338
goal_time = 0.2

def make_grouper(self, N):
return pd.period_range('1900-01-01', freq='D', periods=N)

def setup(self):
N = 10000
self.grouper = self.make_grouper(N)
self.df = pd.DataFrame(np.random.randn(N, 2))

def time_groupby_sum(self):
self.df.groupby(self.grouper).sum()


class groupby_datetime(groupby_period):
def make_grouper(self, N):
return pd.date_range('1900-01-01', freq='D', periods=N)


class groupby_datetimetz(groupby_period):
def make_grouper(self, N):
return pd.date_range('1900-01-01', freq='D', periods=N,
tz='US/Central')

#----------------------------------------------------------------------
# Series.value_counts

Expand Down
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/hdfstore_bench.py
Original file line number Diff line number Diff line change
Expand Up @@ -348,4 +348,4 @@ def remove(self, f):
try:
os.remove(self.f)
except:
pass
pass
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/index_object.py
Original file line number Diff line number Diff line change
Expand Up @@ -344,4 +344,4 @@ def setup(self):
self.mi = MultiIndex.from_product([self.level1, self.level2])

def time_multiindex_with_datetime_level_sliced(self):
self.mi[:10].values
self.mi[:10].values
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/io_sql.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,4 +212,4 @@ def setup(self):
self.df = DataFrame({'float1': randn(10000), 'float2': randn(10000), 'string1': (['foo'] * 10000), 'bool1': ([True] * 10000), 'int1': np.random.randint(0, 100000, size=10000), }, index=self.index)

def time_sql_write_sqlalchemy(self):
self.df.to_sql('test1', self.engine, if_exists='replace')
self.df.to_sql('test1', self.engine, if_exists='replace')
25 changes: 25 additions & 0 deletions asv_bench/benchmarks/packers.py
Original file line number Diff line number Diff line change
Expand Up @@ -547,6 +547,31 @@ def remove(self, f):
pass


class packers_write_json_lines(object):
goal_time = 0.2

def setup(self):
self.f = '__test__.msg'
self.N = 100000
self.C = 5
self.index = date_range('20000101', periods=self.N, freq='H')
self.df = DataFrame(dict([('float{0}'.format(i), randn(self.N)) for i in range(self.C)]), index=self.index)
self.remove(self.f)
self.df.index = np.arange(self.N)

def time_packers_write_json_lines(self):
self.df.to_json(self.f, orient="records", lines=True)

def teardown(self):
self.remove(self.f)

def remove(self, f):
try:
os.remove(self.f)
except:
pass


class packers_write_json_T(object):
goal_time = 0.2

Expand Down
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/panel_ctor.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,4 +61,4 @@ def setup(self):
self.data_frames[x] = self.df

def time_panel_from_dict_two_different_indexes(self):
Panel.from_dict(self.data_frames)
Panel.from_dict(self.data_frames)
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/panel_methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,4 @@ def setup(self):
self.panel = Panel(np.random.randn(100, len(self.index), 1000))

def time_panel_shift_minor(self):
self.panel.shift(1, axis='minor')
self.panel.shift(1, axis='minor')
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/replace.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,4 @@ def setup(self):
self.ts = Series(np.random.randn(self.N), index=self.rng)

def time_replace_replacena(self):
self.ts.replace(np.nan, 0.0, inplace=True)
self.ts.replace(np.nan, 0.0, inplace=True)
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/reshape.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,4 +73,4 @@ def setup(self):
break

def time_unstack_sparse_keyspace(self):
self.idf.unstack()
self.idf.unstack()
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/stat_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -258,4 +258,4 @@ def time_rolling_skew(self):
rolling_skew(self.arr, self.win)

def time_rolling_kurt(self):
rolling_kurt(self.arr, self.win)
rolling_kurt(self.arr, self.win)
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -390,4 +390,4 @@ def time_strings_upper(self):
self.many.str.upper()

def make_series(self, letters, strlen, size):
return Series([str(x) for x in np.fromiter(IT.cycle(letters), count=(size * strlen), dtype='|S1').view('|S{}'.format(strlen))])
return Series([str(x) for x in np.fromiter(IT.cycle(letters), count=(size * strlen), dtype='|S1').view('|S{}'.format(strlen))])
13 changes: 12 additions & 1 deletion asv_bench/benchmarks/timedelta.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from .pandas_vb_common import *
from pandas import to_timedelta
from pandas import to_timedelta, Timestamp


class timedelta_convert_int(object):
Expand Down Expand Up @@ -47,3 +47,14 @@ def time_timedelta_convert_coerce(self):

def time_timedelta_convert_ignore(self):
to_timedelta(self.arr, errors='ignore')


class timedelta_add_overflow(object):
goal_time = 0.2

def setup(self):
self.td = to_timedelta(np.arange(1000000))
self.ts = Timestamp('2000')

def test_add_td_ts(self):
self.td + self.ts
4 changes: 2 additions & 2 deletions ci/prep_cython_cache.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
ls "$HOME/.cache/"

PYX_CACHE_DIR="$HOME/.cache/pyxfiles"
pyx_file_list=`find ${TRAVIS_BUILD_DIR} -name "*.pyx"`
pyx_cache_file_list=`find ${PYX_CACHE_DIR} -name "*.pyx"`
pyx_file_list=`find ${TRAVIS_BUILD_DIR} -name "*.pyx" -o -name "*.pxd"`
pyx_cache_file_list=`find ${PYX_CACHE_DIR} -name "*.pyx" -o -name "*.pxd"`

CACHE_File="$HOME/.cache/cython_files.tar"

Expand Down
2 changes: 1 addition & 1 deletion ci/submit_cython_cache.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

CACHE_File="$HOME/.cache/cython_files.tar"
PYX_CACHE_DIR="$HOME/.cache/pyxfiles"
pyx_file_list=`find ${TRAVIS_BUILD_DIR} -name "*.pyx"`
pyx_file_list=`find ${TRAVIS_BUILD_DIR} -name "*.pyx" -o -name "*.pxd"`

rm -rf $CACHE_File
rm -rf $PYX_CACHE_DIR
Expand Down
4 changes: 2 additions & 2 deletions doc/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -155,9 +155,9 @@ Where to start?
---------------

There are a number of issues listed under `Docs
<https://github.com/pydata/pandas/issues?labels=Docs&sort=updated&state=open>`_
<https://github.com/pandas-dev/pandas/issues?labels=Docs&sort=updated&state=open>`_
and `Good as first PR
<https://github.com/pydata/pandas/issues?labels=Good+as+first+PR&sort=updated&state=open>`_
<https://github.com/pandas-dev/pandas/issues?labels=Good+as+first+PR&sort=updated&state=open>`_
where you could start out.

Or maybe you have an idea of your own, by using pandas, looking for something
Expand Down
2 changes: 1 addition & 1 deletion doc/_templates/autosummary/accessor_attribute.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@

.. currentmodule:: {{ module.split('.')[0] }}

.. autoaccessorattribute:: {{ [module.split('.')[1], objname]|join('.') }}
.. autoaccessorattribute:: {{ [module.split('.')[1], objname]|join('.') }}
2 changes: 1 addition & 1 deletion doc/_templates/autosummary/accessor_method.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@

.. currentmodule:: {{ module.split('.')[0] }}

.. autoaccessormethod:: {{ [module.split('.')[1], objname]|join('.') }}
.. autoaccessormethod:: {{ [module.split('.')[1], objname]|join('.') }}
14 changes: 7 additions & 7 deletions doc/source/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1794,18 +1794,18 @@ The following functions are available for one dimensional object arrays or scala

- :meth:`~pandas.to_datetime` (conversion to datetime objects)

.. ipython:: python
.. ipython:: python

import datetime
m = ['2016-07-09', datetime.datetime(2016, 3, 2)]
pd.to_datetime(m)
import datetime
m = ['2016-07-09', datetime.datetime(2016, 3, 2)]
pd.to_datetime(m)

- :meth:`~pandas.to_timedelta` (conversion to timedelta objects)

.. ipython:: python
.. ipython:: python

m = ['5us', pd.Timedelta('1day')]
pd.to_timedelta(m)
m = ['5us', pd.Timedelta('1day')]
pd.to_timedelta(m)

To force a conversion, we can pass in an ``errors`` argument, which specifies how pandas should deal with elements
that cannot be converted to desired dtype or object. By default, ``errors='raise'``, meaning that any errors encountered
Expand Down
2 changes: 1 addition & 1 deletion doc/source/categorical.rst
Original file line number Diff line number Diff line change
Expand Up @@ -973,7 +973,7 @@ are not numeric data (even in the case that ``.categories`` is numeric).
print("TypeError: " + str(e))

.. note::
If such a function works, please file a bug at https://github.com/pydata/pandas!
If such a function works, please file a bug at https://github.com/pandas-dev/pandas!

dtype in apply
~~~~~~~~~~~~~~
Expand Down
4 changes: 2 additions & 2 deletions doc/source/comparison_with_sas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ Reading External Data

Like SAS, pandas provides utilities for reading in data from
many formats. The ``tips`` dataset, found within the pandas
tests (`csv <https://raw.github.com/pydata/pandas/master/pandas/tests/data/tips.csv>`_)
tests (`csv <https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv>`_)
will be used in many of the following examples.

SAS provides ``PROC IMPORT`` to read csv data into a data set.
Expand All @@ -131,7 +131,7 @@ The pandas method is :func:`read_csv`, which works similarly.

.. ipython:: python

url = 'https://raw.github.com/pydata/pandas/master/pandas/tests/data/tips.csv'
url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv'
tips = pd.read_csv(url)
tips.head()

Expand Down
2 changes: 1 addition & 1 deletion doc/source/comparison_with_sql.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ structure.

.. ipython:: python

url = 'https://raw.github.com/pydata/pandas/master/pandas/tests/data/tips.csv'
url = 'https://raw.github.com/pandas-dev/pandas/master/pandas/tests/data/tips.csv'
tips = pd.read_csv(url)
tips.head()

Expand Down
10 changes: 5 additions & 5 deletions doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,15 +295,15 @@
'python': ('http://docs.python.org/3', None),
'numpy': ('http://docs.scipy.org/doc/numpy', None),
'scipy': ('http://docs.scipy.org/doc/scipy/reference', None),
'py': ('http://pylib.readthedocs.org/en/latest/', None)
'py': ('https://pylib.readthedocs.io/en/latest/', None)
}
import glob
autosummary_generate = glob.glob("*.rst")

# extlinks alias
extlinks = {'issue': ('https://github.com/pydata/pandas/issues/%s',
extlinks = {'issue': ('https://github.com/pandas-dev/pandas/issues/%s',
'GH'),
'wiki': ('https://github.com/pydata/pandas/wiki/%s',
'wiki': ('https://github.com/pandas-dev/pandas/wiki/%s',
'wiki ')}

ipython_exec_lines = [
Expand Down Expand Up @@ -468,10 +468,10 @@ def linkcode_resolve(domain, info):
fn = os.path.relpath(fn, start=os.path.dirname(pandas.__file__))

if '+' in pandas.__version__:
return "http://github.com/pydata/pandas/blob/master/pandas/%s%s" % (
return "http://github.com/pandas-dev/pandas/blob/master/pandas/%s%s" % (
fn, linespec)
else:
return "http://github.com/pydata/pandas/blob/v%s/pandas/%s%s" % (
return "http://github.com/pandas-dev/pandas/blob/v%s/pandas/%s%s" % (
pandas.__version__, fn, linespec)


Expand Down
Loading