Skip to content

PERF: HDFStore __unicode__ method #16514

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 55 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
6faa5a6
PERF: HDFStore has faster __unicode__, new info() method with old beh…
Kiv May 26, 2017
c8af4cf
ENH: added margins_name parameter for crosstab (#16489)
cmohl2013 May 26, 2017
840de2f
TST: ujson tests are not being run (#16499) (#16500)
abarber4gh May 26, 2017
4ed801b
DOC: Remove preference for pytest paradigm in assert_raises_regex (#1…
gfyoung May 27, 2017
c570eaf
TST: Specify HTML file encoding on PY3 (#16526)
neirbowj May 29, 2017
44d2a12
BUG: Fixed tput output on windows (#16496)
TomAugspurger May 30, 2017
1a9cb5b
BUG: Incorrect handling of rolling.cov with offset window (#16244)
keitakurita May 30, 2017
0a9f548
TST: Avoid global state in matplotlib tests (#16539)
TomAugspurger May 31, 2017
f7149a2
DOC: Update to docstring of DataFrame(dtype) (#14764) (#16487)
VincentLa May 31, 2017
36d6171
DOC: correct docstring examples (#3439) (#16432)
ProsperousHeart May 31, 2017
ab9bc9a
Fix unbound local with bad engine (#16511)
jtratner May 31, 2017
9a9c315
return empty MultiIndex for symmetrical difference on equal MultiInde…
Tafkas May 31, 2017
79cc4a9
BUG: select_as_multiple doesn't respect start/stop kwargs GH16209 (#1…
JosephWagner May 31, 2017
0db4de5
BUG: Bug in .resample() and .groupby() when aggregating on integers (…
jreback May 31, 2017
c193235
COMPAT: cython str-to-int can raise a ValueError on non-CPython (#16563)
mattip May 31, 2017
b9febe0
CLN: raise correct error for Panel sort_values (#16532)
pepicello May 31, 2017
98ed54d
BUG: Fixed pd.unique on array of tuples (#16543)
TomAugspurger Jun 1, 2017
6d761b4
BUG: Allow non-callable attributes in aggregate function. Fixes GH164…
pvomelveny Jun 1, 2017
ed542ee
Strictly monotonic (#16555)
TomAugspurger Jun 1, 2017
f92ec38
COMPAT: Consider Python 2.x tarfiles file-like (#16533)
gfyoung Jun 1, 2017
3f70fda
BUG: Fixed to_html ignoring index_names parameter
CRP Jun 1, 2017
785887a
BUG: fixed wrong order of ordered labels in pd.cut()
economy Jun 1, 2017
746c3cb
fix linting
jreback Jun 1, 2017
885522a
TST: writing invalid table names to sqlite (#16464)
Jun 1, 2017
79beeb6
TST: Skip test_database_uri_string if pg8000 importable (#16528)
neirbowj Jun 1, 2017
a7c95f2
DOC: Remove incorrect elements of PeriodIndex docstring (#16553)
tui-rob Jun 1, 2017
50479ae
TST: Make HDF5 fspath write test robust (#16575)
TomAugspurger Jun 1, 2017
b8ca9fc
ENH: add .ngroup() method to groupby objects (#14026) (#14026)
dsm054 Jun 1, 2017
e331c78
make null lowercase a missing value (#16534)
OlegShteynbuk Jun 1, 2017
e24e57c
MAINT: Drop has_index_names input from read_excel (#16522)
gfyoung Jun 1, 2017
ec535e9
BUG: reimplement MultiIndex.remove_unused_levels (#16565)
rhendric Jun 2, 2017
9e71f08
Adding 'n/a' to list of strings denoting missing values (#16079)
chrisgorgo Jun 2, 2017
32512b9
API: Make is_strictly_monotonic_* private (#16576)
TomAugspurger Jun 2, 2017
36670fc
DOC: change doc build to python 3.6 (#16545)
jorisvandenbossche Jun 2, 2017
5d7a020
DOC: whatsnew 0.20.2 edits (#16587)
jreback Jun 2, 2017
882ea0f
DOC: Fix typo in timeseries.rst (#16590)
funnycrab Jun 4, 2017
9d0be9d
PERF: vectorize _interp_limit (#16592)
TomAugspurger Jun 4, 2017
b3769f1
DOC: Fix typo in merge doc for validate kwarg (#16595)
benjello Jun 4, 2017
a0174eb
BUG: convert numpy strings in index names in HDF #13492 (#16444)
makmanalp Jun 4, 2017
9771514
ERRR: Raise error in usecols when column doesn't exist but length mat…
bpraggastis Jun 4, 2017
cf5f2d8
DOC: Whatsnew fixups (#16596)
TomAugspurger Jun 4, 2017
1415b95
DOC: Update release.rst
TomAugspurger Jun 4, 2017
93aabe7
BUG: pickle compat with UTC tz's (#16611)
jreback Jun 6, 2017
3ebd719
Fix some lgtm alerts (#16613)
jhelie Jun 7, 2017
fd171eb
BLD: fix numpy on 3.6 build as 1.13 was released but no deps are buil…
jreback Jun 8, 2017
4b0ef03
BUG: Fix Series.get failure on missing NaN (#8569) (#16619)
dsm054 Jun 8, 2017
1b159af
TST: NaN in MultiIndex should not become a string (#7031) (#16625)
dsm054 Jun 8, 2017
8eb0c7f
TST: verify we can add and subtract from indices (#8142) (#16629)
dsm054 Jun 8, 2017
6fa83d3
BUG: conversion of Series to Categorical (#16557)
preddy5 Jun 9, 2017
aba51b6
BLD: fix numpy on 2.7 build as 1.13 was released but no deps are buil…
jreback Jun 9, 2017
fdb54df
CLN: make license file machine readable (#16649)
tswast Jun 9, 2017
41b3968
fix pytest-xidst version as 1.17 appears buggy (#16652)
jreback Jun 10, 2017
9d4c88d
COMPAT: numpy 1.13 test compat (#16654)
jreback Jun 10, 2017
8f6e50a
Revert "fix pytest-xidst version as 1.17 appears buggy (#16652)" (#16…
jreback Jun 10, 2017
1de16b6
Add ASV benchmark.
Kiv Jun 11, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,11 @@ matrix:
# In allow_failures
- os: linux
env:
- JOB="3.5_DOC" DOC=true
- JOB="3.6_DOC" DOC=true
addons:
apt:
packages:
- xsel
allow_failures:
- os: linux
env:
Expand All @@ -87,7 +91,7 @@ matrix:
- JOB="3.6_NUMPY_DEV" TEST_ARGS="--skip-slow --skip-network" PANDAS_TESTING_MODE="deprecate"
- os: linux
env:
- JOB="3.5_DOC" DOC=true
- JOB="3.6_DOC" DOC=true

before_install:
- echo "before_install"
Expand Down
57 changes: 57 additions & 0 deletions AUTHORS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
About the Copyright Holders
===========================

* Copyright (c) 2008-2011 AQR Capital Management, LLC

AQR Capital Management began pandas development in 2008. Development was
led by Wes McKinney. AQR released the source under this license in 2009.
* Copyright (c) 2011-2012, Lambda Foundry, Inc.

Wes is now an employee of Lambda Foundry, and remains the pandas project
lead.
* Copyright (c) 2011-2012, PyData Development Team

The PyData Development Team is the collection of developers of the PyData
project. This includes all of the PyData sub-projects, including pandas. The
core team that coordinates development on GitHub can be found here:
http://github.com/pydata.

Full credits for pandas contributors can be found in the documentation.

Our Copyright Policy
====================

PyData uses a shared copyright model. Each contributor maintains copyright
over their contributions to PyData. However, it is important to note that
these contributions are typically only changes to the repositories. Thus,
the PyData source code, in its entirety, is not the copyright of any single
person or institution. Instead, it is the collective copyright of the
entire PyData Development Team. If individual contributors want to maintain
a record of what changes/contributions they have specific copyright on,
they should indicate their copyright in the commit message of the change
when they commit the change to one of the PyData repositories.

With this in mind, the following banner should be used in any source code
file to indicate the copyright and license terms:

```
#-----------------------------------------------------------------------------
# Copyright (c) 2012, PyData Development Team
# All rights reserved.
#
# Distributed under the terms of the BSD Simplified License.
#
# The full license is in the LICENSE file, distributed with this software.
#-----------------------------------------------------------------------------
```

Other licenses can be found in the LICENSES directory.

License
=======

pandas is distributed under a 3-clause ("Simplified" or "New") BSD
license. Parts of NumPy, SciPy, numpydoc, bottleneck, which all have
BSD-compatible licenses, are included. Their licenses follow the pandas
license.

106 changes: 24 additions & 82 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,87 +1,29 @@
=======
License
=======
BSD 3-Clause License

pandas is distributed under a 3-clause ("Simplified" or "New") BSD
license. Parts of NumPy, SciPy, numpydoc, bottleneck, which all have
BSD-compatible licenses, are included. Their licenses follow the pandas
license.

pandas license
==============

Copyright (c) 2011-2012, Lambda Foundry, Inc. and PyData Development Team
All rights reserved.

Copyright (c) 2008-2011 AQR Capital Management, LLC
Copyright (c) 2008-2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.

* Neither the name of the copyright holder nor the names of any
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

About the Copyright Holders
===========================

AQR Capital Management began pandas development in 2008. Development was
led by Wes McKinney. AQR released the source under this license in 2009.
Wes is now an employee of Lambda Foundry, and remains the pandas project
lead.

The PyData Development Team is the collection of developers of the PyData
project. This includes all of the PyData sub-projects, including pandas. The
core team that coordinates development on GitHub can be found here:
http://github.com/pydata.

Full credits for pandas contributors can be found in the documentation.

Our Copyright Policy
====================

PyData uses a shared copyright model. Each contributor maintains copyright
over their contributions to PyData. However, it is important to note that
these contributions are typically only changes to the repositories. Thus,
the PyData source code, in its entirety, is not the copyright of any single
person or institution. Instead, it is the collective copyright of the
entire PyData Development Team. If individual contributors want to maintain
a record of what changes/contributions they have specific copyright on,
they should indicate their copyright in the commit message of the change
when they commit the change to one of the PyData repositories.

With this in mind, the following banner should be used in any source code
file to indicate the copyright and license terms:

#-----------------------------------------------------------------------------
# Copyright (c) 2012, PyData Development Team
# All rights reserved.
#
# Distributed under the terms of the BSD Simplified License.
#
# The full license is in the LICENSE file, distributed with this software.
#-----------------------------------------------------------------------------

Other licenses can be found in the LICENSES directory.
8 changes: 8 additions & 0 deletions asv_bench/benchmarks/hdfstore_bench.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,14 @@ def time_query_store_table(self):
stop = self.df2.index[15000]
self.store.select('table', where="index > start and index < stop")

def time_store_tostring(self):
repr(self.store)
str(self.store)

def time_store_info(self):
self.store.info()



class HDF5Panel(object):
goal_time = 0.2
Expand Down
9 changes: 9 additions & 0 deletions asv_bench/benchmarks/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,12 @@ def setup(self):
[np.arange(100), list('A'), list('A')],
names=['one', 'two', 'three'])

rng = np.random.RandomState(4)
size = 1 << 16
self.mi_unused_levels = pd.MultiIndex.from_arrays([
rng.randint(0, 1 << 13, size),
rng.randint(0, 1 << 10, size)])[rng.rand(size) < 0.1]

def time_series_xs_mi_ix(self):
self.s.ix[999]

Expand Down Expand Up @@ -248,6 +254,9 @@ def time_multiindex_small_get_loc_warm(self):
def time_is_monotonic(self):
self.miint.is_monotonic

def time_remove_unused_levels(self):
self.mi_unused_levels.remove_unused_levels()


class IntervalIndexing(object):
goal_time = 0.2
Expand Down
9 changes: 9 additions & 0 deletions ci/build_docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,15 @@ if [ "$DOC" ]; then
git remote -v

git push origin gh-pages -f

echo "Running doctests"
cd "$TRAVIS_BUILD_DIR"
pytest --doctest-modules \
pandas/core/reshape/concat.py \
pandas/core/reshape/pivot.py \
pandas/core/reshape/reshape.py \
pandas/core/reshape/tile.py

fi

exit 0
2 changes: 1 addition & 1 deletion ci/requirements-2.7.build
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ python=2.7*
python-dateutil=2.4.1
pytz=2013b
nomkl
numpy
numpy=1.12*
cython=0.23
2 changes: 1 addition & 1 deletion ci/requirements-3.6.build
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ python=3.6*
python-dateutil
pytz
nomkl
numpy
numpy=1.12*
cython
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
python=3.5*
python=3.6*
python-dateutil
pytz
numpy
numpy=1.12*
cython
4 changes: 2 additions & 2 deletions ci/requirements-3.5_DOC.run → ci/requirements-3.6_DOC.run
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ lxml
beautifulsoup4
html5lib
pytables
openpyxl=1.8.5
openpyxl
xlrd
xlwt
xlsxwriter
Expand All @@ -21,4 +21,4 @@ numexpr
bottleneck
statsmodels
xarray
pyqt=4.11.4
pyqt
File renamed without changes.
10 changes: 10 additions & 0 deletions doc/source/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -948,6 +948,16 @@ On the other hand, if the index is not monotonic, then both slice bounds must be
In [11]: df.loc[2:3, :]
KeyError: 'Cannot get right slice bound for non-unique label: 3'

:meth:`Index.is_monotonic_increasing` and :meth:`Index.is_monotonic_decreasing` only check that
an index is weakly monotonic. To check for strict montonicity, you can combine one of those with
:meth:`Index.is_unique`

.. ipython:: python

weakly_monotonic = pd.Index(['a', 'b', 'c', 'c'])
weakly_monotonic
weakly_monotonic.is_monotonic_increasing
weakly_monotonic.is_monotonic_increasing & weakly_monotonic.is_unique

Endpoints are inclusive
~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
2 changes: 2 additions & 0 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ HDFStore: PyTables (HDF5)
HDFStore.append
HDFStore.get
HDFStore.select
HDFStore.info

Feather
~~~~~~~
Expand Down Expand Up @@ -1705,6 +1706,7 @@ Computations / Descriptive Stats
GroupBy.mean
GroupBy.median
GroupBy.min
GroupBy.ngroup
GroupBy.nth
GroupBy.ohlc
GroupBy.prod
Expand Down
Loading