Skip to content

PERF: HDFStore __unicode__ method #16666

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 11, 2017
Merged

PERF: HDFStore __unicode__ method #16666

merged 2 commits into from
Jun 11, 2017

Conversation

Kiv
Copy link
Contributor

@Kiv Kiv commented Jun 11, 2017

supersedes #16514

HDFStore unicode now only returns file path info. New info() method has the previous behavior of unicode.

import pandas as pd
store = pd.HDFStore('test.h5', 'w')
for i in range(5000):
    store.put('table_{}'.format(i), pd.DataFrame([i]))

# Before
%time str(store)
CPU times: user 26.1 s, sys: 156 ms, total: 26.2 s
Wall time: 26.2 s

# After
%time str(store)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 40.1 µs
  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master --name-only -- '*.py' | flake8 --diff
  • whatsnew entry

@Kiv Kiv changed the title PERF: HDFStore __unicode__ method #16514 PERF: HDFStore __unicode__ method Jun 11, 2017
@Kiv Kiv mentioned this pull request Jun 11, 2017
4 tasks
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually run the asv and show the output

@@ -90,6 +90,15 @@ def time_query_store_table(self):
stop = self.df2.index[15000]
self.store.select('table', where="index > start and index < stop")

def time_store_repr(self):
repr(self.store)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may not show much as there is only a couple of nodes

create a new store that has an example like your issue (but use only like 50 nodes)

Copy link
Contributor Author

@Kiv Kiv Jun 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 is plenty of nodes to show the issue:

    before     after       ratio
  [75c8698e] [a5016b44]
-   24.82ms     7.09μs      0.00  hdfstore_bench.HDF5.time_store_repr
-   24.45ms     6.76μs      0.00  hdfstore_bench.HDF5.time_store_str

@jreback jreback added IO HDF5 read_hdf, HDFStore Performance Memory or execution speed performance labels Jun 11, 2017
@jreback jreback added this to the 0.21.0 milestone Jun 11, 2017
@@ -1161,6 +1136,37 @@ def copy(self, file, mode='w', propindexes=True, keys=None, complib=None,

return new_store

def info(self):
"""return detailed information on the store
.. versionadded:: 0.21.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a blank line before versionadded

Kiv and others added 2 commits June 11, 2017 19:17
…avior.

__unicode__ now only returns file path info, not (expensive) details on all existing keys.
@codecov
Copy link

codecov bot commented Jun 11, 2017

Codecov Report

Merging #16666 into master will decrease coverage by <.01%.
The diff coverage is 94.73%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16666      +/-   ##
==========================================
- Coverage   90.93%   90.93%   -0.01%     
==========================================
  Files         161      161              
  Lines       49269    49268       -1     
==========================================
- Hits        44802    44801       -1     
  Misses       4467     4467
Flag Coverage Δ
#multiple 88.69% <5.26%> (ø) ⬆️
#single 40.22% <94.73%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/pytables.py 93.04% <94.73%> (-0.08%) ⬇️
pandas/core/generic.py 92.36% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5a6f50d...5d2812d. Read the comment docs.

@codecov
Copy link

codecov bot commented Jun 11, 2017

Codecov Report

Merging #16666 into master will decrease coverage by <.01%.
The diff coverage is 85.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16666      +/-   ##
==========================================
- Coverage   90.93%   90.92%   -0.01%     
==========================================
  Files         161      161              
  Lines       49269    49271       +2     
==========================================
  Hits        44802    44802              
- Misses       4467     4469       +2
Flag Coverage Δ
#multiple 88.69% <4.76%> (-0.01%) ⬇️
#single 40.22% <85.71%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/pytables.py 93.04% <85.71%> (-0.08%) ⬇️
pandas/io/excel.py 80.55% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5a6f50d...5d2812d. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO HDF5 read_hdf, HDFStore Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants