Skip to content

DOC: Cleaned references to pandas <v0.12 in docs #17375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

topper-123
Copy link
Contributor

@topper-123 topper-123 commented Aug 29, 2017

There is a lot of references to in the docs to when exactly some change occured. For newer changes this is great, but there comes a time when such references only disturb the reader rather than help him, as the versions referenced become so old, that they become noise rather than help.

I've cleaned up references up to and including v0.11.

IMO I could have gone higher (v.015?), but can do that in some later round.

Some issues I would be glad for input on:

  • In gotschas.rst, there is a sentence "As of pandas 0.11, pandas is not 100% thread safe." I haven't altered this, but I presume this still is correct in the newest version of pandas? Then IMO it should be changed to reference a newer version or simply to "pandas is currently not 100% thread safe."
  • In io.rst there is a sentence "0.10.1 of HDFStore can read tables created in a prior version of pandas, ...". I'm not even sure the "0.10.1" references the version of pandas or a HDF library and I left it alone. The paragraph also discusses backwards compatability, which makes it somewhat relevant to keep around, even if it's an old change.

@gfyoung gfyoung added the Docs label Aug 29, 2017
@gfyoung
Copy link
Member

gfyoung commented Aug 29, 2017

@topper-123 : I might consider even going to 0.17 in fact because that's > three major releases behind the current one (0.21), making super old.

I would leave the 0.11.0 sentence as is in the gotchas, and for io.rst, given its proximity to HDFStore and pandas, semantically and grammatically, it is referring to HDFStore.

@codecov
Copy link

codecov bot commented Aug 30, 2017

Codecov Report

Merging #17375 into master will decrease coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17375      +/-   ##
==========================================
- Coverage   91.01%   90.99%   -0.03%     
==========================================
  Files         163      163              
  Lines       49567    49567              
==========================================
- Hits        45113    45103      -10     
- Misses       4454     4464      +10
Flag Coverage Δ
#multiple 88.77% <ø> (-0.01%) ⬇️
#single 40.25% <ø> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.72% <0%> (-0.1%) ⬇️
pandas/core/indexes/datetimes.py 95.23% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0d676a3...d566383. Read the comment docs.

@codecov
Copy link

codecov bot commented Aug 30, 2017

Codecov Report

Merging #17375 into master will increase coverage by 0.12%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17375      +/-   ##
==========================================
+ Coverage   91.01%   91.14%   +0.12%     
==========================================
  Files         163      163              
  Lines       49567    49581      +14     
==========================================
+ Hits        45113    45190      +77     
+ Misses       4454     4391      -63
Flag Coverage Δ
#multiple 88.92% <ø> (+0.15%) ⬆️
#single 40.25% <ø> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.72% <0%> (-0.1%) ⬇️
pandas/io/pytables.py 92.79% <0%> (-0.09%) ⬇️
pandas/core/indexes/multi.py 96.9% <0%> (-0.03%) ⬇️
pandas/io/formats/excel.py 96.65% <0%> (ø) ⬆️
pandas/core/groupby.py 92.21% <0%> (ø) ⬆️
pandas/core/indexes/api.py 98.78% <0%> (ø) ⬆️
pandas/io/formats/printing.py 89.38% <0%> (ø) ⬆️
pandas/io/formats/css.py 100% <0%> (ø) ⬆️
pandas/io/parsers.py 95.46% <0%> (ø) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0d676a3...7d18e06. Read the comment docs.

@shoyer
Copy link
Member

shoyer commented Aug 30, 2017

I like this. I think the cutoff you've chosen here is a good one for now. v0.12 marks a slow-down in the development pace of pandas (just look at the pace of release tags).

0.17.0 may be >3 versions old, but it's also less than two years old. In general I would go by time instead of number of major versions.

In gotschas.rst, there is a sentence "As of pandas 0.11, pandas is not 100% thread safe." I haven't altered this, but I presume this still is correct in the newest version of pandas? Then IMO it should be changed to reference a newer version or simply to "pandas is currently not 100% thread safe."

I don't understand what this was originally intended to convey. There are at least two types of thread-safety:

  1. Thread-safe libraries don't crash when run from multiple threads.
  2. Thread-safe data structures have locks to ensure atomic operations on mutable data structures.

Like most Python code, pandas falls in the first category, not the second. This is tested routinely by dask. But not even built-in data structures are thread-safe in the second sense in Python.

In io.rst there is a sentence "0.10.1 of HDFStore can read tables created in a prior version of pandas, ...". I'm not even sure the "0.10.1" references the version of pandas or a HDF library and I left it alone. The paragraph also discusses backwards compatability, which makes it somewhat relevant to keep around, even if it's an old change.

I'm pretty sure 0.10.1 references the pandas version. HDF5 is on version 1.8.17.

I don't think it's important to mention the version here at this point, since 0.10.1 is quite old.

@jreback
Copy link
Contributor

jreback commented Aug 30, 2017

In io.rst there is a sentence "0.10.1 of HDFStore can read tables created in a prior version of pandas, ...". I'm not even sure the "0.10.1" references the version of pandas or a HDF library and I left it alone. The paragraph also discusses backwards compatability, which makes it somewhat relevant to keep around, even if it's an old change.

I'm pretty sure 0.10.1 references the pandas version. HDF5 is on version 1.8.17.

this has nothing to do with the HDF5 standard. This is a reference to the pandas version itself. in any event it can be removed.

@jreback jreback added this to the 0.21.0 milestone Aug 30, 2017
@@ -251,8 +251,8 @@ replace NaN with some other value using ``fillna`` if you wish).
Flexible Comparisons
~~~~~~~~~~~~~~~~~~~~

Starting in v0.8, pandas introduced binary comparison methods eq, ne, lt, gt,
le, and ge to Series and DataFrame whose behavior is analogous to the binary
Note that Series and DataFrame have the binary comparison methods eq, ne, lt, gt,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the 'Note that'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use double-backtics on all of the eq etc (better readability)

@@ -698,7 +698,7 @@ DataFrame in tabular form, though it won't always fit the console width:

print(baseball.iloc[-20:, :12].to_string())

New since 0.10.0, wide DataFrames will now be printed across multiple rows by
Note that wide DataFrames will be printed across multiple rows by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove Note that

@@ -856,8 +856,7 @@ DataFrame objects with mixed-type columns, all of the data will get upcasted to
From DataFrame using ``to_panel`` method
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This method was introduced in v0.7 to replace ``LongPanel.to_long``, and converts
a DataFrame with a two-level index to a Panel.
``to_panel`` converts a DataFrame with a two-level index to a Panel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a referencde to the section where panel is deprecated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a deprecation warning a bit above, so it's too much adding it here also IMO. I changed a note that calls on people to contribute to panels, though, as that isnt relevant anymore.

@@ -140,7 +140,7 @@ columns:

In [5]: grouped = df.groupby(get_letter_type, axis=1)

Starting with 0.8, pandas Index objects now support duplicate values. If a
Note that pandas Index objects support duplicate values. If a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the Note that

@@ -3878,7 +3878,7 @@ create a new table!)
Iterator
++++++++

Starting in ``0.11.0``, you can pass, ``iterator=True`` or ``chunksize=number_in_a_chunk``
Note that you can pass ``iterator=True`` or ``chunksize=number_in_a_chunk``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove Note that

@topper-123
Copy link
Contributor Author

Adjusted according to comments.

Wrt. backwards compatabilty of HDFStore and pandas <0.10.1, I propose the whole paragraph removed. If you're having data as old as pandas 0.10.1, you should be responisible to look through the release notes to find the relevant change.


import os
legacy_file_path = os.path.abspath('source/_static/legacy_0.10.h5')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually you can also remove the legacy stuff from the tests as well (and add a small note in the whatsnew)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Is it correct we're talking about the tests/io/test_pytables.py::testHDFStore.test_legacy* functions? (4 tests)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave test changes for a separate PR please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a start to it (#17398), but I have some issuse.

One issue is that this doc fragment uses a binary hdf file that needs to be deleted. Can this request be accepted now, so the other will pass? I will do it today and/or tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove this part of the doc in the other PR

@topper-123
Copy link
Contributor Author

The removal of the note about backwards compatibilit has been moved to #17404, as those two things are connected through the file legacy_0.10.h5.

IMO this should be ready now to commit.

I will look into cleaning references upto v0.14 or v0.15 as the next step.

considered to be "NA" in computations. This is no longer the case by
default; use the ``mode.use_inf_as_na`` option to recover it.
If you want to consider ``inf`` and ``-inf``
to be "NA" in computations, you can use the ``mode.use_inf_as_na`` option to archieve it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

achieve

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny typo. ping on green.

@topper-123
Copy link
Contributor Author

Ping, corrected.

@jorisvandenbossche jorisvandenbossche merged commit 1981b67 into pandas-dev:master Sep 2, 2017
@jorisvandenbossche
Copy link
Member

@topper-123 Thanks a lot, this was a good idea!

jbrockmendel pushed a commit to jbrockmendel/pandas that referenced this pull request Sep 10, 2017
@topper-123 topper-123 deleted the remove_references_to_old_versions branch September 11, 2017 21:10
jowens pushed a commit to jowens/pandas that referenced this pull request Sep 20, 2017
alanbato pushed a commit to alanbato/pandas that referenced this pull request Nov 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants