Skip to content

DOC: Remove versionadded/versionchanged up to 1.5.0 #51071

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
phofl opened this issue Jan 30, 2023 · 11 comments
Open
1 task done

DOC: Remove versionadded/versionchanged up to 1.5.0 #51071

phofl opened this issue Jan 30, 2023 · 11 comments
Labels
Docs Needs Discussion Requires discussion from core team before further action

Comments

@phofl
Copy link
Member

phofl commented Jan 30, 2023

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

We have many versionadded/versionchanged notes in our API docs. This clutters our docstrings quite a bit, see https://pandas.pydata.org/docs/dev/reference/api/pandas.read_csv.html for example and is actually not really relevant for most users. I'd suggest removing everything that was added up to 1.4.5, e.g. 1.4.5 or lower to clean this up a bit.

We have the version switcher if people want to go back. Also, 2.0 is a major release so I think it's a good opportunity to clean this up.

Documentation problem

See above

Suggested fix for documentation

Remove the notes

@phofl phofl added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 30, 2023
@jorisvandenbossche
Copy link
Member

It's good to clean up those notes regularly, but I would personally keep them a bit longer (so more older versions than just for 1.5). It seems we currently have such notes starting from 1.0.0. So we could also start with only removing the ones from 1.0 and 1.1?

@phofl
Copy link
Member Author

phofl commented Jan 30, 2023

I'd at least remove up to 1.3.5, 1.4.x and 1.5.x will cover over a year by the time 2.0 comes out, this should be sufficient. I mean it's still there if you switch to the 1.5.x docs

@rhshadrach
Copy link
Member

I'd at least remove up to 1.3.5, 1.4.x and 1.5.x will cover over a year by the time 2.0 comes out, this should be sufficient.

I don't agree here, even the latest LTS runtime environment from Databricks is using pandas 1.3.5. In my experience, many users install packages and don't change their versions unless something breaks. This is bad practice in my opinion, but I think it leaves us with users making use of older versions regardless.

@phofl
Copy link
Member Author

phofl commented Jan 30, 2023

I agree, but you could make this argument for every version we ever released. Do you think someone would upgrade from 1.0 to 2.0 straight? I would expect users to upgrade maybe 2 minor versions at a time but no more.

In general it's a bad idea to upgrade from anything but 1.5 to 2.0, no chance of seeing all deprecation warnings before.

@lithomas1 lithomas1 added Needs Discussion Requires discussion from core team before further action and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 31, 2023
@rhshadrach
Copy link
Member

I would expect users to upgrade maybe 2 minor versions at a time but no more.

I still see frequent use of 1.3.x. I can't be certain, but I would guess that a good portion of users that look at the documentation do so without switching to the appropriate version; the default page that loads is the most recent version. Where I foresee a bad user experience is to have users on 1.3.x looking at documentation of the most recent version and seeing some argument exists or behaves a certain way that is not the case in their version. By having the versionchanged et al, it would give a way for them to realize what's going on.

In other words, I don't think the number of versions that get upgraded is the thing that's important; just what versions are in common use currently.

@lithomas1
Copy link
Member

lithomas1 commented Feb 3, 2023

I can pull download counts for various pandas version to see which versions are still in use from the GBQ PyPI dataset.
(Had a small website working before, but my heroku dyno got shut down :( )

Will try to get a chart up.

@lithomas1
Copy link
Member

lithomas1 commented Feb 4, 2023

Currently the top 10 versions I have are (for downloads today, but the monthly counts are similar):

1.5.3: 18.371%
1.3.5: 15.966%
1.1.5: 11.428%
1.2.5: 5.278%
0.24.2: 4.853%
1.0.5: 3.793%
1.3.4: 3.179%
0.25.3: 2.793%
1.2.1: 2.339%
1.0.1: 2.172%
Other: 29.828%

bokeh_plot

Source is: https://lithomas1.github.io/platypi/packageinfo.html
(It's pretty slow since its running pyodide, so you'll have to be a little patient.)

@lithomas1
Copy link
Member

Going by this, I would be +1 on removing up to 1.1.0 and -1 on everything else.

IMO, we should try to set a threshold for things (docs versionadded/versionchanged, platform support too) like this. I can write up another PDEP if folks are interested.

@rhshadrach
Copy link
Member

Agreed. For a PDEP, I would suggest using a constant # of previous versions rather than basing it on usage stats.

@jorisvandenbossche
Copy link
Member

FWIW given this only about docs and not actual runtime, I don't think that's worth a PDEP. Just discussing this and writing up that as a guideline in our contributing docs can be sufficient I think.

(it's something else for how long runtime dependencies are supported, like NEP29)

@lithomas1
Copy link
Member

Yeah, I would like to formalize things like support for new architectures, version bumps of deps, and dropping architectures as well.

Right now there's ambiguity over things like if a pyarrow version is too new to set as the minimum, how do determine threshold for adding platforms(e.g. Alpine is under consideration right now), and whether we can remove things like 32-bit support.

For now, how about lets assume people are currently using the last four pandas versions (based on usage details, which may or may not be accurate due to noise from things like CI puling pandas).

This would mean we drop versionchanged in docs up to 1.3.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

4 participants