Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PDEP-1: Purpose and guidelines for pandas enhancement proposals #47444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDEP-1: Purpose and guidelines for pandas enhancement proposals #47444
Changes from 8 commits
8bde84f
0b43492
3d9a75b
6d9d34b
a0e6cda
a0d7276
a8295b8
1e408dd
291de8d
2ce2164
ebf1687
05d43a5
d20de1e
9b37d11
55b3887
8c34db0
4f3343b
7c1a725
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to match the PDEP created date? but probably August.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah switch to August :->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "Created" instead of "Date"? Since this will typically the date when the process is started, and not when it was eg accepted, "created" might denote that more correctly (and both NEPs and PEPs seem to use that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? (not fully sure, but "proposal to" sounds like it should be followed by a verb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? (might be a bit more generic wording, as "developer" sounds more code-centric?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it would be convenient to have an actual blank template PDEP file to be filled out.
It might also be nice to have a script that generates the next PDEP number for you(I anticipate it might be hard to find the next PDEP number if a lot of PDEPs are submitted in the future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd personally leave that for later, once we need it. Those sound like good ideas to me, but I wouldn't implement them initially, I would wait to see how things work first. If we merge like PDEP per month, I think checking the last PDEP number before merging is easier than a system to autogenerate them. And for a template, I'd wait to have few actual PDEPs before deciding if it helps, or if PDEPs are too different from one to another.
Does it make sense to you to start simple and iterate later as we have more experience?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to also have a public comment period on PDEP's like tensorflow has with their RFC's?
We should probably also clarify when voting happens(define what proportion of core team members need to consider a PDEP ready before voting like @mroeschke stated) and how long core team members have to vote.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to discuss and hear other opinions. But to me personally, I wouldn't have a voting period or deadline.
I see PDEPs more like ongoing discussions, with feedback and updates, than just a voting process. Also, I assume tensorflow core devs are mainly google employees, so imposing a deadline makes more sense, as they are supposed to be working on the project X hours. But for a mostly volunteer developed project like pandas, I'd leave it more open.
But again, I'd start by seeing how things work, and if we see PDEPs keep open for too long, and seems like having a timeframe should help, surely worth giving it a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, just having a PDEP is akin to a new process, so having a process for the PDEP is maybe like introducing another meta process at the same time. So in response to this comment and others related to the process, I agree with @datapythonista that to begin with we just discuss whether we want to use PDEPs, what we want out of a PDEP, and roughly what it should contain and iterate on the gaps over time. The only caveat here, is that once someone has spent effort preparing a PDEP and it is approved that we don't have "blockers" implementing, so that does affect the approval process to some degree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree - we need a voting process here - we could emulate that of NEP which is pretty simple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what Marc proposed above is to defer a discussion about the decision process (basically our governance model) for a subsequent discussion.
(which sounds good to me to keep the discussion manageable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am not sure this is true - sure a lot will easily be accepted but some might be controversial and ultimately not accepted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. maybe just qualify this by changing
By default, we expect a PDEP will be accepted
to something likeBy default, we expect a PDEP (with some preliminary discussion) will be accepted
before this text we have...
so the PEP is part of the detailed documentation and an issue should perhaps be the initial part of the discussion. (just as I would normally expect say a bug fix PR to have an associated issue)
and we also have
I think that maybe issues should always be opened before submitting a PEP, either as a specific issue or an existing issue concluding that a PEP is required. Maybe we also need to ensure that
@pandas-dev/pandas-core
is also notified in these cases.If we have some form of gate for opening a PEP, then most should be accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also just leave out the "By default, we expect a PDEP will be accepted" ? To me it doesn't seem to add much, rather than a potentially wrong/confusing message (in general PDEPs are for topics that are not trivial and thus will not always be accepted)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could alsof default to start a PDEP in a "draft" status (or "under discussion", I see there is a section for those)?
To me that seems to better signal the actual status (when going through the diff of this one, I wondered why it was saying " accepted" since we are still discussing the proposal). When we decide to merge, it's then only a small change to update the status before merging. And for larger PDEPs it might also be useful to merge it already in draft status (to have a rendered version to link to, like they do with PEPs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me.
Regarding merging PDEPs as draft, feels like it could make discussions more difficult. Like where to have the discussion once the PR is merged. Via new PRs? Or in a second PR, not being able to comment in parts not in the diff. But I guess in some cases it can be useful, maybe a PR with different parts, and merging as a Draft once the first part has been discussed, and before working on other parts.
I'll update the document if there are no objections to using Under discussion status (or Draft if people prefer this name) until PR is Accepted. And to consider merging PRs before they are accepted when it's useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO I think the current guidance stated here is clear as long as a PDEP is an open PR, it should be considered "under discussion" and a draft PDEP can be a draft PR for example. I agree with @datapythonista that merging intermediate states of a PDEP would fragment the discussion and there shouldn't be more than one page/PR/location where a PDEP can be discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will probably have to see how it goes in practice. For many PDEPs a single PR with proposal, discussion, acceptance all in one will quite likely be sufficient.
But I can imagine for larger/longer-running discussions (I am thinking eg about the copy/view proposal), that it could be useful to already have a hosted version of PDEP that is still under discussion.
As some references: numpy seems to already merge NEP PRs if they are still under discussion (eg numpy/numpy#18456), but they also have substantial discussion on the mailing list. Similarly in Python, PEPs are merged quickly, but also there the discussion is mostly happening on mailing list or discourse. That's of course different with the current way in pandas where most discussion typically happens on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe reword "will" to something else? This language makes it feel as if all PDEP's will be accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word "accepted" seems a bit strange here, since this is about PDEPs that are not accepted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to contrast rejected PDEPs against accepted PDEPs; i.e. "even rejected PDEPs are useful". I think the wording is okay, but maybe "Rejected PDEPs are just as useful as accepted PDEPs..." would make it more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed the first "as" in the sentence, so I read it as "they are useful as accepted PDEP", instead of "they are as useful as accepted PDEPs", hence the confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably clarify when a PDEP can be revised v.s. when a new PDEP has to be submitted superseding the old PDEP.