Skip to content

Commit 80001a0

Browse files
noatamirDr-IrvMarcoGorellirhshadrachjorisvandenbossche
authored
PDEP-1 first revision (scope) (#51417)
Co-authored-by: Irv Lustig <[email protected]> Co-authored-by: Marco Edward Gorelli <[email protected]> Co-authored-by: Richard Shadrach <[email protected]> Co-authored-by: Joris Van den Bossche <[email protected]> Co-authored-by: Matthew Roeschke <[email protected]> Co-authored-by: Terji Petersen <[email protected]> Co-authored-by: Marc Garcia <[email protected]> Co-authored-by: Simon Hawkins <[email protected]>
1 parent a80cba4 commit 80001a0

File tree

1 file changed

+98
-22
lines changed

1 file changed

+98
-22
lines changed

web/pandas/pdeps/0001-purpose-and-guidelines.md

+98-22
Original file line numberDiff line numberDiff line change
@@ -2,25 +2,33 @@
22

33
- Created: 3 August 2022
44
- Status: Accepted
5-
- Discussion: [#47444](https://github.com/pandas-dev/pandas/pull/47444)
6-
- Author: [Marc Garcia](https://github.com/datapythonista)
7-
- Revision: 1
5+
- Discussion: [#47444](https://github.com/pandas-dev/pandas/pull/47444),
6+
[#51417](https://github.com/pandas-dev/pandas/pull/51417)
7+
- Author: [Marc Garcia](https://github.com/datapythonista),
8+
[Noa Tamir](https://github.com/noatamir)
9+
- Revision: 2
810

911
## PDEP definition, purpose and scope
1012

1113
A PDEP (pandas enhancement proposal) is a proposal for a **major** change in
1214
pandas, in a similar way as a Python [PEP](https://peps.python.org/pep-0001/)
1315
or a NumPy [NEP](https://numpy.org/neps/nep-0000.html).
1416

15-
Bug fixes and conceptually minor changes (e.g. adding a parameter to a function)
16-
are out of the scope of PDEPs. A PDEP should be used for changes that are not
17-
immediate and not obvious, and are expected to require a significant amount of
18-
discussion and require detailed documentation before being implemented.
19-
20-
PDEP are appropriate for user facing changes, internal changes and organizational
21-
discussions. Examples of topics worth a PDEP could include moving a module from
22-
pandas to a separate repository, a refactoring of the pandas block manager or
23-
a proposal of a new code of conduct.
17+
Bug fixes and conceptually minor changes (e.g. adding a parameter to a function) are out of the
18+
scope of PDEPs. A PDEP should be used for changes that are not immediate and not obvious, when
19+
everybody in the pandas community needs to be aware of the possibility of an upcoming change.
20+
Such changes require detailed documentation before being implemented and frequently lead to a
21+
significant discussion within the community.
22+
23+
PDEPs are appropriate for user facing changes, internal changes and significant discussions.
24+
Examples of topics worth a PDEP could include substantial API changes, breaking behavior changes,
25+
moving a module from pandas to a separate repository, or a refactoring of the pandas block manager.
26+
It is not always trivial to know which issue has enough scope to require the full PDEP process.
27+
Some simple API changes have sufficient consensus among the core team, and minimal impact on the
28+
community. On the other hand, if an issue becomes controversial, i.e. it generated a significant
29+
discussion, one could suggest opening a PDEP to formalize and document the discussion, making it
30+
easier for the wider community to participate. For context, see
31+
[the list of issues that could have been a PDEP](#List-of-issues).
2432

2533
## PDEP guidelines
2634

@@ -40,11 +48,11 @@ consider when writing a PDEP are:
4048

4149
### PDEP authors
4250

43-
Anyone can propose a PDEP, but in most cases developers of pandas itself and related
44-
projects are expected to author PDEPs. If you are unsure if you should be opening
45-
an issue or creating a PDEP, it's probably safe to start by
46-
[opening an issue](https://github.com/pandas-dev/pandas/issues/new/choose), which can
47-
be eventually moved to a PDEP.
51+
Anyone can propose a PDEP, but a core member should be engaged to advise on a proposal made by
52+
non-core contributors. To submit a PDEP as a community member, please propose the PDEP concept on
53+
[an issue](https://github.com/pandas-dev/pandas/issues/new/choose), and find a pandas team
54+
member to collaborate with. They can advise you on the PDEP process and should be listed as an
55+
advisor on the PDEP when it is submitted to the PDEP repository.
4856

4957
### Workflow
5058

@@ -63,8 +71,8 @@ Proposing a PDEP is done by creating a PR adding a new file to `web/pdeps/`.
6371
The file is a markdown file, you can use `web/pdeps/0001.md` as a reference
6472
for the expected format.
6573

66-
The initial status of a PDEP will be `Status: Under discussion`. This will be changed
67-
to `Status: Accepted` when the PDEP is ready and have the approval of the core team.
74+
The initial status of a PDEP will be `Status: Under discussion`. This will be changed to
75+
`Status: Accepted` when the PDEP is ready and has the approval of the core team.
6876

6977
#### Accepted PDEP
7078

@@ -98,7 +106,7 @@ PDEPs, since there are discussions that are worth having, and decisions about
98106
changes to pandas being made. They will be merged with `Status: Rejected`, so
99107
there is visibility on what was discussed and what was the outcome of the
100108
discussion. A PDEP can be rejected for different reasons, for example good ideas
101-
that aren't backward-compatible, and the breaking changes aren't considered worth
109+
that are not backward-compatible, and the breaking changes are not considered worth
102110
implementing.
103111

104112
#### Invalid PDEP
@@ -111,7 +119,7 @@ good as an accepted PDEP, but where the final decision was to not implement the
111119

112120
## Evolution of PDEPs
113121

114-
Most PDEPs aren't expected to change after accepted. Once there is agreement in the changes,
122+
Most PDEPs are not expected to change after they are accepted. Once there is agreement on the changes,
115123
and they are implemented, the PDEP will be only useful to understand why the development happened,
116124
and the details of the discussion.
117125

@@ -123,6 +131,74 @@ be edited, its `Revision: X` label will be increased by one, and a note will be
123131
to the `PDEP-N history` section. This will let readers understand that the PDEP has
124132
changed and avoid confusion.
125133

134+
## <a id="List-of-issues"></a> List of issues that could have been PDEPs for context
135+
### Clear examples for potential PDEPs:
136+
137+
- Adding a new parameter to many existing methods, or deprecating one in many places. For example:
138+
- The `numeric_only` deprecation ([GH-28900][28900]) affected many methods and could have been a PDEP.
139+
- Adding a new data type has impact on a variety of places that need to handle the data type.
140+
Such wide-ranging impact would require a PDEP. For example:
141+
- `Categorical` ([GH-7217][7217], [GH-8074][8074]), `StringDtype` ([GH-8640][8640]), `ArrowDtype`
142+
- A significant (breaking) change in existing behavior. For example:
143+
- Copy/view changes ([GH-36195][36195])
144+
- Support of new Python features with a wide impact on the project. For example:
145+
- Supporting typing within pandas vs. creation of `pandas-stubs` ([GH-43197][43197],
146+
[GH-45253][45253])
147+
- New required dependency.
148+
- Removing module from the project or splitting it off to a separate repository:
149+
- Moving rarely used I/O connectors to a separate repository [GH-28409](28409)
150+
- Significant changes to contributors' processes are not going to have an impact on users, but
151+
they do benefit from structured discussion among the contributors. For example:
152+
- Changing the build system to meson ([GH-49115][49115])
153+
154+
### Borderline examples:
155+
Small changes to core functionality, such as `DataFrame` and `Series`, should always be
156+
considered as a PDEP candidate as it will likely have a big impact on users. But the same types
157+
of changes in other functionalities would not be good PDEP candidates. That said, any discussion,
158+
no matter how small the change, which becomes controversial is a PDEP candidate. Consider if more
159+
attention and/or a formal decision-making process would help. Following are some examples we
160+
hope can help clarify our meaning here:
161+
162+
- API breaking changes, or discussion thereof, could be a PDEP. For example:
163+
- `value_counts` result rename ([GH-49497][49497]). The scope does not justify a PDEP at first, but later a
164+
discussion about whether it should be executed as a breaking change or with deprecation
165+
emerges, which could benefit from the PDEP process.
166+
- Adding new methods or parameters to an existing method typically will not require a PDEP for
167+
non-core features. For example:
168+
- Both `dropna(percentage)` ([GH-35299][35299]), and `Timestamp.normalize()` ([GH-8794][8794])
169+
would not have required a PDEP.
170+
- On the other hand, `DataFrame.assign()` might. While it is a single method without backwards
171+
compatibility concerns, it is also a core feature and the discussion should be highly visible.
172+
- Deprecating or removing a single method would not require a PDEP in most cases.
173+
- That said, `DataFrame.append` ([GH-35407][35407]) is an example of deprecations on core
174+
features that would be a good candidate for a PDEP.
175+
- Changing the default value of parameters in a core pandas method is another edge case. For
176+
example:
177+
- Such changes for `dropna` in `DataFrame.groupby` and `Series.groupby` could be a PDEP.
178+
- New top level modules and/or exposing internal classes. For example:
179+
- Add `pandas.api.typing` ([GH-48577][48577]) is relatively small and would not necessarily
180+
require a PDEP.
181+
182+
126183
### PDEP-1 History
127184

128-
- 3 August 2022: Initial version
185+
- 3 August 2022: Initial version ([GH-47938][47938])
186+
- 15 February 2023: Version 2 ([GH-51417][51417]) clarifies the scope of PDEPs and adds examples
187+
188+
[7217]: https://github.com/pandas-dev/pandas/pull/7217
189+
[8074]: https://github.com/pandas-dev/pandas/issues/8074
190+
[8640]: https://github.com/pandas-dev/pandas/issues/8640
191+
[36195]: https://github.com/pandas-dev/pandas/issues/36195
192+
[43197]: https://github.com/pandas-dev/pandas/issues/43197
193+
[45253]: https://github.com/pandas-dev/pandas/issues/45253
194+
[49497]: https://github.com/pandas-dev/pandas/issues/49497
195+
[35299]: https://github.com/pandas-dev/pandas/issues/35299
196+
[8794]: https://github.com/pandas-dev/pandas/issues/8794
197+
[6249]: https://github.com/pandas-dev/pandas/issues/6249
198+
[48577]: https://github.com/pandas-dev/pandas/issues/48577
199+
[49115]: https://github.com/pandas-dev/pandas/pull/49115
200+
[28409]: https://github.com/pandas-dev/pandas/issues/28409
201+
[47938]: https://github.com/pandas-dev/pandas/pull/47938
202+
[51417]: https://github.com/pandas-dev/pandas/pull/51417
203+
[28900]: https://github.com/pandas-dev/pandas/issues/28900
204+
[35407]: https://github.com/pandas-dev/pandas/issues/35407

0 commit comments

Comments
 (0)