CLN: replace usage internally of .iteritems with .items #26114

topper-123 · 2019-04-16T23:09:29Z

xref CLN: remove PY2 #25725
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Now that Python2 is being removed, .iteritems should be deprecated, and .items should be used instead.

EDIT: This PR ended up not deprecating .iteritems, but only change its usage internally with using .items

topper-123 · 2019-04-17T02:03:41Z

The error is unrelated to this PR:

CondaHTTPError: HTTP 504 GATEWAY_TIMEOUT for url <https://conda.anaconda.org/c3i_test/noarch/repodata.json>
Elapsed: 00:59.894676
CF-RAY: 4c8a8f7d1f385558-ORD
A remote server error occurred when trying to retrieve this URL.
A 500-type error (e.g. 500, 501, 502, 503, etc.) indicates the server failed to
fulfill a valid request.  The problem may be spurious, and will resolve itself if you
try your request again.  If the problem persists, consider notifying the maintainer
of the remote server.

codecov · 2019-04-17T03:36:02Z

Codecov Report

Merging #26114 into master will decrease coverage by 0.01%.
The diff coverage is 90%.

@@            Coverage Diff             @@
##           master   #26114      +/-   ##
==========================================
- Coverage   91.99%   91.97%   -0.02%     
==========================================
  Files         175      175              
  Lines       52384    52392       +8     
==========================================
+ Hits        48189    48190       +1     
- Misses       4195     4202       +7

Flag	Coverage Δ
#multiple	`90.53% <90%> (-0.01%)`	⬇️
#single	`40.73% <30%> (-0.14%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/panel.py	`35.43% <0%> (ø)`	⬆️
pandas/core/reshape/pivot.py	`96.54% <100%> (ø)`	⬆️
pandas/io/json/table_schema.py	`98.3% <100%> (ø)`	⬆️
pandas/core/indexes/multi.py	`95.61% <100%> (ø)`	⬆️
pandas/plotting/_core.py	`83.76% <100%> (ø)`	⬆️
pandas/core/util/hashing.py	`98.4% <100%> (ø)`	⬆️
pandas/core/reshape/reshape.py	`99.56% <100%> (ø)`	⬆️
pandas/core/strings.py	`98.86% <100%> (ø)`	⬆️
pandas/core/sparse/frame.py	`95.49% <100%> (ø)`	⬆️
pandas/io/formats/style.py	`96.68% <100%> (ø)`	⬆️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 68b1da7...397bf66. Read the comment docs.

codecov · 2019-04-17T03:36:05Z

Codecov Report

Merging #26114 into master will increase coverage by 0.07%.
The diff coverage is 90%.

@@            Coverage Diff             @@
##           master   #26114      +/-   ##
==========================================
+ Coverage   91.79%   91.86%   +0.07%     
==========================================
  Files         180      174       -6     
  Lines       50934    50700     -234     
==========================================
- Hits        46756    46577     -179     
+ Misses       4178     4123      -55

Flag	Coverage Δ
#multiple	`90.4% <90%> (-0.08%)`	⬇️
#single	`41.78% <30%> (-0.32%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/panel.py	`17.61% <0%> (+1.74%)`	⬆️
pandas/core/reshape/pivot.py	`96.52% <100%> (ø)`	⬆️
pandas/io/json/table_schema.py	`98.3% <100%> (ø)`	⬆️
pandas/core/indexes/multi.py	`95.67% <100%> (+0.81%)`	⬆️
pandas/plotting/_core.py	`83.77% <100%> (-5.61%)`	⬇️
pandas/core/util/hashing.py	`98.41% <100%> (ø)`	⬆️
pandas/core/reshape/reshape.py	`99.56% <100%> (ø)`	⬆️
pandas/core/strings.py	`98.92% <100%> (-0.02%)`	⬇️
pandas/core/sparse/frame.py	`95.63% <100%> (+2.06%)`	⬆️
pandas/io/formats/style.py	`96.68% <100%> (-0.47%)`	⬇️
... and 115 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c407b73...4fec810. Read the comment docs.

jreback · 2019-04-17T12:47:28Z

yeah I guess we need to deprecate this. let me have a look.

jorisvandenbossche · 2019-04-17T12:57:17Z

Is it needed to deprecate this?

I understand it that from the stdlib python 2/3 point of view, it makes sense: items + iteritems were originally modelled after dict's items and iteritems methods, and now with python 3 only, we only need to keep items.

However, leaving that aspect aside, from a consistency within pandas, I think having the methods iteritems / itertuples / iterrows is also nice. And I personally prefer to keep this consistency compared to the one with dict.

topper-123 · 2019-04-17T14:42:12Z

@jorisvandenbossche

iterrows and itertuples are code smells really and if encountered in a code base, the instinct should be to try to refactor into something vectorized.

iteritems/items OTOH is idiomatic and useful for a lot of problem types. So I don't have a problem with distancing these methods from each other by just deprecating iteritems. The grouping of these methods is deceptional and if you've learned that iteritems is a good method you may be lead to believe that iterrows and itertuples can be used in a similar manner, but really you shouldn't in 99% of the cases.

The dict argument is also important IMO, as having DataFrames/Series be dict-like is a nice pedagogical concept, and keepingiteritems will require more explanation for the user (it's a python2 concept used to avoid returning lists, but that we chose to keep in order to keep similarity with iterrows and itertuples (but don't use those, because...), but it's really the same as items, so just use items ...).

So, I think removing iteritems will make the basic pandas concepts a bit easier to learn/use and will encourage healthier habits for non-expert users.

jreback · 2019-04-20T18:44:32Z

I actually like this deprecation; it clearly delineates that that we are iterating over columns, rather than the row-iterators; plus it does make sense post python 2 for more compatibility in Series with dict.

@jorisvandenbossche ?

WillAyd

I'm on board with this one

jorisvandenbossche · 2019-05-03T05:58:36Z

My main problem with deprecating this is that we have consistently documented to use iteritems() instead of items() (in the user guide, in the docstrings, in see also's, ... even the docstring of items() uses iteritems() in the example. items() is basically not mentioned a single time). So ideally, we should have thought about this much earlier to update that ...
Because now we would want to deprecate that method that is documented in favor of an alias probably almost no one uses.

topper-123 · 2019-05-03T15:28:35Z

A solution to your concern could be to keep the deprecations in 0.25 in v1.0 (+ maybe 2 minor versions)? That would not be unreasonable anyway for changes so close to 1.0.

Another alternative could be to make items the "advertised" method in 0.25, and just have a warning text in the iteritems doc string about a future deprecation, without actually deprecating it? This would be kind of like we did with tolist when to_list was added.

Alternatively IMO advertising DataFrames/Seried having a dict-like interface should probably be de-emphasised. E.g. the phrase here

Consistent with the dict-like interface, iteritems() iterates through key-value pairs

isn't really true for Pandas on Python3 (while it was true for Pandas on Python2).

WillAyd · 2019-05-10T03:53:16Z

@jorisvandenbossche AFAICT there's only a handful of references to iteritems in the user guide (specifically four in the basics file) so I would think this is a pretty easy deprecation? Bundling it with dropping Python2 also seems to make a lot of sense from an API perspective.

If you feel strongly against this then something we can discuss at the sprint, but just wanted to offer that counterpoint of this I think not being too hard of a deprecation

simonjayhawkins · 2019-05-31T15:09:12Z

@topper-123 can you merge master.

Conflicting files
doc/source/whatsnew/v0.25.0.rst
pandas/core/generic.py
pandas/core/indexes/multi.py

jreback

small comment, otherwise lgtm.

jreback · 2019-06-01T14:53:24Z

doc/source/getting_started/basics.rst

@@ -1524,10 +1524,10 @@ To iterate over the rows of a DataFrame, you can use the following methods:

    df

-iteritems
+items
 ~~~~~~~~~


this needs to be same length as the header

jreback · 2019-06-01T14:54:37Z

@jorisvandenbossche are you still -1 on this?

WillAyd · 2019-07-01T16:46:52Z

@topper-123 can you merge master?

@jorisvandenbossche still objecting to this?

jreback · 2019-07-03T12:16:17Z

@TomAugspurger what do you think about this? I am +0 on this, @jorisvandenbossche ?

TomAugspurger · 2019-07-03T12:42:34Z

Maybe -0. I don’t have a strong preference either :)

…

On Jul 3, 2019, at 07:16, Jeff Reback ***@***.***> wrote: @TomAugspurger what do you think about this? I am +0 on this, @jorisvandenbossche ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

jorisvandenbossche · 2019-07-03T13:39:26Z

Sorry for the slow reply here. I don't feel stronly about it, so you can count me as a -0 as well (or -0.2 :-))

Basically I feel still the way as I said above (#26114 (comment)), that it feels "strange" to deprecate the method that we have always documented in favor of something that was basically not mentioned in the docs.

(BTW, apart from actually deprecation, we could also already start with changing the docs)

topper-123 · 2019-07-10T12:39:06Z

I've taken out the proposed deprecation.

I would like to get the rest of the PR in though, so internally we use items rather than iteritems. If no one objects, I'll merge the PR in a few days (without deprecating iteritems).

TomAugspurger · 2019-07-10T14:51:44Z

Doc-only for now sounds good.

…

On Wed, Jul 10, 2019 at 6:39 AM Terji Petersen ***@***.***> wrote: I've taken out the proposed deprecation. I would like to get the rest of the PR in though, so internally we use items rather than iteritems. If no one objects, I'll merge the PR in a few days (without deprecating iteritems). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#26114?email_source=notifications&email_token=AAKAOIXFJTMFXRMXSHAJ2N3P6XJ7HA5CNFSM4HGPQJ52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZTKKCQ#issuecomment-510043402>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAKAOIW6U52OV3HEQQ2XMQTP6XJ7HANCNFSM4HGPQJ5Q> .

jreback

do we need a note in whatsnew? (just saying this is now not recommended)?
can we add a code_check to prohitbit this?

jreback · 2019-07-10T16:31:06Z

doc/source/reference/series.rst

   Series.items
+   Series.iteritems


did you mean to add this back?

Yes, as we’re not deprecating iteritems anymore?

EDIT: iteritems, not itertools....

jreback · 2019-07-10T16:31:51Z

pandas/core/frame.py

@@ -826,6 +826,11 @@ def iteritems(self):
            for i, k in enumerate(self.columns):
                yield k, self._ixs(i, axis=1)

+    @Appender(items.__doc__)
+    def iteritems(self):
+        for key, value in self.items():


you don't want to return self.items() here?

Yeah, I've changed that.

The issue was that items is a generator, while iteritems is a function, that returns a generator. So flake8 requires the items doc string to use "Yields" and the iteritems doc string to use "Returns".

This is a bit silly in either case, and it feels off to do stuff just to satisfy a linter.

jreback · 2019-07-10T18:42:18Z

cool thanks, ping on green.

topper-123 · 2019-07-10T19:13:07Z

* do we need a note in whatsnew? (just saying this is now not recommended)?

* can we add a code_check to prohibit this?

I can look into these two in a follow-up.

jreback · 2019-07-10T19:16:59Z

thanks @topper-123

jorisvandenbossche

Can you clarify in the iteritems docstring that it is an alias of items, and that items is preferred or something like that ?

jorisvandenbossche · 2019-07-10T20:46:40Z

pandas/core/frame.py

        if self.columns.is_unique and hasattr(self, "_item_cache"):
            for k in self.columns:
                yield k, self._get_item_cache(k)
        else:
            for i, k in enumerate(self.columns):
                yield k, self._ixs(i, axis=1)

+    @Appender(_shared_docs["items"] % "Returns\n        -------")


Both return a generator, so why the different docstring here?

It's the "return " part of the method that makes flake8 think this is not a generator, so it won't allow a "Yields" section in the iteritems doc string.

I think flake8 is tecnically right, but it makes for uglier code in this case.

On second thought, I think maybe if the Appender is followed by a # noqa comment, we can make flake8 ignore that section. I'll try it tomorrow.

jorisvandenbossche · 2019-07-10T20:48:53Z

pandas/core/generic.py

-        This is index for Series, columns for DataFrame and so on.
+        This is index for Series and columns for DataFrame.
+
+        Returns


Fr the other docstring you used Yields ?

jorisvandenbossche · 2019-07-10T20:50:45Z

pandas/core/generic.py

        """
        for h in self._info_axis:
            yield h, self[h]

+    @Appender(items.__doc__)
+    def iteritems(self):
+        return self.items()


Actually, I think both Series and DataFrame have their own implementation, so this one is nowhere used. So can rather be removed?

I'll look into it,

topper-123 force-pushed the deprecate_iteritems branch from d68d4f8 to 397bf66 Compare April 17, 2019 01:13

topper-123 marked this pull request as ready for review April 17, 2019 01:15

jreback added API Design Deprecate Functionality to remove in pandas labels Apr 17, 2019

jreback added this to the 0.25.0 milestone Apr 20, 2019

WillAyd approved these changes May 3, 2019

View reviewed changes

topper-123 force-pushed the deprecate_iteritems branch 2 times, most recently from 5d4b238 to 86c220a Compare June 1, 2019 11:45

jreback requested changes Jun 1, 2019

View reviewed changes

topper-123 force-pushed the deprecate_iteritems branch 2 times, most recently from 9657a34 to 136c6c0 Compare June 4, 2019 01:05

jreback removed this from the 0.25.0 milestone Jul 2, 2019

topper-123 force-pushed the deprecate_iteritems branch from 136c6c0 to ec3e1a2 Compare July 2, 2019 23:24

topper-123 force-pushed the deprecate_iteritems branch from 4fec810 to f83f906 Compare July 3, 2019 17:00

topper-123 added 2 commits July 10, 2019 07:58

DEPR: deprecate (DataFrame|Series).iteritems

8915c5c

Change header underline etc.

5790132

topper-123 force-pushed the deprecate_iteritems branch 3 times, most recently from 0de0e84 to 556dcc0 Compare July 10, 2019 09:57

remove deprecation of iteritems

5307152

topper-123 force-pushed the deprecate_iteritems branch from 556dcc0 to 5307152 Compare July 10, 2019 12:18

pandas-dev deleted a comment from pep8speaks Jul 10, 2019

topper-123 added this to the 0.25.0 milestone Jul 10, 2019

topper-123 changed the title ~~DEPR: deprecate DataFrame.iteritems and Series.iteritems~~ CLN: replace usage internally of .iteritems with .items Jul 10, 2019

topper-123 added Clean and removed API Design Deprecate Functionality to remove in pandas labels Jul 10, 2019

jreback requested changes Jul 10, 2019

View reviewed changes

Change iteritems doc string

71c9aca

jreback approved these changes Jul 10, 2019

View reviewed changes

jreback merged commit 472af55 into pandas-dev:master Jul 10, 2019

topper-123 deleted the deprecate_iteritems branch July 10, 2019 19:32

jorisvandenbossche reviewed Jul 10, 2019

View reviewed changes

jorisvandenbossche mentioned this pull request Aug 12, 2019

DOC: Use of "Yields" vs "Returns" for documentation of DataFrame.items() and DataFrame.iteritems() #27860

Closed

Uh oh!

CLN: replace usage internally of .iteritems with .items #26114

CLN: replace usage internally of .iteritems with .items #26114

Conversation

topper-123 commented Apr 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

topper-123 commented Apr 17, 2019

Uh oh!

codecov bot commented Apr 17, 2019

Codecov Report

Uh oh!

codecov bot commented Apr 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jreback commented Apr 17, 2019

Uh oh!

jorisvandenbossche commented Apr 17, 2019

Uh oh!

topper-123 commented Apr 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback commented Apr 20, 2019

Uh oh!

WillAyd left a comment

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche commented May 3, 2019

Uh oh!

topper-123 commented May 3, 2019

Uh oh!

WillAyd commented May 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

simonjayhawkins commented May 31, 2019

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback commented Jun 1, 2019

Uh oh!

WillAyd commented Jul 1, 2019

Uh oh!

jreback commented Jul 3, 2019

Uh oh!

TomAugspurger commented Jul 3, 2019 via email

Uh oh!

jorisvandenbossche commented Jul 3, 2019

Uh oh!

topper-123 commented Jul 10, 2019

Uh oh!

TomAugspurger commented Jul 10, 2019 via email

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

topper-123 Jul 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback commented Jul 10, 2019

Uh oh!

topper-123 commented Jul 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback commented Jul 10, 2019

Uh oh!

jorisvandenbossche left a comment

Choose a reason for hiding this comment

topper-123 commented Apr 16, 2019 •

edited

Loading

codecov bot commented Apr 17, 2019 •

edited

Loading

topper-123 commented Apr 17, 2019 •

edited

Loading

WillAyd commented May 10, 2019 •

edited

Loading

topper-123 Jul 10, 2019 •

edited

Loading

topper-123 commented Jul 10, 2019 •

edited

Loading