added f strings and typing to frame.py #30021

mck619 · 2019-12-03T23:15:20Z

replaced old formatting with fstrings where possible

added typing hints to duplicated, drop_duplicates, reset_index

alimcmaster1 · 2019-12-04T01:36:50Z

Thanks for the PR - can you run flake8 that will fix the github actions code checks CI failure: https://github.com/pandas-dev/pandas/pull/30021/checks?check_run_id=332186140#step:6:41

alimcmaster1

Thanks for the PR - generally looks good - few minor comments

alimcmaster1 · 2019-12-04T01:38:52Z

pandas/core/frame.py

-                        "({cols:d} != {counts:d})".format(
-                            cols=len(cols), counts=len(counts)
-                        )
+                        "Columns must equal counts " f"({len(cols)} != {len(counts)})"


I would put all of this line inside the f string for readability

alimcmaster1 · 2019-12-04T01:39:48Z

pandas/core/frame.py

-                    raise TypeError(
-                        err_msg + " Received column of type {}".format(type(col))
-                    )
+                    raise TypeError(err_msg + f" Received column of type {type(col)}")


Could just include the err_msg inside f string

…ame_typing_fstring

WillAyd

Looks good - some minor edits

pandas/core/frame.py

Co-Authored-By: William Ayd <[email protected]>

pep8speaks · 2019-12-04T04:46:25Z

Hello @mck619! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-12-06 00:05:07 UTC

Co-Authored-By: William Ayd <[email protected]>

pandas/core/frame.py

simonjayhawkins · 2019-12-04T11:25:34Z

pandas/core/frame.py

+                        f"{err_msg} Received column of type {type(col)}"
+                        "array, or a list containing only valid column keys and "
+                        f"one-dimensional arrays. Received column of type {type(col)}"


why's this changed?

My mistake, I misinterpreted a previous comment

simonjayhawkins · 2019-12-04T11:30:17Z

pandas/core/frame.py

@@ -4423,7 +4413,7 @@ def _maybe_casted_values(index, labels=None):
                            raise ValueError(
                                "col_fill=None is incompatible "
                                "with incomplete column name "
-                                "{}".format(name)
+                                f"{name}"


this looks odd and redundant as an f-string. perhaps combine with string above

pandas/core/frame.py

simonjayhawkins · 2019-12-04T11:35:52Z

pandas/core/frame.py

-                    "Generating numeric_only data with filter_type {f}"
-                    "not supported.".format(f=filter_type)
+                    f"Generating numeric_only data with filter_type {filter_type}"
+                    "not supported."


this appears to have a missing space in the message. do we have a test for this message?

simonjayhawkins · 2019-12-04T11:38:15Z

pandas/core/frame.py

@@ -8170,4 +8168,4 @@ def _from_nested_dict(data):


 def _put_str(s, space):
-    return "{s}".format(s=s)[:space].ljust(space)
+    return f"{s}"[:space].ljust(space)


i'm not a fan of f-strings used like this. could just use str().

simonjayhawkins · 2019-12-04T11:40:42Z

pandas/core/frame.py

-    def drop_duplicates(self, subset=None, keep="first", inplace=False):
+    def drop_duplicates(
+        self,
+        subset: Optional[Union[Hashable, Sequence[Hashable]]] = None,


is None a valid value in a sequence of labels?

It is, check this out:

df = pd.DataFrame({None:[1,2,2], 'col1':[1,2,3]}) df.drop_duplicates(subset=[None])

outputs:

NaN col1 0 1 1 1 2 2

Suggested change

subset: Optional[Union[Hashable, Sequence[Hashable]]] = None,

subset: Optional[Union[Hashable, Sequence[Optional[Hashable]]]] = None,

should probably be Sequence[Optional[Hashable]] in that case.

Co-Authored-By: Simon Hawkins <[email protected]>

… into frame_typing_fstring

Co-Authored-By: Simon Hawkins <[email protected]>

… into frame_typing_fstring

WillAyd · 2019-12-04T18:50:37Z

@mck619 looks like some tricky mypy failures. Flag me down at PyData and I'll help you out in person

simonjayhawkins · 2019-12-04T19:35:15Z

could try this as minimum to make mypy pass...

diff --git a/pandas/core/frame.py b/pandas/core/frame.py
index 6afd64f64..491c86999 100644
--- a/pandas/core/frame.py
+++ b/pandas/core/frame.py
@@ -15,6 +15,7 @@ import itertools
 import sys
 from textwrap import dedent
 from typing import (
+    Any,
     FrozenSet,
     Hashable,
     Iterable,
@@ -25,6 +26,7 @@ from typing import (
     Tuple,
     Type,
     Union,
+    cast,
 )
 import warnings
 
@@ -4192,7 +4194,7 @@ class DataFrame(NDFrame):
         inplace: bool = False,
         col_level: Hashable = 0,
         col_fill: Optional[Hashable] = "",
-    ) -> "DataFrame":
+    ) -> Optional["DataFrame"]:
         """
         Reset the index, or a level of it.
 
@@ -4220,8 +4222,8 @@ class DataFrame(NDFrame):
 
         Returns
         -------
-        DataFrame
-            DataFrame with the new index.
+        DataFrame or None
+            DataFrame with the new index or None if ``inplace=True``.
 
         See Also
         --------
@@ -4386,6 +4388,7 @@ class DataFrame(NDFrame):
                 new_index = self.index.droplevel(level)
 
         if not drop:
+            to_insert: Iterable[Tuple[Any, Optional[Any]]]
             if isinstance(self.index, ABCMultiIndex):
                 names = [
                     (n if n is not None else f"level_{i}")
@@ -4425,6 +4428,8 @@ class DataFrame(NDFrame):
         if not inplace:
             return new_obj
 
+        return None
+
     # ----------------------------------------------------------------------
     # Reindex-based selection methods
 
@@ -4590,7 +4595,7 @@ class DataFrame(NDFrame):
         subset: Optional[Union[Hashable, Sequence[Hashable]]] = None,
         keep: Union[str, bool] = "first",
         inplace: bool = False,
-    ) -> "DataFrame":
+    ) -> Optional["DataFrame"]:
         """
         Return DataFrame with duplicate rows removed.
 
@@ -4612,7 +4617,7 @@ class DataFrame(NDFrame):
 
         Returns
         -------
-        DataFrame
+        DataFrame or None
         """
         if self.empty:
             return self.copy()
@@ -4624,6 +4629,7 @@ class DataFrame(NDFrame):
             (inds,) = (-duplicated)._ndarray_values.nonzero()
             new_data = self._data.take(inds)
             self._update_inplace(new_data)
+            return None
         else:
             return self[-duplicated]
 
@@ -4675,6 +4681,9 @@ class DataFrame(NDFrame):
         ):
             subset = (subset,)
 
+        # needed for mypy since can't narrow types using np.iterable
+        subset = cast(Iterable, subset)
+
         # Verify all columns in subset exist in the queried dataframe
         # Otherwise, raise a KeyError, same as if you try to __getitem__ with a
         # key that doesn't exist.
@@ -6024,6 +6033,8 @@ class DataFrame(NDFrame):
             raise ValueError("columns must be unique")
 
         df = self.reset_index(drop=True)
+        # TODO: use overload to refine return type of reset_index
+        assert df is not None  # needed for mypy
         result = df[column].explode()
         result = df.drop([column], axis=1).join(result)
         result.index = self.index.take(result.index)
diff --git a/pandas/core/reshape/merge.py b/pandas/core/reshape/merge.py
index d671fff56..726a59ca8 100644
--- a/pandas/core/reshape/merge.py
+++ b/pandas/core/reshape/merge.py
@@ -126,7 +126,10 @@ def _groupby_and_merge(
                 on = [on]
 
             if right.duplicated(by + on).any():
-                right = right.drop_duplicates(by + on, keep="last")
+                _right = right.drop_duplicates(by + on, keep="last")
+                # TODO: use overload to refine return type of drop_duplicates
+                assert _right is not None  # needed for mypy
+                right = _right
         rby = right.groupby(by, sort=False)
     except KeyError:
         rby = None

May want to use overloads with drop_duplicates and reset_index to reduce additional asserts and code changes needed elsewhere.

May also want to be more precise with declaration of to_insert

Ideally, also find a way of avoiding the cast.

WillAyd · 2019-12-06T20:50:50Z

Thanks @mck619 for the PR ! (and @simonjayhawkins for detailed comments)

mck619 · 2019-12-06T21:12:41Z

Thank you @WillAyd and @simonjayhawkins for all your help with my first PR!! Can't wait to contribute in a more helpful and meaningful manner in the future.

Michael Kakehashi added 2 commits December 3, 2019 15:10

added f strings and typing to frame.py

f5b303e

minor fix

c3fd308

alimcmaster1 added Clean Code Style Code style, linting, code_checks Typing type annotations, mypy/pyright type checking labels Dec 4, 2019

alimcmaster1 requested changes Dec 4, 2019

View reviewed changes

Michael Kakehashi added 4 commits December 3, 2019 19:21

Merge branch 'master' of https://github.com/pandas-dev/pandas into fr…

7602461

…ame_typing_fstring

cleaned up f strings, and flack 8 errors per PR comments

3a4c244

fixed return annotation of functions that return a DataFrame

ef87c64

fixed annotation of functions that return a Series

fde23a9

WillAyd requested changes Dec 4, 2019

View reviewed changes

pandas/core/frame.py Outdated Show resolved Hide resolved

pandas/core/frame.py Outdated Show resolved Hide resolved

pandas/core/frame.py Outdated Show resolved Hide resolved

pandas/core/frame.py Outdated Show resolved Hide resolved

pandas/core/frame.py Outdated Show resolved Hide resolved

WillAyd added this to the 1.0 milestone Dec 4, 2019

mck619 and others added 2 commits December 3, 2019 20:46

Update pandas/core/frame.py

cf33998

Co-Authored-By: William Ayd <[email protected]>

Update pandas/core/frame.py

d223c88

Co-Authored-By: William Ayd <[email protected]>

mck619 and others added 5 commits December 3, 2019 20:46

Update pandas/core/frame.py

188410c

Co-Authored-By: William Ayd <[email protected]>

Update pandas/core/frame.py

bfdf696

Co-Authored-By: William Ayd <[email protected]>

Update pandas/core/frame.py

0ecb000

Co-Authored-By: William Ayd <[email protected]>

typing syntax fix

7b52345

more typing syntax fixes

5e7d915

simonjayhawkins reviewed Dec 4, 2019

View reviewed changes

mck619 and others added 7 commits December 4, 2019 09:23

Update pandas/core/frame.py

70ef860

Co-Authored-By: Simon Hawkins <[email protected]>

fixed fstring with err_msg

997a2e3

Merge branch 'frame_typing_fstring' of https://github.com/mck619/pandas…

2e05e01

… into frame_typing_fstring

Update pandas/core/frame.py

a00c34d

Co-Authored-By: Simon Hawkins <[email protected]>

fstring clean up

099feb6

Merge branch 'frame_typing_fstring' of https://github.com/mck618/pandas…

85909ea

… into frame_typing_fstring

black formatting

18fed32

Michael Kakehashi added 2 commits December 5, 2019 15:42

mypy fixes per Simon's comments

17444ec

doc string fix

1a9c6f0

WillAyd approved these changes Dec 6, 2019

View reviewed changes

WillAyd merged commit 282a0e4 into pandas-dev:master Dec 6, 2019

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

added f strings and typing to frame.py (pandas-dev#30021)

3eb4466

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

added f strings and typing to frame.py (pandas-dev#30021)

7109a45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added f strings and typing to frame.py #30021

added f strings and typing to frame.py #30021

mck619 commented Dec 3, 2019

alimcmaster1 commented Dec 4, 2019 •

edited

Loading

alimcmaster1 left a comment

alimcmaster1 Dec 4, 2019

alimcmaster1 Dec 4, 2019

WillAyd left a comment

pep8speaks commented Dec 4, 2019 •

edited

Loading

simonjayhawkins Dec 4, 2019

mck619 Dec 4, 2019

simonjayhawkins Dec 4, 2019

simonjayhawkins Dec 4, 2019

simonjayhawkins Dec 4, 2019

simonjayhawkins Dec 4, 2019

mck619 Dec 4, 2019 •

edited

Loading

simonjayhawkins Dec 4, 2019

WillAyd commented Dec 4, 2019

simonjayhawkins commented Dec 4, 2019

WillAyd commented Dec 6, 2019

mck619 commented Dec 6, 2019

	subset: Optional[Union[Hashable, Sequence[Hashable]]] = None,
	subset: Optional[Union[Hashable, Sequence[Optional[Hashable]]]] = None,

added f strings and typing to frame.py #30021

added f strings and typing to frame.py #30021

Conversation

mck619 commented Dec 3, 2019

alimcmaster1 commented Dec 4, 2019 • edited Loading

alimcmaster1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WillAyd left a comment

Choose a reason for hiding this comment

pep8speaks commented Dec 4, 2019 • edited Loading

Comment last updated at 2019-12-06 00:05:07 UTC

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mck619 Dec 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WillAyd commented Dec 4, 2019

simonjayhawkins commented Dec 4, 2019

WillAyd commented Dec 6, 2019

mck619 commented Dec 6, 2019

alimcmaster1 commented Dec 4, 2019 •

edited

Loading

pep8speaks commented Dec 4, 2019 •

edited

Loading

mck619 Dec 4, 2019 •

edited

Loading