Skip to content

added np.timedelta64 for series arithmatic methods #432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Nov 23, 2022
Merged

added np.timedelta64 for series arithmatic methods #432

merged 21 commits into from
Nov 23, 2022

Conversation

ramvikrams
Copy link
Contributor

sorry sir had to close that draft pr because for some reason I was not able to generate a pr from that

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also have to modify the operators in TimedeltaSeries

def timedelta64_and_arithmatic_operator() -> None:
s1 = pd.Series(data=pd.date_range("1/1/2020", "2/1/2020"))
s2 = pd.Series(data=pd.date_range("1/1/2021", "2/1/2021"))
check(assert_type((s1 - np.timedelta64(1, "M")), np.timedelta64), np.timedelta64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just create a variable td = np.timedelta64(1, "M") and reuse it

check(assert_type((s1 - np.timedelta64(1, "M")), np.timedelta64), np.timedelta64)
check(assert_type((s1 + np.timedelta64(1, "M")), np.timedelta64), np.timedelta64)
check(assert_type(((s1 * np.timedelta64(1, "M")), np.timedelta64), np.timedelta64))
check(assert_type(((s1 / np.timedelta64(1, "M")), np.timedelta64), np.timedelta64))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results here are wrong, because s1 is a TimeStampSeries, so adding and subtracting a np.timedelta64 will produce the same. Have to try each one out locally and see what the right type is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir it's worling fine with addition,substraction only problem is with multiplication,division any suggestions for that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir it's worling fine with addition,substraction only problem is with multiplication,division any suggestions for that

Doesn't seem that it is working. Look at the log files from the CI runs (or run locally) and you can then see the error messages and fix the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sir

@ramvikrams
Copy link
Contributor Author

You also have to modify the operators in TimedeltaSeries

yes sir will do

@ramvikrams
Copy link
Contributor Author

Done sir

def __radd__(self, pther: Timestamp | TimestampSeries) -> TimestampSeries: ... # type: ignore[override]
def __mul__(self, other: num) -> TimedeltaSeries: ... # type: ignore[override]
def __mul__(self, other: num | np.timedelta64) -> TimedeltaSeries: ... # type: ignore[override]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not add np.timedelta64 to TimedeltaSeries.__mul__(), as you can't multiple a series of Timedelta by np.timedelta64

s3 = s2 - s1
td = np.timedelta64(1, "M")
check(assert_type((s1 - td), TimestampSeries), np.timedelta64)
check(assert_type((s1 + td), pd.Series[Any]), np.timedelta64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect. s1 is TimestampSeries, so you need to add a stub for __add__() in TimestampSeries with arguments Timedelta | np.timedelta64. Might also need a @overload

check(assert_type((s1 - td), TimestampSeries), np.timedelta64)
check(assert_type((s1 + td), pd.Series[Any]), np.timedelta64)
check(assert_type((s1 * td), TimedeltaSeries), np.timedelta64)
check(assert_type((s1 / td), pd.Series[float]), np.timedelta64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the above 4 checks, I am surprised they got past pytest, because the ehecks should be of the form check(assert_type((s1 - td), TimestampSeries), pd.Series, pd.Timestamp) (for example)

check(assert_type((s1 / td), pd.Series[float]), np.timedelta64)
check(
assert_type((s3 - td), TimedeltaSeries),
pd.Series,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add pd.Timedelta to the last argument of check here

check(
assert_type((s3 * td), TimedeltaSeries),
pd.Series,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for above 2 checks, add pd.Timedelta to the last argument of check

check(
assert_type((s3 / td), pd.Series[float]),
pd.Series,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add float to last argument of check . (Might need to be np.float)

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do poetry run poe test before pushing new changes to make sure things work locally

@@ -1743,6 +1743,8 @@ class TimestampSeries(Series[Timestamp]):
# ignore needed because of mypy
@property
def dt(self) -> TimestampProperties: ... # type: ignore[override]
@overload
def __add__(self, other: Timedelta | np.timedelta64) -> TimedeltaSeries: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result here of adding TimestampSeries plus a timedelta is TimestampSeries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sir here there is no TimestampSeries

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at line 1742

)
check(assert_type((s1 - td), TimestampSeries), pd.Series, pd.Timestamp)
check(assert_type((s1 + td), TimestampSeries), pd.Series, pd.Timestamp)
check(assert_type((s1 * td), TimedeltaSeries), pd.Series, pd.Timestamp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multipying a TimestampSeries times a Timedelta is not allowed, so have to fix that.

In TimestampSeries, put in a method for __mul__() that takes the two timedelta arguments, and returns Never from typing_extensions

Change the test to assert the type, but do not do the check part.

check(assert_type((s1 - td), TimestampSeries), pd.Series, pd.Timestamp)
check(assert_type((s1 + td), TimestampSeries), pd.Series, pd.Timestamp)
check(assert_type((s1 * td), TimedeltaSeries), pd.Series, pd.Timestamp)
check(assert_type((s1 / td), TimestampSeries), pd.Series, pd.Timestamp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dividing a TimestampSEries by a Timedelta is not allowed, so have to fix that

In TimestampSeries, put in a method for __truediv__() that takes the two timedelta arguments, and returns Never from typing_extensions

Change the test to assert the type, but do not do the check part.

check(assert_type((s1 / td), TimestampSeries), pd.Series, pd.Timestamp)
check(assert_type((s3 - td), TimedeltaSeries), pd.Series, pd.Timedelta)
check(assert_type((s3 + td), TimedeltaSeries), pd.Series, pd.Timedelta)
check(assert_type((s3 * td), TimedeltaSeries), pd.Series, pd.Timedelta)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiplying a TimedeltaSeries by a timedelta fails, so have to fix that

In TimedeltaSeries, put in a method for __mul__() that takes the two timedelta arguments, and returns Never from typing_extensions

Change the test to assert the type, but do not do the check part.

Copy link
Contributor Author

@ramvikrams ramvikrams Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir in TimedeltaSeries should a __trudiv__() be added because it is showing errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and sir __mul__() it is still showing errors after adding it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir in TimedeltaSeries should a __trudiv__() be added because it is showing errors

Yes, you should add that. It is valid to divide TimedeltaSeries by Timedelta or np.timedelta64 to get a Series[float], and to divde by a float to get TimedeltaSeries

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and sir __mul__() it is still showing errors after adding it.

Your job to fix them!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir just not able to fix this error: Name "__mul__" already defined on line 1761 [no-redef] could not find about it over internet also

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably have __mul__() defined in there twice, with different arguments. You need to put a @overload before each def __mul__() statement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir still mypy hows the same error

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without seeing your code, I can't see what you are doing wrong. You can push and it will fail and I will tell you what you did wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes sir

def __mul__(self, other: num) -> TimedeltaSeries: ... # type: ignore[override]
def __sub__( # type: ignore[override]
self, other: Timedelta | TimedeltaSeries | TimedeltaIndex | np.timedelta64
) -> TimedeltaSeries: ...
def __truediv__(self, other: TimedeltaSeries | np.timedelta64) -> Series[float]: ... # type: ignore[override]
@overload
def __mul__(self, other: TimestampSeries | np.timedelta64) -> Never: ... # type: ignore[override]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to put all the definitions of the same methods that have overloads together. Move lines 1766-1767 to after line 1761

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks sir

Copy link
Contributor Author

@ramvikrams ramvikrams Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error: Signature of "__mul__" incompatible with supertype "Series" [override] this is the last mypy error it doesn't go if I add # type: ignore[override] from line number 1760

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to put all the definitions of the same methods that have overloads together. Move lines 1766-1767 to after line 1761

done sir

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error: Signature of "__mul__" incompatible with supertype "Series" [override] this is the last mypy error it doesn't go if I add # type: ignore[override] from line number 1760

It's OK to put in the # type: ignore[override] in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error: Signature of "__mul__" incompatible with supertype "Series" [override] this is the last mypy error it doesn't go if I add # type: ignore[override] from line number 1760

It's OK to put in the # type: ignore[override] in this case.

then it shows this error error: Unused "type: ignore" comment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check the line numbers that the error is reported on

Copy link
Contributor Author

@ramvikrams ramvikrams Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is on line 1760 the initial error is error: Signature of "__mul__" incompatible with supertype "Series" [override] and after I add # type: ignore[override] the initial error still remains

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Push your current code and I can take a look at the CI logs

@overload
def __mul__(self, other: TimestampSeries | np.timedelta64 | Timedelta | TimedeltaSeries) -> Never: ...
@overload
def __mul__(self, other: num) -> TimedeltaSeries: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add # type: ignore[override] on both lines 1761 and 1763, or maybe just one of them, or possibly on lines 1760 and/or 1762. You'll have to experiment.

assert_type((s3 - td), TimedeltaSeries) # type: ignore
assert_type((s3 + td), TimedeltaSeries) # type: ignore
assert_type((s3 * td), NoReturn) # type: ignore
assert_type((s3 / td), pd.Series[float]) # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do NOT put #type: ignore in the testing functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sir

@ramvikrams
Copy link
Contributor Author

Done sir, I think it's complete

@@ -1739,6 +1746,9 @@ class TimestampSeries(Series[Timestamp]):
# ignore needed because of mypy
@property
def dt(self) -> TimestampProperties: ... # type: ignore[override]
def __add__(self, other: TimedeltaSeries | np.timedelta64 | TimestampSeries | Timestamp) -> TimestampSeries: ... # type: ignore[override]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove TimestampSeries | Timestamp here as you can't add two timestamps

assert_type((s1 / td), NoReturn) # pyright: ignore
assert_type((s3 - td), TimedeltaSeries)
assert_type((s3 + td), TimedeltaSeries)
assert_type((s3 * td), NoReturn) # pyright: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do the following:

  1. for the 4 cases where you are expecting TimestampSeries or TimedeltaSeries, use check(assert_type( operation, expected_type), expected_type, expected_subtype) where operation is the operation you are testing, expected_type is the type expected, and expected_subtype is pd.Timestamp or pd.Timedelta as appropriate
  2. For the 3 cases where you have NoReturn, can you change that to Never , which you will need to import from typing_extensions
  3. The # pyright: ignore is correct - pyright is doing the right thing here, but mypy is not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir regarding point 2 and 3 when I changed it to Never pyrigth shows error, mypy works ok and if we add the #type: ignore both work fine so should I add it.
Because in the pyright error it says "assert_type" mismatch: expected "Never" but received "Unknown"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you will still need the # pyright: ignore where you had it before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done sir

assert_type((s3 - td), TimedeltaSeries)
assert_type((s3 + td), TimedeltaSeries)
assert_type((s3 * td), NoReturn) # pyright: ignore
assert_type((s3 / td), pd.Series[float])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use check on this one too

self, other: TimedeltaSeries | np.timedelta64 | TimestampSeries
) -> TimestampSeries: ...
@overload
def __add__(self, other: Timestamp) -> TimestampSeries: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't add a Timestamp to a TimestampSeries. Make the return type Never here.

Copy link
Contributor Author

@ramvikrams ramvikrams Nov 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have issue in line number 266 of tests/time_funcs.py if we change this, it's pyright issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On that line, change # TODO both: ignore[operator] to # pyright: ignore

@@ -1035,3 +1038,18 @@ def test_timedelta_range() -> None:
def test_dateoffset_freqstr() -> None:
offset = DateOffset(minutes=10)
check(assert_type(offset.freqstr, str), str)


def timedelta64_and_arithmatic_operator() -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name of any test function must begin with test_. If you make that change, you will see that the tests will fail. They were never getting tested.

You will have to change the tests as well. See example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir even after adding the test_ the poe mypy test doesn't fail on my system

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do poe test to run all tests or poe pytest to just run pytest which is where the failures will occur

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes sir there were errors, I'll fix it then

s2 = pd.Series(data=pd.date_range("1/1/2021", "2/1/2021"))
s3 = s2 - s1
td = np.timedelta64(1, "M")
check(assert_type((s1 - td), TimestampSeries), TimestampSeries, pd.Timestamp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check(assert_type((s1 - td), "TimestampSeries"), pd.Series, pd.Timestamp)

All the tests should look like this. The type TimestampSeries isn't known at runtime, so you have to surround it with quotes.

you will have to do something similar for TimedeltaSeries and pd.Series[float].

Look elsewhere in the code to see examples

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also change thte left 3 tests because they cannot pass pytets

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Move those 3 tests to the end and do

if TYPE_CHECKING_INVALID_USAGE:
     # test 1 here
     # test 2 here
     # test 3 here

you import TYPE_CHECKING_INVALID_USAGE from tests

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @ramvikrams

By the way, in the future, when you make commits, please make the commit message reflect the changes you make, not just use the word "update"

@Dr-Irv Dr-Irv merged commit b7163c2 into pandas-dev:main Nov 23, 2022
@ramvikrams
Copy link
Contributor Author

ramvikrams commented Nov 23, 2022

thanks @ramvikrams

By the way, in the future, when you make commits, please make the commit message reflect the changes you make, not just use the word "update"

yes sir sure I will keep that in mind

Thank You very much sir

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Series arithmetic methods accepting a Timedelta should also accept a np.timedelta64
2 participants