-
-
Notifications
You must be signed in to change notification settings - Fork 141
ENH: Add to_tuples
method to IntervalIndex
#538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add to_tuples
method to IntervalIndex
#538
Conversation
to_tuples
method to IntervalIndex
to_tuples
method to IntervalIndex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. One small change in the test.
tests/test_interval_index.py
Outdated
@@ -31,6 +31,11 @@ def test_from_tuples() -> None: | |||
) | |||
|
|||
|
|||
def test_to_tuples() -> None: | |||
ind = pd.IntervalIndex.from_tuples([(0, 1), (1, 2)]).to_tuples() | |||
check(assert_type(ind, pd.Index), pd.Index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change this test to
check(assert_type(ind, pd.Index), pd.Index, tuple)
That will then check we get an Index
of tuples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I wrote such a test at first, but finally removed the last tuple
argument because I think it is not necessary to check the type of elements in the Index
in this test at this point.
If Indexes are generic now, then this test should be like this:
check(assert_type(ind, "pd.Index[tuple]"), pd.Index, tuple)
in which case it is important to check that the inferred return type is actually an instance of Index of tuples.
However, since Indexes are not generic at the moment, it is not the responsibility of this stub to ensure that a return value is actually an Index "of tuples".
For now, this type stub layer should only guarantee the behavior of "returning an Index (of something)".
So, I think the current test is fine for now, and checking for the type of elements in an Index will only make sense after generic support for Indexes is added in the future (at that time this test should be modified to the code above).
Are you suggesting that this test should check for the type of elements in the Index even though it is currently a test for runtime behavior and not a test for this type stub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current test is fine for now, and checking for the type of elements in an Index will only make sense after generic support for Indexes is added in the future (at that time this test should be modified to the code above).
Are you suggesting that this test should check for the type of elements in the Index even though it is currently a test for runtime behavior and not a test for this type stub?
The assert_type()
call checks the validity of the stub. The check()
call validates that the stub and runtime behavior are aligned. By adding the tuple
argument to check
, it is just making sure that alignment is correct.
We've done that (admittedly inconsistently) in other places, and it actually has revealed some errors in pandas itself, or in the stubs. So the goal is to try to get more consistency in how we do our testing.
I agree that things will be better once we make Index
generic in the stubs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that the check()
call checks that the type information in the stub and the runtime type information match, but that alignment check is accomplished even without the dtype=tuple
argument (because Index
is currently not generic and the type of elements in an Index
on the stub side is not known).
However, I agree with you that adding the argument would improve consistency between pandas and stubs.
It would allow the stub layer to perform additional checks on pandas' behavior, for example, to detect bugs in pandas (or stubs).
Thanks for the explanation of your intention, I am convinced now that it is better to have this argument.
I have reverted to passing the last tuple
argument to the check()
call.
Can you create an issue in the pandas repo about this doc issue? You're correct that the docs for |
OK thanks, will open an issue. |
433439c
to
b6bad8d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dr-Irv Thanks for your suggestion and additional comment!
I have understood your intention now and added the tuple
argument back in.
Would appreciate it if you could re-review it when you have time.
Also, a test with mypy nightly is failing, but I don't think that is due to this change.
tests/test_interval_index.py
Outdated
@@ -31,6 +31,11 @@ def test_from_tuples() -> None: | |||
) | |||
|
|||
|
|||
def test_to_tuples() -> None: | |||
ind = pd.IntervalIndex.from_tuples([(0, 1), (1, 2)]).to_tuples() | |||
check(assert_type(ind, pd.Index), pd.Index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that the check()
call checks that the type information in the stub and the runtime type information match, but that alignment check is accomplished even without the dtype=tuple
argument (because Index
is currently not generic and the type of elements in an Index
on the stub side is not known).
However, I agree with you that adding the argument would improve consistency between pandas and stubs.
It would allow the stub layer to perform additional checks on pandas' behavior, for example, to detect bugs in pandas (or stubs).
Thanks for the explanation of your intention, I am convinced now that it is better to have this argument.
I have reverted to passing the last tuple
argument to the check()
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @skatsuta
That was fixed in #541 |
Add to_tuples method to IntervalIndex
assert_type()
to assert the type of any return valueThank you for always maintaining this stub!
It contributes greatly to the robustness of Python programs using pandas.
Problem
IntervalIndex
hasto_tuples
method, but its stub is currently missing.Solution
Add a stub for
IntervalIndex#to_tuples
and a test for it.Note that the documentation above states that the return type of that method is an ndarray,
but it is actually
Index
(the test to be added checks for this).This appears to be a mistake in the pandas doc.
(Probably the doc for
IntervalArray#to_tuples
is also being used forIntervalIndex#to_tuples
, and sinceIntervalArray#to_tuples
does indeed return an ndarray, it is incorrectly showing up in the doc forIntervalIndex#to_tuples
as well.)