Skip to content

TST(string dtype): Fix xfails in test_block_internals.py #60765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Jan 22, 2025

No description provided.

@WillAyd WillAyd added the Strings String extension data type and string data label Jan 22, 2025
def test_construction_with_mixed(self, float_string_frame, using_infer_string):
# test construction edge cases with mixed types
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume something got lost in refactor, but the code removed here is not actually being tested. Both the result and expected variables are getting shadowed before they are compared

def test_construction_with_conversions(self):
# convert from a numpy array of non-ns timedelta64; as of 2.0 this does
# *not* convert
arr = np.array([1, 2, 3], dtype="timedelta64[s]")
df = DataFrame(index=range(3))
df["A"] = arr
df = DataFrame({"A": arr})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failures in this module go back to the discussion in #60338

I don't think it is important for these tests to use that construction pattern

@WillAyd WillAyd added this to the 2.3 milestone Jan 22, 2025
@@ -401,14 +383,16 @@ def test_update_inplace_sets_valid_block_values():
assert isinstance(df._mgr.blocks[0].values, Categorical)


@pytest.mark.xfail(using_string_dtype(), reason="TODO(infer_string)")
def test_nonconsolidated_item_cache_take():
# https://github.com/pandas-dev/pandas/issues/35521

# create non-consolidated dataframe with object dtype columns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still non-consolidated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice catch - let me fix that

# check dtypes
result = df.dtypes
expected = Series({"datetime64[us]": 3})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would adding a tm.assertsimething pass here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - adding tm.assert_series_equal(result, expected) doesn't work on main irrespective of the string data type work.

The data provided is 3 dimensional so I'm not sure why the expected Series is just one element, unless it meant to do list multiplication instead of creating a dictionary

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to make that change if we want the list multiplication. I just can't figure out why its here in the first place though, so figured a good removal candidate

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data provided is 3 dimensional

The data is 2 dimensional, so it creates the dataframe just fine?

In [12]:         data = [
    ...:             [datetime(2001, 1, 5), np.nan, datetime(2001, 1, 2)],
    ...:             [datetime(2000, 1, 2), datetime(2000, 1, 3), datetime(2000, 1, 1)],
    ...:         ]
    ...:         df = DataFrame(data)

In [13]: df
Out[13]: 
           0          1          2
0 2001-01-05        NaT 2001-01-02
1 2000-01-02 2000-01-03 2000-01-01

In [14]: df.dtypes
Out[14]: 
0    datetime64[us]
1    datetime64[us]
2    datetime64[us]
dtype: object

But indeed the series should have three elements.

And agreed that it looks like out of place to being tested here, I am fine with removing this

@jorisvandenbossche jorisvandenbossche merged commit d38706a into pandas-dev:main Jan 24, 2025
54 checks passed
Copy link

lumberbot-app bot commented Jan 24, 2025

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
git checkout 2.3.x
git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
git cherry-pick -x -m1 d38706af66249ef74e42671a480261c68bedfbce
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
git commit -am 'Backport PR #60765: TST(string dtype): Fix xfails in test_block_internals.py'
  1. Push to a named branch:
git push YOURFORK 2.3.x:auto-backport-of-pr-60765-on-2.3.x
  1. Create a PR against branch 2.3.x, I would have named this PR:

"Backport PR #60765 on branch 2.3.x (TST(string dtype): Fix xfails in test_block_internals.py)"

And apply the correct labels and milestones.

Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon!

Remember to remove the Still Needs Manual Backport label once the PR gets merged.

If these instructions are inaccurate, feel free to suggest an improvement.

@jorisvandenbossche
Copy link
Member

Manual backport -> #60781

@WillAyd WillAyd deleted the fix-internals-string-tests branch January 24, 2025 20:37
jorisvandenbossche added a commit that referenced this pull request Jan 24, 2025
…s.py (#60765) (#60781)

TST(string dtype): Fix xfails in test_block_internals.py (#60765)

(cherry picked from commit d38706a)

Co-authored-by: William Ayd <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants