-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
REGR: NA-values in ctors with string dtype #21366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
d07b238
d2585e3
bcc993c
0d7c853
a94d399
3d81d5d
4de6f7a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -137,6 +137,14 @@ def test_constructor_no_data_index_order(self): | |
result = pd.Series(index=['b', 'a', 'c']) | ||
assert result.index.tolist() == ['b', 'a', 'c'] | ||
|
||
def test_constructor_dtype_str_na_values(self): | ||
# https://github.com/pandas-dev/pandas/issues/21083 | ||
ser = Series(['x', None], dtype=str) | ||
result = ser.isna() | ||
expected = Series([False, True]) | ||
tm.assert_series_equal(result, expected) | ||
assert ser.iloc[1] is None | ||
|
||
def test_constructor_series(self): | ||
index1 = ['d', 'b', 'a', 'c'] | ||
index2 = sorted(index1) | ||
|
@@ -164,22 +172,24 @@ def test_constructor_list_like(self): | |
|
||
@pytest.mark.parametrize('input_vals', [ | ||
([1, 2]), | ||
([1.0, 2.0, np.nan]), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had to remove this case, as it isn't correct. We aren't consistent between In [4]: pd.Series([1, 2, None], dtype='str').tolist()
Out[4]: ['1', '2', None]
In [5]: pd.Series([1, 2, None], dtype='str').astype('str').tolist()
Out[5]: ['1', '2', 'None'] |
||
(['1', '2']), | ||
(list(pd.date_range('1/1/2011', periods=2, freq='H'))), | ||
(list(pd.date_range('1/1/2011', periods=2, freq='H', | ||
tz='US/Eastern'))), | ||
([pd.Interval(left=0, right=5)]), | ||
]) | ||
def test_constructor_list_str(self, input_vals): | ||
def test_constructor_list_str(self, input_vals, string_dtype): | ||
# GH 16605 | ||
# Ensure that data elements from a list are converted to strings | ||
# when dtype is str, 'str', or 'U' | ||
result = Series(input_vals, dtype=string_dtype) | ||
expected = Series(input_vals).astype(string_dtype) | ||
assert_series_equal(result, expected) | ||
|
||
for dtype in ['str', str, 'U']: | ||
result = Series(input_vals, dtype=dtype) | ||
expected = Series(input_vals).astype(dtype) | ||
assert_series_equal(result, expected) | ||
def test_constructor_list_str_na(self, string_dtype): | ||
result = Series([1.0, 2.0, np.nan], dtype=string_dtype) | ||
expected = Series(['1.0', '2.0', None], dtype=object) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't the NaN be preserved? (so expected There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mmm apparently assert_series_equal considers those equal In [11]: pd.Series([1.0, 2.0, np.nan], dtype=str)[2]
Out[11]: nan I'll be explicit in the test. |
||
assert_series_equal(result, expected) | ||
|
||
def test_constructor_generator(self): | ||
gen = (i for i in range(10)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not put any more code here! these routines are already way overloaded.
also I believe you can simply do this by:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dtype
is explicitly passed by the user here. We using that approach wouldn't cast[1.0, 2.0, None]
to['1.0', '2.0', None]
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then pls make a routine in cast, this is not the right place for this. we have so many casting routines pls modify / use existing ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is actually what happened in 0.22:
(but the new behaviour to convert the numbers to strings is certainly an improvement)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh... OK hold off on merging then. Ideally this would just be a bugfix for that regression :/
Won't have time to look till ~3 hours from now.