Skip to content

[WIP] Quick fix to provide complex data type support for hashmap based algorithms #27599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
Closed

Conversation

ghost
Copy link

@ghost ghost commented Jul 25, 2019

# simplefilter("ignore", np.ComplexWarning)
# values = ensure_float64(values)
# return values, "float64", "float64"
return ensure_object(np.asarray(values)), "object", "object"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cant just return non-object values? regardless, definitely dont leave the commented-out stuff above, just delete

…x numbers should not be sortable according to test_complex_sorting. The user should specify sort=False when grouping a dataframe by a complex data type.
@codecov
Copy link

codecov bot commented Jul 26, 2019

Codecov Report

Merging #27599 into master will decrease coverage by 50.6%.
The diff coverage is 60%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #27599       +/-   ##
===========================================
- Coverage      93%    42.4%   -50.61%     
===========================================
  Files         182      182               
  Lines       50311    50312        +1     
===========================================
- Hits        46793    21333    -25460     
- Misses       3518    28979    +25461
Flag Coverage Δ
#multiple ?
#single 42.4% <60%> (-0.09%) ⬇️
Impacted Files Coverage Δ
pandas/core/algorithms.py 47.76% <0%> (-47.02%) ⬇️
pandas/core/sorting.py 27.84% <75%> (-70.87%) ⬇️
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/io/gcs.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/core/groupby/categorical.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
pandas/io/s3.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-99.38%) ⬇️
... and 138 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ebcfee4...7a16823. Read the comment docs.

result = algos.value_counts(array)
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("array,expected", [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For any that are just one set of parameters here you can just define in the function body. No need to parametrize

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just an initial set of test cases. I plan on adding more test cases for each of the algos.

])
def test_factorize(self, array, expected):
result = pd.factorize(array)
assert len(result) == 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think easier to read at a quick glance if you just compare the two elements rather than the enumerating loop

@WillAyd WillAyd added the Complex Complex Numbers label Jul 26, 2019
@ghost ghost changed the title Complex value counts [WIP] Quick fix to provide complex data type support for hashmap based algorithms Jul 31, 2019
@jbrockmendel
Copy link
Member

@heckeop can you rebase? the CI failures might have been unrelated problems that have since been fixed

@WillAyd
Copy link
Member

WillAyd commented Sep 13, 2019

@heckeop is this still active?

@ghost
Copy link
Author

ghost commented Sep 13, 2019 via email

@jreback
Copy link
Contributor

jreback commented Oct 6, 2019

@heckeop can you merge master and update

…rning instead of pytest.warns to capture warnings
@ghost
Copy link
Author

ghost commented Oct 8, 2019

has numpy undergone a version change? The test below was passing before. Now, the returned object has dtype complex128

https://github.com/pandas-dev/pandas/pull/27599/files#diff-3957a56d910fcc0790307c20a9a67a58R29-R40

simplefilter("ignore", np.ComplexWarning)
values = ensure_float64(values)
return values, "float64", "float64"
raise TypeError("Complex data types not supported...Coercing to object")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to do the object-coersion here than fallback to the except branch. clearer

@jbrockmendel
Copy link
Member

has numpy undergone a version change? The test below was passing before. Now, the returned object has dtype complex128

We've made some recent changes to the cython code that handled complex correctly in some cases, not sure if this would be affected. can you merge master and re-push

@alimcmaster1
Copy link
Member

Thanks for the PR - @heckeop Could you merge master here?

@jbrockmendel
Copy link
Member

@heckeop can you rebase and see if you can get the CI passing

@jreback
Copy link
Contributor

jreback commented Jan 1, 2020

@heckeop can you merge master

@alimcmaster1
Copy link
Member

I can maybe take a look if we don’t hear back from @heckeop

@WillAyd
Copy link
Member

WillAyd commented Jan 20, 2020

Closing as I think this is stale but @heckeop ping if you'd like to pick back up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Complex Complex Numbers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Functions that rely on hash tables are incorrect for complex numbers
5 participants