-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Inconsistent behavior of .map() #20495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Strange but it looks like this works fine on master: In [3]: pd.Series(list('abc') + [np.nan]).map({np.nan: 'bar'})
Out[3]:
0 NaN
1 NaN
2 NaN
3 bar
dtype: object INSTALLED VERSIONScommit: c36e9f7 pandas: 0.23.0.dev0+685.gc36e9f705 |
yeah this lgtm. can one of you check if we have a test covering this case, if not pls add (advise either way). |
I looked through our tests but couldn't find anything to really cover this. From what I can tell there are a few similar tests in @readyready15728 any interest in trying your hand at a PR to add this scenario? I think the module above would be a good location unless @jreback disagrees |
I barely know anything about pull requests. Could you be a little more specific about how I should proceed? |
There is a pretty detailed guide for contributing to pandas - give that a look and let me know if you have any questions. https://pandas.pydata.org/pandas-docs/stable/contributing.html |
OK so do you intend that I submit a pull request to the code that implements .map() itself or the test code? And what is the intended behavior of .map()? I'm not sure which is right. |
You should be adding a test to The expected value would be your second example from your original post |
Alright, I'm virtually certain that my addition will work but I want to test it first. To do so, I need to get the tests to run.
But I have Cython installed. :? (Am running Ubuntu 16.04) EDIT: I did some Googling and someone suggested using pip to get the latest version and now the Cython extensions are compiling. |
There is something else about the behavior of pd.Series([42] + [np.nan]).map({42: 'the answer', np.nan: 'not NaN'}) Has the result:
In other words, the intended effect doesn't happen when there's just an integer alongside I am almost ready to do a pull request but I'd like a little guidance about creating a whatsnew entry, if it is needed at all. |
Sorry for the delay in response. Since you aren't changing any of the core code (just adding tests) I don't think you need a whatsnew entry, since there's nothing that you can link to here that would change behavior. That said, there's already a |
Frankly I wouldn't know how to do that. Additionally |
Did you configure git remote -v show
upstream https://github.com/pandas-dev/pandas.git (fetch)
upstream https://github.com/pandas-dev/pandas.git (push) |
No, it has only my forked repository. [dumb comment by me] I'm accustomed to working only on my own repos. EDIT: OK never mind I see what is going on here. |
Make sure you check out the contributing guide as it gives instructions on how to properly set your upstream repo. You will need this for comparisons and pulling updates to master https://pandas.pydata.org/pandas-docs/stable/contributing.html |
I did |
Did you |
Alright, it succeeded. While my test works I still might not meet all your expectations. A pull request has been submitted anyway. We can go back and forth until it's acceptable. |
Code Sample, a copy-pastable example if possible
Result:
Compare:
Result:
Problem description
If only
np.nan
is present in the mapping, then the associated mapping is not carried out. If another value is present, then it is carried out. Forgive me if this is intended behavior for some reason.Expected Output
I'm not sure what is appropriate here.
Output of
pd.show_versions()
pandas: 0.22.0
pytest: None
pip: 8.1.1
setuptools: 20.7.0
Cython: None
numpy: 1.11.0
scipy: 0.17.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: None
patsy: 0.5.0
dateutil: 2.4.2
pytz: 2014.10
blosc: None
bottleneck: None
tables: 3.2.2
numexpr: 2.4.3
feather: None
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.4.1
html5lib: 0.999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.6.0
The text was updated successfully, but these errors were encountered: