-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: repair 'style' kwd handling in DataFrame.plot (#21003) #33821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Looks like there's one issue left but I have no idea how to resolve it. Does anyone know what's going on there? Anaconda prompt output
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @joooeey !
I'll just make a few comments ahead of core members' reviews
This is odd. CI reports two test failures in
These tests run fine on my machine. |
CI pipeline is currently broken, the errors were probably unrelated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @joooeey - nice work!
So, as far as I can tell, the problem was that before, with style='d'
, this line
nocolor_style = style is None or re.match("[a-z]+", style) is None
wasn't setting nocolor_style
to True, which is necessary to reach
if isinstance(colors, dict):
kwds["color"] = colors[label]
else:
kwds["color"] = colors[col_num % len(colors)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, and I think it's very close. @joooeey
Since the PR is to repair the style
keyword handling which essentially sets the marker for plot. Therefore, it is necessary to also test marker
is still set correctly. As mentioned in test, you could use ax.lines.get_marker() == 'd'
to achieve it.
And personally I think you could also remove some of your comments in the tests because most of those codes in the tests, they are very straightforward thus there is no need to explain IMHO.
pandas/tests/plotting/test_frame.py
Outdated
# GH21003 - code sample from 2018-05-10 is equivalent to this test | ||
df = DataFrame(np.random.random((7, 4))) | ||
# combining a color string and a marker letter should be allowed | ||
df.plot(color="green", style="d") # d for diamond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we also check color
and style
here?
something like:
ax = df.plot(color="green", style="d")
assert ax.lines[0].get_color() == 'green'
assert ax.lines[0].get_marker() == 'd'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my reply below.
pandas/tests/plotting/test_frame.py
Outdated
# in a plot where each individual line was assigned a list of colors: | ||
# ax.lines[i].get_color() == ['yellow', 'red', 'green', 'blue'] | ||
# which resulted in a ValueError when plt.draw() was called. | ||
assert [line.get_color() for line in ax.lines] == color_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, also use ax.lines.get_marker()
to verify that marker
is set correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! This will introduce lots of duplicate code between the tests test_color_and_style_arguments
, test_color_and_marker
and test_color_list_and_marker
. Namely df.plot
and the list comprehensions. I'll see how I can refactor this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it more readable to test these four combinations of color
and style
arguments (test_color_and_style_arguments
has two) in four separate test methods or all in one? I'd go for the former but I've noticed lots of huge test methods in frame.py
so I'm not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this will work in one test?
assert [line.get_color() for line in ax.lines] == color_list
assert all(line.get_marker() == 'd' for line in ax.lines)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, i would prefer to combine both of your tests, something like this
@pytest.mark.parametrize("color, output_color_list",
[('green', ['green'] * 4),
(['yellow', 'red', 'green', 'blue'], ['yellow', 'red', 'green', 'blue'])
])
def test_color_and_marker(self, color, output_color_list):
df = DataFrame(np.random.random((7, 4)))
ax = df.plot(color=color, style="d")
assert [line.get_color() for line in ax.lines] == output_color_list
assert all(line.get_marker() == 'd' for line in ax.lines)
And regarding test_color_and_style_arguments
, it's kinda out of the scope of your PR, and you can just leave it aside.
@joooeey does this sound good to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said, import pandas
results in this error:
import pandas
Traceback (most recent call last):
File "<ipython-input-1-38d4b0363d82>", line 1, in <module>
import pandas
File "C:\Users\Joey\pandas-joooeey\pandas\__init__.py", line 52, in <module>
from pandas.core.api import (
File "C:\Users\Joey\pandas-joooeey\pandas\core\api.py", line 15, in <module>
from pandas.core.arrays import Categorical
File "C:\Users\Joey\pandas-joooeey\pandas\core\arrays\__init__.py", line 8, in <module>
from pandas.core.arrays.datetimes import DatetimeArray
File "C:\Users\Joey\pandas-joooeey\pandas\core\arrays\datetimes.py", line 46, in <module>
from pandas.core.arrays import datetimelike as dtl
File "C:\Users\Joey\pandas-joooeey\pandas\core\arrays\datetimelike.py", line 11, in <module>
from pandas._libs.tslibs.timestamps import (
ImportError: cannot import name 'integer_op_not_supported' from 'pandas._libs.tslibs.timestamps' (C:\Users\Joey\pandas-joooeey\pandas\_libs\tslibs\timestamps.cp37-win_amd64.pyd)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it's jupyter notebook, have you tried to shut down the notebook and restart? and then import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is the IPython Console in Spyder. Restarting Spyder didn't help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytest pandas
fails too with Windows fatal exception: access violation
:
Anaconda prompt stack trace
[...]
pandas\tests\arrays\test_numpy.py ................................................... [ 15%]
pandas\tests\arrays\test_period.py ..................................................Windows fatal exception: access violation
Thread 0x00001f80 (most recent call first):
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\multiprocessing\pool.py", line 576 in _handle_results
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 870 in run
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 932 in _bootstrap_inner
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 890 in _bootstrap
Thread 0x00004e4c (most recent call first):
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\multiprocessing\pool.py", line 528 in _handle_tasks
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 870 in run
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 932 in _bootstrap_inner
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 890 in _bootstrap
Thread 0x00001fe8 (most recent call first):
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\multiprocessing\connection.py", line 810 in _exhaustive_wait
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\multiprocessing\connection.py", line 878 in wait
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\multiprocessing\pool.py", line 499 in _wait_for_updates
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\multiprocessing\pool.py", line 519 in _handle_workers
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 870 in run
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 932 in _bootstrap_inner
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 890 in _bootstrap
Thread 0x000028a8 (most recent call first):
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\multiprocessing\pool.py", line 114 in worker
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 870 in run
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 932 in _bootstrap_inner
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 890 in _bootstrap
Thread 0x000047c0 (most recent call first):
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\multiprocessing\pool.py", line 114 in worker
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 870 in run
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 932 in _bootstrap_inner
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 890 in _bootstrap
Thread 0x00004cb8 (most recent call first):
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\multiprocessing\pool.py", line 114 in worker
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 870 in run
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 932 in _bootstrap_inner
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 890 in _bootstrap
Thread 0x0000559c (most recent call first):
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\multiprocessing\pool.py", line 114 in worker
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 870 in run
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 932 in _bootstrap_inner
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\threading.py", line 890 in _bootstrap
Current thread 0x000014a0 (most recent call first):
File "C:\Users\Joey\pandas-joooeey\pandas\tests\arrays\test_period.py", line 413 in test_arrow_table_roundtrip
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\python.py", line 184 in pytest_pyfunc_call
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\callers.py", line 187 in _multicall
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\manager.py", line 84 in <lambda>
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\manager.py", line 93 in _hookexec
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\hooks.py", line 286 in __call__
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\python.py", line 1479 in runtest
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\runner.py", line 135 in pytest_runtest_call
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\callers.py", line 187 in _multicall
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\manager.py", line 84 in <lambda>
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\manager.py", line 93 in _hookexec
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\hooks.py", line 286 in __call__
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\runner.py", line 217 in <lambda>
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\runner.py", line 244 in from_call
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\runner.py", line 216 in call_runtest_hook
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\runner.py", line 186 in call_and_report
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\runner.py", line 100 in runtestprotocol
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\runner.py", line 85 in pytest_runtest_protocol
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\callers.py", line 187 in _multicall
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\manager.py", line 84 in <lambda>
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\manager.py", line 93 in _hookexec
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\hooks.py", line 286 in __call__
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\main.py", line 272 in pytest_runtestloop
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\callers.py", line 187 in _multicall
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\manager.py", line 84 in <lambda>
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\manager.py", line 93 in _hookexec
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\hooks.py", line 286 in __call__
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\main.py", line 247 in _main
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\main.py", line 191 in wrap_session
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\main.py", line 240 in pytest_cmdline_main
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\callers.py", line 187 in _multicall
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\manager.py", line 84 in <lambda>
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\manager.py", line 93 in _hookexec
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\pluggy\hooks.py", line 286 in __call__
File "C:\ProgramData\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\config\__init__.py", line 124 in main
File "C:\ProgramData\Anaconda3\envs\pandas-dev\Scripts\pytest-script.py", line 10 in <module>
(pandas-dev) c:\Users\Joey\pandas-joooeey>
I've seen this error before, mentioned on April 27 in this thread but I don't remember how I resolved it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emm, have no idea, i really think
python setup.py build_ext --inplace -j 4
python -m pip install -e . --no-build-isolation --no-use-pep517
would solve the issue if you already merge the master
@WillAyd could you help with the environment issue?
Yea looks like the extensions need to be rebuilt. @charlesdong1991 instructions above should work; maybe not even need the second line if you launch Python from the pandas repo
… On May 19, 2020, at 1:17 PM, Kaiqi Dong ***@***.***> wrote:
@charlesdong1991 commented on this pull request.
In pandas/tests/plotting/test_frame.py <#33821 (comment)>:
> + def test_color_and_marker(self):
+ # GH21003 - code sample from 2018-05-10 is equivalent to this test
+ df = DataFrame(np.random.random((7, 4)))
+ # combining a color string and a marker letter should be allowed
+ df.plot(color="green", style="d") # d for diamond
+
+ def test_color_list_and_marker(self):
+ # GH21003 - code sample from 2020-04-23 is equivalent to this test
+ df = DataFrame(np.random.random((7, 4)))
+ color_list = ["yellow", "red", "green", "blue"]
+ ax = df.plot(color=color_list, style="d")
+ # Before this patch was introduced, the previous line of code resulted
+ # in a plot where each individual line was assigned a list of colors:
+ # ax.lines[i].get_color() == ['yellow', 'red', 'green', 'blue']
+ # which resulted in a ValueError when plt.draw() was called.
+ assert [line.get_color() for line in ax.lines] == color_list
emm, have no idea, i really think
python setup.py build_ext --inplace -j 4
python -m pip install -e . --no-build-isolation --no-use-pep517
would solve the issue
@WillAyd <https://github.com/WillAyd> could you help with the environment issue?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#33821 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAEU4UNW7DTC32C2KAWVF2LRSLSNZANCNFSM4MR622DA>.
|
I still can't even import pandas. see #34280 . |
For now I don't intend to bother with Docker or WSL as the instructions you linked to are too vague for me to follow. I'm really not keen on more experiments. Or how would I set up the Docker container? Would that make my dev environment more stable? |
You probably don't need to run all the tests, just the plotting ones pytest pandas/tests/plotting |
Running
If everything is set up right, there should never be test failures on master, no? |
Unfortunately sometimes it can happen. Pulling the latest commits from upstream master can sometimes fix it if it's been solved upstream |
hey @joooeey I think if you rebase and resolve the conflict, and solve the linting error shown in CI, then it's fine, because seems the change of your PR doesn't seem to cause other tests to fail(the error you showed seems unrelated to your change, but related to your development environment probably). |
What's the command sequence for rebasing? Do I start with |
Assuming your local branch is up-to-date with git checkout plot-style-error
git fetch upstream master
git merge upstream/master # you'll need to fix some conflicts here
git commit |
I did that and it helped quite a bit. In the Travis CI there's still one error which I can't comprehend. |
Thanks @joooeey ! The Travis CI error is unrelated |
@joooeey someone will have a look we have hundreds of PRs so things do take time |
pandas/tests/plotting/test_frame.py
Outdated
green_line = ax.lines[2] | ||
assert green_line.get_color() == "green" | ||
assert green_line.get_marker() == "d" | ||
assert green_line.get_linestyle() == "--" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you might need another rebase, just one minor comment, as mentioned in #33821 (comment)
I still prefer to not test only green line but all lines using parametrization, i had an example there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you might need another rebase, just one minor comment, as mentioned in #33821 (comment)
Why do I need another rebase? I just merged upstream/master 6 days ago which unfortunately left one CI issue. I don't think another rebase will solve that?
I still prefer to not test only green line but all lines using parametrization, i had an example there
Well the function is parametrized I'm testing the inputs color="green"
and color=["yellow", "red", "green", "blue"]
. The only thing is, to simplify the code, I'm only checking the third line, the one that is green for both inputs in the parametrize. I'm confident that this is enough. Checking all four lines would be overkill. After all, checking all four lines (as opposed to the third) would only guard against absurd bugs such as if we suddenly had code that ignored the color
kwarg and made every line green.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just merged upstream/master 6 days ago which unfortunately left one CI issue. I don't think another rebase will solve that?
Might do if upstream/master has changed
Regarding the test - unfortunately, absurd bugs do sometimes happen. FWIW, I think the suggestion to check all lines' colours is sensible
@joooeey can you move release note to 1.2 |
Hi @joooeey - is this still active? If you address the latest comments I think this is close to being ready to be merged |
thanks @joooeey very nice. thanks @MarcoGorelli and @charlesdong1991 for reviewing. |
Thanks guys for pulling this to completion! I couldn't spare any time to work on this this summer unfortunately. |
black pandas
=>
command not found
git diff upstream/master -u -- "*.py" | flake8 --diff
=>
command not found