Skip to content

Parcoords skipping all colors from the colorscale except min and max value #1442

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Rabeez opened this issue Feb 28, 2019 · 7 comments
Closed
Labels
bug something broken
Milestone

Comments

@Rabeez
Copy link

Rabeez commented Feb 28, 2019

I'm using plotly 3.6.1 in JupyterLab.

I'm trying to recreate the iris visualization from the plotly docs using the FigureWidget syntax.

I have this code so far:

fig = go.FigureWidget()
parcords = fig.add_parcoords(dimensions=[{'label':n.title(),
                                          'values':iris[n],
                                          'range':[0,8]} for n in iris.columns[:-2]])

fig.data[0].dimensions[0].constraintrange = [4,8]
parcords.line.color = iris['species_id']
parcords.line.colorscale = [[0,'#D7C16B'],[0.5,'#23D8C3'],[1,'#F3F10F']]
parcords.line.colorbar.tickvals = np.unique(iris['species_id']).tolist()
parcords.line.colorbar.ticktext = np.unique(iris['species']).tolist()
fig.layout.title = 'A Wild Parallel Coordinates Plot'
fig

which produces
image

The lines are identical to the ones in the example notebooks (doc) but the class corresponding to the 0.5 color scale value is shown with the incorrect color.

Changing the middle tick value (0.5) in the colorscale up or down does not reveal the third color either. Using the discrete colorscale (code below) also gives the same plot with a different (but correct) colorbar, with the middle class's lines incorrectly colored.

parcords.line.colorscale = [(0.0, '#D7C16B'),
                            (0.3333333333333333, '#D7C16B'),
                            (0.3333333333333333, '#23D8C3'),
                            (0.6666666666666666, '#23D8C3'),
                            (0.6666666666666666, '#F3F10F'),
                            (1.0, '#F3F10F')]

I have used a similar method to color data points based on categorical variable with a scatter plot and it works perfectly fine.

fig = go.FigureWidget()
strace = fig.add_scatter(x=iris['sepal_length'], y=iris['sepal_width'],
                         mode='markers');

strace.marker.color = iris['species_id']
strace.marker.colorscale = [[0,'#D7C16B'],[0.5,'#23D8C3'],[1,'#F3F10F']]
strace.marker.colorbar.tickvals = np.unique(iris['species_id']).tolist()
strace.marker.colorbar.ticktext = np.unique(iris['species']).tolist()
fig

This produces the expected output with 3 c:olors
image

Update: I tried to get a numeric variable mapped to the colors so I created a dummy column called linear which is just np.arange(0,len(iris)) and mapped it to the same colorscale

fig = go.FigureWidget()
parcords = fig.add_parcoords(dimensions=[{'label':n.title(),
                                          'values':iris[n],
                                          'range':[0,8]} for n in iris.columns[:-3]])

fig.data[0].dimensions[0].constraintrange = [4,8]
parcords.line.color = iris['linear']
parcords.line.colorscale = [[0,'#D7C16B'],[0.5,'#23D8C3'],[1,'#F3F10F']]
parcords.line.colorbar.title = ''
fig.layout.title = 'A Wild Parallel Coordinates Plot'
fig

This makes
image

which is what I expected because this makes it look like the highest value is given the color corresponding to 1 in the colorscale property and everything else gets the color corresponding to the lowest colorscale value (0 in this case).

I tried using a four-step colorscale

parcords.line.colorscale = [[0,'#D7C16B'],[0.33,'#23D8C3'],[0.67,'#F3A10F'],[1,'#F3F10F']]

and got the exact same plot as above just with a different (and correct) colorbar. So everything but the top and bottom value in the colorscale are being skipped over

@Rabeez Rabeez changed the title Parcoords skipping a color from the colorscale for categorical variable Parcoords skipping all colors from the colorscale except min and max value Feb 28, 2019
@jonmmease
Copy link
Contributor

Hi @Rabeez, thanks for the report. Could you include a fully reproducible example with data loading and imports included? When I start with the example from the docs and work towards your example things seem to be working.

import plotly.graph_objs as go
import numpy as np
import pandas as pd 

df = pd.read_csv("https://raw.githubusercontent.com/bcdunbar/datasets/master/iris.csv")

species_df = df.groupby('species_id', as_index=False).species.first()

data = [
    go.Parcoords(
        line = dict(color = df['species_id'],
                    colorscale=[[0,'#D7C16B'],[0.5,'#23D8C3'],[1,'#F3F10F']],
                    colorbar={'tickvals': species_df['species_id'].tolist(),
                              'ticktext': species_df['species'].tolist()
                             },
                    showscale=True),
        dimensions = list([
            dict(range = [0,8],
                constraintrange = [4,8],
                label = 'Sepal Length', values = df['sepal_length']),
            dict(range = [0,8],
                label = 'Sepal Width', values = df['sepal_width']),
            dict(range = [0,8],
                label = 'Petal Length', values = df['petal_length']),
            dict(range = [0,8],
                label = 'Petal Width', values = df['petal_width'])
        ])
    )
]

layout = go.Layout()

fig = go.FigureWidget(data = data, layout = layout)
fig

newplot 8

Also please inlcude the version of plotly.py that you're using. Thanks!

@Rabeez
Copy link
Author

Rabeez commented Mar 2, 2019

Here is a full isolated example, which I also just double checked (it skips the middle class).

Also I'm on plotly.py 3.6.1, I mentioned it in the original post but it probably got buried in the text. I've moved it to the top now.

@jonmmease
Copy link
Contributor

Hi @Rabeez, thanks for taking the time to report this and share the full example. I was able to work out the issue by starting with your example.

It seems the problem is in Plotly.js's handling of integer TypedArrays as the line.color property of a parcoords trace. Here is the corresponding Plotly.js issue plotly/plotly.js#3595.

In the meantime, the workaround is to cast the integer pandas series to a floating point series:

import numpy as np
import pandas as pd
import seaborn as sns
import plotly
import plotly.graph_objs as go

iris = sns.load_dataset('iris')
iris['species_id'] = pd.Categorical(iris['species']).codes
print(iris.shape)
iris.head()

fig = go.FigureWidget()
parcords = fig.add_parcoords(dimensions=[{'label':n.title(),
                                          'values':iris[n],
                                          'range':[0,8]} for n in iris.columns[:-3]])

fig.data[0].dimensions[0].constraintrange = [4,8]
parcords.line.color = iris['species_id'].astype('float')  # <---[HERE]
parcords.line.colorscale = [[0,'#D7C16B'],[0.5,'#23D8C3'],[1,'#F3F10F']]
parcords.line.colorscale = [(0.0, '#D7C16B'),
                            (0.3333333333333333, '#D7C16B'),
                            (0.3333333333333333, '#23D8C3'),
                            (0.6666666666666666, '#23D8C3'),
                            (0.6666666666666666, '#F3F10F'),
                            (1.0, '#F3F10F')]

parcords.line.colorbar.title = ''
parcords.line.colorbar.tickvals = np.unique(iris['species_id']).tolist()
parcords.line.colorbar.ticktext = np.unique(iris['species']).tolist()
fig.layout.title = 'A Wild Parallel Coordinates Plot'
fig

newplot 5

As a side note, I wouldn't recommend computing the tickvals/ticktext like this:

parcords.line.colorbar.tickvals = np.unique(iris['species_id']).tolist()
parcords.line.colorbar.ticktext = np.unique(iris['species']).tolist()

It works in this exact case because the iris type strings happen to be sorted alphabetically. But in general you could end up with mismatched tick labels. See the species_df approach I used above for an alternative.

@jonmmease jonmmease added bug something broken plotly.js labels Mar 2, 2019
@Rabeez
Copy link
Author

Rabeez commented Mar 2, 2019

Yup, using floats fixed colors for both the categorical and numeric variable 👍
And yes I'm aware of the tickval thing :D it was a quick and dirty hack I put in but your groupby approach is will work much better.

@Rabeez
Copy link
Author

Rabeez commented Mar 2, 2019

I am curious though. Why does plotly.js have a different mechanism for setting up colors for a parcoords trace? Since the scatter plot works fine even with integer arrays.

@archmoj
Copy link
Contributor

archmoj commented Mar 11, 2019

@jonmmease closed on plotly.js side: plotly/plotly.js#3595

@jonmmease jonmmease added this to the v3.7.0 milestone Mar 25, 2019
@jonmmease
Copy link
Contributor

Fix released in plotly.js 1.45.1 which was included in plotly.py 3.7.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken
Projects
None yet
Development

No branches or pull requests

3 participants