Parcoords skipping all colors from the colorscale except min and max value #1442

Rabeez · 2019-02-28T10:32:47Z

I'm using plotly 3.6.1 in JupyterLab.

I'm trying to recreate the iris visualization from the plotly docs using the FigureWidget syntax.

I have this code so far:

fig = go.FigureWidget()
parcords = fig.add_parcoords(dimensions=[{'label':n.title(),
                                          'values':iris[n],
                                          'range':[0,8]} for n in iris.columns[:-2]])

fig.data[0].dimensions[0].constraintrange = [4,8]
parcords.line.color = iris['species_id']
parcords.line.colorscale = [[0,'#D7C16B'],[0.5,'#23D8C3'],[1,'#F3F10F']]
parcords.line.colorbar.tickvals = np.unique(iris['species_id']).tolist()
parcords.line.colorbar.ticktext = np.unique(iris['species']).tolist()
fig.layout.title = 'A Wild Parallel Coordinates Plot'
fig

which produces

The lines are identical to the ones in the example notebooks (doc) but the class corresponding to the 0.5 color scale value is shown with the incorrect color.

Changing the middle tick value (0.5) in the colorscale up or down does not reveal the third color either. Using the discrete colorscale (code below) also gives the same plot with a different (but correct) colorbar, with the middle class's lines incorrectly colored.

parcords.line.colorscale = [(0.0, '#D7C16B'),
                            (0.3333333333333333, '#D7C16B'),
                            (0.3333333333333333, '#23D8C3'),
                            (0.6666666666666666, '#23D8C3'),
                            (0.6666666666666666, '#F3F10F'),
                            (1.0, '#F3F10F')]

I have used a similar method to color data points based on categorical variable with a scatter plot and it works perfectly fine.

fig = go.FigureWidget()
strace = fig.add_scatter(x=iris['sepal_length'], y=iris['sepal_width'],
                         mode='markers');

strace.marker.color = iris['species_id']
strace.marker.colorscale = [[0,'#D7C16B'],[0.5,'#23D8C3'],[1,'#F3F10F']]
strace.marker.colorbar.tickvals = np.unique(iris['species_id']).tolist()
strace.marker.colorbar.ticktext = np.unique(iris['species']).tolist()
fig

This produces the expected output with 3 c:olors

Update: I tried to get a numeric variable mapped to the colors so I created a dummy column called linear which is just np.arange(0,len(iris)) and mapped it to the same colorscale

fig = go.FigureWidget()
parcords = fig.add_parcoords(dimensions=[{'label':n.title(),
                                          'values':iris[n],
                                          'range':[0,8]} for n in iris.columns[:-3]])

fig.data[0].dimensions[0].constraintrange = [4,8]
parcords.line.color = iris['linear']
parcords.line.colorscale = [[0,'#D7C16B'],[0.5,'#23D8C3'],[1,'#F3F10F']]
parcords.line.colorbar.title = ''
fig.layout.title = 'A Wild Parallel Coordinates Plot'
fig

This makes

which is what I expected because this makes it look like the highest value is given the color corresponding to 1 in the colorscale property and everything else gets the color corresponding to the lowest colorscale value (0 in this case).

I tried using a four-step colorscale

parcords.line.colorscale = [[0,'#D7C16B'],[0.33,'#23D8C3'],[0.67,'#F3A10F'],[1,'#F3F10F']]

and got the exact same plot as above just with a different (and correct) colorbar. So everything but the top and bottom value in the colorscale are being skipped over

The text was updated successfully, but these errors were encountered:

jonmmease · 2019-03-01T11:12:13Z

Hi @Rabeez, thanks for the report. Could you include a fully reproducible example with data loading and imports included? When I start with the example from the docs and work towards your example things seem to be working.

import plotly.graph_objs as go
import numpy as np
import pandas as pd 

df = pd.read_csv("https://raw.githubusercontent.com/bcdunbar/datasets/master/iris.csv")

species_df = df.groupby('species_id', as_index=False).species.first()

data = [
    go.Parcoords(
        line = dict(color = df['species_id'],
                    colorscale=[[0,'#D7C16B'],[0.5,'#23D8C3'],[1,'#F3F10F']],
                    colorbar={'tickvals': species_df['species_id'].tolist(),
                              'ticktext': species_df['species'].tolist()
                             },
                    showscale=True),
        dimensions = list([
            dict(range = [0,8],
                constraintrange = [4,8],
                label = 'Sepal Length', values = df['sepal_length']),
            dict(range = [0,8],
                label = 'Sepal Width', values = df['sepal_width']),
            dict(range = [0,8],
                label = 'Petal Length', values = df['petal_length']),
            dict(range = [0,8],
                label = 'Petal Width', values = df['petal_width'])
        ])
    )
]

layout = go.Layout()

fig = go.FigureWidget(data = data, layout = layout)
fig

Also please inlcude the version of plotly.py that you're using. Thanks!

Rabeez · 2019-03-02T11:11:48Z

Here is a full isolated example, which I also just double checked (it skips the middle class).

Also I'm on plotly.py 3.6.1, I mentioned it in the original post but it probably got buried in the text. I've moved it to the top now.

jonmmease · 2019-03-02T11:41:52Z

Hi @Rabeez, thanks for taking the time to report this and share the full example. I was able to work out the issue by starting with your example.

It seems the problem is in Plotly.js's handling of integer TypedArrays as the line.color property of a parcoords trace. Here is the corresponding Plotly.js issue plotly/plotly.js#3595.

In the meantime, the workaround is to cast the integer pandas series to a floating point series:

import numpy as np
import pandas as pd
import seaborn as sns
import plotly
import plotly.graph_objs as go

iris = sns.load_dataset('iris')
iris['species_id'] = pd.Categorical(iris['species']).codes
print(iris.shape)
iris.head()

fig = go.FigureWidget()
parcords = fig.add_parcoords(dimensions=[{'label':n.title(),
                                          'values':iris[n],
                                          'range':[0,8]} for n in iris.columns[:-3]])

fig.data[0].dimensions[0].constraintrange = [4,8]
parcords.line.color = iris['species_id'].astype('float')  # <---[HERE]
parcords.line.colorscale = [[0,'#D7C16B'],[0.5,'#23D8C3'],[1,'#F3F10F']]
parcords.line.colorscale = [(0.0, '#D7C16B'),
                            (0.3333333333333333, '#D7C16B'),
                            (0.3333333333333333, '#23D8C3'),
                            (0.6666666666666666, '#23D8C3'),
                            (0.6666666666666666, '#F3F10F'),
                            (1.0, '#F3F10F')]

parcords.line.colorbar.title = ''
parcords.line.colorbar.tickvals = np.unique(iris['species_id']).tolist()
parcords.line.colorbar.ticktext = np.unique(iris['species']).tolist()
fig.layout.title = 'A Wild Parallel Coordinates Plot'
fig

As a side note, I wouldn't recommend computing the tickvals/ticktext like this:

parcords.line.colorbar.tickvals = np.unique(iris['species_id']).tolist()
parcords.line.colorbar.ticktext = np.unique(iris['species']).tolist()

It works in this exact case because the iris type strings happen to be sorted alphabetically. But in general you could end up with mismatched tick labels. See the species_df approach I used above for an alternative.

Rabeez · 2019-03-02T11:49:16Z

Yup, using floats fixed colors for both the categorical and numeric variable 👍
And yes I'm aware of the tickval thing :D it was a quick and dirty hack I put in but your groupby approach is will work much better.

Rabeez · 2019-03-02T11:50:07Z

I am curious though. Why does plotly.js have a different mechanism for setting up colors for a parcoords trace? Since the scatter plot works fine even with integer arrays.

archmoj · 2019-03-11T14:15:41Z

@jonmmease closed on plotly.js side: plotly/plotly.js#3595

jonmmease · 2019-03-25T22:57:21Z

Fix released in plotly.js 1.45.1 which was included in plotly.py 3.7.0.

Rabeez changed the title ~~Parcoords skipping a color from the colorscale for categorical variable~~ Parcoords skipping all colors from the colorscale except min and max value Feb 28, 2019

jonmmease added bug something broken plotly.js labels Mar 2, 2019

jonmmease added this to the v3.7.0 milestone Mar 25, 2019

jonmmease closed this as completed Mar 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parcoords skipping all colors from the colorscale except min and max value #1442

Parcoords skipping all colors from the colorscale except min and max value #1442

Rabeez commented Feb 28, 2019 •

edited

Loading

jonmmease commented Mar 1, 2019

Rabeez commented Mar 2, 2019

jonmmease commented Mar 2, 2019

Rabeez commented Mar 2, 2019

Rabeez commented Mar 2, 2019

archmoj commented Mar 11, 2019

jonmmease commented Mar 25, 2019

Parcoords skipping all colors from the colorscale except min and max value #1442

Parcoords skipping all colors from the colorscale except min and max value #1442

Comments

Rabeez commented Feb 28, 2019 • edited Loading

jonmmease commented Mar 1, 2019

Rabeez commented Mar 2, 2019

jonmmease commented Mar 2, 2019

Rabeez commented Mar 2, 2019

Rabeez commented Mar 2, 2019

archmoj commented Mar 11, 2019

jonmmease commented Mar 25, 2019

Rabeez commented Feb 28, 2019 •

edited

Loading