Skip to content

Commit c5ad978

Browse files
authored
Merge pull request #106 from plotly/parallel-category
Parallel category V4
2 parents b265e5d + e2f870c commit c5ad978

File tree

2 files changed

+286
-354
lines changed

2 files changed

+286
-354
lines changed

python/parallel-categories-diagram.md

Lines changed: 286 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,286 @@
1+
---
2+
jupyter:
3+
jupytext:
4+
notebook_metadata_filter: all
5+
text_representation:
6+
extension: .md
7+
format_name: markdown
8+
format_version: '1.1'
9+
jupytext_version: 1.2.1
10+
kernelspec:
11+
display_name: Python 3
12+
language: python
13+
name: python3
14+
language_info:
15+
codemirror_mode:
16+
name: ipython
17+
version: 3
18+
file_extension: .py
19+
mimetype: text/x-python
20+
name: python
21+
nbconvert_exporter: python
22+
pygments_lexer: ipython3
23+
version: 3.6.8
24+
plotly:
25+
description: How to make parallel categories diagrams in Python with Plotly.
26+
display_as: statistical
27+
has_thumbnail: true
28+
ipynb: ~notebook_demo/258
29+
language: python
30+
layout: user-guide
31+
name: Parallel Categories Diagram
32+
order: 10.3
33+
page_type: u-guide
34+
permalink: python/parallel-categories-diagram/
35+
thumbnail: thumbnail/parcats.jpg
36+
title: Python Parallel Categories | Plotly
37+
---
38+
39+
#### Parallel Categories Diagram
40+
The parallel categories diagram is a visualization of multi-dimensional categorical data sets. Each variable in the data set is represented by a column of rectangles, where each rectangle corresponds to a discrete value taken on by that variable. The relative heights of the rectangles reflect the relative frequency of occurrence of the corresponding value.
41+
42+
Combinations of category rectangles across dimensions are connected by ribbons, where the height of the ribbon corresponds to the relative frequency of occurrence of the combination of categories in the data set.
43+
44+
45+
#### Basic Parallel Category Diagram with plotly.express
46+
47+
This example visualizes the resturant bills of a sample of 244 people. Hovering over a category rectangle (sex, smoker, etc) displays a tooltip with the number of people with that single trait. Hovering over a ribbon in the diagram displays a tooltip with the number of people with a particular combination of the five traits connected by the ribbon.
48+
49+
50+
```python
51+
import plotly.express as px
52+
53+
tips = px.data.tips()
54+
fig = px.parallel_categories(tips)
55+
56+
fig.show()
57+
```
58+
59+
#### Style Diagram
60+
In this example `dimensions` represents a list of stings or the columns of data frame, and `labels` is a dictionary with string keys (column name) and string values ('desired label to be displayed'). See [Plotly express reference page](https://www.plotly.express/plotly_express/#plotly_express.parallel_categories) for more information.
61+
62+
```python
63+
import plotly.express as px
64+
65+
tips = px.data.tips()
66+
fig = px.parallel_categories(tips, dimensions=['sex', 'smoker', 'day'],
67+
color="size", color_continuous_scale=px.colors.sequential.Inferno,
68+
labels={'sex':'Payer sex', 'smoker':'Smokers at the table', 'day':'Day of week'})
69+
fig.show()
70+
```
71+
72+
#### Basic Parallel Categories Diagram
73+
This example illustartes the hair color, eye color, and sex of a sample of 8 people. The dimension labels can be dragged horizontally to reorder the dimensions and the category rectangles can be dragged vertically to reorder the categories within a dimension.
74+
75+
```python
76+
import plotly.graph_objects as go
77+
78+
fig = go.Figure(go.Parcats(
79+
dimensions=[
80+
{'label': 'Hair',
81+
'values': ['Black', 'Black', 'Black', 'Brown', 'Brown', 'Brown', 'Red', 'Brown']},
82+
{'label': 'Eye',
83+
'values': ['Brown', 'Brown', 'Brown', 'Brown', 'Brown', 'Blue', 'Blue', 'Blue']},
84+
{'label': 'Sex',
85+
'values': ['Female', 'Female', 'Female', 'Male', 'Female', 'Male', 'Male', 'Male']}]
86+
))
87+
88+
fig.show()
89+
```
90+
91+
#### Basic Parallel Categories Diagram with Counts
92+
If the frequency of occurrence for each combination of attributes is known in advance, this can be specified using the `counts` property
93+
94+
```python
95+
import plotly.graph_objects as go
96+
97+
fig = go.Figure(go.Parcats(
98+
dimensions=[
99+
{'label': 'Hair',
100+
'values': ['Black', 'Brown', 'Brown', 'Brown', 'Red']},
101+
{'label': 'Eye',
102+
'values': ['Brown', 'Brown', 'Brown', 'Blue', 'Blue']},
103+
{'label': 'Sex',
104+
'values': ['Female', 'Male', 'Female', 'Male', 'Male']}],
105+
counts=[6, 10, 40, 23, 7]
106+
))
107+
108+
109+
fig.show()
110+
```
111+
112+
#### Mutli-Color Parallel Categories Diagram
113+
The color of the ribbons can be specified with the `line.color` property. Similar to other trace types, this property may be set to an array of numbers, which are then mapped to colors according to the the colorscale specified in the `line.colorscale` property.
114+
115+
Here is an example of visualizing the survival rate of passengers in the titanic dataset, where the ribbons are colored based on survival outcome.
116+
117+
By setting the `hoveron` property to `'color'` and the `hoverinfo` property to `'count+probability'` the tooltips now display count and probability information for each color (survival outcome) per category.
118+
119+
By setting the `arrangement` property to `'freeform'` it is now possible to drag categories horizontally to reorder dimensions as well as vertically to reorder categories within the dimension.
120+
121+
```python
122+
import plotly.graph_objects as go
123+
import pandas as pd
124+
125+
titanic_df = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/titanic.csv")
126+
127+
# Create dimensions
128+
class_dim = go.parcats.Dimension(
129+
values=titanic_df.Pclass,
130+
categoryorder='category ascending', label="Class"
131+
)
132+
133+
gender_dim = go.parcats.Dimension(values=titanic_df.Sex, label="Gender")
134+
135+
survival_dim = go.parcats.Dimension(
136+
values=titanic_df.Survived, label="Outcome", categoryarray=[0, 1],
137+
ticktext=['perished', 'survived']
138+
)
139+
140+
# Create parcats trace
141+
color = titanic_df.Survived;
142+
colorscale = [[0, 'lightsteelblue'], [1, 'mediumseagreen']];
143+
144+
fig = go.Figure(data = [go.Parcats(dimensions=[class_dim, gender_dim, survival_dim],
145+
line={'color': color, 'colorscale': colorscale},
146+
hoveron='color', hoverinfo='count+probability',
147+
labelfont={'size': 18, 'family': 'Times'},
148+
tickfont={'size': 16, 'family': 'Times'},
149+
arrangement='freeform')])
150+
151+
fig.show()
152+
```
153+
154+
#### Parallel Categories Linked Brushing
155+
This example demonstrates how the `on_selection` and `on_click` callbacks can be used to implement linked brushing between 3 categorical dimensions displayed with a `parcats` trace and 2 continuous dimensions displayed with a `scatter` trace.
156+
157+
This example also sets the `line.shape` property to `hspline` to cause the ribbons to curve between categories.
158+
159+
**Note:** In order for the callback functions to be executed the figure must be a `FigureWidget`, and the figure should display itself.
160+
161+
```python
162+
import plotly.graph_objects as go
163+
from ipywidgets import widgets
164+
import pandas as pd
165+
import numpy as np
166+
167+
cars_df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/imports-85.csv')
168+
169+
# Build parcats dimensions
170+
categorical_dimensions = ['body-style', 'drive-wheels', 'fuel-type'];
171+
172+
dimensions = [dict(values=cars_df[label], label=label) for label in categorical_dimensions]
173+
174+
# Build colorscale
175+
color = np.zeros(len(cars_df), dtype='uint8')
176+
colorscale = [[0, 'gray'], [1, 'firebrick']]
177+
178+
# Build figure as FigureWidget
179+
fig = go.FigureWidget(
180+
data=[go.Scatter(x=cars_df.horsepower, y=cars_df['highway-mpg'],
181+
marker={'color': 'gray'}, mode='markers', selected={'marker': {'color': 'firebrick'}},
182+
unselected={'marker': {'opacity': 0.3}}), go.Parcats(
183+
domain={'y': [0, 0.4]}, dimensions=dimensions,
184+
line={'colorscale': colorscale, 'cmin': 0,
185+
'cmax': 1, 'color': color, 'shape': 'hspline'})
186+
])
187+
188+
fig.update_layout(
189+
height=800, xaxis={'title': 'Horsepower'},
190+
yaxis={'title': 'MPG', 'domain': [0.6, 1]},
191+
dragmode='lasso', hovermode='closest')
192+
193+
# Update color callback
194+
def update_color(trace, points, state):
195+
# Update scatter selection
196+
fig.data[0].selectedpoints = points.point_inds
197+
198+
# Update parcats colors
199+
new_color = np.zeros(len(cars_df), dtype='uint8')
200+
new_color[points.point_inds] = 1
201+
fig.data[1].line.color = new_color
202+
203+
# Register callback on scatter selection...
204+
fig.data[0].on_selection(update_color)
205+
# and parcats click
206+
fig.data[1].on_click(update_color)
207+
208+
fig
209+
```
210+
211+
![Single-color brushing](https://user-images.githubusercontent.com/15064365/48087050-4a3eae80-e1cc-11e8-94c9-c0ffcedd3aa9.gif)
212+
213+
214+
#### Parallel Categories with Multi-Color Linked Brushing
215+
This example extends the previous example to support brushing with multiple colors. The toggle buttons above may be used to select the active color, and this color will be applied when points are selected in the `scatter` trace and when categories or ribbons are clicked in the `parcats` trace.
216+
217+
```python
218+
import plotly.graph_objects as go
219+
import ipywidgets as widgets
220+
import pandas as pd
221+
import numpy as np
222+
223+
cars_df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/imports-85.csv')
224+
225+
# Build parcats dimensions
226+
categorical_dimensions = ['body-style', 'drive-wheels', 'fuel-type']
227+
228+
dimensions = [dict(values=cars_df[label], label=label) for label in categorical_dimensions]
229+
230+
# Build colorscale
231+
color = np.zeros(len(cars_df), dtype='uint8')
232+
colorscale = [[0, 'gray'], [0.33, 'gray'],
233+
[0.33, 'firebrick'], [0.66, 'firebrick'],
234+
[0.66, 'blue'], [1.0, 'blue']]
235+
cmin = -0.5
236+
cmax = 2.5
237+
238+
# Build figure as FigureWidget
239+
fig = go.FigureWidget(
240+
data=[go.Scatter(x=cars_df.horsepower, y=cars_df['highway-mpg'],
241+
marker={'color': color, 'cmin': cmin, 'cmax': cmax,
242+
'colorscale': colorscale, 'showscale': True,
243+
'colorbar': {'tickvals': [0, 1, 2], 'ticktext': ['None', 'Red', 'Blue']}},
244+
mode='markers'),
245+
246+
go.Parcats(domain={'y': [0, 0.4]}, dimensions=dimensions,
247+
line={'colorscale': colorscale, 'cmin': cmin,
248+
'cmax': cmax, 'color': color, 'shape': 'hspline'})]
249+
)
250+
251+
fig.update_layout(height=800, xaxis={'title': 'Horsepower'},
252+
yaxis={'title': 'MPG', 'domain': [0.6, 1]},
253+
dragmode='lasso', hovermode='closest')
254+
255+
# Build color selection widget
256+
color_toggle = widgets.ToggleButtons(
257+
options=['None', 'Red', 'Blue'],
258+
index=1, description='Brush Color:', disabled=False)
259+
260+
# Update color callback
261+
def update_color(trace, points, state):
262+
# Compute new color array
263+
new_color = np.array(fig.data[0].marker.color)
264+
new_color[points.point_inds] = color_toggle.index
265+
266+
with fig.batch_update():
267+
# Update scatter color
268+
fig.data[0].marker.color = new_color
269+
270+
# Update parcats colors
271+
fig.data[1].line.color = new_color
272+
273+
# Register callback on scatter selection...
274+
fig.data[0].on_selection(update_color)
275+
# and parcats click
276+
fig.data[1].on_click(update_color)
277+
278+
# Display figure
279+
widgets.VBox([color_toggle, fig])
280+
```
281+
282+
![Multi-color brushing](https://user-images.githubusercontent.com/15064365/48087055-4e6acc00-e1cc-11e8-8f0c-c8316eb4def2.gif)
283+
284+
285+
#### Reference
286+
See [reference page](https://plot.ly/python/reference/#parcats) for more information and chart attribute options!

0 commit comments

Comments
 (0)