Skip to content

Commit 9fa8326

Browse files
add swarm plot to the scatter documentation
This is inspired by plotly#5087
1 parent 80d473e commit 9fa8326

File tree

1 file changed

+89
-0
lines changed

1 file changed

+89
-0
lines changed

Diff for: doc/python/line-and-scatter.md

+89
Original file line numberDiff line numberDiff line change
@@ -284,6 +284,95 @@ fig.update_traces(textposition="bottom right")
284284
fig.show()
285285
```
286286

287+
### Swarm (or Beeswarm) Plots
288+
289+
Swarm plots show the distribution of values in a column by giving each entry one dot and adjusting the y-value so that dots do not overlap and appear symmetrically around the y=0 line. They complement histograms, box plots, and violin plots. This example could be generalized to implement a swarm plot for multiple categories by adjusting the y-coordinate for each category.
290+
291+
```python
292+
import pandas as pd
293+
import plotly.express as px
294+
import collections
295+
296+
297+
def swarm(
298+
X_series,
299+
point_size=16,
300+
fig_width = 800,
301+
gap_multiplier=1.2,
302+
):
303+
#sorting will align columns in attractive arcs rather than having columns the vary unpredicatbly in the x-dimension
304+
X_series=X_series.copy().sort_values()
305+
306+
307+
# we need to reason in terms of the marker size that is measured in px
308+
# so we need to think about each x-coordinate as being a fraction of the way from the
309+
# minimum X value to the maximum X value
310+
min_x = min(X_series)
311+
max_x = max(X_series)
312+
313+
list_of_rows = []
314+
# we will count the number of points in each "bin" / vertical strip of the graph
315+
# to be able to assign a y-coordinate that avoids overlapping
316+
bin_counter = collections.Counter()
317+
318+
for x_val in X_series:
319+
# assign this x_value to bin number
320+
# each bin is a vertical strip wide enough for one marker
321+
bin=(((fig_width*(x_val-min_x))/(max_x-min_x)) // point_size)
322+
323+
#update the count of dots in that strip
324+
bin_counter.update([bin])
325+
326+
# if this is an odd numbered entry in its bin, make its y coordinate negative
327+
# the y coordinate of the first entry is 0, so entries 3, 5, and 7 get negative y coordinates
328+
if bin_counter[bin]%2 == 1:
329+
negative_1_if_count_is_odd = -1
330+
else:
331+
negative_1_if_count_is_odd = 1
332+
333+
# the collision free y coordinate gives the items in a vertical bin
334+
# coordinates: 0, 1, -1, 2, -2, 3, -3 ... and so on to evenly spread
335+
# their locations above and below the y-axis (we'll make a correction below to deal with even numbers of entries)
336+
# we then scale this by the point_size*gap_multiplier to get a y coordinate in px
337+
338+
collision_free_y_coordinate=(bin_counter[bin]//2)*negative_1_if_count_is_odd*point_size*gap_multiplier
339+
list_of_rows.append({"x":x_val,"y":collision_free_y_coordinate,"bin":bin})
340+
341+
# if the number of points is even,
342+
# move y-coordinates down to put an equal number of entries above and below the axis
343+
for row in list_of_rows:
344+
if bin_counter[row["bin"]]%2==0:
345+
row["y"]-=point_size*gap_multiplier/2
346+
347+
df = pd.DataFrame(list_of_rows)
348+
349+
fig = px.scatter(
350+
df,
351+
x="x",
352+
y="y",
353+
hover_data="x",
354+
)
355+
#we want to suppress the y coordinate in the hover value because the y-coordinate is irrelevant/misleading
356+
fig.update_traces(
357+
marker_size=point_size,
358+
hovertemplate="<b>value</b>: %{x}",
359+
)
360+
# we have to set the width and height because we aim to avoid icon collisions and we specify the icon size
361+
# in the same units as the width and height
362+
fig.update_layout(width=fig_width, height=(point_size*max(bin_counter.values())+200))
363+
fig.update_yaxes(
364+
showticklabels=False, # Turn off y-axis labels
365+
ticks='', # Remove the ticks
366+
title=""
367+
)
368+
fig.show()
369+
370+
371+
372+
df_iris = px.data.iris() # iris is a pandas DataFrame
373+
swarm(df_iris["sepal_length"])
374+
```
375+
287376
## Scatter and line plots with go.Scatter
288377

289378
If Plotly Express does not provide a good starting point, it is possible to use [the more generic `go.Scatter` class from `plotly.graph_objects`](/python/graph-objects/). Whereas `plotly.express` has two functions `scatter` and `line`, `go.Scatter` can be used both for plotting points (makers) or lines, depending on the value of `mode`. The different options of `go.Scatter` are documented in its [reference page](https://plotly.com/python/reference/scatter/).

0 commit comments

Comments
 (0)