Skip to content

Bullet Chart FF #872

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Nov 23, 2017
Merged

Bullet Chart FF #872

merged 22 commits into from
Nov 23, 2017

Conversation

Kully
Copy link
Contributor

@Kully Kully commented Nov 7, 2017

NEEDED: Tests

update: tests added

@Kully
Copy link
Contributor Author

Kully commented Nov 7, 2017

Screenshots w/ code examples:

import plotly
import plotly.plotly as py
import plotly.figure_factory as ff
import plotly.colors as colors

import pandas as pd

fig = ff.create_bullet(
    df, as_rows=True
)
py.iplot(fig)

screen shot 2017-11-07 at 3 24 16 pm

With Colorhunt theme:

screen shot 2017-11-07 at 3 28 38 pm

import plotly
import plotly.plotly as py
import plotly.figure_factory as ff
import plotly.colors as colors

import pandas as pd

fig = ff.create_bullet(
    df, as_rows=False, marker_size=30, marker_symbol='400',measure_colors=['#476268', '#40A798'],
    range_colors=['#F5E1DA', '#F1F1F1']
)
py.iplot(fig)

screen shot 2017-11-07 at 3 30 16 pm

@Kully
Copy link
Contributor Author

Kully commented Nov 8, 2017

@chriddyp ready for a review

@chriddyp
Copy link
Member

chriddyp commented Nov 8, 2017

It looks like we're using shapes to create these bars. Have you looked into using bar chart traces with width, offset, and base instead? Here's the PR that added it plotly/plotly.js#1075 and here is an example that I was making a couple of weeks ago for a separate project: https://codepen.io/chriddyp/pen/mBQVEy

@chriddyp
Copy link
Member

chriddyp commented Nov 8, 2017

The advantage of using bar charts with width and offset are:

  • Easier to load and style inside chart studio (https://plot.ly/create)
  • Easier for user to update their figure programatically
  • Hover labels

@Kully
Copy link
Contributor Author

Kully commented Nov 10, 2017

@chriddyp ready for a review now. The code has been reworked for bar traces 📊

plotly/colors.py Outdated

if colortype == 'rgb':
# covert back to rgb
inter_med_rgb = label_rgb(inter_med_tuple)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by rgb do you mean like a string like rgb(30, 20, 10)? if so, could we say convert back to an rgb string, e.g. rgb(30, 20, 10)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's what I mean. More clearer your way.

"""
Splits a low and high color into a list of n_colors colors in it

Accepts two color tuples and returns a list of n_colors colors
which form the intermediate colors between lowcolor and highcolor
from linearly interpolating through RGB space
from linearly interpolating through RGB space. If colortype is 'rgb'
the function will return a list of colors in the same form.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a typo? i thought tuple returned a list (well, a tuple) of colors?

Copy link
Contributor Author

@Kully Kully Nov 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how there's confusion.

If colortype='tuple', lowcolor and highcolor must be what I have called tuple(-like) colors i.e. tuples of the form (a,b,c) where a, b, c are between 0 and 1, where (0,0,0) == rgb(0,0,0), etc.

If the input islowcolor=(0,0,0), highcolor=(1,1,1), n_colors=3 then the output is:

[(0, 0, 0), (0.5, 0.5, 0.5), (1, 1, 1)]

If colortype='rgb', then the list above will contain the rgb equivalent colors in a list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this function will error if you input tuple-colors and colortype is set to rgb. Another thing to fix.

plotly/colors.py Outdated
@@ -520,6 +542,10 @@ def n_colors(lowcolor, highcolor, n_colors):
lowcolor[2] + (index * incr_2))
color_tuples.append(new_tuple)

if colortype == 'rgb':
# convert back to rgb
color_tuples = color_parser(color_tuples, label_rgb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this say color_rgb rather than color_tuples?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should, but I kept it like that so that color_tuples is a common variable in both paths the function can take. Shall I change to list_of_colors for an apt name?

DataFrame. All keys must be one of 'title', 'subtitle', 'ranges',
'measures', and 'markers'.
:param (bool) as_rows: if True, the bars are placed horizontally as rows.
If False, the bars are placed vertically in the chart.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would orientation='v' | 'h' be more consistent with the rest of plotly here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes again. 👍


:param (pd.DataFrame | list) df: either a JSON list of dicts or a pandas
DataFrame. All keys must be one of 'title', 'subtitle', 'ranges',
'measures', and 'markers'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unlike the rest of our figure factories, but I might be mistaken. It seems odd to require users to reformat their dataframes. Would it be better if it was something like:

create_bullet(df, title='column a', measures='column b', markers='column c')

?

In other words, let users pass in whatever dataframe they want but then have them specify the column mapping

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, if there are they are probably ones I wrote. This makes way more sense, great suggestion.

try:
r_is_nan = math.isnan(r)
if r_is_nan or r is None:
df[needed_key][idx] = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here. if you need to change data, make a copy of their data and then mutate the copy

fig['layout'].update(
title=title,
height=height,
width=width,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this was layout_options then you could just pass in fig['layout'].update(layout_options)

)

def test_full_bullet(self):
df = pd.read_json('https://cdn.rawgit.com/plotly/datasets/master/BulletData.json')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include this data in the repo itself? That way these tests don't break if that dataset is removed or if rawgit goes down.

fig = plotly.tools.make_subplots(
num_of_rows, num_of_cols, print_grid=False,
horizontal_spacing=subplot_spacing,
vertical_spacing=subplot_spacing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to provide horizontal_spacing and vertical_spacing as input arguments to ff.make_bullet instead? that way it's consistent with make_subplots

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does!

:param (str) title: title of the bullet chart.
:param (float) height: height of the chart.
:param (float) width width of the chart.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we include a couple of simple examples in here?

@Kully
Copy link
Contributor Author

Kully commented Nov 22, 2017

@chriddyp bumped ver num and CHANGELOG update

ready for a final 💃


for k in scatter_options:
if k != 'marker':
markers[k] = scatter_options[k]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be written just like:

go.Scatter(
    x=x,
    y=y,
    name='markers',
    hoverinfo='...',
    xaxis='...',
    yaxis='...',
    **scatter_options
)

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

of each subplot chart.
:param (bool) orientation: if 'h', the bars are placed horizontally as
rows. If 'v' the bars are placed vertically in the chart.
:param (int) marker_size: sets the size of the markers in the chart.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is outdated

rows. If 'v' the bars are placed vertically in the chart.
:param (int) marker_size: sets the size of the markers in the chart.
:param (str | int) marker_symbol: the symbol of the markers in the chart.
Default='diamond-tall'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this too

the rectangles for the range are drawn. These rectangles are meant to
be qualitative indicators against which the marker and measure bars
are compared.
Default=['rgb(198, 198, 198)', 'rgb(248, 248, 248)']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see rgb(248, 248, 248) anywhere else in the code.

Would it make sense to supply this value directly in the function instead of None? i.e.

def create_bullet(data, markers=None, measures=None, ranges=None,
                   subtitles=None, titles=None, orientation='h',
                   range_colors=('rgb(198, 198, 198)', 'rgb(248, 248, 248)',),
                   measure_colors=('rgb(31, 119, 280)', 'rgb(176, 196, 221)', ),
                   horizontal_spacing=None, vertical_spacing=None,
                   scatter_options={}, **layout_options):

(note that I'm using a tuple instead of a list because mutable default arguments aren't safe: http://docs.python-guide.org/en/latest/writing/gotchas/)

# validate df
if not pd:
raise exceptions.ImportError(
"'pandas' must be imported for this figure_factory."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"imported" - it's really installed, not imported, right? The user doesn't need to do a import pandas before they call this function do they?
If so, could we correct this typo in any other place that it might appear

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's correct, it just needs to be installed on their system

if isinstance(data, list):
if not all(isinstance(item, dict) for item in data):
raise exceptions.PlotlyError(
'If your data is a list, all entries must be dictionaries.'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's confusing to lead with a "If your data is a list" - the user might wonder if there data is indeed a list or not. What about something like "Every entry of the data argument (a list) must be a dictionary."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this:

'Every entry of the data argument (a list, tuple, etc) must be a dictionary.'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 very nice

[d[measures] for d in data] if measures else [[]] * len(data),
[d[ranges] for d in data] if ranges else [[]] * len(data),
],
index=col_names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than looping through every element and reconstructing a list, I believe that you can just pass in your list of dictionaries directly into pandas:

In [2]: pd.DataFrame([{'x': 1, 'y': 3}, {'x': 3, 'y': 10}])
Out[2]:
   x   y
0  1   3
1  3  10

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but I want to insert empty columns where a column name is not inserted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ex. if the user doesn't assign the titles variable to anything, I want a column named titles with empty strings in each cell of that column

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I leave this the same way it is? I think it will be more straightforward than doing pd.DataFrame(data) then adding in the empty columns afterwards.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

]

fig = ff.create_bullet(
data, titles='e', subtitles='d', markers='a', measures='b',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use more descriptive names than e, d, a, b?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes can do. the point was to have titles that are different than the names of the columns that the second dataframe I construct in the figure factory.

@chriddyp
Copy link
Member

chriddyp commented Nov 22, 2017

💃 once my comments are addressed

@Kully
Copy link
Contributor Author

Kully commented Nov 23, 2017

What do you think also about allowing tuples as input arguments for the data?

There's no reason why not to allow this.

I was thinking of altering the part of code that checks if isinstance(data, list) to something like:

hasattr(data, index) for both lists and tuples

@chriddyp
Copy link
Member

What do you think also about allowing tuples as input arguments for the data?

To an extent, python is a duck-typed language so we don't always need to explicitly check the type as long as the object that is supplied implements the right methods. So lists, tuples, numpy arrays, (and others) might look and feel the same and should all be accepted.

Here is a solution that checks if the object is a Sequence which is the base class for things like lists and tuples: https://stackoverflow.com/a/37842328/4142536

@Kully Kully merged commit 2ffbe1a into master Nov 23, 2017
@Kully Kully deleted the bullet_charts branch November 23, 2017 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants