Skip to content

Add Aggregation user input _index #2031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 2, 2017
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion src/plots/plots.js
Original file line number Diff line number Diff line change
Expand Up @@ -1039,9 +1039,21 @@ plots.supplyTransformDefaults = function(traceIn, traceOut, layout) {
_module = transformsRegistry[type],
transformOut;

/*
* Supply defaults may run twice. First pass runs all supply defaults steps
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions:

  • I assume this refers to the way supplyTraceDefaults runs on traces and then again on expanded traces?
  • Which _module is added? I think that sometimes refers to the trace module and sometimes to the transform module (and every now and then to the base plot module, though usually it's called baseModule or something, I think)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answer to second: the transform module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that is what it is referring to

* and adds the _module to any output transforms.
* If transforms exist another pass is run so that any generated traces also
* go through supply defaults. This has the effect of rerunning
* supplyTransforms. If the transform does not have a `transform` function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*supplyTransformDefaults?

* it could not have generated any new traces and the second stage is
* unnecessary. We detect this case with the following variables.
*/
var isSecondStage = transformIn._module && transformIn._module === _module,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a third stage possible if multiple transforms expand things multiple times? (e.g. groupby + groupby) I think this makes sense to me, but just making sure transform module identity equates to isSecondStage. (Or is it isNotFirstStage?) My understanding is just a little loose here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, more than 2 stages... need to think about that. Suggestions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I think we are ok. I should call this variable isNotFirstStage though. Thanks

Copy link
Contributor

@rreusser rreusser Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things get fundamentally a bit shaky after one level. Like groupby + groupby. It does mostly what you'd expect, but it turns out styling is the big challenge. Think SQL. You massage the query until you have a nice query that spits out data. Then you insert it into a plot and you're happy. Plotly's transforms incrementally apply styles as a sequence of transforms is applied so that you'd have to track styles as well as the pathway a data point took through the query to really sort things out after the fact. Though I agree that I think we're okay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, specifics: groupby adds a _group property to traces so that you can figure out which group split generated a trace. But it should really be tracking that as a transform index and a group value. This comes into play with transform vs. calcTransform since you start having to worry about when a trace was split/grouped/styled/etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies if that's still a bit loose on the specifics. /cc @alexcjohnson since we've talked this through before in a fair bit of detail. I think our conclusion at the time was that the basic use cases are handled just fine and account for 97+% of realistic use-cases.

doSecondStage = _module && typeof _module.transform === 'function';

if(!_module) Lib.warn('Unrecognized transform type ' + type + '.');

if(_module && _module.supplyDefaults) {
if(_module && _module.supplyDefaults && (!isSecondStage || doSecondStage)) {
transformOut = _module.supplyDefaults(transformIn, traceOut, layout, traceIn);
transformOut.type = type;
transformOut._module = _module;
Expand Down
35 changes: 17 additions & 18 deletions src/transforms/aggregate.js
Original file line number Diff line number Diff line change
Expand Up @@ -155,31 +155,29 @@ exports.supplyDefaults = function(transformIn, traceOut) {
arrayAttrs[groups] = 0;
}

var aggregationsIn = transformIn.aggregations;
var aggregationsIn = transformIn.aggregations || [];
var aggregationsOut = transformOut.aggregations = new Array(aggregationsIn.length);
var aggregationOut;

function coercei(attr, dflt) {
return Lib.coerce(aggregationsIn[i], aggregationOut, aggAttrs, attr, dflt);
}

if(aggregationsIn) {
for(i = 0; i < aggregationsIn.length; i++) {
aggregationOut = {};
var target = coercei('target');
var func = coercei('func');
var enabledi = coercei('enabled');

// add this aggregation to the output only if it's the first instance
// of a valid target attribute - or an unused target attribute with "count"
if(enabledi && target && (arrayAttrs[target] || (func === 'count' && arrayAttrs[target] === undefined))) {
if(func === 'stddev') coercei('funcmode');

arrayAttrs[target] = 0;
aggregationsOut[i] = aggregationOut;
}
else aggregationsOut[i] = {enabled: false};
for(i = 0; i < aggregationsIn.length; i++) {
aggregationOut = {_index: i};
var target = coercei('target');
var func = coercei('func');
var enabledi = coercei('enabled');

// add this aggregation to the output only if it's the first instance
// of a valid target attribute - or an unused target attribute with "count"
if(enabledi && target && (arrayAttrs[target] || (func === 'count' && arrayAttrs[target] === undefined))) {
if(func === 'stddev') coercei('funcmode');

arrayAttrs[target] = 0;
aggregationsOut[i] = aggregationOut;
}
else aggregationsOut[i] = {enabled: false, _index: i};
}

// any array attributes we haven't yet covered, fill them with the default aggregation
Expand All @@ -188,7 +186,8 @@ exports.supplyDefaults = function(transformIn, traceOut) {
aggregationsOut.push({
target: arrayAttrArray[i],
func: aggAttrs.func.dflt,
enabled: true
enabled: true,
_index: -1
});
}
}
Expand Down
56 changes: 56 additions & 0 deletions test/jasmine/tests/transform_aggregate_test.js
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,26 @@ describe('aggregate', function() {
expect(traceOut.marker.size).toEqual([10, 20]);
});

// Regression test - throws before fix:
// https://github.com/plotly/plotly.js/issues/2024
it('can handle case where aggregation array is missing', function() {
Plotly.newPlot(gd, [{
x: [1, 2, 3, 4, 5],
y: [2, 4, 6, 8, 10],
marker: {size: [10, 10, 20, 20, 10]},
transforms: [{
type: 'aggregate',
groups: 'marker.size'
}]
}]);

var traceOut = gd._fullData[0];

expect(traceOut.x).toEqual([1, 3]);
expect(traceOut.y).toEqual([2, 6]);
expect(traceOut.marker.size).toEqual([10, 20]);
});

it('handles median, mode, rms, & stddev for numeric data', function() {
// again, nothing is going to barf with non-numeric data, but sometimes it
// won't make much sense.
Expand Down Expand Up @@ -225,4 +245,40 @@ describe('aggregate', function() {
expect(traceOut.marker.line.width).toBeCloseToArray([0.5, 0], 5);
expect(traceOut.marker.color).toBeCloseToArray([Math.sqrt(1 / 3), 0], 5);
});

it('links fullData aggregations to userData via _index', function() {
Plotly.newPlot(gd, [{
x: [1, 2, 3, 4, 5],
y: [2, 4, 6, 8, 10],
marker: {
size: [10, 10, 20, 20, 10],
color: ['red', 'green', 'blue', 'yellow', 'white']
},
transforms: [{
type: 'aggregate',
groups: 'marker.size',
aggregations: [
{target: 'x', func: 'sum'},
{target: 'x', func: 'avg'},
{target: 'y', func: 'avg'},
]
}]
}]);

var traceOut = gd._fullData[0];
var fullAggregation = traceOut.transforms[0];
var fullAggregations = fullAggregation.aggregations;
var enabledAggregations = fullAggregations.filter(function(agg) {
return agg.enabled;
});

expect(enabledAggregations[0].target).toEqual('x');
expect(enabledAggregations[0]._index).toEqual(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to test this that doesn't rely upon an underscored property? Like could supplyTransformDefaults delete the first aggregation if it's overridden by another? Would that be adequate to sort things out in the workspace or does the workspace need to know indices?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason for indicies is for Workspace. supplyTransformDefaults doesn't care about them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need a way in Workspace to take a fullData transform Aggregration and write back to the correct userData index. I need some way to correlate userData indicies with the fullData equivalents.

Copy link
Member Author

@bpostlethwaite bpostlethwaite Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it shouldn't be underscored? Something like .inputIndex. the only problem with that is there are precedents for things being added to fullData for consumption by users that have an underscore. _input for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the workspace accesses indices directly? (That's okay, just making sure. plotly.js's 'private' properties are fairly publicly accessed, so it's nothing new. Just means they need to be modified with great care.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, and the workspace is the only thing that needs them. Plotly.js doesn't care as it rebuilds new arrays each time.


expect(enabledAggregations[1].target).toEqual('y');
expect(enabledAggregations[1]._index).toEqual(2);

expect(enabledAggregations[2].target).toEqual('marker.color');
expect(enabledAggregations[2]._index).toEqual(-1);
});
});