Skip to content

More robust Sankey #2629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 11, 2018
Merged

More robust Sankey #2629

merged 3 commits into from
May 11, 2018

Conversation

alexcjohnson
Copy link
Collaborator

Make Sankey more accepting of messy data (fixing #2480)

  • Numeric strings are OK
  • Zero (and negative) link values get ignored completely. Note: this is not what I proposed in Sankey nodes overlap by default #2480 (comment) - thinking a bit more about it, links with no value should really be not links at all, and nodes with no size shouldn't be nodes at all. If you really want a node & link to show up with zero value, create one with an explicitly tiny value.
  • Unused nodes and malformed links are ignored as well. Since there are now various places this can happen and in other contexts we simply ignore partially-invalid data points, I also removed the logged error on missing nodes - which happened during supplyDefaults, so was called frequently, and had a fairly slow implementation so could be a drag on large Sankey traces.

cc @etpinard

@@ -437,6 +437,8 @@ function recurseIntoValObject(valObject, parts, i) {
return valObject;
}

// note: this is different from Lib.isIndex, this one doesn't accept numeric
// strings, only actual numbers.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially made isIndex into a Lib function thinking I could reuse it here... but in this context it's in principle important that an index is actually a number, to distinguish an array index from an object property that happens to be a number. That said I don't think it's possible to use a number as an object property in nestedProperty...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh we could use this for validating selectedpoints items as done here:

plotly.js/src/lib/index.js

Lines 497 to 499 in a132b85

function isPtIndexValid(v) {
return isNumeric(v) && v >= 0 && v % 1 === 0;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... by this I mean Lib.isIndex

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call -> cdcd620

We probably either have or should have this in other places too... mesh3d is the other obvious place, but it has bigger issues -> #2630

links[i].source = nodeIndices[links[i].source];
links[i].target = nodeIndices[links[i].target];
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the most part it didn't cause problems to have extra unused nodes in the list, but sometimes the layout algorithm would reserve extra space for these nodes (leaving extra margins or making the whole diagram S-shaped, for example) so it's better to remove them. The sankey_messy mock includes a whole bunch of extra nodes both before (to test this link reassignment) and after the real ones to test this.

@@ -14,13 +14,16 @@ var wrap = require('../../lib/gup').wrap;

function circularityPresent(nodeList, sources, targets) {

var nodes = nodeList.map(function() {return [];});
var nodes = nodeList.map(function() { return []; });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to reuse:

exports.init2dArray = function(rowLength, colLength) {
var array = new Array(rowLength);
for(var i = 0; i < rowLength; i++) array[i] = new Array(colLength);
return array;
};

describe('No warnings for missing nodes', function() {
// we used to warn when some nodes were not used in the links
// not doing that anymore, it's not really consistent with
// the rest of our data processing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicely done 👌

@etpinard
Copy link
Contributor

Great PR 💃

@etpinard
Copy link
Contributor

... my comments are non-blocking.

@alexcjohnson alexcjohnson merged commit 9d61443 into master May 11, 2018
@alexcjohnson alexcjohnson deleted the sankey-overlap-fix branch May 11, 2018 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants