-
Notifications
You must be signed in to change notification settings - Fork 3.1k
bug#10511 Add total orderings for Float and Double #6410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
review @Ichoran |
@SethTisue could you run a community build on this, to see how much code assumes that there is a non-ambiguous implicit |
0a05589
to
b69d410
Compare
queued: https://scala-ci.typesafe.com/job/scala-2.13.x-integrate-community-build/993/ (404 til Jenkins catches up) |
Problems:
scala-js is the one I'm most concerned about |
is two orderings (of the same type) in implicit scope better than no orderings in implicit scope? |
That can be addressed by adding the explicit |
@dwijnand having them both in scope means that users know what they are called and thus where their documentation is (hopefully helping them choose); if none were in scope, they wouldn't know where to start, and would just be confused why there's no |
@sjrd Because the sorts use The behaviour in 2.12 is that of |
@NthPortal interesting.. you're right. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I'm not sure if there's a way to get a custom message in there in case of the collision (that would be ideal), but I think it's a good compromise between discoverability and safety.
All the details look fine to me.
wouldn't |
@hrhino - Oh, yes! It would. I hadn't noticed that one. @NthPortal - Can you provide a more solution-oriented message using the annotation? (E.g. suggest passing the ordering directly, or explicitly importing the one that they mean in the context where they mean it?) |
I did not even know that annotation existed! I will absolutely work on coming up with a custom message for that |
Is the following message too long? "The behaviour of Double specified by IEEE is not consistent with a total ordering when dealing with NaN, so there are two orderings defined for Double: DoubleTotalOrdering, which is consistent with a total ordering, and DoubleIeeeOrdering, which is consistent as much as possible with IEEE spec and floating point operations defined in scala.math" |
@NthPortal - Yeah, that does seem too long, and it is missing the critical section that tells the receiver of the message what to do. The scaladoc on the instances should have the wordier explanation of the difference between the orderings. If you could come up with something more brief that would be great. The most important thing is to mention that the user needs to select (by import, assignment, or explicit passing) one of the two orderings. This may be the first intentional collision between implicits that the user sees, so we should walk them through solving it. |
"The behaviour of IEEE Doubles is not always consistent with a total ordering, so there are two orderings defined for Double. You can import the one which best suits your needs, or pass it explicitly as an argument." @Ichoran thoughts? I can add the names of the two instances to the message, but the compiler should list them anyway as part of the error message. |
Thinking of possibly holding off on merging this until #6468 is merged, as it will simplify the changes needed for this PR |
I'd say terser is better (image a few pages of this ambiguity in a build log somewhere), favouring info about how to resolve and deferring further context to another source. so perhaps something like:
maybe? </bikeshed> |
I like the heads-up about the problem, but I agree that terseness is a virtue. Maybe
(And the corresponding one for Then the documentation would give examples of each, and discuss the problem at more length. |
@dwijnand I have literally requested bikeshedding at this point xD |
I was mistaken - using |
The message used in the latest commit is: "There are multiple ways to order Doubles (Ordering.DoubleTotalOrdering, Ordering.DoubleIeeeOrdering). Specify one by using a local import, assigning an implicit val, or passing it explicitly. See the documentation for details." If you can think of any reasonable way to shorten that further, I'm happy to do so, but I can't think of a good way to do so without losing important information. |
lgtm! |
It is a bit of a misuse of |
@martijnhoekstra I would also like some warning annotations as alternatives to |
I'm not at all against, but don't think it needs to hold up this PR. |
Add total orderings for Float and Double, so that there are two implicit orderings for each in scope: one consistent with a total ordering, and one consistent with IEEE spec.
squashed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎈
just to show anyone interested what we actually ended up with:
|
I like that What do you think about the usability of this lengthy error message as the default interaction with sorting, Seth? It still feels nonideal to me, but having undetected wrong assumptions also seems quite nonideal. |
IMO "See the documentation for details" could use some precision - which documentation? The documentation of |
idk... there is no ideal solution to this. if it had been just up to me, I would probably have just provided the additional orderings, updated the documentation to refer to them, not deprecated anything, and call it good, and let those with higher standards use a linter to enforce them. but perhaps I am insufficiently devoted to lawfulness and correctness. |
I would love to cut down on the length of the error message. If anyone has ideas of how to do so without losing significant information, I would be 100% behind that.
@dwijnand sounds good to me. I'll wait a little to see if anyone has thoughts on shortening the message in general before PRing the change though. |
I think it's not necessary to say "Specify one by using a local import, assigning an implicit val, or passing it explicitly." That's general-purpose scala knowledge. Especially for a warning which may occur a lot of times. How about, If it's not obvious enough from the context that they're implicits, how about this: Anyway, a newbie has been told to see their docs. Perhaps a (shortened?) link could be included. Anyway the docs would have all the information a newbie needs. And for everyone else it should be self-explanatory. |
(the docs could remind the reader of the various ways of using implicit Orderings) |
Fixes scala/bug#11844 Ref scala/bug#10511 Ref scala#6410 Ref scala#76 This change the deprecation of `DeprecatedDoubleOrdering` to a migration warning instead to avoid `List(1.0, -1.0).sorted` giving deprecation warning. This also provides some documentation on the ordering instances in Scaladoc.
Fixes scala/bug#11844 Ref scala/bug#10511 Ref scala#6410 Ref scala#76 This change the deprecation of `DeprecatedDoubleOrdering` to a migration warning instead to avoid `List(1.0, -1.0).sorted` giving deprecation warning. This also provides some documentation on the ordering instances in Scaladoc.
…d if built with Scala 2.13 ### What changes were proposed in this pull request? This PR fixes an issue that the histogram and timeline aren't rendered in the `Streaming Query Statistics` page if we built Spark with Scala 2.13.   The reason is [`maxRecordRate` can be `NaN`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala#L371) for Scala 2.13. The `NaN` is the result of [`query.recentProgress.map(_.inputRowsPerSecond).max`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala#L372) when the first element of `query.recentProgress.map(_.inputRowsPerSecond)` is `NaN`. Actually, the comparison logic for `Double` type was changed in Scala 2.13. scala/bug#12107 scala/scala#6410 So this issue happens as of Scala 2.13. The root cause of the `NaN` is [here](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala#L164). This `NaN` seems to be an initial value of `inputTimeSec` so I think `Double.PositiveInfinity` is suitable rather than `NaN` and this change can resolve this issue. ### Why are the changes needed? To make sure we can use the histogram/timeline with Scala 2.13. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? First, I built with the following commands. ``` $ /dev/change-scala-version.sh 2.13 $ build/sbt -Phive -Phive-thriftserver -Pscala-2.13 package ``` Then, ran the following query (this is brought from #30427 ). ``` import org.apache.spark.sql.streaming.Trigger val query = spark .readStream .format("rate") .option("rowsPerSecond", 1000) .option("rampUpTime", "10s") .load() .selectExpr("*", "CAST(CAST(timestamp AS BIGINT) - CAST((RAND() * 100000) AS BIGINT) AS TIMESTAMP) AS tsMod") .selectExpr("tsMod", "mod(value, 100) as mod", "value") .withWatermark("tsMod", "10 seconds") .groupBy(window($"tsMod", "1 minute", "10 seconds"), $"mod") .agg(max("value").as("max_value"), min("value").as("min_value"), avg("value").as("avg_value")) .writeStream .format("console") .trigger(Trigger.ProcessingTime("5 seconds")) .outputMode("append") .start() ``` Finally, I confirmed that the timeline and histogram are rendered.  ``` Closes #30546 from sarutak/ss-nan. Authored-by: Kousuke Saruta <[email protected]> Signed-off-by: Jungtaek Lim (HeartSaVioR) <[email protected]>
Add total orderings for Float and Double, so that there
are two implicit orderings for each in scope: one
consistent with a total ordering, and one consistent with
IEEE spec.
Fixes scala/bug#10511