Skip to content

Commit f12b73d

Browse files
mattansbteunbrand
andauthored
Expand on docs/example for cases with non-equal-width bins in stat_bin() (#6151)
* Docs/example for non-equal-width bins * Update R/geom-histogram.R Co-authored-by: Teun van den Brand <[email protected]> * Update R/geom-histogram.R Co-authored-by: Teun van den Brand <[email protected]> * move (count / width) note to details --------- Co-authored-by: Teun van den Brand <[email protected]>
1 parent 5e62f0c commit f12b73d

File tree

2 files changed

+36
-0
lines changed

2 files changed

+36
-0
lines changed

R/geom-histogram.R

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,12 @@
1717
#' one change at a time. You may need to look at a few options to uncover
1818
#' the full story behind your data.
1919
#'
20+
#' By default, the _height_ of the bars represent the counts within each bin.
21+
#' However, there are situations where this behavior might produce misleading
22+
#' plots (e.g., when non-equal-width bins are used), in which case it might be
23+
#' preferable to have the _area_ of the bars represent the counts (by setting
24+
#' `aes(y = after_stat(count / width))`). See example below.
25+
#'
2026
#' In addition to `geom_histogram()`, you can create a histogram plot by using
2127
#' `scale_x_binned()` with [geom_bar()]. This method by default plots tick marks
2228
#' in between each bar.
@@ -63,6 +69,18 @@
6369
#' ggplot(diamonds, aes(price, after_stat(density), colour = cut)) +
6470
#' geom_freqpoly(binwidth = 500)
6571
#'
72+
#'
73+
#' # When using the non-equal-width bins, we should set the area of the bars to
74+
#' # represent the counts (not the height).
75+
#' # Here we're using 10 equi-probable bins:
76+
#' price_bins <- quantile(diamonds$price, probs = seq(0, 1, length = 11))
77+
#'
78+
#' ggplot(diamonds, aes(price)) +
79+
#' geom_histogram(breaks = price_bins, color = "black") # misleading (height = count)
80+
#'
81+
#' ggplot(diamonds, aes(price, after_stat(count / width))) +
82+
#' geom_histogram(breaks = price_bins, color = "black") # area = count
83+
#'
6684
#' if (require("ggplot2movies")) {
6785
#' # Often we don't want the height of the bar to represent the
6886
#' # count of observations, but the sum of some other variable.

man/geom_histogram.Rd

Lines changed: 18 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)