5.6 Statistical Summary Functions
Functions for calculating descriptive statistics and summaries.
Summary statistics for box plots
boxplot.stats
returns the extreme of the lower whisker, the lower ‘hinge’ (Q1 or the 1st quartile), the median (or midhinge), the upper ‘hinge’ (Q3 or the 3rd quartile) and the extreme of the upper whisker.
lower whisker is the smallest data point within 1.5 × IQR from Q1
The larger of
min
andQ1–1.5*IQR
,IQR=Q3-Q1
is the interquartile range.upper whisker is the the largest data point within 1.5 × IQR from Q3
The smaller of
max
andQ3+1.5*IQR
box.stat <- function(col){
# Return a vector of statistics of 6 elements, containing
# 1) the extreme of the lower whisker,
# 2) the lower 'hinge', 3) the median, 4)the upper 'hinge',
# 5) the extreme of the upper whisker,
# and 6) the mean.
setNames(c(boxplot.stats(col)$stats, mean(unlist(col), na.rm=TRUE) ),
c("low.whisker", "1st.Q", "median", "3rd.Q", "upper.whisker", "mean") )
}
Get boxplot and its stats
# boxplot of price groupped by make
boxplot(price~make, sample_price2)$stats %>%
as_tibble() %>%
setNames(c("bmw","toyota")) %>%
add_column("stat"=c("low.whisker", "1st.Q", "median", "3rd.Q", "upper.whisker"), .before=1)
Summary statistics for box plots
boxplot.stats
returns the extreme of the lower whisker, the lower ‘hinge’ (Q1 or the 1st quartile), the median (or midhinge), the upper ‘hinge’ (Q3 or the 3rd quartile) and the extreme of the upper whisker.
lower whisker is the smallest data point within 1.5 × IQR from Q1
The larger of
min
andQ1–1.5*IQR
,IQR=Q3-Q1
is the interquartile range.upper whisker is the the largest data point within 1.5 × IQR from Q3
The smaller of
max
andQ3+1.5*IQR
box.stat <- function(col){
# Return a vector of statistics of 6 elements, containing
# 1) the extreme of the lower whisker,
# 2) the lower ‘hinge’, 3) the median, 4)the upper ‘hinge’,
# 5) the extreme of the upper whisker,
# and 6) the mean.
setNames(c(boxplot.stats(col)$stats, mean(unlist(col), na.rm=TRUE) ),
c("low.whisker", "1st.Q", "median", "3rd.Q", "upper.whisker", "mean") )
}
Get boxplot and its stats
# boxplot of price groupped by make
boxplot(price~make, sample_price2)$stats %>%
as_tibble() %>%
setNames(c("bmw","toyota")) %>%
add_column("stat"=c("low.whisker", "1st.Q", "median", "3rd.Q", "upper.whisker"), .before=1)
setNames()
updates the column names without having to write another replacement function.