2.10 Commenting the answers

  1. Overlapping points problem
# Original plot – points overlap
ggplot(mpg, aes(cty, hwy)) +
  geom_point() +
  labs(title = "City vs Highway MPG (overlapping points)")

Problem:

Many cars have the same cty and hwy values → points overlap → hard to see density.

Solutions:

# Option 1: jitter points slightly
ggplot(mpg, aes(cty, hwy)) +
  geom_jitter(width = 0.2, height = 0.2) +
  labs(title = "City vs Highway MPG (jittered)")

# Option 2: size points by count
ggplot(mpg, aes(cty, hwy)) +
  geom_count() +
  labs(title = "City vs Highway MPG (point size = count)")

💡 Tip: geom_jitter() or geom_count() helps visualize overlapping points effectively.

  1. Boxplot with alphabetically ordered classes
ggplot(mpg, aes(class, hwy)) +
  geom_boxplot() +
  labs(title = "Highway MPG by Vehicle Class (alphabetical order)")

Problem:

class is ordered alphabetically → not informative Example: “2seater” appears first, “suv” last, regardless of MPG values.

  1. Reordering factor by data
# Reorder class by median highway MPG
ggplot(mpg, aes(reorder(class, hwy, FUN = median), hwy)) +
  geom_boxplot() +
  labs(
    x = "Vehicle Class (ordered by median hwy MPG)",
    title = "Highway MPG by Vehicle Class (reordered)"
  )

Explanation:

reorder(class, hwy, FUN = median) → sorts class on the x-axis by median highway mpg

Makes the plot more informative: easy to compare classes from low to high mpg

💡 Tip: Always consider data-driven factor ordering for categorical variables in plots.