# Color Palettes in RGB Space

## Introduction

I've recently been interested in how to communicate information using color. I don't know much about the field of Color Theory, but it's an interesting topic to me. The selection of color palettes, in particular, has been a topic I've been faced with lately.I downloaded 18 different sequential color palettes from Cynthia Brewer's ColorBrewer2 website to use as suggested color palettes. I was struck by the various movements through the spectrum of some of these palettes and wanted to poke around at quantifying some of that movement. This is the result of that analysis.

Using the 18 different sequential color palettes, I generated heatmaps to try to gauge the aesthetic appeal of each palette. Using Amazon's Mechanical Turk, I was able to ask workers to rate a palette on a scale from 1 to 5. I had each palette rated 20 times to generate a sufficient amount of data to start determining statistical significance. The 360 ratings were completed by 28 workers in a couple of hours.

This initial study didn't consider anything about how clear the heatmaps are, only how aesthetically pleasing they are. The 18 matrices in the 18 available palettes are displayed below.

source("../Correlation.R") source("loadPalettes.R") par(mfrow = c(6, 3), mar = c(2, 1, 1, 1)) for (i in 1:18) { palette <- allSequential[[i]] image(mat, col = rgb(palette/255), axes = FALSE, main = i) }

## RGB-Space

One question I had was about the motion of a "visually attractive" color palette through the 3-dimensional space of all RGB values. My assumption was that most palettes can be represented as a straight line through this space.

You must enable Javascript to view this page properly.

### R^{2} Values

One way to quantify this is to calculate the principal component of each palette, which represents the line in RGB space which best fits the palette. The R^{2}value can then be used to quantify how well the data aligns to this component. An R

^{2}value of 1 indicates that the palette aligns perfectly to a straight line through RGB space.

You must enable Javascript to view this page properly.

#' Compute the proportion of variance accounted for by the given number of #' components #' #' @author Jeffrey D. Allen \email{jeff.allen@@trestletech.com} propVar <- function(pca, components = 1) { # proportion of variance explained by the first component pv <- pca$sdev^2/sum(pca$sdev^2)[1] return(sum(pv[1:components])) } #' Calculate the R-squared values of all 18 color palettes #' #' @author Jeffrey D. Allen \email{jeff.allen@@trestletech.com} calcR2Sequential <- function() { library(rgl) R2 <- list() for (i in which(sequential[, 2] == 9)) { palette <- sequential[i:(i + 8), 7:9] pca <- plotCols(palette$R, palette$G, palette$B, pca = 1) cat(i, ": ", propVar(pca, 1), "\n") R2[[length(R2) + 1]] <- propVar(pca, 1) } return(R2) }

### Path Length

An alternative way to consider the palette in RGB space is to consider the "length" of the palette through RGB space by simply calculating the distance between each color represented as a point in RGB space. My thought is that this may better capture the "movement" of a palette around this space. The R^{2}value doesn't encompass any notion of how much space is covered by a palette, but only the arrangement of the colors relative to their principal component.

#' Calculate the path length for all sequential palettes. #' #' @author Jeffrey D. Allen \email{jeff.allen@@trestletech.com} calcPathLength <- function() { plen <- array(dim = sum(sequential[, 2] == 9, na.rm = TRUE)) p <- 1 for (i in which(sequential[, 2] == 9)) { palette <- sequential[i:(i + 8), 7:9] cat(i, ": ", getPathLength(palette), "\n") plen[p] <- getPathLength(palette) p <- p + 1 } return(plen) } #' Calculate the length of a path through RGB space of a given palette. #' #' Sums the distance from all adjacent colors. @author Jeffrey D. Allen #' \email{jeff.allen@@trestletech.com} getPathLength <- function(palette) { pd <- apply(palette, 2, diff) pl <- sqrt(apply(pd^2, 1, sum)) return(sum(pl)) }

### Comparison

Let's compare the R^{2}values to the path length values

The p-value (r2 <- calcR2Sequential()## 34 : 0.981 ## 76 : 0.9097 ## 118 : 0.9541 ## 160 : 0.9847 ## 202 : 0.9552 ## 244 : 0.9663 ## 286 : 0.9674 ## 328 : 0.9273 ## 370 : 0.9344 ## 412 : 0.9593 ## 454 : 0.9049 ## 496 : 0.9007 ## 538 : 0.9954 ## 580 : 0.9752 ## 622 : 0.9846 ## 664 : 0.9292 ## 706 : 0.9311 ## 748 : 1r2 <- unlist(r2) names(r2) <- 1:18 pl <- calcPathLength()## 34 : 404.8 ## 76 : 455.2 ## 118 : 377.5 ## 160 : 407.5 ## 202 : 418.1 ## 244 : 400.5 ## 286 : 389.5 ## 328 : 405.6 ## 370 : 430 ## 412 : 420.7 ## 454 : 424.7 ## 496 : 430 ## 538 : 342.9 ## 580 : 374.1 ## 622 : 400.2 ## 664 : 400.3 ## 706 : 430.6 ## 748 : 441.7plot(r2 ~ pl, main = "Path Length vs. R-Squared Value", xlab = "Path Length", ylab = "R-Squared Value") abline(lm(r2 ~ pl), col = 2)pv <- anova(lm(r2 ~ pl))$"Pr(>F)"[1]

`0.0318`

) is significant in the negative correlation between the two variables, meaning that, as expected, the closer the palette stays to its principal component, the shorter the path through RGB space is (on average). So either scheme could be used to quantify a palette's non-linearity.
## Color Ratings

The output of the Mechanical Turk trial is available in a stored file in this project. We'll read it in and filter out the peripheral information.We can then visualize the results, as well.colorPreference <- read.csv("../turk/output/Batch_790445_batch_results.csv", header = TRUE, stringsAsFactors = FALSE) colorPreference <- colorPreference[, 28:29] colorPreference[, 1] <- substr(colorPreference[, 1], 44, 100) colorPreference[, 1] <- substr(colorPreference[, 1], 0, nchar(colorPreference[, 1]) - 4) colnames(colorPreference) <- c("palette", "rating") prefList <- split(colorPreference[, 2], colorPreference[, 1]) prefList <- prefList[order(as.integer(names(prefList)))]

We can then check to see if there's a significant association between the palette and the rating of the palette, or if we've just got noise.boxplot(prefList, main = "Ratings of Color Palettes", xlab = "Palette Number", ylab = "Rating")avgs <- sapply(prefList, mean)

With a p-value offit <- anova(lm(colorPreference[, 2] ~ as.factor(colorPreference[, 1]))) pv <- (fit$"Pr(>F)")[1]

`1.6911 × 10`^{-4}

, you can see that there is a significant difference between the different palettes.
We can list out the most attractive palettes in order, as well:

So the palettes, in order of visual appeal are:sort(sapply(prefList, mean), decreasing = TRUE)## 14 13 4 6 3 15 1 7 18 5 8 2 16 9 10 ## 4.30 4.10 3.85 3.75 3.65 3.65 3.55 3.45 3.45 3.40 3.40 3.35 3.15 3.10 3.10 ## 17 12 11 ## 2.95 2.85 2.65

par(mfrow = c(6, 3), mar = c(2, 1, 1, 1)) for (i in order(avgs, decreasing = TRUE)) { palette <- allSequential[[i]] image(mat, col = rgb(palette/255), axes = FALSE, main = i) }

### Warm Palettes

The first thing I noticed was that the cooler palettes were rated more highly than the warmer palettes. To try to quantify this, we can plot out the "redness" (the strenght of the red channel in each palette) against the average rating.Indeed, the p-value of this correlation is significant for this data (redness <- apply(sapply(allSequential, "[[", "R"), 2, mean) plot(avgs ~ redness, main = "Warmth of Palette vs. Aesthetic Appeal", xlab = "\"Redness\"", ylab = "Average Aesthetic Rating") abline(lm(avgs ~ redness), col = 2)pv <- anova(lm(avgs ~ redness))$"Pr(>F)"[1]

`3.6602 × 10`^{-4}

) indicating that -- among these palettes and in this context -- cooler palettes are more visually appealing.
### R^{2} Values

We can calculate the R^{2}values for each palette as previously discussed and compare to see if it's associated with the aesthetic appeal of a palette.

Again, the p-value of this association is significant (r2## 1 2 3 4 5 6 7 8 9 10 ## 0.9810 0.9097 0.9541 0.9847 0.9552 0.9663 0.9674 0.9273 0.9344 0.9593 ## 11 12 13 14 15 16 17 18 ## 0.9049 0.9007 0.9954 0.9752 0.9846 0.9292 0.9311 1.0000plot(avgs ~ r2, main = "Linearity of Color Palette vs. Aesthetic Appeal", xlab = "R-squared Value", ylab = "Average Aesthetic Rating") abline(lm(avgs ~ r2), col = 4)pv <- anova(lm(avgs ~ r2))$"Pr(>F)"[1]

`3.2224 × 10`^{-4}

). So it seems that adhering a color spectrum to a straight line through RGB-space is visually appealing.
Similarly, for the path length, the p-value is significant (though not as strongly as with the R^{2} values).

anova(lm(avgs ~ pl))$"Pr(>F)"[1]## [1] 0.001476

## Summary

This analysis answered a couple of questions for me. First, it showed that, in general, linear paths through RGB space create more aesthetically pleasing color palettes. Second, it demonstrated that, in this narrow study, palettes with cooler color schemes were preferred as more "aesthetically pleasing." Finally, it gave some concrete recommendations regarding which of the available color palettes to use if the goal is purely aesthetic.### Future Work

Of course, aesthetics are not to only goal behind color palette selection. Generally, the goal of heatmaps such as these is to convey information. If no legend is given, we're hoping to convey relative "strengths" of some phenomena to the viewer. If a legend is given, we're additionally hoping to support some quantification to these data, as well. So merely determining which color palettes are best to look at will likely not be the most important consideration in determining which palettes to use. We should do further analysis to determine which palettes convey such information most efficiently, and then likely make some compromise between efficient communication and aesthetics, depending on the application.### Acknowledgements

- All analysis done in R
- All ratings generated using Amazon Mechanical Turk
- Interactive 3D graphics generated using rgl version 0.92.879
- Report generated using knitr
- Code hosted by GitHub in trestletech/RGB-Space
- Color palettes obtained from Cynthia Brewer's ColorBrewer2.0 service

### 5 Comments

### Trackbacks/Pingbacks

- Color Palettes in HCL Space – Trestle Technology, LLC - [...] is a quick follow-up to my previous post about Color Palettes in RGB Space. Achim Zeileis had commented that, …
- Visualize Color Palettes in Interactive 3D Grid (Shiny + RGL) – Trestle Technology, LLC - [...] is an adaptation of some previous analysis exploring the progression of color palettes through three-dimensional RGB space. Thanks to …
- Color Palettes in Interactive 3D (Shiny+RGL) | Trestle Technology - [...] is an adaptation of some previous analysis exploring the progression of color palettes through three-dimensional RGB space. Thanks to …
- Color Palettes in HCL Space | Trestle Technology - [...] is a quick follow-up to my previous post about Color Palettes in RGB Space. Achim Zeileis had commented that, …

beautiful keep it up, thanks

Thanks for the interesting post. Two comments:

(1) I wouldn’t expect the paths in RGB space to be particularly interesting because RGB does not capture how humans perceive color. The paths in HCL (polar LUV) space are more revealing. (Alternatively, polar LAB could also be interesting.) Most of the sequential color brewer palettes then have rather clear interpretations: luminance changes from light to dark monotonically, chroma (colorfulness) increases first but might also decrease again for darker colors, and the hue either stays in a rather small interval (around a blue say) or goes from yellow to red or something like that. These principles can also be abstracted and used to generate other color palettes which is what the functions in “colorspace” do.

(2) Aesthetics are (as you note) not really the main goal here. Especially in statistical graphs they can be a rather poor guide. Many viewers “like” graphics with a lot of flashy colors because they draw attention to a plot and are “interesting” without the need to convey information. However, such graphics are typically hard to look at for a longer period of time and it is often hard(er) to extract information from them. A simple example would be the heatmaps in Figure 1 of our RGBland paper [*] where many viewers like the upper left panel if they don’t (need to) know anything about the content of the picture. If, however, you really want to convey the _smooth_ density clusters, the panels on the right are much more suitable – albeit more boring at a short glance.

[*] Achim Zeileis, Kurt Hornik, Paul Murrell (2009). Escaping RGBland: Selecting Colors for Statistical Graphics. Computational Statistics & Data Analysis, 53(9), 3259-3270.

http://dx.doi.org/10.1016/j.csda.2008.11.033

(a preprint version is also linked on my web page)

Thanks for the tips. Your paper looks really interesting! I’ll have to spend some time digesting it.

I’m just getting into this field, so your pointers are very instructive. I may have to look into HCL color space and play around with those for my next analysis.

I’ve got a simple “readability” study in the pipeline that is aiming to figure out which palettes best communicate information (again on Turk); hopefully I’ll find some time to finish the analysis and prepare a write-up this summer. Of course, I’d appreciate your thoughts when I finally finish it up!

Excellent post! I wonder if there’s some kind of cultural bias in the aesthetic ratings, owing to the fact that amazon’s mechanical turk… It’d be interesting to see a IP/Map v ColorOoolness plot, faceted by rating… But perhaps the amount of data isn’t enough to show such an effect…

Thanks for the kind words; I agree that would be an interesting study! I’m afraid that, on this particular data set, having data from only a couple dozen users may not be sufficient to build a reliable global map, but a similar study on Turk could certainly aim to capture a cultural bias in color palettes. I may have to take a look at that.