# ggplot2

The best library to plot data

## Alluvial diagrams

Requires the `ggalluvial`

package, on top of `ggplot2`

(*via* Mathieu Perona).

Alluvial diagrams are like this (my own production):

You can also use `networkD3`

with the function `sankeyNetwork`

(*via* Antoine Belgodere): more here. An interactive example can be found here (*via* Pauline R.).

## Bar plots

### Simple bar plot

Suppose you have those data in `df`

:

year | n |
---|---|

2017 | 381 |

2018 | 315 |

You want to plot how many `n`

are in each `year`

in a simple bar plot.

plot <- ggplot(df, aes(x = year, y = n)) + geom_bar(stat = "identity") plot

Do NOT forget to add `stat = "identity"`

in `geom_bar()`

. It tells `ggplot2`

to “count” `n`

.

### Percentage bar plot

plot <- ggplot(df, aes(x = var1, y = (..count..)/sum(..count..))) + geom_bar() plot

More sophisticated code, not tested yet:

ggplot(df, aes(x= var1, group=var_group)) + geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") + geom_text(aes( label = scales::percent(..prop..), y= ..prop.. ), stat= "count", vjust = -.5) + labs(y = "Percent", fill="var1") + facet_grid(~var_group) + scale_y_continuous(labels = scales::percent)

Requires the library `scales`

.

(Same source)

### Plot several series in one single bar plot

ggplot(df, aes(x=var1, y=(..count..), fill=var2)) + geom_bar(stat="identity", position="dodge")

If the variable used with `fill`

is kind of categorical (for instance it's years), beware to be sure it's a *factor* (`as.factor(var2)`

). Otherwise it may be treated as a continuous variable, and displayed incorrectly.

`(..count..)`

means that `ggplot2`

will display the count of `var1`

for each value of `var2`

.

## Discussion