Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
economics:r:ggplot2 [2019/04/30 16:22] Olivier Simard-Casanova |
— (current) | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | <panel type="default"> | ||
- | {{page>snippets:in-construction&noheader&nofooter}} | ||
- | <lead> | ||
- | ====== ggplot2 ====== | ||
- | The best library to plot data | ||
- | </lead> | ||
- | |||
- | </panel> | ||
- | |||
- | - [Types of graph that ggplot2 can draw](https://ggplot2.tidyverse.org/reference/index.html#section-layer-geoms) | ||
- | |||
- | ## Alluvial diagrams | ||
- | |||
- | Requires the `ggalluvial` [package](https://cran.r-project.org/web/packages/ggalluvial/vignettes/ggalluvial.html), on top of `ggplot2` (*via* [Mathieu Perona](https://twitter.com/MathieuPerona/status/1120994707746766848)). | ||
- | |||
- | [Alluvial diagrams](https://en.wikipedia.org/wiki/Alluvial_diagram) are like this (my own production): | ||
- | |||
- | {{ :economics:r:alluvial.png?direct |}} | ||
- | |||
- | You can also use `networkD3` with the function `sankeyNetwork` (*via* [Antoine Belgodere](https://twitter.com/ABelgo_optimum/status/1120996218086273024)): more [here](https://christophergandrud.github.io/networkD3/). An interactive example can be found [here](https://www.r-graph-gallery.com/sankey-diagram/) (*via* [Pauline R.](https://twitter.com/NutriPrudens/status/1120999342419140608)). | ||
- | |||
- | ## Bar plots | ||
- | |||
- | - [In French](http://www.sthda.com/french/wiki/ggplot2-barplots-guide-de-demarrage-rapide-logiciel-r-et-visualisation-de-donnees) | ||
- | |||
- | ### Simple bar plot | ||
- | |||
- | Suppose you have those data in `df`: | ||
- | |||
- | ^ year ^ n ^ | ||
- | | 2017 | 381 | | ||
- | | 2018 | 315 | | ||
- | |||
- | You want to plot how many `n` are in each `year` in a simple bar plot. | ||
- | |||
- | <code:r> | ||
- | plot <- ggplot(df, aes(x = year, y = n)) + | ||
- | geom_bar(stat = "identity") | ||
- | plot | ||
- | </code> | ||
- | |||
- | Do NOT forget to add `stat = "identity"` in `geom_bar()`. It tells `ggplot2` to "count" `n`. | ||
- | |||
- | <imgcaption output_simple_bar_plot | Here's the result of the aforementioned code. Of course you can customize its look if you are not happy with the colours, and so on.>{{ economics:r:screenshot_2018-10-29_at_17.00.43.png?direct&400 |}}</imgcaption> | ||
- | |||
- | ### Percentage bar plot | ||
- | |||
- | <code:r> | ||
- | plot <- ggplot(df, aes(x = var1, y = (..count..)/sum(..count..))) + | ||
- | geom_bar() | ||
- | plot | ||
- | </code> | ||
- | |||
- | [Source and more details](https://sebastiansauer.github.io/percentage_plot_ggplot2_V2/). | ||
- | |||
- | More sophisticated code, not tested yet: | ||
- | |||
- | <code:r> | ||
- | ggplot(df, aes(x= var1, group=var_group)) + | ||
- | geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") + | ||
- | geom_text(aes( label = scales::percent(..prop..), | ||
- | y= ..prop.. ), stat= "count", vjust = -.5) + | ||
- | labs(y = "Percent", fill="var1") + | ||
- | facet_grid(~var_group) + | ||
- | scale_y_continuous(labels = scales::percent) | ||
- | </code> | ||
- | |||
- | Requires the library `scales`. | ||
- | |||
- | (Same source) | ||
- | |||
- | ### Plot several series in one single bar plot | ||
- | |||
- | [source (in French](http://www.sthda.com/french/wiki/ggplot2-barplots-guide-de-demarrage-rapide-logiciel-r-et-visualisation-de-donnees#barplot-avec-plusieurs-groupes) | ||
- | |||
- | <code:r> | ||
- | ggplot(df, aes(x=var1, y=(..count..), fill=var2)) + | ||
- | geom_bar(stat="identity", position="dodge") | ||
- | </code> | ||
- | |||
- | If the variable used with `fill` is kind of categorical (for instance it's years), beware to be sure it's a *factor* (`as.factor(var2)`). Otherwise it may be treated as a continuous variable, and displayed incorrectly. | ||
- | |||
- | `(..count..)` means that `ggplot2` will display the count of `var1` for each value of `var2`. | ||
- | |||