Dataviz for Scientists
Introduction
Communication is essential for any kind of scientist. Your research findings might contain the most important results, but if you don’t manage to communicate them to others, they might be like a tree falling in a forest with no one around. Do they make a sound?
A huge part of results today depends on data, and communicating data effectively is a skill made of many components. Representing data graphically is a big part of it.
In this workshop you will learn how to communicate data graphically, and how to use literate programming to make your graphics available to others, together with text and analytical code.
For this three days workshop, we will use R and Javascript to analyze and visualize the data.
With Quarto, a modern literate programming tool dedicated to scientific communication, we will put it all together, to let you communicate your data in beautifully formatted outputs, which are easy to read, but also reproducible and transparent for the analytical mind.
In the extensively hands-on sessions we will focus on real-world data, and if you would like to, you are welcomed to bring your own data to the workshop.
All the software, programming language, and resources used in this workshop are open source and open access. In this way the participants will have full control on the tools that they use and will be able to access them after the class is over, free from unfavorable commercial licenses. All the tools are cutting edge in both industry and academic fields.
Slides
Below you can find the link to the slides.
Part 1: Welcome to R
Day 1.
Part 2: Intro to Data Visualization
Day 2, morning.
Part 3: Better Graphs
Day 2, afternoon.
Part 4: Scientific Publishing
Day 3, morning.
Part 5: Web Development
Day 3, afternoon.
Resources
Besides the slides, you can consult any of these open access books on the topics of data analysis, statistics, programming and data visualization. The authors of those books made them open access, so they can be consulted online anytime.
Packages
There’s some package that we are going to use in most exercise. To be sure that you have them ready, install them by running at the R console:
install.packages(
c('tidyverse',
'palmerpenguins',
'here',
'janitor',
'paletteer')
)
At the beginning of each one of your script, you can load them by writing:
library(tidyverse)
library(palmerpenguins)
library(here)
library(janitor)
library(paletteer)
Source Code
The source code for these slides is on Github at https://github.com/othomantegazza/dataviz-for-scientists-slides
License
This work is licensed by Otho Mantegazza under the CC BY-NC-SA 4.0 license. For more information about the non-original bits and pieces, please check the file LICENSE on this course’s Github repo.
Acknowledgements
Big thanks to Giorgia Ditano for the help and support in reviewing the course material.