We are going to encounter some methods for interacting with, visualizing, and analyzing earth sciences data in this activity. At each table, one student will pick one of the datasets and run the code associated with that dataset. What do you see? Based on the course materials, how does this constitute evidence for global climate change?

Please start by logging into sso.rstudio.cloud/pomona. Once you’ve logged in, you should see the option to open the `EA 30.1 - Fall 2022`

workspace. You can also access the `EA 30.1 - Fall 2022`

workspace at this link.

At the workspace, you should then see the “Assignment” `ClimateData`

. Please click on that Assignment. Here is a direct link to the `ClimateData`

project, but it may not work if this is the first time in a while that you’re logging into RStudio.Cloud. If it doesn’t work, no worries - just log in through the sso.rstudio.cloud/pomona link.

First, we will all load these packages - make sure you run this code irrespective of whichever variable you select.

```
library(dplyr)
library(openxlsx)
library(ggplot2)
sheet_url <- "https://github.com/EA30POM/Fall2021/blob/main/data/ClimateChangeSigns.xlsx?raw=true" # location of the Excel spreadsheet
keelingCO2 <- openxlsx::read.xlsx(sheet_url,sheet="Keeling") # everyone needs to download the Keeling CO2 data because we will be combining that with the other variables too.
```

Visualization can help you get a feel for your data, and can be a very useful aid in exploring data. We will use the `ggplot2`

package to create plots in `R`

.

`ggplot2`

creates plots based on different layers. The three necessary components are:

- the dataset with the data to be plotted
- the “aesthetics”, which map how the data will be associated with different plotting elements, such as the location of points in a scatterplot, or other features such as shape, size, color, transparency, etc.
- the geometric elements in the plot (points, boxplot, histogram)

While there are additional layering components that one can include, we will confine our attention to these three layers. For more information, please consult the free, online ggplot2 Book.

Scatterplots are most useful for visualizing two continuous, numeric variables. In defining a plot using `ggplot2`

, we specify the data source (in the `ggplot`

function), and the mapping of the variables (`aes`

, short for aesthetic, within `ggplot`

). Subsequently, we specify the type of `geom`

(geometry) we want in the plot.

Below, we will follow a convention where we assign an object (`p`

) to store the plot. We specify what layers we want to add to the object `p`

by using the `+`

sign.

The data source for this variable is NASA’s Global Climate Change Vital Signs of the Planet Program.

```
p <- ggplot(keelingCO2, aes(x=year,y=meanCO2ppm)) # initiate the plot
p <- p + geom_point() # add scatterplot points
p <- p + geom_line() # add a trend line
p <- p + labs(x="Year",y="CO2 (ppm)") # change axis labels
p <- p + theme_classic() # change plot style
p
CO2mod <- lm(meanCO2ppm~year, data=keelingCO2) # we are running a "linear regression" statistical model. You can think of this like finding the line that fits our data the best.
summary(CO2mod) # what do you see in the Coefficients: table for year?
```

The data source is here.

```
seaIce <- openxlsx::read.xlsx(sheet_url,"SeaIce")
seaIceCO2 <- left_join(seaIce, keelingCO2, by=c("year"="year")) # we are joining up the sea ice extent data with global CO2 level data, matching on each year
p <- ggplot(seaIceCO2, aes(x=meanCO2ppm, y=extent))
p <- p + geom_point()
p <- p + geom_line()
p <- p + labs(x="CO2 (ppm)",y="Minimum sea ice extent (million km2)")
p <- p + theme_classic()
p # what do we see?
seaIceMod <- lm(extent ~ meanCO2ppm, data=seaIceCO2) # we are running a "linear regression" statistical model. You can think of this like finding the line that fits our data the best.
summary(seaIceMod) # what do you see in the Coefficients: table for mean CO2 (ppm)?
```

The data source is here.

```
globTemp <- openxlsx::read.xlsx(sheet_url,"GlobTemp")
globTempCO2 <- left_join(globTemp, keelingCO2, by=c("Year"="year")) # we are joining up global temperature data with global CO2 level data, matching on each year
p <- ggplot(globTempCO2, aes(x=meanCO2ppm, y=TempObs))
p <- p + geom_point()
p <- p + geom_line()
p <- p + labs(x="CO2 (ppm)",y="Global temperature anomaly (*C)")
p <- p + theme_classic()
p # what do we see?
tempMod <- lm(TempObs ~ meanCO2ppm, data=globTempCO2) # we are running a "linear regression" statistical model. You can think of this like finding the line that fits our data the best.
summary(tempMod) # what do you see in the Coefficients: table for mean CO2 (ppm)?
```

The original data source is here but the data I used came from the EPA. The ocean’s heat content is expressed in *zettajoules* or \(10^{22}\) joules. In contrast, total global energy consumption is around 580 million terajoules, or 5.8 zettajoules.

```
oceanHeat <- openxlsx::read.xlsx(sheet_url,"OceanHeat")
oceanHeatCO2 <- left_join(oceanHeat, keelingCO2, by=c("Year"="year")) # we are joining up global ocean heat content data with global CO2 level data, matching on each year
p <- ggplot(oceanHeatCO2, aes(x=meanCO2ppm, y=NOAASea))
p <- p + geom_point()
p <- p + geom_line()
p <- p + labs(x="CO2 (ppm)",y="Ocean heat content (ZJ)")
p <- p + theme_classic()
p # what do we see?
oceanMod <- lm(NOAASea ~ meanCO2ppm, data=oceanHeatCO2) # we are running a "linear regression" statistical model. You can think of this like finding the line that fits our data the best.
summary(oceanMod) # what do you see in the Coefficients: table for mean CO2 (ppm)?
```

Time permitting, consider the following:

- for the
`lm`

coefficient outputs, what do those numbers represent? What do they tell us about the impact on some response variable for changes in \(CO_2\)?- what is the unit of change in \(CO_2\) for the coefficients?
- how much would your variable change if \(CO_2\) went up by 10 ppm?
- can we reasonably extrapolate those estimated valus into the future?

- how would you modify the code above to re-run the analyses with
`Year`

as the explanatory/predictive/x-axis variable?