Evaluating the signs of climate change

We are going to encounter some methods for interacting with, visualizing, and analyzing earth sciences data in this activity. At each table, one student will pick one of the datasets and run the code associated with that dataset. What do you see? Based on the course materials, how does this constitute evidence for global climate change?

Please start by logging into rstudio.pomona.edu.

Note! If you run the command library(package) and see an error in red text stating Error in library(package) : there is no package called ‘package’, don’t worry! That simply means that the package is not installed in your R environment. All you need to do is just run install.packages(package) once. (Of course, replace package with the name of the package in question, such as swirl).

First, we will all load these packages - make sure you run this code irrespective of whichever variable you select.

library(dplyr)
library(openxlsx)
library(ggplot2)

sheet_url <- "https://github.com/EA30POM/Fall2021/blob/main/data/ClimateChangeSigns.xlsx?raw=true" # location of the Excel spreadsheet

keelingCO2 <- openxlsx::read.xlsx(sheet_url,sheet="Keeling") # everyone needs to download the Keeling CO2 data because we will be combining that with the other variables too.

Visualization can help you get a feel for your data, and can be a very useful aid in exploring data. We will use the ggplot2 package to create plots in R.

ggplot2 creates plots based on different layers. The three necessary components are:

the dataset with the data to be plotted
the “aesthetics”, which map how the data will be associated with different plotting elements, such as the location of points in a scatterplot, or other features such as shape, size, color, transparency, etc.
the geometric elements in the plot (points, boxplot, histogram)

While there are additional layering components that one can include, we will confine our attention to these three layers. For more information, please consult the free, online ggplot2 Book.

Scatterplots are most useful for visualizing two continuous, numeric variables. In defining a plot using ggplot2, we specify the data source (in the ggplot function), and the mapping of the variables (aes, short for aesthetic, within ggplot). Subsequently, we specify the type of geom (geometry) we want in the plot.

Below, we will follow a convention where we assign an object (p) to store the plot. We specify what layers we want to add to the object p by using the + sign.

At each table, one student will pick one of the datasets and run the code associated with that dataset. What do you see? Based on the course materials, how does this constitute evidence for global climate change?

Variable 1: Global \(CO_2\) concentration in the atmosphere

The data source for this variable is NASA’s Global Climate Change Vital Signs of the Planet Program.

p <- ggplot(keelingCO2, aes(x=year,y=meanCO2ppm)) # initiate the plot
p <- p + geom_point() # add scatterplot points
p <- p + geom_line() # add a trend line
p <- p + labs(x="Year",y="CO2 (ppm)") # change axis labels
p <- p + theme_classic() # change plot style
p

CO2mod <- lm(meanCO2ppm~year, data=keelingCO2) # we are running a "linear regression" statistical model. You can think of this like finding the line that fits our data the best.
summary(CO2mod) # what do you see in the Coefficients: table for year?

Variable 2: Arctic sea ice minimum spatial extent

The data source is here.

seaIce <- openxlsx::read.xlsx(sheet_url,"SeaIce")
seaIceCO2 <- left_join(seaIce, keelingCO2, by=c("year"="year")) # we are joining up the sea ice extent data with global CO2 level data, matching on each year

p <- ggplot(seaIceCO2, aes(x=meanCO2ppm, y=extent))
p <- p + geom_point()
p <- p + geom_line()
p <- p + labs(x="CO2 (ppm)",y="Minimum sea ice extent (million km2)")
p <- p + theme_classic()
p # what do we see?

seaIceMod <- lm(extent ~ meanCO2ppm, data=seaIceCO2) # we are running a "linear regression" statistical model. You can think of this like finding the line that fits our data the best.
summary(seaIceMod) # what do you see in the Coefficients: table for mean CO2 (ppm)?

Variable 3: Global surface temperature (land and ocean)

The data source is here.

globTemp <- openxlsx::read.xlsx(sheet_url,"GlobTemp")
globTempCO2 <- left_join(globTemp, keelingCO2, by=c("Year"="year")) # we are joining up global temperature data with global CO2 level data, matching on each year

p <- ggplot(globTempCO2, aes(x=meanCO2ppm, y=TempObs))
p <- p + geom_point()
p <- p + geom_line()
p <- p + labs(x="CO2 (ppm)",y="Global temperature anomaly (*C)")
p <- p + theme_classic()
p # what do we see?

tempMod <- lm(TempObs ~ meanCO2ppm, data=globTempCO2) # we are running a "linear regression" statistical model. You can think of this like finding the line that fits our data the best.
summary(tempMod) # what do you see in the Coefficients: table for mean CO2 (ppm)?

Variable 4: Heat content of the oceans

The original data source is here but the data I used came from the EPA. The ocean’s heat content is expressed in zettajoules or \(10^{22}\) joules. In contrast, total global energy consumption is around 580 million terajoules, or 5.8 zettajoules.

oceanHeat <- openxlsx::read.xlsx(sheet_url,"OceanHeat")
oceanHeatCO2 <- left_join(oceanHeat, keelingCO2, by=c("Year"="year")) # we are joining up global ocean heat content data with global CO2 level data, matching on each year

p <- ggplot(oceanHeatCO2, aes(x=meanCO2ppm, y=NOAASea))
p <- p + geom_point()
p <- p + geom_line()
p <- p + labs(x="CO2 (ppm)",y="Ocean heat content (ZJ)")
p <- p + theme_classic()
p # what do we see?

oceanMod <- lm(NOAASea ~ meanCO2ppm, data=oceanHeatCO2) # we are running a "linear regression" statistical model. You can think of this like finding the line that fits our data the best.
summary(oceanMod) # what do you see in the Coefficients: table for mean CO2 (ppm)?

Taking the activity further

The introverse package makes super-useful help pages that can demystify the tidyverse commands that we’ll be using in this class.

Time permitting, consider the following:

for the lm coefficient outputs, what do those numbers represent? What do they tell us about the impact on some response variable for changes in \(CO_2\)?
- what is the unit of change in \(CO_2\) for the coefficients?
- how much would your variable change if \(CO_2\) went up by 10 ppm?
- can we reasonably extrapolate those estimated valus into the future?
how would you modify the code above to re-run the analyses with Year as the explanatory/predictive/x-axis variable?

Climate class activity

2024-08-19