This document contains code for Group 1. It is your task to modify the code in places where I have left ...
as a placeholder for a variable name. Please feel free to consult with myself and Quyen for guidance.
## Libraries for interacting with data
library(readr) # reads in spreadsheet data
library(dplyr) # lets you wrangle data more effectively
library(tidyr) # lets us interact with data with additional functions
## Libraries for plotting
library(ggplot2) # plotting library
library(waffle) # square pie chart (waffle chart) library
## Libraries for spatial data
library(sf) # a nicer way to deal with spatial data
library(spData) # spatial datasets
Next, we’re going to import the class dataset.
CSPdf <- read_tsv("https://raw.githubusercontent.com/EA30POM/Fall2021/main/data/OTEC_CSP.tsv") # use the function read_tsv from readr to pull in a spreadsheet from this URL.
# Then store the output in an object called "CSPdf"
# CSPdf stands for Conservation Science Partners data frame
We can always pull up a viewer of our data by using these commands:
## Version 1:
CSPdf %>% # take CSPdf and feed it into the next command
head( ) %>% # look at the first 6 rows
View( ) # open the spreadsheet viewer
# akin to x --> f(x) --> g(x)
## Alternatively, version 2:
View( head( CSPdf ) ) # akin to g( f( x ) )
What are the patterns for the topic of Human Wildlife Interaction globally?
### Q1: Differences in topic pattern - visualization using the global map
## Q1.1: Generate calculations by country
group1DF <- CSPdf %>%
group_by(...) %>% # group the data by Country (what do you put in ...?)
summarize(HWIcount = length(which(Topic=="HWI")), CountryN = n(), HWIprop = length(which(Topic=="HWI"))/n())
# create three columns: 1) HWIcount which counts the number of HWI articles in each country,
# 2) CountryN, the number of articles in each country,
# and 3) HWIprop, which presents the proportion of articles in each country that are about HWI in our sample
## Q1.1: Inspecting output
View(group1DF) # open up a viewer of the data
## Q1.1: Creating a subset for the countries that have non-0 HWI articles
group1sub <- group1DF %>%
filter(HWIcount > 0) # only retain the rows that have more than 0 HWI articles
The next step is to join up the processed data with global map data.
## Q1.2: Combine the subset data with GIS data for the world's countries
worldDF <- world # store the dataset for the world in worldDF, world data frame
speciesMap <- worldDF %>% # take the worldDF object and feed forward
left_join(group1DF, by=c("name_long"="Country")) # combine the group1DF with worldDF by country,
# getting the data to match on the columns "name_long" (countries) and "Country"
# How would you modify this code to only use group1sub instead?
# When would group1sub be more appropriate to use than group1DF?
We’re ready to plot the data now.
## Q1.3: Generating the map
# Your task: How do you display either HWIcount or HWIprop?
# Hint: you may want to change the ... in fill=...
g1 <- ggplot(data=speciesMap, aes(fill=...)) # use the speciesMap object to create a map
g1 # inspect the current output
g1 <- g1 + geom_sf(aes(fill=...),color="gray47",size=0.05) # add on the boundaries of countries
g1 # inspect the current output
g1 <- g1 + labs(x="",y="",fill="...: ") # change the names of the variables as needed
g1 # inspect the current output
g1 <- g1 + theme_minimal() # change the display style
g1 # inspect the current output
g1 <- g1 + theme(legend.position = "top") # move the legend bar to the top
g1 # display the final plot
Are there differences in the distribution of humans or animals being portrayed as victims in the topic of HWI?
HWI
### Q2: Differences across countries with respect to HWI human/animal victimhood?
## Q2.1: Subset data to those where Topic is HWI
# Your task is to replace ... in filter(...)
# Hint: filter(Topic="something")
group1DF2 <- CSPdf %>%
filter(...) # subset to the rows where the Topic is HWI
# and assign the output to group1DF2
## Q2.1: View the different portrayals in the subset of data
table(group1DF2$Portrayal)
Great! We’ve subset the data. But now, the problem is that we need to able to categorize the articles as portraying species as “victims” or “aggressors”.
## Q2.2: Categorize portrayals into animals as victims or aggressors
group1sum2 <- group1DF2 %>%
mutate(Victimhood = case_when(Portrayal %in% c("Dangerous","Controversial") ~ "Aggressor", # create a new column called Victimhood
Portrayal %in% c("Vulnerable","Victim) ~ "Victim") # modify the code above to categorize other
# portrayals of wildlife as victim or aggressor.
# Hint: use the output of the table(group1DF2$Portrayal) and your own understanding
# to relabel these portrayals of species
View(group1sum2) # view data
## Q2.3: Subset the data for creating a waffle chart (square pie chart)
group1waffle <- group1sum2 %>%
group_by(Country) %>% # categorize by Country
summarise(Aggressor = round(length(which(Victimhood=="Aggressor"))/n() * 100,0),
Victim = 100-Aggressor)
# Create two summary columns: 1) Aggressor, which finds the proportion of
# articles that feature the animal as an aggressor and converts to a percentage,
# and 2) Victim, the percentage of articles where the animal is a victim
## Q2.4: Reformatting the data
group1waffle <- group1waffle %>%
tidyr::pivot_longer(cols=Aggressor:Victim)
View(group1waffle) # inspect the data object you've created
Hooray! Now we’re on to visualization
waff1 <- ggplot(group1waffle, aes(fill=name, values=value)) # initiate plot
waff1 <- waff1 + geom_waffle() # add on waffle
waff1 # display output
waff1 <- waff1 + facet_wrap(~Country) # create sub-plots by Country
waff1 # display output
waff1 <- waff1 + theme_void() # change the plot visuals
waff1 # display output
waff1 <- waff1 + labs(x="",y="",fill="Portrayal: ") + theme(legend.position="top") # change plot appearance
waff1 # display final plot