UQRUG 4

meeting
Published

April 20, 2020

2020-04-20: UQRUG 4 - moving online!

Because of physical isolation, we will be meeting online for now.

Join the zoom meeting at 3 pm on Monday the 20th of April

Attendees

Please add your name and a problem you’d like to work on, or just something you’d like to chat about! Feel free to add links too, so others can have a look and contribute.

  • Steph (Library) - putting together a resource for learning reproducible reports with R Markdown.
  • Greg - working on my PhD thesis using RMarkdown (I have converted the UQ thesis template to RMarkdown). Also working on some data analysis which involves manipulating data contained in multiple files. Lots of dplyr code. Note: R 4.0 coming this week. This is a major release and it involves reinstalling all of your packages. Be careful if you update and make sure you keep backups.
  • Patrick - Honours student in agriculture using R to analyse biomass of lettuce and genetic variations of weed species.
  • Aljay - Working in fruit/vegetable quality changes in supply chains and using R to predict changes in the chain.
  • Fathiyya - Currently I am working in remote sensing field for group of crops classification using R
  • Paula - I’m here to help with questions.
  • Amelia - Hello! I am a basic learner of R. I will start plotting some graphs, and if I have some questions I will let you know.
  • Debbie - Postdoc at SEES using R for catchment restoration projects. We can create breakout rooms in Zoom for people to help each other.

Topics discussed

  • R 4.0 released in a few days. Find out more:
    • https://developer.r-project.org/Blog/public/
    • https://cran.r-project.org/doc/manuals/r-devel/NEWS.html
  • dplyr soon to hit 1.0 release, planned for around the end of May. Details about changes are on the Tidyverse blog, in a series of posts:
    • https://www.tidyverse.org/categories/package/
  • R Markdown UQ thesis template by Greg: https://github.com/gsiemon/UQThesis
    • Manage citations https://ropensci.org/technotes/2020/05/07/rmd-citations/
  • Bookdown for bigger, structured R Markdown documents: https://bookdown.org/
  • Tidying data with Tidyverse functions
  • Iterating over several vectors of values
    • useful examples:
      • https://github.com/jennybc/row-oriented-workflows/blob/master/ex06_runif-via-pmap.md
      • https://github.com/jennybc/row-oriented-workflows/blob/master/iterate-over-rows.md
      • https://serialmentor.com/blog/2016/6/13/reading-and-combining-many-tidy-data-files-in-R
      • https://gitlab.com/stragu/DSH/-/blob/master/R/packaging/packaging.md#read-a-single-file

The code we worked on:

# construct dataframe containing relevant data
site_number <- c(1, 2) # there are 2 numbers for this variable (1 and 2)
year <- c(2001, 2002) # there are 2 years (2001, 2002) 
veg_index <- c("EVI", "EVI2") # there are 2 VegIndices (EVI, EVI2)
# put everything together
df <- data.frame(site_number, year, veg_index)

# find all combinations of values
library(tidyr)
expanded <- expand(df, site_number, year, veg_index)

# create a path from other variables
library(dplyr)
with_paths <- expanded %>% 
  mutate(path = paste0("beginning/", site_number, "_", year, "-", veg_index))

# encapsulate code into a function
do_stats <- function(df, numCores) {
  siteShp <- readRDS(df$path)
  calcPixelwiseStats(siteShp, Year = df$year, removeBadGrads = removeBadGrads, optionsYldIdx = setOptionsYldIdx(minevitrees = 999, ndaysbeforemax4yldidx = 60, ndaysaftermax4yldidx = 60, VegIndex = df$veg_index), numCores = numCores, getFromBricks = FALSE)
}

# apply the custom function on 1-row tibbles by splitting
library(purrr)
siteInfoAllYrs <- with_paths %>% 
  split(1:nrow(with_paths)) %>% # create sequence as long as number of rows
  map(do_stats, numCores = 4)
# we end up with a list of results

# Other option: with pmap, which will work row-wise
# (which is why we have to have the df variables in the right order)
do_stats2 <- function(site_number, year, veg_index, path, numCores) {
  siteShp <- readRDS(path)
  calcPixelwiseStats(siteShp, Year = year, removeBadGrads = removeBadGrads, optionsYldIdx = setOptionsYldIdx(minevitrees = 999, ndaysbeforemax4yldidx = 60, ndaysaftermax4yldidx = 60, VegIndex = veg_index), numCores = numCores, getFromBricks = FALSE)
}

# apply the function row-wise
siteInfoAllYrs <- pmap(with_paths, do_stats2, numCore = 4)

# small example of how pmap works row-wise:
x <- c(1, 10, 100)
y <- c(1, 2, 3)
z <- c(5, 50, 500)
examp <- data.frame(x, y, z)
pmap(examp, sum)