30 Wrangling Repository
The Wrangling Repository contains scripts for data wrangling for all phases of the project. It takes in raw data and outputs analysis/visualization-ready .csv files.
It will be the primary home for this Research/Admin Position (or at least it was for me) so it may help to give you a brief explanation of each of the main scripts. Here’s the link.
30.1 Misc. Non-Manuscript Subset Scripts
So far, this only includes the script to separate out PlantPopNet members’ data to make sharing that with PPN leadership (upon request) simpler
30.2 Phase 1 Scripts
phase 1 abiotic wrangling.R
: Wrangles WORLDCLIM climatic data for phase 1 surveys.phase 1 herbivoreData wrangling.R
: Wrangles any information to do with herbivores from phase 1.phase 1 plant richness wrangling.R
: Extracts interpolated native/invasive species richness information from Ellis et al. 2012 shapefiles.phase 1 primary productivity.R
: Extracts primary productivity data from satellite data for phase 1 surveys.phase 1 shareable index.R
: Creates files of only location information for phase 1 sites to enable other HerbVar collaborators to harvest metadata without sharing “actual” data.phase 1 siteData wrangling.R
: Wrangles information from siteData sheet of template Excel file.phase 1 soil data wrangling.R
: Placeholder describing where soil data may someday be acquired. For now, the relevant R package does not work (though their team is aware of and working on this issue).phase 1 survey-lvl summarizing.R
: Summarizes the tidy plant-level (one row per plant) phase 1 data to survey-level (one row per survey).phase 1 wrangling.R
: Takes all the separate phase 1 raw data files combines and wrangles them to plant-level (one row per plant).
30.3 Phase 2 Scripts
phase 2 densityData wrangling.R
: Wrangles eponymous sheet from template Excel file.phase 2 herbivoreData wrangling.R
: Wrangles eponymous sheet from template Excel file (at both plant-level and survey-level).phase 2 metadata
abiotic.R`: Extracts WORLDCLIM climatic data from phase 2 site locations (requires tidy file from siteData wrangling script).phase 2 newColumns wrangling.R
: Wrangles eponymous sheet from template Excel file.phase 2 notes wrangling.R
: Wrangles eponymous sheet from template Excel file.phase 2 plantData survey-lvl summarizing.R
: Summarizes tidy plantData to survey-level (i.e., one row per survey).phase 2 plantData wrangling.R
: Wrangles eponymous sheet from template Excel file (at ONLY plant-level).phase 2 reproData wrangling.R
: Wrangles eponymous sheet from template Excel file at plant-level only (survey level absent because insufficient raw data at this point).phase 2 siteData wrangling.R
: Wrangles eponymous sheet from template Excel file.
30.4 Script Archive
All “actual” scripts (i.e., those used in day-to-day wrangling) should have a consistent aesthetic and comment structure (as well as being primarily tidyverse-based). When others contribute code, duplicate the file and edit one version to match internal standards. The second version goes here to be preserved in its original form as a back-up
30.5 Singleton Tasks
Any scripts written to accomplish a ‘one-off’ task I thought unlikely to be repeated regularly are placed here. Some of them may include operations that could be useful in other contexts tough!
30.6 Manuscript Subsetting Scripts
Each script is dedicated for a single HerbVar manuscript and does the subsetting and/or column selection necessary to create a tidy data file of only what authors request to test their hypotheses