From 5ee216c2d9b0006939368bebbc426393d623695f Mon Sep 17 00:00:00 2001 From: Tyler Chafin Date: Wed, 28 Feb 2024 12:13:41 +0000 Subject: [PATCH] Update paper.md --- paper/paper.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/paper/paper.md b/paper/paper.md index ade39b1..badcd09 100644 --- a/paper/paper.md +++ b/paper/paper.md @@ -65,11 +65,11 @@ To demonstrate autoStreamTree, we employed existing SNP data for Speckled Dace ( Stream networks were parsed directly as a minimal sub-graph from RiverATLAS, which contains various local-scale environmental/ hydrological features as annotations (i.e., physiography, climate, land-cover, geology, anthropogenic effects) [@Linke2019]. Genetic distances were computed globally and per-locus among sites as linearized *F*~ST~ [@Weir1984] (=*F*~ST~/1-*F*~ST~). To compare with @Kalinowski2008, we used unweighted least-squares, iterative negative distance correction, and replicated analyses using linearized *F*~ST~ independently recalculated in Adegenet [@Jombart2008]. -We examined variation in per-locus fitted distances as a function of environmental and anthropogenic covariates, carried over as annotations to RiverATLAS. We reduced N=281 hydro-environmental RiverATLAS attributes using forward-selection following the implementation used in adeSpatial [@Dray2018], after first removing variables which were invariant, containing missing values, exhibiting pairwise correlations (\(r\)) over 0.7, or having Variance Inflation Factor (VIF) > 3. Remaining selected variables were used in redundancy analysis (RDA) to visualize variation in fitted distances as a function of environmental factors. +We examined variation in per-locus fitted distances as a function of environmental and anthropogenic covariates, carried over as annotations to RiverATLAS. We reduced N=281 hydro-environmental RiverATLAS attributes using forward-selection following the implementation used in adeSpatial [@Dray2018], after first removing variables which were invariant, containing missing values, exhibiting pairwise correlations (\(r\)) over 0.7, or having Variance Inflation Factor (VIF) >3. Remaining selected variables were used in redundancy analysis (RDA) to visualize variation in fitted distances as a function of environmental factors. ## Results and comparison -Runtimes are reported for a 2021 Macbook Pro, 16GB memory, 3.2GHz M1 CPU. Time required to calculate/extract a minimal sub-graph containing 118 dissolved edges from RiverATLAS (North America shapefile totaling 986,463 original vertices) was 35min. Computing pairwise hydrologic distances required an additional 3sec. Pairwise population genetic distances were computed in ~24min (linearized *F*~ST~), with Mantel test and distance fitting requiring 11sec and 10sec, respectively. Re-running the entire pipeline per-locus (i.e., `-r RUNLOC`) took 3h 34min. Fitted-*F*~ST~ for autoStreamTree (\autoref{fig:example}) matched that re-calculated using the @Kalinowski2008 method (adjusted *R*^2 = 0.9955; *p* < 2.2e-16). However, due to runtime constraints and manual pre-processing for the latter, per-locus distances were not attempted. The pRDA selected four variables, with 221 SNPs and 9 edges as outliers (\autoref{fig:example}). +Runtimes are reported for a 2021 Macbook Pro, 16GB memory, 3.2GHz M1 CPU. Time required to calculate/extract a minimal sub-graph containing 118 dissolved edges from RiverATLAS (North America shapefile totaling 986,463 original vertices) was 35min. Computing pairwise hydrologic distances required an additional 3sec. Pairwise population genetic distances were computed in ~24min (linearized *F*~ST~), with Mantel test and distance fitting requiring 11sec and 10sec, respectively. Re-running the entire pipeline per-locus (i.e., `-r RUNLOC`) took 3h 34min. Fitted-*F*~ST~ for autoStreamTree (\autoref{fig:example}) matched that re-calculated using the @Kalinowski2008 method (adjusted *R*^2 = 0.9955; *p* < 2.2e-16). However, due to runtime constraints and manual pre-processing for the latter, per-locus distances were not attempted. The RDA selected 26 environmental variables, with 262 SNPs and 125 edges as outliers (\autoref{fig:example}). ![autoStreamTree output. Shown are *F*~ST~ distances fitted onto original stream network (A), variation in per-locus fitted-*F*~ST~ distances via pRDA (controlling for stream length) scaled by loci (B), and by stream segment (C). Outliers highlighted according to most closely correlated explanatory axis and abbreviated as: AET (actual evapotranspiration), PET (predicted evapotranspiration), DOF (degree of fragmentation), and USE (proportion of water for anthropogenic use). \label{fig:example}](./figs/Fig1.png){ width=80% }