Skip to content

Commit

Permalink
Update paper.md
Browse files Browse the repository at this point in the history
  • Loading branch information
tkchafin authored Feb 28, 2024
1 parent ba73411 commit 5ee216c
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,11 +65,11 @@ To demonstrate autoStreamTree, we employed existing SNP data for Speckled Dace (

Stream networks were parsed directly as a minimal sub-graph from RiverATLAS, which contains various local-scale environmental/ hydrological features as annotations (i.e., physiography, climate, land-cover, geology, anthropogenic effects) [@Linke2019]. Genetic distances were computed globally and per-locus among sites as linearized *F*~ST~ [@Weir1984] (=*F*~ST~/1-*F*~ST~). To compare with @Kalinowski2008, we used unweighted least-squares, iterative negative distance correction, and replicated analyses using linearized *F*~ST~ independently recalculated in Adegenet [@Jombart2008].

We examined variation in per-locus fitted distances as a function of environmental and anthropogenic covariates, carried over as annotations to RiverATLAS. We reduced N=281 hydro-environmental RiverATLAS attributes using forward-selection following the implementation used in adeSpatial [@Dray2018], after first removing variables which were invariant, containing missing values, exhibiting pairwise correlations (\(r\)) over 0.7, or having Variance Inflation Factor (VIF) > 3. Remaining selected variables were used in redundancy analysis (RDA) to visualize variation in fitted distances as a function of environmental factors.
We examined variation in per-locus fitted distances as a function of environmental and anthropogenic covariates, carried over as annotations to RiverATLAS. We reduced N=281 hydro-environmental RiverATLAS attributes using forward-selection following the implementation used in adeSpatial [@Dray2018], after first removing variables which were invariant, containing missing values, exhibiting pairwise correlations (\(r\)) over 0.7, or having Variance Inflation Factor (VIF) >3. Remaining selected variables were used in redundancy analysis (RDA) to visualize variation in fitted distances as a function of environmental factors.

## Results and comparison

Runtimes are reported for a 2021 Macbook Pro, 16GB memory, 3.2GHz M1 CPU. Time required to calculate/extract a minimal sub-graph containing 118 dissolved edges from RiverATLAS (North America shapefile totaling 986,463 original vertices) was 35min. Computing pairwise hydrologic distances required an additional 3sec. Pairwise population genetic distances were computed in ~24min (linearized *F*~ST~), with Mantel test and distance fitting requiring 11sec and 10sec, respectively. Re-running the entire pipeline per-locus (i.e., `-r RUNLOC`) took 3h 34min. Fitted-*F*~ST~ for autoStreamTree (\autoref{fig:example}) matched that re-calculated using the @Kalinowski2008 method (adjusted *R*^2 = 0.9955; *p* < 2.2e-16). However, due to runtime constraints and manual pre-processing for the latter, per-locus distances were not attempted. The pRDA selected four variables, with 221 SNPs and 9 edges as outliers (\autoref{fig:example}).
Runtimes are reported for a 2021 Macbook Pro, 16GB memory, 3.2GHz M1 CPU. Time required to calculate/extract a minimal sub-graph containing 118 dissolved edges from RiverATLAS (North America shapefile totaling 986,463 original vertices) was 35min. Computing pairwise hydrologic distances required an additional 3sec. Pairwise population genetic distances were computed in ~24min (linearized *F*~ST~), with Mantel test and distance fitting requiring 11sec and 10sec, respectively. Re-running the entire pipeline per-locus (i.e., `-r RUNLOC`) took 3h 34min. Fitted-*F*~ST~ for autoStreamTree (\autoref{fig:example}) matched that re-calculated using the @Kalinowski2008 method (adjusted *R*^2 = 0.9955; *p* < 2.2e-16). However, due to runtime constraints and manual pre-processing for the latter, per-locus distances were not attempted. The RDA selected 26 environmental variables, with 262 SNPs and 125 edges as outliers (\autoref{fig:example}).

![autoStreamTree output. Shown are *F*~ST~ distances fitted onto original stream network (A), variation in per-locus fitted-*F*~ST~ distances via pRDA (controlling for stream length) scaled by loci (B), and by stream segment (C). Outliers highlighted according to most closely correlated explanatory axis and abbreviated as: AET (actual evapotranspiration), PET (predicted evapotranspiration), DOF (degree of fragmentation), and USE (proportion of water for anthropogenic use). \label{fig:example}](./figs/Fig1.png){ width=80% }

Expand Down

0 comments on commit 5ee216c

Please sign in to comment.