Data Exploration
Early data exploration included overlaying and plotting each chronology according to species and ecozone to view the frequency of the data and identify any anomalies (Figure 7). As seen in the Taiga ecozone, strange ring width values can be observed near the years 1930, 1990, and 2000. It is possible that the beginning and end ring-width values are false and should be validated using external tree-ring databases such as Dendrobox (Zang, 2015). These values will need to be examined further to remove false or conflicting data prior to performing further analysis.
Figure 7. A graph showing the annual growth of both white spruce (PCGL) and lodgepole pine (PICO) from 191 chronology sites over time. The chronologies were overlayed and plotted according to ecozone and color-coded according to species. The ecozones included in the graph from top to bottom are: GRPL (Great Plains); HDPL (Hudson Plain); MWCF (Marine West Coast Forests); NAD (North American Deserts); NORF (Northern Forests); NWFM (North Western Forested Mountains); TAIG (Taiga); and TUND (Tundra).
Boxplots of the mean annual ring-widths for each species were created according to ecozone. These boxplots show significant variability in radial growth in the Taiga, Tundra, and Northern Forest ecozones (Figure 8). Much of the variability could be attributed to the large geographical range that the chronologies cover for each of these ecozones. To further visualize the cause beyond this variability, multivariate analyses were explored.
Boxplots of the mean annual ring-widths for each species were created according to ecozone. These boxplots show significant variability in radial growth in the Taiga, Tundra, and Northern Forest ecozones (Figure 8). Much of the variability could be attributed to the large geographical range that the chronologies cover for each of these ecozones. To further visualize the cause beyond this variability, multivariate analyses were explored.
Figure 8. Boxplots showing the average annual radial growth (ring-width in mm) for both white spruce (PCGL) and lodgepole pine (PICO) according to ecozone. Each box shows the median ring-width value, while the whiskers represent the minimum and maximum values within the upper and lower quartiles. Values outside of the box and whiskers are considered outliers. The ecozones included in the graph from top left to bottom right are: GRPL (Great Plains); HDPL (Hudson Plain); MWCF (Marine West Coast Forests); NAD (North American Deserts); NORF (Northern Forests); NWFM (North Western Forested Mountains); TAIG (Taiga); and TUND (Tundra).
The following principal component analysis (PCA) shows which climate variables most influence each of the eight ecozones, as well as what limits the growth in each of the white spruce and lodgepole pine chronology sites (Figure 9). The Great Plains and the North American Deserts are relatively independent and appear to be driven by mean annual temperature more than precipitation. Meanwhile, the chronology sites in the Tundra and Taiga ecozones tend to cluster together and are most influenced by cold temperatures and mean annual temperature differences near the DD_0 and TD climate variables.
Figure 9. A Principal Component Analysis (PCA) that explains 77.2% of the variance between each species and ecozone as a function of "normal" annual climate variables (Hamann and Wang, 2013). The species included are white spruce (PCGL) and lodgepole pine (PICO) and are displayed according to shape. The ecozones and the various chronology sites are color-coded according to the ecozone that each chronology occupies.
Northern forests were most influenced by the number of frost-free days, which is intuitive given the geographical location of this region. Since there is only one chronology site in the Marine West Coast Forests ecozone, this ecozone and its limited tree-ring data will require more chronology sites for a more robust analysis or it will be removed from the remaining data moving forward. Contrary to the boxplots shown above, the northwestern forested mountains exhibited the largest range of variability in the PCA.
Based on the information provided by this principal component analysis, the precipitation variables had several outliers and were log transformed prior to further analysis. Additionally, classifying the chronology groups according to ecozone did not appear the most accurate approach. Therefore, the preferred approach moving forward in the analysis was the partitioning around medoids function (PAM) to create groups based on the similarity of specified input features. Further data exploration also showed that the chronology data for lodgepole pine was not robust enough for statistically significant results, so only the white spruce dataset was included in the final interpretation.
Northern forests were most influenced by the number of frost-free days, which is intuitive given the geographical location of this region. Since there is only one chronology site in the Marine West Coast Forests ecozone, this ecozone and its limited tree-ring data will require more chronology sites for a more robust analysis or it will be removed from the remaining data moving forward. Contrary to the boxplots shown above, the northwestern forested mountains exhibited the largest range of variability in the PCA.
Based on the information provided by this principal component analysis, the precipitation variables had several outliers and were log transformed prior to further analysis. Additionally, classifying the chronology groups according to ecozone did not appear the most accurate approach. Therefore, the preferred approach moving forward in the analysis was the partitioning around medoids function (PAM) to create groups based on the similarity of specified input features. Further data exploration also showed that the chronology data for lodgepole pine was not robust enough for statistically significant results, so only the white spruce dataset was included in the final interpretation.