This application allows researchers to interactively analyze Tree Amplitude outcomes.
Our primary objective is to determine which environmental factors—such as soil moisture or solar intensity—are the strongest predictors of tree radius amplitude.
Principal Component Analysis (PCA) is a technique used to condense the feature space by creating new axes from the original features while preserving as much variation as possible. Each component captures a different pattern of variation in the data, with the first components explaining the most variance. Here, we use PCA to reduce our 8 numerical environmental and growth features to a more manageable number while retaining as much information as we can.
We use this larger correlation matrix to show how each principal component relates to the original features. This, alongside the loadings plot below, allows us to obtain a better understanding of what these components represent.
Principal Component Regression (PCR) combines PCA with linear regression to predict tree growth outcomes. Instead of using the original features directly, PCR builds a regression model on our principal components. This can improve interpretability and potentially provide new insights.
Choose which PCs to include in the regression:
To explore the effects of different environmental conditions on tree amplitude, we used a K-Nearest Neighbors (KNN) algorithm to classify trees into different daily basal area change categories. The KNN model predicts the change category of new data points based the most common category of its nearest neighbors.
The preselected variables are the ones that yielded the best model performance. The graphs/figures show the following:
To understand how different environmental conditions grouped into different clusters -- or what we call "forests" --, we used a K-Means algorithm using both numeric and categorical variables.
We settled on three forests after observing the elbow plot below. Both three and four forests have very similiar silhouette scores and three is the best balance between structure and interpretation.
This project focuses on modeling the impact that environmental factors have on Tree Stem Amplitude (how much a tree grows/shrinks in a day). Tree Amplitude primarily comes from trees absorbing or losing water. As the climate changes (and the arctic is particularly susceptible to warming
It is unknown how trees will react. Our model hopes to provide clarity to what factors impact tree amplitude and therefore how climate change can impact trees.
To address the research question of distinguishing distinct physiological regimes, we utilized a multinomial logistic regression classifier to predict categories of Basal Area Daily Amplitude, a choice well-justified for isolating "Extreme Change" events from background noise.
The model achieved an accuracy of 47.7%, outperforming the random baseline (33%), though performance metrics indicate a stronger ability to identify stable periods ("No/Little Change" Recall: 0.61) compared to detecting high-amplitude events ("Extreme Change" Recall: 0.33).
Despite this identification gap, the model successfully validated key biological assumptions, confirming that species identity and energy input are deterministic: Picea mariana and high average solar irradiance emerged as the strongest positive drivers of extreme daily amplitude, while Picea glauca served as a significant negative predictor associated with stability.
To condense the number of features in our analysis and potentially simplify the model, we performed principal component analysis. This technique led to the identification of 6 principal components that explain roughly 95% of the variance in the data, allowing us to shrink the number of features from 8 to 6.
While some features are more difficult to interpret (particularly the later ones), we do gain valuable insight from some of the components. For example, PC1 and PC2 appear to be related to current tree stress, with PC1 being highly correlated with current stem radius and basal area, while PC2 is negatively correlated with these factors. PC4 seems to be a good general indciator of tree growth, since it is positively correlated with change in stem radius and basal area, as well as humidity and soil water content, which may indicate that additional expansion comes from water absorption.
When we take these components and apply them to a Principal Component Regression (PCR) model, they appear to work very well. As our scree plot demonstrated, the first four components are particularly useful in explaining the change in basal area. Overall, PCA has proven to be a useful tool.
Our regression results support the well-established hydraulic mechanism where high VPD drives reversible trunk shrinkage and recovery with low VPD. Higher VPD increased daily stem amplitude, while greater soil moisture buffered this effect, consistent with Devine & Harrington (2011).
Our results show that Arctic tree stem dynamics respond strongly to VPD and soil water, not just temperature, supporting Jensen’s argument that moisture-related stress is a key but overlooked driver of Arctic tree physiology under climate change.
This is particularly compelling as several studies have found a strong relationship between shrinkage (TWD )and hydraulic stress ($psi$) persisting across all drought conditions until lethal dehydration such as in (Ziegler et al., 2024)/ That is, large TWD (shrinkage) amplitudes are strongly linked to high hydraulic stress ($psi$ approaching lethal levels) because large TWD means living tissues have lost a lot of water (low turgor) to supply transpiration, signaling water stress.
To approach the question of predicting Basal Area Daily Amplitude from a set of environmental factors, we applied a K-Nearest Neighbors (KNN) classifier. Our model achieved an accuracy of 66%, outperforming the random baseline (33%) by a factor of 2. The model had good predictive performance across all 3 categories, with a "Moderate Change" preforming the worst yet still having a precision of 57.7% and a recall of 52.2%.
This was found using the following features: Air Pressure, Humidity, Temperature, Solar Radiation, Soil Moisture, Stem Radius, and Species. The addition of variables measuring time of year (month), site location, latitude, and longitude (among others) did not improve performance. Thus, a fairly accurate predictive model can be built using only environmental features while effectively controlling in part for species in tree size. While this model cannot show exactly how influential each variable is, it does provide a useful baseline for further analysis and shows evidence that tree daily amplitude is strongly influenced by environmental factors.
However, we can Look at some partial effects plots :heatmaps of 2 numeric features with an overlay for predicted change holding everything else at its mean. We see some of what is happening in the data, but there is a lot of uncertainty. for humidity and soil water content, the most extreme changes are predicted when the conditions are either low humid and dry soil or high humidity and wet soil. For soil water content and temperature, we also see that the soil water content has a large deciding effect at lower temperatures but not at higher ones where many values are predicted to be extreme. For many of the plots with solar irradiance, there is a middle band on sunlight where the prediction is largely moderate with extreme and no/little change occurring at high or low solar irradiance.
To observe how different environmental conditions group trees into distinct "forests," we applied K-Means clustering using both numeric and categorical variables. Based on the elbow plot and nearly identical silhouette scores for three and four clusters, we chose three clusters to keep the model interpretable while still capturing key structure in the data.
Forest 1 represents wetter, hotter, and sunnier sites with extreme changes in mean amplitude, whereas Forest 2 captures drier, colder, and shadier environments where mean amplitude is only moderate. Forest 3 is similar to Forest 1 but has higher humidity and air pressure that result in only moderate mean amplitude. This suggests that small shifts in climate variables can meaningfully change overall tree growth patterns.