: A Metaheuristic Approach of predicting the Dynamic Modulus in Asphalt Concrete A Metaheuristic Approach of predicting the Dynamic Modulus in Asphalt Concrete

The prediction of the asphalt dynamic modulus ( E* ), which measures the material's ability to withstand changes in shape or structure, is important. Previous studies indicated that the well-known Witczak 1-40D model for E* is outperformed by machine learning models. Additionally, the application of machine learning algorithms requires manual fine-tuning of their hyperparameters. In this study, the artificial Hummingbird and Harris Hawks optimization algorithms were employed in the automatic calibration of the Random Forest and Gradient Boost algorithms' hyperparameters for modeling E* using the Witczak 1-40D model and additional parameters. In addition, the model was interpreted using the Shapley value and permutation feature importance. The results indicate that the optimized artificial hummingbird algorithm model performed better, with R ² reaching 0.97. The interpretability of the model suggests that the binder parameters exhibited the highest effect on the variance of E*


INTRODUCTION
Road quality is crucial for the economic development of a country and well-designed roads are vital to the economic progress of any region.The construction of a high-quality road requires careful knowledge of road distress and a proper selection of materials to withstand these stresses.Rutting is one of the asphalt pavement distresses prevalent on roads exposed to high temperatures and traffic loads.Asphalt concrete dynamic modulus (E*), which measures the material's ability to withstand changes in shape or structure [1], is often used to characterize the resistance of a material to rutting.Several models have been developed to predict rutting, the most notable among them being the Witczak model [2][3].The performance evaluation of predicting E* using the Witczak model gave an R 2 of 0.94 with Gaussian process regression [4].
Previous studies have shown that machine learning evaluation performance can be improved when an optimization approach is applied to fine-tune model hyperparameters instead of performing manual trial-and-error calibration [5][6][7].This is evident in the results reported in [8][9][10][11][12] in the prediction of E*.
Despite the reported high accuracy, there are significant gaps in the literature:  The databases used are limited.
 The input parameters do not account for all the parameters that affect the variance of E*.
 Model interpretability is missing.
This study sought to address the aforementioned gaps using the Witczak database with 346 test cases, which totals 7,400 data points.Furthermore, the Witczak model input parameter was used, since it accounts for the binder, aggregate, void, and test parameters, in addition to five other parameters.

II. METHODOLOGY
This study focuses on the automatic calibration of traditional machine learning models, namely Random Forest (RF) and Gradient Boost Regression (GBR), to predict E* using Witczak 1-40D parameters.This study aims to improve the prediction accuracy of E* using the Witczak 1-40D parameters, six additional parameters, and the Witczak 1-40D database.The parameters of the Witczak 1-40D are: r200 is the percentage passing a #200 sieve, r4 is the cumulative percentage retained on a #4 sieve, r38 is the cumulative percentage retained on a 3 / 8 " sieve, r34 is the cumulative percentage retained on ¾" sieve, Va is the air void percentage, Vbeff is the effective binder content percentage, |Gb*| is the binder complex shear modulus in psi, and db is the binder phase angle.The six additional parameters were: Asphalt Content (AC), Temperature (T), frequency of loading (fs), air voids (A), Volumetric Total Solids (VTS), and phase angle of the mixture (phi).This study also aims to provide model feature interpretability using the Shapley (SHAP) value and Permutation Feature Importance (PFI) analysis to identify the key parameters that significantly influence predictive accuracy.The trained models were evaluated using standard error metrics such as the Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), R-squared (R 2 ), Nash-Sutcliffe Efficiency (NSE), and Kling-Gupta efficiency (KGE).The mathematical expressions of these metrics are: where = ∑6 Pred ," − 8 9 , : = ∑6Y <=>,?− 8 9 , 8 is the mean of @ value, 3 is the number of observed values, $%&'" is the predicted value, !" is the observed value, -represents the correlation coefficient, -A is the coefficient of determination, BA denotes the relative deviation, -C represents the mean error, and BC represents the relative mean error.

A. Model Training
The model was trained with the GBR and RF algorithms from the scikit-learn library in Python.Training these algorithms traditionally requires manual calibration of the algorithm parameters.This could be time-consuming and with a low possibility of reaching the optimum solution.For this reason, two metaheuristic optimization algorithms, Haris Hawks Optimization (HHO) and Artificial Hummingbird Algorithm (AHA), were used to automatically calibrate the GBR and RF model training hyperparameters.
HHO is a metaheuristic algorithm inspired by hawks' hunting strategies It demonstrates exceptional performance in many optimization problems, such as mobile robot path planning, and enhanced feature selection performance by integrating chaotic opposition and simulated annealing in an improved version of the algorithm consisting of two main stages, investigation and exploitation, utilizing equations with variables representing hawks' positions, prey location, solution space bounds, and random numbers.Although this algorithm is very effective, it can struggle with population diversity and local optima [13].AHA [14] is a novel bioinspired design that is classified as a new metaheuristic optimizer, based on intelligent foraging strategies and unique flights of hummingbirds.AHA simulates three foraging behaviors, guided, territorial, and migration foraging, in the optimization process.It uses a visit table to model the flight abilities and clever foraging techniques of hummingbirds.AHA has three basic components: food sources, hummingbirds, and visit tables.Food sources measure nectar quality, nectar replenishment rate, and last visit to flowers.Hummingbirds are assigned to particular food sources and communicate with each other.For each food source, visit levels are recorded in visit tables [14].
For GBR, the key hyperparameters optimized are the number of estimators, learning rate, and maximum depth.These hyperparameters control the trade-off between accuracy and complexity in gradient boost models, where more trees, lower learning rate, and higher depth can improve the fit but also increase the risk of overfitting and the computational cost.The upper and lower bounds of the optimization algorithm search ranges for these parameters were: 50-200 estimators, 0.01-1 learning rate, and 1-10 maximum depth.The search was carried out over a population of 50 in 180 iterations.RF optimization was applied to the number of estimators, maximum depth, and maximum features.The lower and upper values of the optimization algorithm search range were: 10-200 estimators, 1-50 maximum depth, and maximum feature range from 1 to the maximum number of input variables.The search was carried out over a population of 100 in 180 iterations.

B. Model Feature Selection
Model feature importance can help to understand how the model makes predictions, identifying and removing irrelevant or redundant features, and comparing and contrasting different models or datasets.There are different ways to calculate model feature importance, such as coefficients from linear models, decision trees, permutation importance, and SHAP values.This study used PFI and SHAP analysis to offer a comprehensive, multi-faceted evaluation of feature importance in the predictive models.
The SHAP value is a concept from cooperative game theory that distributes the total pay-off among the players of a game according to their marginal contribution.In this context, the players are the features, the game is the prediction of the model, and the payoff is some measure of the importance or influence of the feature subset.The SHAP value for a feature is calculated by averaging the difference in the model output when the feature is included or excluded from all possible subsets of features [15].PFO is a technique that measures how much the prediction error of a machine learning model increases when the values of a feature are randomly shuffled.This technique can be used for any fitted model and any tabular data, and it is especially useful for non-linear or opaque models that are hard to explain.PFI can help in understanding which features are more important to the model and how they affect its performance and accuracy [16].

III. RESULTS AND DISCUSSION
This study evaluates the performance potentials of the AHA and HHO-optimized GB and RF models to predict E*.This study is unique as it used two distinct sets of input features for its predictions.On the one hand, it used the Witczak 1-40D model input features, a conventional method known for its reliability in predictions.On the other hand, the model was also trained with a combination of Witczak 1-40D input features and six additional features: AC, T, fs, A, VTS, and phi.This dual approach in feature evaluation aimed to explore the extent to which the integration of these additional features could enhance the model's accuracy in predicting E*, thereby providing a comprehensive understanding of the model's predictive capabilities.Furthermore, the interpretability of the model parameters was evaluated using the SHAP value and PFI.

A. Performance using Witczak 1-40D Input Parameters
The evaluation of Witzcak parameters using GB and RF models, reveals significant insights into their predictive accuracy and efficiency.While the GB model exhibits excellent performance in the training phase (R² ranging from 0.90 to 0.99, NSE from 0.91 to 0.99, KGE from 0.90 to 0.99), it shows a slight decline in the testing phase (R² from 0.89 to 0.93, NSE from 0.90 to 0.93, KGE from 0.89 to 0.96), indicating a potential overfitting to the training data.On the other hand, the RF model demonstrates remarkable consistency and robustness, maintaining scores across all metrics in both training (R², NSE, KGE all around 0.98 to 0.99) and testing phases (R² and NSE from 0.92 to 0.94 and KGE from 0.92 to 0.93), Figure 1shows that RF models are more fitted compared to GB models.This consistent high performance of the RF model, especially in the testing phase, underscores its superior ability to generalize, making it a more reliable choice for applications requiring stable and accurate predictions on new, unseen data.

B. Performance using Witczak 1-40D Input and Additional Parameters
For the model with the Witczak 1-40D model parameters and the additional six parameters, the GBR training results were outstanding, particularly for the AHA and HHO variants, with R², NSE, and KGE values reaching perfection.However, a slight decline was observed in the test phase, although the metric values remain high (R², NSE, KGE ranging from 0.92 to 0.97), indicating strong model generalization but a slight overfit in training.In contrast, the RF model exhibited exceptional consistency and slightly higher robustness, with almost perfect scores in training (R², NSE, KGE all around 0.99) and very high scores in testing (R², NSE, KGE all around 0.94), showcasing its remarkable ability to generalize across unseen data.These results indicate that while GBR provides excellent fit and predictive accuracy as shown in Figure 2(b), especially when its parameters were optimized with AHA and HHO, RF stands out for its consistency and robustness, making it exceptionally reliable for practical applications where generalization and stability across diverse datasets are crucial.Furthermore, the addition of the six parameters shows that the performance of the E* prediction can be improved when compared to the performance reported in [4,17].

C. Parameter Interpretability and Optimize Input Features for the Model
For the model interpretability, the E* measurement of asphalt mixture depends on grouped factors: binder parameters, mixture test parameters, mixture volumetric parameters, and the aggregate distribution properties.

1) Binder Parameters
Figure 3 shows that the binder's complex modulus (G*) significantly influences the prediction of the E* of an asphalt mixture.This impact is more pronounced than the effects of the binder's phase angle, which although important, is less influential than the phase angle of the mixture and the test temperature.for the binder are close to zero, there is an increased importance of these properties in predicting E*.

2) Mixture Test Parameters
The analysis of a group of mixture test parameters, including mixture, phase angle, test temperature T, and frequency Fs, shows that they are significantly influential in the prediction of the E* of an asphalt mixture, following the binder parameter group in importance.Specifically, the mixture phase angle within this group is highlighted for its high permutation importance, as shown in Figure 3(b).According to the SHAP explainer plot shown in Figure 3(a), these parameters generally exhibit a negative distribution, implying a particular kind of influence on E* predictions.Additionally, it was observed that the test frequency, although a part of this group, has a relatively minor effect on the prediction as indicated by its low permutation importance score.

3) Aggregate Distribution Properties
In evaluating the E* of asphalt mixtures, the significance of the aggregate gradation parameters progressively decreases.Among these, the percentage of aggregates retained on a 19.05 mm sieve stands out (r34), ranking as the second most influential factor in this category.On the contrary, the percentage of aggregates passing through a 0.13 mm sieve r200 is observed to have the least influence on E* predictions within the same group.This trend underscores hierarchical importance in aggregate sizes, where larger sizes, such as those retained on the 19.05 mm sieve, play a more substantial role, while finer sizes, like those passing through the 0.13 mm sieve, contribute less to the prediction of the mixture's dynamic modulus.

4) Mixture Volumetric Properties
In the analysis of the E* of asphalt mixtures, the parameters AC, VA, and Vbeff have a significant influence.These parameters substantially affect the E* predictions, a fact underscored by their notable permutation importance scores.Despite this, their Shapley effect distributions tend to hover near zero.This implies that while each parameter individually contributes to E* predictions, their collective impact within the predictive model is more balanced or neutral.

IV. CONCLUSION
This study investigated the novel approach of using the Artificial Hummingbird Algorithm (AHA) and Harris Hawks Optimization (HHO) for the automatic calibration of Gradient Boosting Regression (GBR) and Random Forest (RF) algorithm hyperparameters for the prediction of E*.Additionally, the prediction of E* using the Witczak model parameters was improved by including six additional inputs for an accurate account of its variation.The presented results show that by adding asphalt content, temperature, frequency of loading, air voids, volumetric total solids, and mixture phase angle to the existing Witczak model parameters, the R² performance of the model prediction improved from 0.92 to 0.97 in a model trained with a hybrid AHA-optimized Gradient Boost model.The model interpretability presented showed that the binder properties and mixture phase angle had the highest effect on the variance of E*.The size of the aggregate shows a decreasing effect with a decrease in the aggregate size, indicating the dependency of the mixture stiffness on the aggregate size.
The Witczak 1-40D database used contains 5970 unique test cases and was partitioned into training and testing sets in a 65:35 ratio.The test set was used for the model evaluation after model training was completed.Table I presents the details of the features of the model, assessed with the relevant statistical parameters.

TABLE I
Additionally, in regions where VTS and A values