Machine Learning

Classification of Alzheimer’s Disease Stages using Radiology Imaging and Longitudinal Clinical Data – Part 8

Fine Tuning of an Ensemble of Classification Models Using Random Grid Search

Parameters for a model are learned during the training while hyperparameters are set to control the implementation of the model. Grid-search is a technique used to find the optimal hyperparameters for a model. Ensemble learning involves training multiple models and combining the diverse classifiers together to form a strong machine learning learner. It helps to improve robustness over a single learner and handles large volumes of data or not adequate data (Yao et al., 2018). The technique is applied by certain papers (Kruthika et al., 2019a), (Zhang and Sejdi´c, 2019). The goal of this implementation is to achieve better performance by running a grid search and the use of an ensemble of classifiers. This is a continuation from here.

Run Grid Search to Find Most Acceptable Hyperparameters

Hyperparameter values need to be set before the learning process because the values are used to control the learning process and cannot be estimated from the data. A combination of values is used on a validation data set to find the optimal hyperparameters. Random grid search uses a grid of hyperparameters and random combination to train and score on the validation data set and not a test data set. This helps to generalize performance. Running random grid search (RandomizedSearchCV) several times using cross-validation helps to find the most acceptable parameters for the model. These hyperparameters are used for the model are stated and used for an ensemble of classifiers.

Generate Feature Importance of the Classifiers

The figure shows the feature importance of different classifiers.


It shows that in a given model the features which are important in explaining the target feature. MRI of entorhinal is the most important feature followed by MMSE for all the classifiers except Ada boost. RAVLT immediate is the next important feature for Ada boost. However, the presence of continuous features or high-cardinality categorical features can result in a bias.

Implementation, Evaluation and Result of Ensemble of Classifiers

Ensemble learning generally improves the performance of the models (Nanni et al., 2016). Random Forest , Extra trees classifier , Ada boost classifier , Gradient boosting classifier , and XGBoost with optimized hyperparameters are combined using voting classifier. Voting classifier combines the above models using soft voting which is:


It is used to predict the class labels based on the predicted probabilities for well-calibrated classifier. wj is the weight that can be assigned to the jth classifier. It is implemented using scikit-learn library and function used to implement is VotingClassifier() and weights for the model are 2, 3, 3, 1 and 3.

Figure is a normalized confusion matrix for the ensemble of classifiers.


The values of the diagonal elements denote the degree of correctly predicted class i.e., 0.47 for normal (NL), 0.60 for MCI and 0.99 for dementia. The off-diagonal elements are mistakenly confused with the other classes. Therefore, the model is better for predicting dementia and MCI than normal with the threshold of the ensemble of classifiers fixed at 0.5.

The model resulted in predicting normal with AUROC score of 0.72 against dementia and MCI, MCI with AUROC score of 0.60 against normal and dementia and classify dementia with AUROC score of 0.89 against normal and MCI.


AUROC curve helps to measure the performance of the model without fixing the threshold. It plots a point for every possible threshold and is helpful to select the threshold of the model depending on the use case. The figure shows that the ensemble of classifiers is better in predicting dementia than normal and MCI when the threshold is not fixed. Hence, the model is better than the implementation of XGboost in the previous section.

The report continues here.

Leave a Reply