A data science team is developing a model to predict customer churn.
The lead data scientist is concerned about overfitting, as the initial Decision Tree model is achieving 99% accuracy on the training data but only 75% on the test data.
They also want a model that is robust to noise and provides feature importance rankings.
Which of the following algorithms would be the most appropriate next choice to address these specific concerns?