Fighting the Black Box: Guide to Interpreting Common Machine Learning Algorithms

Sonnet Xu
3 min readJul 29, 2021


With the rapid increase in machine learning interest and usage, more and more concerns are arising about the utility of these different models. One major concern is on the ethical front. How can we let algorithms decide potentially detrimental decisions if we don’t even know the reasoning behind it? For this reason, understanding why your machine learning model makes the decisions that it does is crucial. This blog is a short survey of the interpretability of different ML algorithms, hope you enjoy!


Latent Dirichlet Allocation (LDA) is an unsupervised statistical model that is often used for it’s small footprint and fast prediction speeds makes it especially useful for smaller datasets. This allows the model to be comparatively simpler and doesn't require constant tuning of hyperparameters. It works through document classification by scanning through data and figuring out group or topic it belongs to. One document can belong more than 1 topic. If something is similar enough to a topic it can be classified as both.


k-nearest neighbors (KNN) is a supervised machine learning model which means that learns by having a data set that’s label with which class or label it belongs to. KNN is often used to predict a result using a label heavy dataset with using a separate label to predict. This means that if you trying to figure out if a person owns a dog or a cat you would look at other labels like age, gender, and hobbies. The model follows the philosophy that similar data points will be near each other. The KNN can handle more complicated datasets and also performs better than most simple models.

Naïve Bayes

Naïve Bayes is a supervised machine learning model often used when you have lots of data points, but a small number of variables because it predicts the class of unknown data points. It accomplishes this by using the Bayes theorem of probability which is the “Naïve” assumption that then data points have a conditional independence. It also works faster and more efficiently compared to many other more complicated models. This is really useful in spam filtering and trying to recommend things, but the biggest disadvantage is the the fact that it runs on the assumption that the different elements are independent, but most scenarios they are actually dependent.


The Support Vector Machine (SVM) is a supervised machine learning model which is made as an extension to LDA, using hyperplanes as a method to separate different classes of data. This is so that they can minimize the distance between different classes in a very noisy environment. This is extremely effective in high dimensional spaces, creating a more accurate model for less resources. SVM can be used for regression and classification tasks, but the majority of instances of this model is used for classification. The major drawback however is the difficulty to interpret the model, making it unviable in commercial settings.

Gradient Boosted Forest Models and Random Forest Models

Forest models are some of the most favorable models to use when it comes to thinking about interpretability. It is easy to use information about model parameters to directly draw relationships about feature importance. In fact, most popular machine learning libraries (sci-kit learn, TensorFlow, etc.), have pre-implemented off-the-shelf functions you can use to interpret your models.