In this Substack, we will use the iris dataset with the one-class support vector machines. One-Class SVM is a machine learning algorithm used for unsupervised anomaly detection. It is trained on a set of instances from a single class, typically the majority class or normal instances. One-Class SVM learns a decision boundary that encapsulates the normal data, and during inference, it can classify new instances as either normal or anomalous based on their proximity to the learned boundary.
To demonstrate the One-Class SVM, we will use the iris dataset once again. It contains measurements of iris flowers, including sepal length, sepal width, petal length, and petal width. Similar to the local outlier factor model, by applying One-Class SVM, we can identify potential anomalies among the iris flowers and gain insights into any unusual observations.
To implement One-Class SVM, we will use the svm (support vector machine) component of scikit-learn to analyze the iris dataset. Since One-Class SVM is an unsupervised algorithm, we won't be using the class labels (flower species) during the analysis.
To visualize the results of the One-Class SVM, we can create the same kind of scatter plots as local outlier factor to highlight the classification of data points as either normal or anomalous. Using two relevant features (e.g., sepal length and petal length) on the X and Y axes, we can again color-code the data points based on their classification, which will allow us to identify potential outliers in the dataset.
Positive scores reflect the confidence that a data point belongs to the normal class. In the plot above, you’ll see the same red color-coded data points classified as ‘normal’ as in the local outlier factor exercise. Higher positive scores indicate that the data point is closer to the center of the normal data distribution and exhibits characteristics similar to the majority of the training data.
CodeChat
If you have any ideas that you would like covered on the next chat, please leave me a message on the SubStacker’s Message Board.
If you have any general thoughts or suggestions, please let me know as well!