

In the right panel of the component, select the Training mode option.

You can find this component in the Anomaly Detection category. How to configure PCA-Based Anomaly DetectionĪdd the PCA-Based Anomaly Detection component to your pipeline in the designer. The higher the error, the more anomalous the instance is.įor more information about how PCA works, and about the implementation for anomaly detection, see these papers:Ī randomized algorithm for principal component analysis, by Rokhlin, Szlan, and Tygertįinding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions (PDF download), by Halko, Martinsson, and Tropp The normalized error is used as the anomaly score. The anomaly detection algorithm computes its projection on the eigenvectors, together with a normalized reconstruction error. These combined feature values are used to create a more compact feature space called the principal components.įor anomaly detection, each new input is analyzed. It looks for correlations among the variables and determines the combination of values that best captures differences in outcomes. PCA works by analyzing data that contains multiple variables.

It's frequently used in exploratory data analysis because it reveals the inner structure of the data and explains the variance in the data. PCA is an established technique in machine learning. This approach lets you train a model by using existing imbalanced data. The component then applies distance metrics to identify cases that represent anomalies. The PCA-Based Anomaly Detection component solves the problem by analyzing available features to determine what constitutes a "normal" class. But you might have many examples of good transactions. This component helps you build a model in scenarios where it's easy to get training data from one class, such as valid transactions, but difficult to get sufficient samples of the targeted anomalies.įor example, to detect fraudulent transactions, you often don't have enough examples of fraud to train on. This article describes how to use the PCA-Based Anomaly Detection component in Azure Machine Learning designer, to create an anomaly detection model based on principal component analysis (PCA).
