gnat_hbos
Synopsis
Histogram-based Outlier Score (HBOS) is a fast and efficient machine learning algorithmn for detecting outliers in flow records.
Description
HBOS (Histogram-Based Outlier Score) is an efficient machine learning algorithm designed for anomaly detection in network flow records. The algorithm, developed by Markus Goldstein and Andreas Dengel at the German Research Center for Artificial Intelligence (DFKI), leverages statistical histograms to model the distribution patterns of network traffic features.
The algorithm operates by constructing histograms for individual features within flow records during a training phase to establish baseline behavior. Each histogram captures the frequency distribution of feature values, creating a statistical profile of normal network activity. When evaluating new flow records, HBOS calculates outlier scores by measuring how well each record's features align with the established histogram distributions. Records with feature values that fall into low-frequency histogram bins receive higher outlier scores, indicating potential anomalous behavior.
This histogram-based approach is fast, simple, and effective; and is particularly suitable for real-time monitoring of large-scale network traffic where processing speed is critical.
Options
Options are specified using the --options
argument and are separated by semicolons.
--options model=[/path/to/model]
The --options model
argument specifies the model file to be used for evaluating flow records.
The model file is a JSON file that contains the histograms for each feature in the flow records.
The model file is generated by the gnat_model
command line interface.
--options sample=[json|csv]
The --options sample
argument specifies the sample file to be used for evaluating flow records.
The sample file is a JSON file that contains the sampled flow records used to generate the model.
The sample file is generated by the gnat_sample
command line interface.
--options output=[json|csv]
The --options output
argument specifies the output file to be used for evaluating flow records.
Examples
$ gnat_model --input /var/spool/input --output /var/spool/output --interval day --options model=/path/to/model