Skip to main content

gnat_hbos

Synopsis

Histogram-based Outlier Score (HBOS) is a fast and efficient machine learning algorithmn for detecting outliers in flow records.

Description

HBOS (Histogram-Based Outlier Score) is an efficient machine learning algorithm designed for anomaly detection in network flow records. The algorithm, developed by Markus Goldstein and Andreas Dengel at the German Research Center for Artificial Intelligence (DFKI), leverages statistical histograms to model the distribution patterns of network traffic features.

The algorithm operates by constructing histograms for individual features within flow records during a training phase to establish baseline behavior. Each histogram captures the frequency distribution of feature values, creating a statistical profile of normal network activity. When evaluating new flow records, HBOS calculates outlier scores by measuring how well each record's features align with the established histogram distributions. Records with feature values that fall into low-frequency histogram bins receive higher outlier scores, indicating potential anomalous behavior.

This histogram-based approach is fast, simple, and effective; and is particularly suitable for real-time monitoring of large-scale network traffic where processing speed is critical.

Options

Options are specified using the --options argument and are separated by semicolons.

--options model=[/path/to/model]

The --options model argument specifies the model file to be used for evaluating flow records. The model file is a JSON file that contains the histograms for each feature in the flow records. The model file is generated by the gnat_model command line interface.

--options sample=[json|csv]

The --options sample argument specifies the sample file to be used for evaluating flow records. The sample file is a JSON file that contains the sampled flow records used to generate the model. The sample file is generated by the gnat_sample command line interface.

--options output=[json|csv]

The --options output argument specifies the output file to be used for evaluating flow records.

Examples

$ gnat_model --input /var/spool/input --output /var/spool/output --interval day --options model=/path/to/model

See Also