Skip to main content

gnat_sample

Synopsis

Sample flow records for the Histogram-based Outlier Score (HBOS) algorithm used by gnat_model.

Description

The gnat_sample primarily used to sample flow records for the Histogram-based Outlier Score (HBOS) algorithm. Randomly select a subset of flow records from the original dataset, reducing the size of the flow record dataset while maintaining its statistical properties. This is particularly useful for large datasets where processing the entire dataset may be computationally expensive or time-consuming.

Options

Options are specified using the --options argument and are separated by semicolons.

--options retention=[int]

The --options retention argument specifies the number of days to retain in the sampled dataset on a sliding window basis.

--options percent=[int]

The --options percent the amount of input data (percent-wise) to sample.

Examples

$ gnat_sample --input /var/spool/input --output /var/spool/output  --interval hour --options retention=7;percent=50

See Also