gnat_sample
Synopsis
Sample flow records for the Histogram-based Outlier Score (HBOS) algorithm used by gnat_model
.
Description
The gnat_sample
primarily used to sample flow records for the Histogram-based Outlier Score (HBOS) algorithm.
Randomly select a subset of flow records from the original dataset, reducing the size of the flow record dataset while maintaining its statistical properties.
This is particularly useful for large datasets where processing the entire dataset may be computationally expensive or time-consuming.
Options
Options are specified using the --options
argument and are separated by semicolons.
--options retention=[int]
The --options retention
argument specifies the number of days to retain in the sampled dataset on a sliding window basis.
--options percent=[int]
The --options percent
the amount of input data (percent-wise) to sample.
Examples
$ gnat_sample --input /var/spool/input --output /var/spool/output --interval hour --options retention=7;percent=50