Skip to main content

Schema

The table details the IPFIX information elements generated by gnat_sensor and the corresponding Apache Parquet schema. The schema includes a wide range of features which enable advanced insights into network traffic patterns, anomalies, and security risks. More specifically, the schema includes fields for basic flow information, protocol details, traffic statistics, entropy and timing analysis, packet size analysis, flow classification, hardware addresses, geographical and ASN information, histograms, and deep packet inspection.

Field Definitions

Basic Flow Information


FieldTypeDescription
versionUINTEGERSchema version number for the flow record
idUUIDUnique identifier for the flow record
observeVARCHARObserver or sensor identifier that captured the flow
stimeTIMESTAMPStart time of the flow
etimeTIMESTAMPEnd time of the flow
durUINTEGERDuration of the flow in microseconds
rttUINTEGERRound-trip time in microseconds
pcrINTEGERPacket capture rate or sampling rate

Protocol and Network Information


FieldTypeDescription
protoVARCHARProtocol type (TCP, UDP, ICMP, etc.)
saddrVARCHARSource IP address
daddrVARCHARDestination IP address
sportUSMALLINTSource port number
dportUSMALLINTDestination port number

TCP-Specific Fields


FieldTypeDescription
iflagsVARCHARInitial TCP flags observed in the flow
uflagsVARCHARUnion of all TCP flags seen during the flow
stcpseqUINTEGERSource TCP sequence number
dtcpseqUINTEGERDestination TCP sequence number
stcpurgUINTEGERSource TCP urgent pointer count
dtcpurgUINTEGERDestination TCP urgent pointer count

VLAN Information


FieldTypeDescription
svlanUSMALLINTSource VLAN ID
dvlanUSMALLINTDestination VLAN ID

Traffic Statistics


FieldTypeDescription
spktsUBIGINTNumber of packets from source to destination
dpktsUBIGINTNumber of packets from destination to source
sbytesUBIGINTNumber of bytes from source to destination
dbytesUBIGINTNumber of bytes from destination to source

Entropy and Timing Analysis


FieldTypeDescription
sentropyUTINYINTSource payload entropy (randomness measure)
dentropyUTINYINTDestination payload entropy (randomness measure)
siatUBIGINTSource inter-arrival time statistics
diatUBIGINTDestination inter-arrival time statistics
sstdevUBIGINTSource standard deviation of inter-arrival times
dstdevUBIGINTDestination standard deviation of inter-arrival times

Packet Size Analysis


FieldTypeDescription
ssmallpktcntUINTEGERCount of small packets from source
dsmallpktcntUINTEGERCount of small packets from destination
slargepktcntUINTEGERCount of large packets from source
dlargepktcntUINTEGERCount of large packets from destination
snonemptypktcntUINTEGERCount of non-empty packets from source
dnonemptypktcntUINTEGERCount of non-empty packets from destination
sfirstnonemptycntUSMALLINTFirst non-empty packet count from source
dfirstnonemptycntUSMALLINTFirst non-empty packet count from destination
smaxpktsizeUSMALLINTMaximum packet size from source
dmaxpktsizeUSMALLINTMaximum packet size from destination
sstdevpayloadUSMALLINTStandard deviation of payload sizes from source
dstdevpayloadUSMALLINTStandard deviation of payload sizes from destination

Flow Classification


FieldTypeDescription
spdVARCHARSpeed or rate classification
reasonVARCHARReason for flow termination or classification
orientVARCHARFlow orientation or direction classification
tagVARCHAR[]Array of tags or labels associated with the flow

Hardware Addresses


FieldTypeDescription
smacVARCHARSource MAC address
dmacVARCHARDestination MAC address

Geographical and ASN Information

FieldTypeDescription
scountryVARCHARSource IP country code
dcountryVARCHARDestination IP country code
sasnUINTEGERSource Autonomous System Number
dasnUINTEGERDestination Autonomous System Number
sasnorgVARCHARSource ASN organization name
dasnorgVARCHARDestination ASN organization name

Histogrambased Outlier Score (HBOS)


FieldTypeDescription
hbos_scoreDOUBLEHistogram-Based Outlier Score for anomaly detection
hbos_severityUTINYINTSeverity level based on HBOS analysis (0-255)
hbos_mapMAP(VARCHAR, FLOAT)Detailed HBOS feature scores as key-value pairs

Deep Packet Inspection (nDPI)


FieldTypeDescription
ndpi_appidVARCHARApplication ID identified by nDPI
ndpi_categoryVARCHARApplication category from nDPI classification
ndpi_risk_bitsUBIGINTBit field representing various risk factors
ndpi_risk_scoreUINTEGERNumerical risk score calculated by nDPI
ndpi_risk_severityUTINYINTRisk severity level (0-255)
ndpi_risk_listVARCHAR[]Array of specific risk descriptions

Flow Processing


FieldTypeDescription
triggerTINYINTTrigger condition or event that caused flow processing

Data Types Reference


  • UINTEGER: Unsigned 32-bit integer
  • UBIGINT: Unsigned 64-bit integer
  • USMALLINT: Unsigned 16-bit integer
  • UTINYINT: Unsigned 8-bit integer
  • TINYINT: Signed 8-bit integer
  • INTEGER: Signed 32-bit integer
  • DOUBLE: Double-precision floating-point
  • FLOAT: Single-precision floating-point
  • VARCHAR: Variable-length character string
  • VARCHAR[]: Array of variable-length character strings
  • UUID: Universally Unique Identifier
  • TIMESTAMP: Date and time value
  • MAP(VARCHAR, FLOAT): Key-value mapping with string keys and float values