dstore-dist

Root Node Config¶

A minimal dstore-dist configuration (missing the details of the destination and route configuration) could be:

listen:
  - "127.0.0.1:2000"
  - "[::1]:2000"
tls_listen:
  - addr: "127.0.0.1:2001"
    tlsconfig:
      cert_file: /etc/certs/cert.pem
      key_file: /etc/certs/cert.key
log:
  level: Warn
ipsets:
  ipset1:
    file: /etc/ipset1.txt
    poll_interval: 10s
destinations:
  mydestination1:
    <destination configuration>
  mydestination2:
    <destination configuration>
routes:
  myroute1:
    <route configuration>
  myroute2:
    <route configuration>

The following YAML key-values are supported for configuration at the root node:

Parameter	Type	Default	Description
`batch_buffer_size`	`int`	1048576	The size in bytes of the buffer(s) used to batch messages internally. Normally the default of 1048576 (1MB) will suffice, but when using Kafka, making this value around 90% of the value of the kafka `max_msg_size` is a good idea, to avoid fragmentation of the messages when sending to Kafka.
`conn_timeout`	`go:`DurationString		The idle timeout for incoming connections to dstore-dist. Defaults to no timeout, i,e. only applies if explicitly configured.
`debug`	`boolean`	false	Enable/disable very verbose debug logging
`destinations`	Map of Destination		Each map key is the name of a destination, and the value is a Destination
`history_num_batches`	`int`		Number of batches to keep in memory for the history API. A batch is capped at 1 MB by default, see `batch_buffer_size` for the actual size. If this is unset or set to 0, the history API is disabled. The history API can be accessed at `/api/history`. By default this URL will print additional usage information, including how to switch to Protobuf or JSON output format, and how to filter on certain fields.
`http_address`	`string`		Listen address for HTTP webserver for Prometheus metrics and status page. Value is an address:port string, using the same format as for `listen` addresses.
`http_api_key`	`string`		If set, an `X-API-Key` header matching this key is required to access the optional HTTP history API.
`ipsets`	Map of IP Set
`listen`	List of `string`		The addresses that `dstore-dist` will listen on for new protobuf messages. The value is a list of address:port strings, in either v4 or v6 format. IPv6 addresses must be placed in square brackets like this `[::1]`. You can omit the address to listen on all local addresses.
`log`	Log Config
`routes`	Map of Route		Each map key is the name of a route, and the value is a Route
`tls_listen`	List of TLS Listen		The addresses that `dstore-dist` will listen on for new protobuf messages. The value is a list of address:port strings, in either v4 or v6 format. IPv6 addresses must be placed in square brackets like this `[::1]`. You can omit the address to listen on all local addresses.

Log Config¶

Log Configuration is as follows:

Parameter	Type	Default	Description
`level`	`string`		Defaults to `Info`. Can also be set to `Warn`, `Error`, `Debug` or `Trace`. For debug level to take effect, the `-debug` flag must also be set.
`stdout`	`boolean`	`false`	By default logging will go to `stderr`, set this to true send to `stdout` instead.
`text`	`boolean`	`false`	If true, enable text-based logging. The default is JSON logging.

TLS Listen¶

You can configure a TLS Listener as follows:

Parameter	Type	Default	Description
`addr`	`string`		The address:port to listen on
`tlsconfig`	TLS Config		Configuration of TLS parameters

TLS Config¶

Parameter	Type	Default	Description
`insecure_skip_verify`	`boolean`	`false`	Controls whether a client verifies the server's certificate chain and hostname.
`ca_file`	`string`		Optional CA file to use (PEM).
`ca`	`string`		Optional CA to use specified as a string in PEM format.
`add_system_ca_pool`	`boolean`	`false`	Adds the system CA pool if private CAs are enabled, when set.
`cert_file`	`string`		Optional certificate file to use (PEM).
`cert`	`string`		Optional certificate to use specified as a string in PEM format.
`key_file`	`string`		Optional key file to use (PEM).
`key`	`string`		Optional key to use specified as a string in PEM format.
`require_client_cert`	`boolean`	`false`	Controls whether a client certificate is required (MTLS). `ca` must be set if this is `true`.
`watch_certs`	`boolean`	`false`	If true, enables background reloading of certifcate files.
`watch_certs_poll_interval`	`go:`DurationString	`5s`	If watch_certs is true, how often to check for changes.

IPSet¶

Parameter	Type	Required	Default	Description
`file`	`string`	`yes`		The path to a file containing newline separated IP prefixes (v4 or v6) which can be used for filtering events with various filters.
`poll_interval`	`go:`DurationString		5s	How often to check the file for changes.

An example IPSet file is shown below:

127.0.0.1/16
128.243.0.0/16
# Comments beginning with # are allowed
fe80::1cc0:3e8c:119f:c2e1/18

Destination¶

The following parameters can be used to configure a destination:

Parameter	Type	Default	Description
`blackhole`	`boolean`	false	Messages to this destination will be dropped
`burst`	`integer`	0	When rate limiting, allow burst size of `n` events
`rate`	`integer`	0	Rate limit throughput to `n` messages per second (0 or empty means no rate limiting will take place)
`sample`	`integer`	0	Sample messages to one out of `n` messages
`type`	`string`	`pdns`	Type of destination. Available options: "pdns" "kafka" "storage" "websub". See below for more configuration options specific to each destination type.

Destination: pdns¶

Additional parameters are available on dstore-dist destinations with type: pdns (or no type). These should be attributes of the destination item itself. For example:

destinations:
  mydestination:
    type: pdns
    addresses:
      - myhost.example.com:1234

Parameter	Type	Required	Default	Description
`addresses`	List of `string`	`yes`		List of addresses of downstream servers supporting the `pdns` protobuf protocol. Should be either `IP:port` or `host:port`
`connect_timeout`	`go:`DurationString		`"5s"`	How long to wait before timing out connection attempts
`distribute`	`string`		`roundrobin`	Distribution algorithm when multiple addresses are configured. Available options: `"all"` (the default) `"roundrobin"` `"sharded"` `"ordered"`
`framing`	`string`		`"16bit"`	How to frame protobuf messages. Available options: `"16bit"` `"32bit"` `"repeated"`
`tlsconfig`	TLS Config		`{}`	TLS configuration options
`use_tls`	`boolean`		`false`	If `true`, attempt to connect to the `addresses` using TLS
`write_timeout`	`go:`DurationString		`"5s"`	How long to wait before timing out write attempts

An explanation of the distribution algorithms is as follows: - all: All messages are sent to all addresses. - roundrobin: Messages are sent to addresses in a round-robin fashion. - sharded: Messages are sent to addresses in a sharded fashion. The query name is used as a key to determine which address to send the message to, using a consistent hashing algorithm. - ordered: Messages are sent to addresses in an ordered fashion. The first address is used if it is available, otherwise the second address is used, and so on.

Destination: kafka¶

Additional parameters are available on destinations with type: kafka. These should be nested attributes inside a kafka: item of the destination item itself. For example:

destinations:
  mydestination:
    type: kafka
    kafka:
      addresses:
        - kafka.endpoint.local:9092
      topic: mytopic

The nested kafka: attribute takes the following parameters:

Parameter	Type	Required	Default	Description
`addresses`	List of `string`	`yes`		List of addresses of kafka endpoints. Should be either `IP:port` or `host:port`
`async`	`boolean`		`false`	If `true`, dstore-dist writes to Kafka never block and all responses from Kafka are ignored
`balancer`	`string`		`"roundrobin"`	Balancer used to distribute Kafka messages amongst partitions. Available options: `"roundrobin"` `"leastbytes"` `"fnv-1a"` `"crc32"` `"murmur2"`
`batch_size`	`integer`		`10000`	Number of messages which will constitute a batch. dstore-dist will wait for new messages until either the batch size is reached, or the `batch_timeout` is exceeded
`batch_timeout`	`go:`DurationString		`1ms`	Timeout before an incomplete batch is written to Kafka
`compression`	`string`		`""`	Compression codec to use. Available options: `"gzip"` `"snappy"` `"lz4"` `"zstd"` No compression is performed when empty
`instance_name`	`string`			If configured, a header will be added to each Kafka message with this value
`json_encode`	`boolean`		`false`	If `true`, JSON encode the data before sending to Kafka
`max_attempts`	`integer`		`2`	Number of times a message will be attempted to send to kafka
`max_msg_size`	`integer`		`900000`	Maximum size of a kafka message (in bytes). Cannot be lower than 65536.
`num_workers`	`integer`		`2`	Number of concurrent workers that will process protobuf messages and send them to kafka
`read_timeout`	`go:`DurationString		`10s`	Timeout for reads from Kafka
`required_acks`	`string`		`"one"`	How many acks are required from kafka. Available options: `"one"` `"all"` `"none"`
`sasl`	SASLConfig			Optional SASL configuration
`single_msgs`	`boolean`		`false`	If `true`, each Kafka message will only contain a single protobuf message
`tlsconfig`	TLS Config		`{}`	TLS configuration options
`topic`	`string`	`yes`		Name of Kafka topic to send messages to
`use_tls`	`boolean`		`false`	If `true`, attempt to connect to the `addresses` using TLS
`write_timeout`	`go:`DurationString		`10s`	Timeout for writes to Kafka

SASL Config¶

The kafka nested sasl: attribute takes the following parameters:

Parameter	Type	Required	Description
`type`	`string`	`yes`	The type of SASL authentication to use, one of `plain`, `scram256` or `scram512`
`username`	`string`	`yes`	The username to use for authentication
`password`	`string`		The password to use for authentication. Will be ignored if `password_file` is provided
`password_file`	`string`		A filename to read the password from. The file must have 0400 permissions. Overrides `password` if that is also specified

Destination: storage¶

Additional parameters are available on destinations with type: storage. These should be nested attributes inside a storage: item of the destination item itself. For example:

destinations:
  mydestination:
    type: storage
    storage:
      type: s3
      encoding: json
      options:
        endpoint_url: https://my.s3.endpoint.local
        bucket: myBucket
        region: myRegion

The nested storage: attribute takes the following parameters:

Parameter	Type	Required	Default	Description
`encoding`	`string`		`"protobuf"`	Encoding used for the files. Available options: `"protobuf"` `"json"` `"bind"`
`flush_interval`	`go:`DurationString		`300s`	Time between consecutive flushes to storage (only if `max_size` is not reached before this interval)
`max_size`	`integer`			Maximum size in bytes of the file (before compression). Defaults to dstore-dist's toplevel configuration item `batchBufferSize`. If `batchBufferSize` is not set the value will be `1048576`
`num_workers`	`integer`		`2`	Number of concurrent workers that will process protobuf messages and attempt to store them
`options`	Storage Options	`yes`		Configuration of the storage backend
`parse_rd`	`boolean`		`false`	If `true`, use the value of the RD flags set in the protobuf rather than assuming it's true. Only applies to `"bind"` encoding
`request_timeout`	`go:`DurationString		5s	How long to wait before giving up on sending requests to storage
`type`	`string`		`"s3"`	Type of storage backend. Available options: `"s3"` `"fs"`
`use_compression`	`boolean`		`false`	If `true`, compress files using gzip compression

Storage Options¶

For storage with type: s3 the following can be configured under options:

Parameter	Type	Required	Default	Description
`access_key`	`string`			S3 access key.
`bucket`	`string`	`yes`		Name of the S3 bucket
`client_timeout`	`go:`DurationString		`15m`	Specifies a time limit for requests made by this HTTP Client. The timeout includes connection time, any redirects, and reading the response body.
`dial_timeout`	`go:`DurationString		`10s`	The maximum amount of time a dial will wait for a connect to complete
`dial_keep_alive`	`go:`DurationString		`10s`	Specifies the interval between keep-alive probes for an active network connection
`endpoint_url`	`string`	`yes`		Endpoint of the S3 service to connect to
`idle_conn_timeout`	`go:`DurationString		`90s`	The maximum amount of time an idle (keep-alive) connection will remain idle before closing itself
`init_timeout`	`go:`DurationString		`20s`	The time we allow for initialisation, like credential checking and bucket creation
`max_idle_conns`	`integer`		`100`	Controls the maximum number of idle (keep-alive) connections
`region`	`string`		`"us-east-1"`	Region used when connecting to the S3 endpoint
`secret_key`	`string`			S3 secret key.
`tls`	TLS Config		`{}`	TLS configuration options
`tls_handshake_timeout`	`go:`DurationString		`10s`	Specifies the maximum amount of time to wait for a TLS handshake

For storage with type: fs the following can be configured under options:

Parameter	Type	Required	Default	Description
`root_path`	`string`	`yes`		The path to a directory in which the files will be stored

Destination: websub¶

Additional parameters are available on destinations with type: websub. These should be nested attributes inside a websub: item of the destination item itself. For example:

destinations:
  mydestination:
    type: websub
    websub:
      client_id: dstoreuser
      client_secret: 12345
      token_url: https://myidp.example.com/token
      scopes:
        - foo
        - bar
      publish_url: https://websub.example.com/
      tlsconfig:
        insecure_skip_verify: true
      topic: dnsmessage
      request_timeout: 1s
      headers:
        X-Custom-Header: foobar

The nested websub: attribute takes the following parameters:

Parameter	Type	Required	Default	Description
`client_id`	`string`	`yes`		The Client ID to use for authenticating to the token endpoint
`client_secret`	`string`	`yes`		The Client Secret to use for authenticating to the token endpoint
`headers`	`map`			Map of headers, with the header name followed by the value
`max_size`	`integer`		`1048576`	Max size in bytes of the messages field in the JSON sent to the websub server
`num_workers`	`integer`		`2`	Number of concurrent workers to use - add more if performance is an issue
`publish_url`	`string`	`yes`		The URL of the websub endpoint. If the URL path does not end with `/webSub/v1/publish` then that path will be appended to the URL
`request_timeout`	`go:`DurationString		`10s`	The request timeout when sending to the websub endpoint
`scopes`	`array` of `string`			If specified, which scopes to request from the token endpoint
`tlsconfig`	TLS Config			Optional TLS configuration for the connections to the websub and token endpoints
`token_url`	`string`	`yes`		The URL of the token endpoint
`topic`	`string`	`yes`		The topic to use when sending to the websub endpoint

See dstore-dist manpage for details of the JSON that is sent in the body of the POST request to the WebSub endpoint.

Route¶

Parameters which can be used to configure a dstore-dist route:

destinations:
  mydestination:
    type: pdns
    addresses:
      - another.dstoredist.endpoint.local:1234
routes:
  myroute:
    destinations:
      - mydestination

Parameter	Type	Required	Default	Description
`append_tags`	List of `string`		`[]`	List of tags which will be appended to each message for this route
`destinations`	List of `string`	`yes`		List of names of destinations, these must have been configured on this dstore-dist instance
`filters`	List of Filter		`{}`	Filters to restrict which messages are sent for this route

Filters¶

By default, all events will be sent to a particular destination, however configuring filters for a route allows only events matching the filter to be sent.

For example the following filters configuration ensures that only events that contain the query name foo.com and are sent over IPv4 transport are sent to the destinations listed under myroute:

destinations:
  mydestination:
    type: pdns
    addresses:
      - another.dstoredist.endpoint.local:1234
routes:
  myroute:
    destinations:
      - mydestination
    filters:
      - qname: foo.com
      - is_ipv4: true

Note that the top-level filters are joined with an implicit and filter, meaning that all filters have to match for a message to reach the specified destination(s).

Matching Filters¶

Matching filters are used to match events based on specific information in the query/response. The list of possible filters is listed below:

Parameter	Type	Description
`dst_port`	`integer`	Matches the destination (to) port in the message
`dst_port_gte` `dst_port_gt` `dst_port_lte` `dst_port_lt`	`integer`	These perform integer comparisons on the `dst_port` field in the message
`dst_port_not`	`integer`	This is simply the inverse of `dst_port`
`edns_version`	`integer`	Match the edns version in the message
`edns_version_gte` `edns_version_gt` `edns_version_lte` `edns_version_lt`	`integer`	These perform integer comparisons on the edns version in the message
`edns_version_not`	`integer`	This is simply the inverse of edns_version
`from_in_ipset`	`string`	Match all messages where the from IP is in a range contained in the named IP set
`has_deviceid`	`boolean`	Matches if the message has a deviceid field
`has_policy`	`boolean`	Matches if message has an appliedPolicy field in the response
`has_requestorid`	`boolean`	Matches if message has a requestorID fields in the response
`has_tags`	`boolean`	Matches if message has any tags set
`has_aa_flag`	`boolean`	Matches if message has AA flag set
`has_tc_flag`	`boolean`	Matches if message has TC flag set
`has_rd_flag`	`boolean`	Matches if message has RD flag set
`has_ra_flag`	`boolean`	Matches if message has RA flag set
`has_ad_flag`	`boolean`	Matches if message has AD flag set
`has_cd_flag`	`boolean`	Matches if message has CD flag set
`has_do_flag`	`boolean`	Matches if message has DO flag set
`has_policy`	`boolean`	Matches if message has an appliedPolicy field in the response
`has_unique_domain_response`	`boolean`	Matches if message has a unique domain response flag set in any response RR. Only applied to responses.
`is_ipv4`	`boolean`	Matches if the DNS query was received over IPv4
`is_ipv4`	`boolean`	Matches if the DNS query was received over IPv4
`is_response`	`boolean`	Matches if the message is a response message (as opposed to a query message)
`is_query`	`boolean`	Matches if the message is a query message (as opposed to a response message)
`is_outgoing_query`	`boolean`	Matches if the message is an outgoing query message (i.e. a query sent by a server)
`is_incoming_response`	`boolean`	Matches if the message is an incoming response message (i.e. in response to an outgoing query)
`is_tcp`	`boolean`	Matches if the DNS query was received over TCP
`is_udp`	`boolean`	Matches if the DNS query was received over UDP
`is_newly_observed_domain`	`boolean`	Matches if the DNS query was a newly observed domain
`is_cache_hit`	`boolean`	Matches if the DNS query was answered without performing outgoing queries
`is_packet_cache_hit`	`boolean`	Matches if the DNS query was answered specifically from the packet cache
`latency_msec`	`integer`	Matches the latency field in the message, in milliseconds
`latency_msec_gte` `latency_msec_gt` `latency_msec_lte` `latency_msec_lt`	`integer`	These perform integer comparisons on the `latency_msec` field in the message
`latency_msec_not`	`integer`	This is simply the inverse of `latency_msec`
`meta_key`	`string`	Matches if there is a meta field key with this name
`meta_key_int`	`string`	Matches meta field key and value, in the form key=, e.g. profile=1
`meta_key_string`	`string`	Matches meta field key and value, in the form key=, e.g. profile_name=foo
`policy_kind`	`string`	Matches if the policyKind field is a match (case-insensitive). Possible values for policy_kind are `none`, `noaction`, `drop`, `nxdomain`, `nodata`, `truncate`.
`policy_type`	`string`	Matches if the policyType field is a match (case-insensitive). Possible values for policy_type are `none`, `unknown`, `qname`, `clientip`, `responseip`, `nsdname`, `nsip`.
`qname`	`string`	The value is a domain name; the filter matches if the query qname is an exact match (case-insensitive) for the specified domain name.
`qname_sub`	`string`	The value is a domain name; the filter matches if the query qname is a subdomain of the specified domain name. Matches are case-insensitive.
`qtype`	`string` or `integer`	The value matches the query resource record type of the request. It can be specified as a string or an integer, as any string will be converted to an integer using the mapping specified in https://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml. If the type is very new, you may need to use the integer version.
`qtype_gte` `qtype_gt` `qtype_lte` `qtype_lt`	`string` or `integer`	These perform integer comparisons on qtype, after converting the type to an integer
`qtype_not`	`string` or `integer`	This is simply the inverse of qtype
`requestor_id`	`string`	Match if the requestor_id field is a match
`reqsubnet_in_ipset`	`string`	Match all messages where the 'origRequestedSubnet' IP is in a range contained in the named IP set
`rcode`	`integer`	Match the response code in the message (only for messages that contain a response)
`rcode_gte` `rcode_gt` `rcode_lte` `rcode_lt`	`integer`	These perform integer comparisons on the response code in the message
`rcode_not`	`integer`	This is simply the inverse of rcode
`src_port`	`integer`	Matches the source (from) port in the message
`src_port_gte` `src_port_gt` `src_port_lte` `src_port_lt`	`integer`	These perform integer comparisons on the `src_port` field in the message
`src_port_not`	`integer`	This is simply the inverse of `src_port`
`tag`	`string`	Match a specific tag
`tag_prefix`	`string`	Match the start of a tag

Boolean Logic Filters¶

The and, or and not filters are used to combine or invert matching filters to create more complex filter patterns.

Parameter	Type	Description
`and`	List of Filter	Applies a logical AND to all the specified filters
`not`	Filter	Inverts the specified filter
`or`	List of Filter	Applies a logical OR to all the specified filters

For example:

routes:
  myroute:
    destinations:
      - mydestination
    filters:
      - tag: REQUIRED_TAG
      - or:
        - tag: GAMBLING
        - tag: FASHION
      - not:
          qtype: foo.com