Skip to content

dstore-dist-top-reporter

Root Node Config

dstore-dist-top-reporter configuration is based on the concepts of streams, which group the characteristics of the incoming events, and reports which are used to generate regular reports from a specified stream. There is also the concept of storage, which specifies where the generated reports will be stored. These interact as follows:

  • Streams: Input sources for data distributed via dstoredist
  • Reports: The reports which should be generated based on a stream
  • Storage: Locations where reports should be stored

The relationship between these is as follows:

  • A report is based on the data from a single stream, multiple reports can be generated from the same stream
  • All reports are stored in all configured storage, unless otherwise specified in the configuration of the storage

A simple configuration file for dstore-dist-top-reporter might look like:

http:
  address: ":8701"

streams:
  - name: all-queries
    title: "All traffic (sampled)"
    address: ":4801"
    # This needs to match the sample value configured in dstore-dist
    upstream_sampling: 1000 

# Reports are generated from streams.
reports:
- name: all-tldplusone-domains
  # This uses the public domain suffix list to remove internal subdomains
  # e.g. www.example.com and mail.example.com will both be truncated to example.com
  field: qname/suffix+1
  # We always want to oversample, otherwise the summary data will be skewed
  n: 5000
  stream: all-queries
  interval: 60s

storage:
  - name: elasticsearch
    # This is currently the only supported backend
    backend: elastic
    skip_empty: true
    url: http://elasticsearch:9200/
    # Ensure the index contains the report name and today's date
    elastic_index_template: "{{.ReportName}}-{{.TimestampDate}}"

# Optional security section for configuring security fields from tags/rules/categories
security:
  malicious:
    rule:
      - rule1
  blocked:
    tag:
      - tag1
  regulatory:
    rule: ["rule2"]

Note that at least one stream, one report and one storage must be configured.

The following YAML key-values are supported for configuration at the root node:

Parameter Type Default Description
http.address <ip:port> The address to listen on for Prometheus metrics and for the status page. The value is an address:port string, in either v4 or v6 format. IPv6 addresses must be placed in square brackets like this [::1]. You can omit the address to listen on all local addresses.
reports List of Report Configuration of the reports to generate
streams List of Stream Configuration of the incoming event streams
storage List of Storage Configuration of storage for report

Stream

Parameters which can be used to configure a TopN stream:

Parameter Type Required Default Description
address ip:port The address (optional) and port to listen on for this stream
name string yes Name of the stream
title string Display friendly name of the stream
tlsconfig TLS Config Configure TLS options for this connection (empty means plaintext)
upstream_sampling integer 1 Sampling value used in the dstore-dist destination which populates this stream

Report

Parameters which can be used to configure a TopN report:

Parameter Type Required Default Description
description string Optional verbose description
entries integer 1000 Maximum number of entries to include in the report
field string yes "qname" Field to use as the key for the report (see below for possible values)
interval go:DurationString "300s" How often to generate the report (longer interval means longer in-memory storage of data = higher memory usage)
n integer yes Number of entries to include in report
name string yes Name of the report
security SecurityConfig yes Configure additional fields that represent security filtering
stream string yes Name of an input stream defined on this TopN instance
title string Friendly title for the report

"field" can take the following values:

field Description
ip/prefix32/prefix64 The IP address of the client, with the IP address aggregated to the v4/v6 prefix specified. For example ip/32/128 would perform no aggregation of v4 or v6 IPs.
qname The lowercase DNS question name
qname/raw The raw qname, not converted to lowercase
qname/suffix The public suffix of the qname (e.g. .com, .co.uk, etc.)
qname/suffix+1 The public suffix plus one label (e.g. example.com, example.co.uk, etc.)
qname/tld The TLD (e.g. com, uk, etc.)
requestorid The subscriber’s username

SecurityConfig

The SecurityConfig section is used to map rules, categories and/or tags to a set of five possible security classifications, which are then counted in reports. Note that the classifications are not mutually exclusive; i.e. if multiple matches for a single query, then all matching classifications will be recorded.

Parameters used to configure SecurityConfig are as follows:

field Type Description
malicious SecurityFieldConfig Used to add a field to the report counting queries which are considered malicious (e.g. malware, phishing etc.)
c2 SecurityFieldConfig Used to add a field to the report counting queries which are considered command and control communication
subscriber SecurityFieldConfig Used to add a field to the report counting queries which represent blocking of domains by individual subscribers
global SecurityFieldConfig Used to add a field to the report counting queries which represent blocking at the global level, i.e. by the operator of the service
contentfilter SecurityFieldConfig Used to add a field to the report counting queries which represent content filtering according to a subscriber's preference
regulatory SecurityFieldConfig Used to add a field to the report counting queries which are due to regulatory filtering

Note that not all fields need to be configured; for example do not configure the regulatory field if there is no regulatory filtering.

SecurityFieldConfig

Parameters used to configure SecurityFieldConfig are as follows:

field Type Required Description
tags List of string no Checks if the query matches any of the tags specified
rules List of string no Checks if the query matches any of the rules specified
categories List of string no Checks if the query matches any of the categories specified

Although none of the fields are required, you must specify at least one to match a query.

Storage

Parameters which can be used to configure a TopN storage backend:

Parameter Type Required Default Description
backend string yes Type of storage backend.
Available options: "elastic", "http"
See below for more configuration options specific to each backend.
name string yes Name of the storage backend
reports []string List of Report names (all reports will be stored if this list is empty)
retry_max integer 0 Maximum number of retries in case of connection errors or HTTP-500
skip_empty boolean false Skip generating a report if there are no entries
tlsconfig TLS Config {} TLS configuration options for the storage backend

Backend: Elastic

Additional parameters which are available on storage backends with backend: elastic. These should be attributes of the storage item itself. For example:

Parameter Type Required Default Description
elastic_id_template string Template used to render the Elastic IDs.
Randomly generated if not configured
elastic_index_template string {{.ReportName}} Template used to render the name of the index to use in Elastic to store the reports
elastic_single_doc boolean false Store a report in a single document in Elastic
elastic_use_create_action boolean false Use the bulk API create action instead of index action. Required for Elastic Data Streams
password string Password to use for authentication with Elastic
retry_max integer Maximum number of retries. By default no retries are attempted.
url string One of url or urls is required Base URL of Elastic instance. Note this can also be templated.
urls List of string One of url or urls is required Base URLs of Elastic instance. Note these can also be templated. The URLs will be tried in turn until one succeeds.
username string Username to use for authentication with Elastic

Backend: HTTP

Parameter Type Required Default Description
headers map Map of headers, with the header name followed by the value
method string POST Method used, e.g. PUT.
password string Password to use for (Basic) authentication
url string One of url or urls is required Base URL of HTTP instance. Note this can also be templated.
urls List of string One of url or urls is required Base URLs of HTTP instance. Note these can also be templated. The URLs will be tried in turn until one succeeds.
username string Username to use for (Basic) authentication
wrapper_template string {{.ReportEvent}} Template used to data sent to the HTTP endpoint. Used to wrap the event in additional JSON e.g. https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector

For example:

backend: http
name: splunk
url: "http://example.com/splunk"
method: POST
headers:
  authorization: "Splunk 12345"
wrapper_template: >
  {"host":"example.com","source":"dstore-dist-top-reporter","index":"pdns-dstore-dist-top-reporter","sourcetype":"{{.ReportName}}","event":{{.ReportEvent}}}

Template Variables

The possible template variables are as follows:

  • ReportName - The name of the report
  • ReportEvent - The JSON representation of the report event
  • Timestamp - The UNIX timestamp corresponding to the time the report was generated
  • TimestampDate - The date (year-month-day) that the report was generated
  • TimestampISO - The UNIX timestamp corresponding to the time the report was generated in RFC3339 format