dstore-dist-top-reporter

Root Node Config¶

dstore-dist-top-reporter configuration is based on the concepts of streams, which group the characteristics of the incoming events, and reports which are used to generate regular reports from a specified stream. There is also the concept of storage, which specifies where the generated reports will be stored. These interact as follows:

Streams: Input sources for data distributed via dstoredist
Reports: The reports which should be generated based on a stream
Storage: Locations where reports should be stored

The relationship between these is as follows:

A report is based on the data from a single stream, multiple reports can be generated from the same stream
All reports are stored in all configured storage, unless otherwise specified in the configuration of the storage

A simple configuration file for dstore-dist-top-reporter might look like:

http:
  address: ":8701"

streams:
  - name: all-queries
    title: "All traffic (sampled)"
    address: ":4801"
    # This needs to match the sample value configured in dstore-dist
    upstream_sampling: 1000 

# Reports are generated from streams.
reports:
- name: all-tldplusone-domains
  # This uses the public domain suffix list to remove internal subdomains
  # e.g. www.example.com and mail.example.com will both be truncated to example.com
  field: qname/suffix+1
  # We always want to oversample, otherwise the summary data will be skewed
  n: 5000
  stream: all-queries
  interval: 60s

storage:
  - name: elasticsearch
    # This is currently the only supported backend
    backend: elastic
    skip_empty: true
    url: http://elasticsearch:9200/
    # Ensure the index contains the report name and today's date
    elastic_index_template: "{{.ReportName}}-{{.TimestampDate}}"

# Optional security section for configuring security fields from tags/rules/categories
security:
  malicious:
    rule:
      - rule1
  blocked:
    tag:
      - tag1
  regulatory:
    rule: ["rule2"]

Note that at least one stream, one report and one storage must be configured.

The following YAML key-values are supported for configuration at the root node:

Parameter	Type	Description
`http.address`	`<ip:port>`	The address to listen on for Prometheus metrics and for the status page. The value is an address:port string, in either v4 or v6 format. IPv6 addresses must be placed in square brackets like this `[::1]`. You can omit the address to listen on all local addresses.
`reports`	List of Report	Configuration of the reports to generate
`streams`	List of Stream	Configuration of the incoming event streams
`storage`	List of Storage	Configuration of storage for report

Stream¶

Parameters which can be used to configure a TopN stream:

Parameter	Type	Required	Default	Description
`address`	`ip:port`			The address (optional) and port to listen on for this stream
`name`	`string`	`yes`		Name of the stream
`title`	`string`			Display friendly name of the stream
`tlsconfig`	TLS Config			Configure TLS options for this connection (empty means plaintext)
`upstream_sampling`	`integer`		`1`	Sampling value used in the dstore-dist destination which populates this stream

Report¶

Parameters which can be used to configure a TopN report:

Parameter	Type	Required	Default	Description
`description`	`string`			Optional verbose description
`entries`	`integer`		`1000`	Maximum number of entries to include in the report
`field`	`string`	`yes`	`"qname"`	Field to use as the key for the report (see below for possible values)
`interval`	`go:`DurationString		`"300s"`	How often to generate the report (longer interval means longer in-memory storage of data = higher memory usage)
`n`	`integer`	`yes`		Number of entries to include in report
`name`	`string`	`yes`		Name of the report
`security`	SecurityConfig	`yes`		Configure additional fields that represent security filtering
`stream`	`string`	`yes`		Name of an input stream defined on this TopN instance
`title`	`string`			Friendly title for the report

"field" can take the following values:

field	Description
`ip/prefix32/prefix64`	The IP address of the client, with the IP address aggregated to the v4/v6 prefix specified. For example ip/32/128 would perform no aggregation of v4 or v6 IPs.
`qname`	The lowercase DNS question name
`qname/raw`	The raw qname, not converted to lowercase
`qname/suffix`	The public suffix of the qname (e.g. .com, .co.uk, etc.)
`qname/suffix+1`	The public suffix plus one label (e.g. example.com, example.co.uk, etc.)
`qname/tld`	The TLD (e.g. com, uk, etc.)
`requestorid`	The subscriber’s username

SecurityConfig¶

The SecurityConfig section is used to map rules, categories and/or tags to a set of five possible security classifications, which are then counted in reports. Note that the classifications are not mutually exclusive; i.e. if multiple matches for a single query, then all matching classifications will be recorded.

Parameters used to configure SecurityConfig are as follows:

field	Type	Description
`malicious`	SecurityFieldConfig	Used to add a field to the report counting queries which are considered malicious (e.g. malware, phishing etc.)
`c2`	SecurityFieldConfig	Used to add a field to the report counting queries which are considered command and control communication
`subscriber`	SecurityFieldConfig	Used to add a field to the report counting queries which represent blocking of domains by individual subscribers
`global`	SecurityFieldConfig	Used to add a field to the report counting queries which represent blocking at the global level, i.e. by the operator of the service
`contentfilter`	SecurityFieldConfig	Used to add a field to the report counting queries which represent content filtering according to a subscriber's preference
`regulatory`	SecurityFieldConfig	Used to add a field to the report counting queries which are due to regulatory filtering

Note that not all fields need to be configured; for example do not configure the regulatory field if there is no regulatory filtering.

SecurityFieldConfig¶

Parameters used to configure SecurityFieldConfig are as follows:

field	Type	Required	Description
`tags`	List of `string`	`no`	Checks if the query matches any of the tags specified
`rules`	List of `string`	`no`	Checks if the query matches any of the rules specified
`categories`	List of `string`	`no`	Checks if the query matches any of the categories specified

Although none of the fields are required, you must specify at least one to match a query.

Storage¶

Parameters which can be used to configure a TopN storage backend:

Parameter	Type	Required	Default	Description
`backend`	`string`	`yes`		Type of storage backend. Available options: `"elastic"`, `"http"` See below for more configuration options specific to each backend.
`name`	`string`	`yes`		Name of the storage backend
`reports`	`[]string`			List of Report names (all reports will be stored if this list is empty)
`retry_max`	`integer`		`0`	Maximum number of retries in case of connection errors or HTTP-500
`skip_empty`	`boolean`		`false`	Skip generating a report if there are no entries
`tlsconfig`	TLS Config		`{}`	TLS configuration options for the storage backend

Backend: Elastic¶

Additional parameters which are available on storage backends with backend: elastic. These should be attributes of the storage item itself. For example:

Parameter	Type	Required	Default	Description
`elastic_id_template`	`string`			Template used to render the Elastic IDs. Randomly generated if not configured
`elastic_index_template`	`string`		`{{.ReportName}}`	Template used to render the name of the index to use in Elastic to store the reports
`elastic_single_doc`	`boolean`		`false`	Store a report in a single document in Elastic
`elastic_use_create_action`	`boolean`		`false`	Use the bulk API create action instead of index action. Required for Elastic Data Streams
`password`	`string`			Password to use for authentication with Elastic
`retry_max`	`integer`			Maximum number of retries. By default no retries are attempted.
`url`	`string`	One of `url` or `urls` is required		Base URL of Elastic instance. Note this can also be templated.
`urls`	List of `string`	One of `url` or `urls` is required		Base URLs of Elastic instance. Note these can also be templated. The URLs will be tried in turn until one succeeds.
`username`	`string`			Username to use for authentication with Elastic

Backend: HTTP¶

Parameter	Type	Required	Default	Description
`headers`	`map`			Map of headers, with the header name followed by the value
`method`	`string`		POST	Method used, e.g. PUT.
`password`	`string`			Password to use for (Basic) authentication
`url`	`string`	One of `url` or `urls` is required		Base URL of HTTP instance. Note this can also be templated.
`urls`	List of `string`	One of `url` or `urls` is required		Base URLs of HTTP instance. Note these can also be templated. The URLs will be tried in turn until one succeeds.
`username`	`string`			Username to use for (Basic) authentication
`wrapper_template`	`string`		`{{.ReportEvent}}`	Template used to data sent to the HTTP endpoint. Used to wrap the event in additional JSON e.g. https://docs.splunk.com/Documentation/Splunk/latest/Data/FormateventsforHTTPEventCollector

For example:

backend: http
name: splunk
url: "http://example.com/splunk"
method: POST
headers:
  authorization: "Splunk 12345"
wrapper_template: >
  {"host":"example.com","source":"dstore-dist-top-reporter","index":"pdns-dstore-dist-top-reporter","sourcetype":"{{.ReportName}}","event":{{.ReportEvent}}}

Template Variables¶

The possible template variables are as follows:

ReportName - The name of the report
ReportEvent - The JSON representation of the report event
Timestamp - The UNIX timestamp corresponding to the time the report was generated
TimestampDate - The date (year-month-day) that the report was generated
TimestampISO - The UNIX timestamp corresponding to the time the report was generated in RFC3339 format