Event Aggregator

Overview

The Event Aggregator service (the systemd unit is named dstore-ev-aggregator) is designed to filter PDNS Protobuf messages, specifically messages about DNS filtering events, as well as new device messages, and do the following:

  • Decide whether to send the message on to any of the configured output channels using filtering and throttling logic. By default, events are not sent; they are only sent to a given output channel if they match a filter and are not throttled.
  • Send the event to the matching output channel (if any). The same event may be sent to multiple output channels if multiple matching filters are defined.

The event aggregator also performs aggregation of messages for sending on to a downstream data-warehousing services such as Elasticsearch. This is achieved using the optional aggregations logic, which keeps track of the number of messages received, and only sends a proportion of those messages to the downstream DB, recording as part of the event data the number of events that were supressed.

If either an elasticsearch or logstash URL is configured, then all received events will be sent to that URL. Thus, if you only want to use event aggregator to send events to elasticsearch, then the simplest way to achieve this is to configure an elasticsearch or logstash URL, together with a single aggregation that doesn’t match any events.

Configuration

Configuration is via two files:

  • Global Configuration: The global configuration file is called ev_aggregator.conf and contains configuration settings, including where to find the filter definition file.
  • Filter Configuration: The filter/aggregation configuration file is defined in ev_aggregator.conf using the filter-file setting and contains definitions of the output channels and filters for those output channels.

By default the above configuration files are located in /etc/dstore; this can be changed using command-line configuration options as described below.

Global Configuration

The following settings are available:

config-file:Specify a different config file location
daemon:Run as a daemon
listen-address:Address and port to listen on for PBDNSMessage events
alert-listen-address:
 Address and port to listen on for PBAlertMessage events
logstash-url:URL, e.g. https://192.168.1.254:8080/ of a logstash server to send all received events to (in json format). Optional parameter, however it is required for many use-cases, so disable with care. This is now deprecated in favour of sending to elasticsearch directly using the elasticsearch-url parameter.
logstash-userpass:
 Username/Password for logstash in the form username:password
elasticsearch-url:
 URL (with no path component) of an elasticsearch server to send all received events to. Supercedes the logstash-url parameter.
elasticsearch-index-prefix:
 The index prefix to use with elasticsearch. Index names will be created as “<index-prefix>%{YYYY}%{MM}%{dd}”.
elasticsearch-index-template:
 Whether to upload an index template to elasticsearch that maps the timestamp parameter to a date type. This defaults to false.
elasticsearch-userpass:
 Username/Password for elasticsearch in the form username:password
worker-threads:The number of worker threads to create (these process individual protobuf messages and are used to search Redis). Defaults to 20.
webhook-threads:
 The number of threads to create for sending webhooks (used for the webhook, notification_center output channels, as well as for writing to Logstash). Defaults to 10.
webhook-conns:The maximum number of HTTP connections that each webhook thread will use. Defaults to 10.
fail-open:If fail-open is set to false (the default), then if Redis is unavailable (and thus throttling cannot be determined), then events that match input filters will not be sent to output channels. If set to true, then matching events will always be sent to output channels.
malware-tags:A comma separated list of tags that indicate malware filtering. These are used to indicate that an event is related to malware.
botnet-tags:A comma separated list of tags that indicate botnet filtering. These are used to indicate that an event is related to botnets.
phishing-tags:A comma separated list of tags that indicate phishing filtering. These are used to indicate that an event is related to phishing.
blacklist-tags:A comma separated list of tags that indicate filtering due to blacklists. These are used to indicate that an event is related to blacklisting.
contentfilter-tags:
 A comma separated list of tags that indicate filtering due to content filtering. These are used to indicate that an event is related to blacklisting.
platform-url:The URL of the PowerDNS Platform API. If specified, then the list of category names and titles is downloaded every hour and used to provide “friendly” names to the Notification Center.
platform-auth-token:
 The token to use in the X-API-Key header for authorization to the Platform API.
filter-file:The location of the file (in YAML format) used to configure output channels and filters. Mandatory parameter.
redis-server:The hostname/IP address of a redis server, which will be used for throttling/filtering queries. Mandatory parameter.
redis-port:The port number of a redis server.
redis-hash-keys:
 Hash redis keys to save memory and CPU (defaults to true).
redis-password:The password to use for Redis (optional)
redis-retries:The number of retry attempts to connect to Redis after connection failure (defaults to 3)
http-listen-address:
 The address (and port) to use to provide prometheus metrics via HTTP on the /metrics endpoint. Format is <IP address>:<optional port>. The port defaults to 8083.

Filter Configuration

Output Channels and Filters are defined in the file specified by the filter-file setting. The filter file is a YAML-format file, e.g.:

output_channels:
  - output_channel:
      name: Output Channel 1
      type: notification_center
      url: http://127.0.0.1:8080/
      api-key: secret
  - output_channel:
      name: Output Channel 2
      type: webhook
      url: http://127.0.0.1:8081/
      api-key: secret
      basic-auth: user:password
      secret: secret
aggregations:
  - aggregate:
      name: test_aggregation
      description: Always send the first 10 events, then aggregate more aggressively as the number of events increases using a 10x multiplier.
      input_filter:
        qname: aggregate.com
      min_events: 10
      multiplier: 10
      cache_timeout: 600
      max_aggregate: 10000
      output_channel: webhook
      switch: on
filters:
  - filter:
      name: Filter 1
      description: This is a filter
      input_filter:
        app: pdns
        type: dnsfilter
        user_id: "?"
      input_exceptions:
        filtertype: phishing
      switch: on
      throttle:
        min_events: 0
        max_notifications: 1
        period: 86400
      output_channel: Output Channel 1
  - filter:
      name: Filter 2
      description: This is another filter
      input_filter:
        app: pdns
        type: newdevice
        user_id: "?"
      switch: on
      throttle:
        min_events: 10
        period: 3600
      output_channel: Output Channel 2

Output Channel Configuration

All output channels must have name and type fields. The type must currently be one of the following strings:

  • log
  • webhook
  • notification_center

For output channels of type webhook and notification_center, the following additional fields are mandatory:

  • url: The URL of the webhook endpoint. Note that the following tokens in a URL will be expanded:
    • %{YYYY} - Expands to the current year e.g. 2020
    • %{MM} - Expands to the current month number, e.g. 01 or 12
    • %{dd} - Expands to the current day of the month number, e.g. 01 or 25

And the following fields are optional:

  • api-key: The value to place into an X-API-Key header
  • basic-auth: The username and password to provide for basic authentication (in user:password format)
  • secret: The secret to use when generating a X-Signature header

Output channels must be defined in the file before the filters map.

Filter Configuration

Filters are used to send matching events to output channels. Every filter must have an input filter, which matches the events to be sent, and an output channel, which decides where the event is sent. Optionally filters also have a throttle, which can be used to restrict when events are sent to the output channel. Filters without a throttle will send every matching event to the output channel. Finally filters have a switch, which simply enables or disables the filter.

Input filters are mandatory and consist of a list of field names and values. There must be at least one field specified; empty input filters are not valid. There are two types of match for input filter fields:

  • Exact Match: For example, key: value will match if the event has a field called “key” with a value of “value”.
  • Current Match: For example key: "?". The “?” syntax specifies that the value of the specified field in the current event is used, whatever that is. For example if the current value is “foo” then “?” will be substituted with “foo”.

To match an input filter, all fields must match, i.e. the terms are combined with a logical AND. Input filters only match string fields; you cannot match on an array field currently.

The current match syntax of “?” may be considered similar to a wildcard, which indeed it is for matching purposes, however its use is more suble than that. Only the events that match the input filter are counted for throttling purposes, which means that for example specifying user_id: "?" as an input filter would count only events for the matching user, thus enabling per-user throttling to be implemented for example. If no throttle is specified then the “?” can be considered to be identical to a wildcard.

An optional input_exceptions map consisting of field names and values can be configured. Any event which has fields which match any of the input exceptions will not be matched (i.e. logical OR) and the filter will be skipped for that event. Only exact matches can be configured, and there is no support for the “?” syntax in input_exceptions.

The switch field should be set to “off” to disable a filter. If the switch field is missing, the filter is considered disabled.

The output_channel field must specify an output channel name as defined in the previous section.

Throttles are used to filter matching events. They are optional, meaning that if a throttle is not specified, all matching events will be sent to the specified output channel.

The only mandatory field of a throttle is period:

  • period: The number of seconds over which the throttle applies. Used to scope the query to Redis.

Throttles must specify either one or both of the following:

  • min_events: The minimum number of matching events which must have been sent previously before the current event is sent to the specified output channel. Once this threshold is exceeded for the current time period, events will continue to be sent unless throttled with max_notifications.
  • max_notifications: The maximum number of matching notifications (i.e. events that are actually sent to the specified output channel) that will be sent in the current time period. Note that setting a value of 0 will ensure that events are always throttled.

Both min_events and max_notifications can be specified at the same time.

If we consider the following filter:

- filter:
    name: filter1
    description: foo
    input_filter:
      app: pdns
      user_id: "?"
      qname: "?"
      type: dnsfilter
    input_exceptions:
      filtertype: botnet
    switch: on
    throttle:
      period: 3600
      max_notifications: 1

This can be explained as follows:

The filter ‘filter1’ matches events matching the “pdns” app, and the “dnsfilter” type, but not events containing the ‘botnet’ filtertype, sending no more than one notification per hour for each unique combination of user_id and qname.

For example, the following table shows whether a notification is sent for a set of incoming events (in chronological order, all received with a minute):

Event Notification sent?
app: pdns type: dnsfilter user_id: joe qname: facebook.com y
app: pdns type: dnsfilter user_id: joe qname: powerdns.com y
app: pdns type: dnsfilter user_id: mary qname: facebook.com y
app: pdns type: dnsfilter user_id: joe qname: facebook.com n
app: pdns type: dnsfilter user_id: mary qname: google.com filtertype: botnet n (Event will not match filter)
app: pdns type: dnsfilter user_id: mary qname: google.com y
app: pdns type: dnsfilter user_id: mary qname: google.com n

Aggregation Configuration

Aggregations are also used to send matching events to output channels. Similarly to filters, aggregations must have an input filter, which matches the events to be sent, and an output channel, which decides where the event is sent. However rather than a throttle, the aggregation defines a multiplier (a positive integer), which determines how aggressively the events will be aggregated; a higher number is more aggressive, and a multiplier of 1 means that no aggregation will be done (i.e. every matching event will be sent). Finally aggregations also have a switch, which simply enables or disables the filter.

Input filters are mandatory and work exactly as described above for filters.

The name field is used to uniquely identify the aggregation, and the description field helps identify its purpose.

The switch field should be set to “off” to disable an aggregation. If the switch field is missing, the aggregation is considered disabled.

The output_channel field must specify an output channel name as defined in the previous section.

Aggregations must specify a multiplier:

  • multiplier: The multiplication factor used to determine how aggressively events get aggregated. A multiplication factor of 10 means that the number of events that get aggregated will increase by a factor of 10 as the number of events increases (e.g. between 1-100, one in every 10 events will be sent, between 100-100 one in every 100 events will be sent etc.)

The following are optional:

  • min_events: If the event count is <= min_events, every event will be sent. Defaults to 0.
  • max_aggregate: The maximum number of events that will be aggregated. Use this to limit the multiplication factor, e.g. 100 will limit to one event being sent for every 1000 events received. Defaults to 0 meaning infinity, meaning no limit.
  • cache_timeout: The count of events is stored in redis, with an expiry. If no events are received within this window then the count is expired, i.e. reset to 0. It defaults to 600 seconds.

If we consider the following aggregation:

- aggregate:
    name: aggregation1
    description: foo
    input_filter:
      app: pdns
    switch: on
    min_events: 10
    multiplier: 10
    max_aggregate: 1000
    cache_timeout: 600
    output_channel: webhook

The following table shows which events will cause an aggregated message to be sent to the output channel (assuming all events are received within the cache_timeout window):

Event Number Event sent? number of events aggregated
1 y 1
10 y 1
19 n n/a
20 y 10
110 y 10
210 y 100
220 n n/a
1010 y 100
1110 n n/a
2010 y 1000
3010 y 1000

The JSON sent to the output channel will contain an event_count field that contains the number of events aggregated.

Input Field Dictionaries

Currently the event aggregator understands two types of event:

  • DNSMessage Events - These events can have either: type: dnsfilter for queries that are filtered, or type: dnsquery for queries that are not filtered.
  • AlertMessage Events - Currently only NewDeviceMessage subtypes are processed. These events will have type: newdevice.

The following sections list the dictionaries of the possible fields and their values for these events.

dnsfilter Dictionary

The following table lists the possible fields present in an event of type dnsfilter. dnsquery events contain a subset of these fields (e.g. there will never be a ‘filter_type’ field).

Field Name Type Possible Values | Description
type String dnsfilter dnsquery For filtered queries For unfiltered queries
app String pdns The name of the app, always pdns
user_id String Any The user id of the user
qname String Any DNS domain The filtered domain name
device_id String Any The ID of the device that was filtered
device_ip String Any v4 or v6 IP Address The IP address of the filtered DNS query
filter_type String malware phishing botnet blacklist contentfilter The type of filtering, only for filtered queries
rule String oxp-security- malware There should only ever be one rule that matches an event
tags Array Any string Each element in the array is a separate tag. For example: cat:OX-category-porn, foo, bar, rule:oxp-content-blacklist
categories Array Any string Each element in the array is a separate category. For example: OXP-category-porn, OX-category-malware
timestamp Integer Any integer Represents milliseconds since UNIX Epoch, e.g. 1549469500048

newdevice Dictionary

The following table lists the possible fields present in an event of type newdevice.

Field Name Type Possible Values Description
type String newdevice  
app String pdns The name of the app, always pdns
user_id String Any The user id of the user
device_id String Any The ID of the device that was detected
device_name String Any The name of the device that was detected
device_type String Any The type of the new device
timestamp Integer Any integer Represents milliseconds since UNIX Epoch, e.g. 1549469500048

Logstash Schema

The data in the dictionaries described above is sent to ELK (and most output filters) in JSON format.

Here is an example DNS Message:

{
 "app": "pdns",
 "categories": [
                "porn", "gambling", "OXP-platform-facebook"
               ],
 "device_id": "4005ffeeeeddaadddd",
 "device_ip": "2001:db8::1",
 "filter_type": "filter",
 "gambling_tag": "1",
 "porn_tag": "1",
 "OXP-platform-facebook_tag": "1",
 "qname": "min_filter.com",
 "timestamp": 1549465752000,
 "type": "dnsfilter",
 "user_id": "ncook"
}

Here is an example New Device Message:

{
 "app": "pdns",
 "device_id": "4005ffeeeeddaadddd",
 "device_name": "Neil's iPhone",
 "device_type": "Apple iPhone X",
 "timestamp": 1549470440410,
 "type": "dnsfilter",
 "user_id": "ncook"
}

Here are two example aggregated DNS Messages:

{
 "app": "pdns",
 "categories": [
                "porn", "gambling", "OXP-platform-facebook"
               ],
 "device_id": "4005ffeeeeddaadddd",
 "device_ip": "2001:db8::1",
 "filter_type": "filter",
 "gambling_tag": "1",
 "porn_tag": "1",
 "OXP-platform-facebook_tag": "1",
 "qname": "min_filter.com",
 "timestamp": 1549465752000,
 "type": "dnsfilter",
 "user_id": "ncook",
 "event_count": 1000
}

{
 "app": "pdns",
 "device_id": "4005ffee2938adddd",
 "device_ip": "2001:db8::1",
 "qname": "facebook.com",
 "timestamp": 1549465752000,
 "type": "dnsquery",
 "user_id": "luser",
 "event_count": 100
}

Note that an aggregated event is identical to a normal event except that it contains an event_count field.