Skip to content

SPOG Architecture Overview

Welcome to the SPOG architecture guide. This document explains the core architectural concepts that make SPOG powerful and flexible for managing distributed PowerDNS infrastructure.

If you want to see the software in action, check out the Quickstart Guide. This guide will help you understand how SPOG components work together and how to customize SPOG for your organization.

What is SPOG?

SPOG (Single Pane of Glass) is a distributed observability and management platform for PowerDNS infrastructure. It provides:

  • Unified Visibility: View and manage multiple DNS clusters from a single interface
  • Flexible Organization: Use labels to organize clusters by eg. region, role, environment, or any custom dimension
  • Secure Access Control: Fine-grained authorization policies that adapt to your organizational structure
  • Real-Time Monitoring: Live cluster state and health information
  • Customizable Dashboards: Create views tailored to different teams and use cases

Architecture Concepts

SPOG's architecture is built around six core concepts that work together to provide flexible, secure multi-cluster management:

  1. Hub-and-Spoke Architecture: Centralized control plane with distributed user planes
  2. Label-Based Taxonomy: Flexible cluster organization using key-value labels
  3. Filter Query Language: SQL-like syntax for selecting and grouping clusters
  4. REGO-Based Authorization: Policy-driven access control
  5. Dashboards & Playlists: Customizable views of your infrastructure
  6. Navigation: Menu structure that adapts to your organization

Let's explore each concept in detail.


1. Hub-and-Spoke Architecture

SPOG uses a hub-and-spoke model where a central control plane (hub) connects to multiple user planes (spokes) via NATS messaging.

Control Plane (Hub)

In our example the control plane runs in the controlplane namespace and provides:

  • Glass UI: Web interface for viewing and managing clusters
  • Authentication Services: Static, LDAP, or OIDC authentication
  • Policy Engine: REGO-based authorization
  • Middleware Services: Request routing and cluster discovery
  • Global Configuration: Dashboard and navigation definitions

User Planes (Spokes)

Each PowerDNS cluster has a user plane enhanced via Glass Instrumentation. The user plane:

  • Announces itself to the control plane via NATS
  • Reports cluster state (pods, services, readiness)
  • Executes operations (DNS queries, cache clearing, pod management)
  • Streams logs from cluster services

User planes identify themselves with unique labels that define their organizational role:

YAML
1
2
3
4
5
6
7
8
# Example user plane cluster
clusterId: "pdns-us-east-prod"
labels:
  region: "us-east"          # Geographic location
  role: "authoritative"      # DNS service type
  environment: "production"  # Deployment stage
  team: "dns-operations"     # Organizational ownership
  tier: "critical"           # Service criticality

NATS Leaf-Node Connectivity

The control and user planes communicate via NATS (a lightweight, high-performance messaging system) using a leaf-node topology. NATS provides reliable message delivery, service discovery, and request-reply patterns that enable real-time communication between the control plane and distributed user plane clusters.

Network topology diagram: The control plane hub at the top contains the NATS server and Glass UI services. Three user plane clusters connect from below via leaf-node links, representing geographically distributed DNS clusters (US East, US West, EU West).

Key Benefits:

  • Centralized Management: Single UI for all clusters
  • Distributed Execution: Operations run locally in each cluster
  • Resilient Communication: NATS provides reliable message delivery
  • Scalable: Add clusters by deploying Glass Instrumentation

2. Label-Based Taxonomy

Labels are the foundation of SPOG's flexible organization. Every cluster has labels that describe its identity and role in your infrastructure.

What Are Labels?

Labels are key-value pairs attached to clusters. They enable:

  • Multi-dimensional organization: Region, role, environment, team, customer, tier
  • Flexible filtering: Show clusters based on any label combination
  • Authorization boundaries: Grant access based on label matching
  • Custom taxonomies: Define labels that match your organization

Labels Are Completely Arbitrary

The label keys and values shown in these examples (region, role, environment, etc.) are illustrative only. You can define any labels that match your organizational structure and operational needs. SPOG doesn't enforce any specific label schema - your taxonomy should reflect how you think about and organize your infrastructure.

Example Label Configuration

YAML
# Example: Label-Based Cluster Organization
labels:
  region: "us-east"          # Geographic location
  role: "authoritative"      # DNS service type
  environment: "production"  # Deployment stage

# User Plane Cluster 2: US West Production Authoritative
# clusterId: "pdns-us-west-prod"
# labels:
#   region: "us-west"
#   role: "authoritative"
#   environment: "production"

# User Plane Cluster 3: EU West Production Recursor
# clusterId: "pdns-eu-west-prod"
# labels:
#   region: "eu-west"
#   role: "recursor"
#   environment: "production"

Alternative Schemas:

Beyond regional organization, you can structure labels to match different operational models:

Team-Based Organization:

Organize clusters by the teams that own and operate them. This approach works well when different teams manage separate DNS infrastructure, have distinct operational responsibilities, or need isolated access boundaries.

YAML
1
2
3
4
labels:
  team: "dns-operations"    # Organizational ownership
  service: "authoritative"  # Service type
  criticality: "high"       # Service criticality level

Use this when: Teams have clear ownership boundaries, different SLAs apply to different services, or you want to delegate operational control to specific groups.

Customer Multi-Tenancy:

Organize clusters by customer or tenant when running DNS as a service for multiple organizations. Each customer gets dedicated clusters with appropriate service tiers and regional deployment.

YAML
1
2
3
4
labels:
  customer: "acme-corp"     # Customer identifier
  region: "us-east"         # Customer's preferred region
  tier: "premium"           # Service tier (SLA, features)

Use this when: Running DNS infrastructure for multiple customers, implementing service tiers with different SLAs, or isolating customer infrastructure for security/compliance.

Label Best Practices

  1. Be Consistent: Use the same label keys across all clusters
  2. Use Lowercase: Label values are case-sensitive; lowercase avoids confusion
  3. Avoid Spaces: Use hyphens (us-east) or underscores (us_east)
  4. Plan Ahead: Consider future organizational changes when choosing labels
  5. Document: Maintain a label taxonomy document for your team

3. Filter Query Language

SPOG uses a SQL-like filter query language to select and organize clusters based on labels.

Basic Syntax

Filter queries use label names and values with familiar operators:

Text Only
1
2
3
4
region = "us-east"                          # Single value match
role in ("authoritative", "recursor")       # Multiple values
region = "us-east" and environment = "production"  # Combined conditions
region = "us-east" group by role            # Group results by role label

Filter Operators

Operator Description Example
= Exact match region = "us-east"
!= Not equal environment != "development"
in Match any value region in ("us-east", "us-west")
and Logical AND region = "us-east" and role = "authoritative"
or Logical OR environment = "staging" or environment = "development"

Grouping

Use group by to organize results hierarchically:

Text Only
1
2
3
group by region                          # Single level grouping
group by region, role                    # Multi-level hierarchy
group by region, role, environment       # Three-level organization

Grouping organizes clusters into hierarchical structures based on label values. The system creates parent nodes for each unique label value, with matching clusters nested underneath.

Filter Examples

Show all production authoritative servers:

Text Only
role = "authoritative" and environment = "production"

US regions grouped by service type:

Text Only
region in ("us-east", "us-west") group by role

Critical tier clusters by region and environment:

Text Only
tier = "critical" group by region, environment

4. REGO-Based Authorization

SPOG uses REGO (Rego Policy Language) to implement flexible, policy-driven authorization. This provides fine-grained control over who can see and manage which clusters based on matching user attributes against cluster labels.

How Authorization Works

SPOG receives user attributes from your authentication provider and makes them available to REGO policies. These attributes can include:

  • RBAC Roles: Traditional role assignments like admin, operator, observer
  • ABAC Attributes: User properties like department, cost-center, clearance-level
  • Groups: Group memberships such as dns-team, americas-ops, security-reviewers
  • OIDC Scopes: OAuth scopes like read:clusters, write:production, manage:dns

Three Permission Layers

REGO policies provide various layers of access control. You define helper rules in user.rego that implement your organization's matching logic, then use those rules in permission checks.

Permissions Are Cluster-Specific

All permissions in SPOG are evaluated in the context of a specific cluster. REGO policies have access to both user attributes (input.user) and cluster labels (input.cluster.labels), allowing you to grant or deny permissions based on cluster-specific properties. For example, a user might have permission to restart instances in staging environments but not in production.

1. Cluster Visibility - Which clusters appear in the UI for this user?

Rego
1
2
3
4
5
6
# In pdns_permissions.rego
can_see_cluster if {
  user.has_matching_region        # User attribute matches cluster label
  user.has_matching_cluster_role  # User attribute matches cluster label
  user.has_matching_environment   # User attribute matches cluster label
}

2. Read Permissions - Can the user view detailed information and logs?

Rego
read_logs if user.can_observe_cluster  # Requires specific role + cluster access

3. Write Permissions - Can the user perform operations?

Rego
clear_cache if user.can_manage_dns_content         # Requires "content-manager"
restart_instance_set if user.can_manage_instances  # Requires "operator"

Each permission check receives both input.user (user attributes and roles) and input.cluster (cluster ID and labels), allowing fine-grained control based on cluster properties like environment, region, or customer.

Example: Region-Based Access Control

Here's a concrete example showing how user attributes map to cluster access:

User Definition:

YAML
1
2
3
4
5
6
staticUsers:
  regional-operator:
    roles:
      - "americas"        # Maps to US regions
      - "production"      # Environment access
      - "observer"        # Permission level

User Helper (in user.rego):

Rego
1
2
3
4
5
6
# Map role "americas" to specific region labels
has_matching_region if {
  "americas" in input.user.roles
  some region in input.cluster.labels.region
  region in ["us-east", "us-west"]
}

Permission Check (in pdns_permissions.rego):

Rego
1
2
3
4
can_see_cluster if {
  user.has_matching_region
  user.has_matching_environment  # Similar helper, not shown for brevity
}

Result: User regional-operator sees clusters labeled region: "us-east" or region: "us-west" with environment: "production", but not EU clusters or development environments.

Authorization Rules Are Arbitrary

The helper rules (user.has_matching_region, user.can_observe_cluster, etc.) are completely arbitrary - you define them in user.rego to match your organization's authorization model. SPOG only requires that REGO policies provide the final permission decisions (can_see_cluster, read_logs, etc.). How you implement those decisions is entirely up to you.

Learn More About REGO Policies

For complete examples, detailed policy writing, OIDC integration, and advanced authorization patterns, see the Authentication & Authorization guide.


5. Dashboards & Playlists

Dashboards provide customized views of your infrastructure using filters and visualization widgets.

Default Dashboard

SPOG provides a global overview dashboard at / out-of-the-box, featuring tree-table, heatmap, pie-charts, and cytoscape widgets. Custom dashboards can be added for organization-specific needs like regional views, service type monitoring, or tier-based dashboards.

Custom Dashboards

Custom dashboards are configured in globalConfig.dashboards with filtered views for different audiences. The example below shows a tier-based dashboard that extends the defaults:

YAML
# Example: Custom Dashboards
#
# The chart provides a default global overview dashboard at "/" with
# tree-table, heatmap, pie-charts, and cytoscape widgets.
#
# This example shows how to add custom dashboards for organization-specific
# views like tier-based monitoring, regional views, or customer dashboards.
#
# To disable the default dashboard, set globalConfig.defaults.enabled: false

globalConfig:
  dashboards:
    # Example 1: Tier-based dashboard (not in defaults)
    # Shows only critical infrastructure across all regions
    critical-infrastructure:
      title: "Critical Infrastructure"
      description: "All critical-tier clusters requiring priority attention"
      url: /critical
      requires:
        - "dashboard_critical_infrastructure"
      graphs:
        - widget: "cc-state-tree-table"
          title: "Critical Systems"
          args:
            filter: 'tier = "critical" group by region'

        - widget: "cc-state-readiness-heatmap"
          title: "Critical Health Status"
          args:
            filter: 'tier = "critical" group by region'

    # Example 2: URL parameter dashboard (demonstrates :parameter syntax)
    # Single dashboard definition serves multiple URLs via parameters
    cluster-by-tier:
      title: ":tier Tier Clusters"
      description: "All clusters in :tier tier"
      url: /tier/:tier
      requires:
        - "dashboard_tier_{{tier}}"
      graphs:
        - widget: "cc-state-tree-table"
          title: ":tier Infrastructure"
          args:
            filter: 'tier = ":tier" group by region, role'

        - widget: "cc-state-readiness-pie-charts"
          title: ":tier Service Distribution"
          args:
            filter: 'tier = ":tier" group by role'

    # Example 3: Combined multi-label filter
    # Demonstrates complex filtering with multiple conditions
    production-critical:
      title: "Production Critical Systems"
      description: "Critical tier in production environment"
      url: /production-critical
      requires:
        - "dashboard_production_critical"
      graphs:
        - widget: "cc-state-cytoscape"
          title: "Critical Production Topology"
          args:
            filter: 'environment = "production" and tier = "critical"'
            depth: "product"
            layout: "Hierarchical"

# These dashboards complement the defaults by providing:
# - Tier-based organization (critical, standard, experimental)
# - URL parameter patterns for dynamic filtering
# - Complex multi-label filtering examples
#
# Navigation items for these dashboards should be added in navigation config.

Dashboard Components:

  • Title: Display name shown in UI
  • Description: Brief explanation of dashboard purpose
  • URL: Path to access this dashboard (used later when building the navigation menu)
  • Graphs: Array of visualization widgets

URL Parameters:

The second example (cluster-by-tier) demonstrates URL parameters using :parameter syntax. A single dashboard definition with URL /tier/:tier serves multiple URLs like /tier/critical, /tier/standard, etc. The :tier parameter can be referenced in titles, descriptions, and filters to create dynamic, reusable dashboards.

Visualization Widgets

SPOG provides several widget types for displaying cluster state:

Widget Description Use Case
cc-state-tree-table Hierarchical tree view Detailed cluster exploration
cc-state-readiness-heatmap Color-coded readiness grid At-a-glance health status
cc-state-readiness-pie-charts Readiness distribution Service availability metrics
cc-state-cytoscape Network topology graph Relationship visualization

Common Widget Arguments:

All widgets accept a filter argument to select which clusters to display and how to group them.

Cytoscape-Specific Arguments:

The cytoscape widget additionally accepts:

  • depth: Controls how deep into the cluster hierarchy to display. Valid values: "cluster", "product", "instance-set", "pod", "container"

  • layout: Determines the graph rendering algorithm. Valid values: "Radial", "Hierarchical"

Widget Configuration:

A dashboard can contain multiple widgets, allowing you to view the same data in different ways. The UI provides controls to switch between widgets within a dashboard.

YAML
graphs:
  - widget: "cc-state-tree-table"
    title: "Cluster Infrastructure"
    args:
      filter: 'region = "us-east" and environment = "production"'

  - widget: "cc-state-readiness-heatmap"
    title: "Regional Health"
    args:
      filter: 'region = "us-east" group by role'

  - widget: "cc-state-cytoscape"
    title: "Network Topology"
    args:
      filter: 'region = "us-east"'
      depth: "product"      # Detail level
      layout: "Hierarchical"  # Layout algorithm

Grouping in Dashboards

Use group by to organize visualizations. Different widgets respond to grouping in different ways:

  • Tree tables create expandable hierarchies based on group values
  • Heatmaps use grouping to define their axes (rows and columns)
  • Pie charts create one separate chart per group value
  • Cytoscape graphs create group nodes that visually organize clusters in the network topology
YAML
1
2
3
4
5
6
7
# Single-level grouping
filter: 'group by region'
# Result: Clusters organized by region

# Multi-level grouping
filter: 'group by region, role'
# Result: Hierarchical organization (region → role → clusters)

Playlists

Playlists automatically rotate through multiple dashboard screens, making them ideal for NOC displays, monitoring stations, and status boards.

What are playlists? Think of playlists as automated slideshows of your infrastructure dashboards. Each screen displays a different view (widget) of your clusters, and SPOG cycles through them at a configured interval. Playlists support two types of screens: static screens that show a single widget, and dynamic screens that automatically generate multiple screens based on query results.

Why use playlists? They eliminate manual dashboard switching for teams monitoring multiple cluster dimensions. Operations centers can display a single rotating view that covers all critical infrastructure.

Dynamic screen generation: Playlists can automatically create multiple screens using the dynamic configuration with template variables. For example, query: 'group by region' with three regions generates three separate screens—one for each region—without manually defining each:

YAML
screens:
  - dynamic:
      query: 'group by region'  # Finds all unique regions
      templates:
        - title: "{{region}} Health"
          widget: "cc-state-readiness-heatmap"
          args:
            filter: 'region = "{{region}}"'
        # Automatically creates:
        # Screen 1: "us-east Health" showing us-east clusters
        # Screen 2: "us-west Health" showing us-west clusters
        # Screen 3: "eu-west Health" showing eu-west clusters

This saves configuration effort and ensures new regions appear automatically without updating the playlist definition.

Key playlist features:

  • Automatic rotation: Configure interval from 5 seconds to several minutes
  • Dynamic screen generation: Use dynamic with templates to create screens per group value or cluster
  • Static screens: Use graph for fixed widget displays
  • Animated backgrounds: Optional visual interest for public displays
  • Multiple widget types: Mix tree tables, heatmaps, pie charts, and topology graphs
  • URL-based access: Direct link for dedicated displays

See the Dashboards and Playlists Guide for complete configuration examples and advanced features.

Basic Playlist Structure:

YAML
globalConfig:
  playlists:
    noc-rotation:
      name: "NOC 24/7 Operations"
      url: "/playlist/noc"
      interval: "15s"  # Time per screen
      animatedBackground: true
      screens:
        - graph:  # Static screen
            title: "Global Overview"
            widget: "cc-state-readiness-heatmap"
            args:
              filter: ""
        - dynamic:  # Dynamic screens (one per region)
            query: "group by region"
            templates:
              - title: "{{region}} Status"
                widget: "cc-state-tree-table"
                args:
                  filter: 'region = "{{region}}"'

6. Navigation

Navigation menus provide structured access to dashboards and operations. The navigation structure adapts to your organizational hierarchy.

Default Navigation

SPOG provides a default Tools menu out-of-the-box with cache management, DNS operations, and AI assistant actions.

Custom menus are added before the default menu, so you configure navigation for your dashboards and organization-specific needs.

Custom Navigation

Custom navigation is configured in globalConfig.navigation. The example below adds a custom "Priority" menu for tier-based dashboards:

YAML
# Example: Extending Default Navigation
globalConfig:
  navigation:
    menus:
      # Example: Custom menu for tier-based monitoring
      # This complements the default regional/service menus
      - name: "Priority"
        sections:
          - name: "By Tier"
            items:
              - name: "Critical Infrastructure"
                url: "/critical"
                description: "All critical-tier systems"
              - name: "Standard Tier"
                url: "/tier/standard"
                description: "Standard-tier clusters"
              - name: "Experimental"
                url: "/tier/experimental"
                description: "Experimental clusters"

          - name: "Combined Views"
            items:
              - name: "Production Critical"
                url: "/production-critical"
                description: "Critical systems in production"

# Navigation Structure:
# - Menu: Top-level dropdown (e.g., "Priority")
# - Section: Group within menu (e.g., "By Tier")
# - Item: Clickable link (e.g., "Critical Infrastructure")
#
# With this config, the full navigation will be:
# [Priority] [Dashboards] [Actions]
#     ^          ^           ^
#   custom    default     default
#
# URLs must match dashboard url fields exactly.

Navigation Hierarchy:

Listed by priority, from most important to least:

  1. Match URLs (Required): Navigation URLs must exactly match dashboard URLs or external links will break
  2. Mirror organizational structure: Align menus with team/regional boundaries so users find their resources intuitively
  3. Limit top-level menus: Keep to 3-5 menus for manageable navigation; too many tabs overwhelm users
  4. Group related items: Put similar dashboards in the same section to reduce cognitive load
  5. Use clear names: Keep menu names short (1-2 words) for clean UI and easy scanning

Permission-Based Visibility

Navigation items automatically hide when users lack access to the target dashboard. Dashboards can specify requires flags that REGO policies must return as true:

YAML
dashboards:
  production-critical:
    title: "Critical Production Systems"
    url: "/production-critical"
    requires:
      - "production_critical_access"  # REGO flag checked
    graphs: [...]

navigation:
  menus:
    - name: "Production"
      sections:
        - name: "Critical Systems"
          items:
            - name: "Production Critical"
              url: "/production-critical"  # Hidden if user lacks permission

Permission Flag Names Are Arbitrary

The flag names in requires (like "production_critical_access") are completely arbitrary - you define them in your REGO policies to match your authorization model. SPOG checks whether your REGO policies return true for each flag. The flag names should match the permission rules you define in your pdns_global_flags.rego file.


Putting It All Together

Let's see how all these concepts work together in a complete example.

Scenario: US Regional Operator Access

Imagine a DNS operations engineer responsible for managing production authoritative DNS servers in US regions. They need visibility into cluster health and logs but should not be able to restart production instances. This scenario demonstrates how SPOG's architecture components work together to provide appropriate access.

User: us-operator logs into Glass UI

User Configuration:

YAML
1
2
3
4
5
6
7
8
staticUsers:
  us-operator:
    roles:
      - "americas"           # Regional: US regions
      - "authoritative"      # Cluster type
      - "production"         # Environment access
      - "observer"           # Can read logs
      - "operator-non-prod"  # Can manage non-production

User Plane Cluster:

YAML
1
2
3
4
5
clusterId: "pdns-us-east-prod"
labels:
  region: "us-east"
  role: "authoritative"
  environment: "production"

Dashboard Configuration:

YAML
1
2
3
4
5
6
7
8
dashboards:
  us-east-operations:
    title: "US East Operations"
    url: "/us-east"
    graphs:
      - widget: "cc-state-tree-table"
        args:
          filter: 'region = "us-east"'

Navigation:

YAML
1
2
3
4
5
6
7
8
navigation:
  menus:
    - name: "Regional Views"
      sections:
        - name: "Americas"
          items:
            - name: "US East"
              url: "/us-east"

User Experience Flow

  1. Login: User authenticates as us-operator
  2. Navigation: Sees "Regional Views" → "Americas" → "US East" menu
  3. Dashboard: Clicks "US East", sees dashboard filtered to region = "us-east"
  4. Cluster View: Sees cluster pdns-us-east-prod in the tree table
  5. Authorization Check: REGO evaluates access
  6. ✓ Can see cluster (region/role/environment match)
  7. ✓ Can read logs (has "observer" role)
  8. ✗ Cannot restart instances (production environment + operator-non-prod)
  9. Operations: Can view state and logs, but cannot perform destructive operations

Authorization Decision Tree

The diagram shows the complete authorization flow with two scenarios:

Authorization Steps:

  1. User attempts operation - User clicks button or calls endpoint
  2. Check cluster visibility - Can user see this cluster? (checks can_see_cluster policy)
  3. Check permission - Does user have the required permission? (checks permission policies like read_logs or restart_instance_set)
  4. Allow or deny - Grant access or return 403 error

Endpoint-to-Permission Mappings Are Embedded

The mapping from endpoints to permissions (e.g., which operation requires which permission flag) is built into SPOG. When configuring authorization, you only need to define the permission policies and flags in your REGO files. The system automatically maps UI buttons and API endpoints to the appropriate permission checks.

Example Scenarios:

  • Allowed Operation (Reading Logs): User has read_logs permission → user has "observer" role → permission granted
  • Denied Operation (Restarting Instance Sets): User needs restart_instance_set permission → user only has "operator-non-prod" role and cluster is production → permission denied

Complete Configuration Example

For a complete working example showing how to extend the defaults with custom tier-based views, the configuration is split across these files:

Labels (applied to each cluster via glass-instrumentation):

YAML
# Example: Label-Based Cluster Organization
# This example demonstrates how to configure cluster labels for organizing
# your PowerDNS infrastructure using flexible taxonomy.
#
# This shows THREE user plane clusters using regional organization.

# User Plane Cluster 1: US East Production Authoritative
clusterId: "pdns-us-east-prod"
labels:
  region: "us-east"          # Geographic location
  role: "authoritative"      # DNS service type
  environment: "production"  # Deployment stage

# User Plane Cluster 2: US West Production Authoritative
# clusterId: "pdns-us-west-prod"
# labels:
#   region: "us-west"
#   role: "authoritative"
#   environment: "production"

# User Plane Cluster 3: EU West Production Recursor
# clusterId: "pdns-eu-west-prod"
# labels:
#   region: "eu-west"
#   role: "recursor"
#   environment: "production"

Custom Dashboards (extend the default dashboards):

YAML
# Example: Custom Dashboards
#
# The chart provides a default global overview dashboard at "/" with
# tree-table, heatmap, pie-charts, and cytoscape widgets.
#
# This example shows how to add custom dashboards for organization-specific
# views like tier-based monitoring, regional views, or customer dashboards.
#
# To disable the default dashboard, set globalConfig.defaults.enabled: false

globalConfig:
  dashboards:
    # Example 1: Tier-based dashboard (not in defaults)
    # Shows only critical infrastructure across all regions
    critical-infrastructure:
      title: "Critical Infrastructure"
      description: "All critical-tier clusters requiring priority attention"
      url: /critical
      requires:
        - "dashboard_critical_infrastructure"
      graphs:
        - widget: "cc-state-tree-table"
          title: "Critical Systems"
          args:
            filter: 'tier = "critical" group by region'

        - widget: "cc-state-readiness-heatmap"
          title: "Critical Health Status"
          args:
            filter: 'tier = "critical" group by region'

    # Example 2: URL parameter dashboard (demonstrates :parameter syntax)
    # Single dashboard definition serves multiple URLs via parameters
    cluster-by-tier:
      title: ":tier Tier Clusters"
      description: "All clusters in :tier tier"
      url: /tier/:tier
      requires:
        - "dashboard_tier_{{tier}}"
      graphs:
        - widget: "cc-state-tree-table"
          title: ":tier Infrastructure"
          args:
            filter: 'tier = ":tier" group by region, role'

        - widget: "cc-state-readiness-pie-charts"
          title: ":tier Service Distribution"
          args:
            filter: 'tier = ":tier" group by role'

    # Example 3: Combined multi-label filter
    # Demonstrates complex filtering with multiple conditions
    production-critical:
      title: "Production Critical Systems"
      description: "Critical tier in production environment"
      url: /production-critical
      requires:
        - "dashboard_production_critical"
      graphs:
        - widget: "cc-state-cytoscape"
          title: "Critical Production Topology"
          args:
            filter: 'environment = "production" and tier = "critical"'
            depth: "product"
            layout: "Hierarchical"

# These dashboards complement the defaults by providing:
# - Tier-based organization (critical, standard, experimental)
# - URL parameter patterns for dynamic filtering
# - Complex multi-label filtering examples
#
# Navigation items for these dashboards should be added in navigation config.

Custom Navigation (appears before default menus):

YAML
# Example: Extending Default Navigation
#
# The chart provides comprehensive default navigation out-of-the-box:
# - Dashboards menu (Overview, Global Views, Regional, Service Type, Environment, Operations)
# - Actions menu (Cache Management, DNS Operations, AI Assistant)
#
# Custom menus defined here appear BEFORE the default menus.
# This example shows how to ADD CUSTOM navigation for organization-specific needs.
#
# To disable default navigation, set globalConfig.defaults.enabled: false

globalConfig:
  navigation:
    menus:
      # Example: Custom menu for tier-based monitoring
      # This complements the default regional/service menus
      - name: "Priority"
        sections:
          - name: "By Tier"
            items:
              - name: "Critical Infrastructure"
                url: "/critical"
                description: "All critical-tier systems"
              - name: "Standard Tier"
                url: "/tier/standard"
                description: "Standard-tier clusters"
              - name: "Experimental"
                url: "/tier/experimental"
                description: "Experimental clusters"

          - name: "Combined Views"
            items:
              - name: "Production Critical"
                url: "/production-critical"
                description: "Critical systems in production"

# Navigation Structure:
# - Menu: Top-level dropdown (e.g., "Priority")
# - Section: Group within menu (e.g., "By Tier")
# - Item: Clickable link (e.g., "Critical Infrastructure")
#
# With this config, the full navigation will be:
# [Priority] [Dashboards] [Actions]
#     ^          ^           ^
#   custom    default     default
#
# URLs must match dashboard url fields exactly.

REGO Policies (required for authorization):

YAML
# Example: REGO Authorization Policies
#
# This example demonstrates how to configure REGO policies for access control.
# It includes dashboard permission flags for the custom tier-based dashboards
# defined in architecture-dashboard.yaml.
#
# The default dashboards have their own permission flags built-in.
# Custom dashboards need custom permission flags defined in pdns_global_flags.rego.

policy:
  # Static users configuration (when OIDC is disabled)
  # Each user has roles that are evaluated by REGO policies
  staticUsers:
    # Global administrator - full access to all clusters
    admin:
      id: "admin"
      password: "secret"
      sub: "admin"
      name: "Global Administrator"
      roles:
        - "admin"
        - "global"
        - "authoritative"
        - "recursor"
        - "production"
        - "staging"
        - "development"

    # Americas regional director - access to US regions
    us-director:
      id: "us-director"
      password: "secret"
      sub: "us-director"
      name: "US Regional Director"
      roles:
        - "americas"          # Regional role
        - "authoritative"     # Cluster role
        - "recursor"          # Cluster role
        - "production"        # Environment
        - "staging"           # Environment
        - "observer"          # Permission role

    # US East operator - can manage non-prod in US East
    us-east-operator:
      id: "us-east-operator"
      password: "secret"
      sub: "us-east-operator"
      name: "US East Operator"
      roles:
        - "americas"                # Regional role
        - "authoritative"           # Cluster role
        - "development"             # Environment
        - "staging"                 # Environment
        - "observer"                # Can read logs
        - "operator-non-prod"       # Can manage instances in non-prod

  # REGO policies - authorization logic
  policies:
    # Permission definitions
    pdns_permissions.rego: |
      package pdns_permissions

      import data.user

      # Basic connection allowed for all authenticated users
      connect if true

      # Read permissions - requires matching region, role, and environment
      read if user.can_see_cluster
      read if user.admin

      # Read logs - requires observer role + cluster access
      read_logs if user.can_observe_cluster

      # Clear cache - requires content-manager role + cluster access
      clear_cache if user.can_manage_dns_content
      clear_cache if user.admin

      # Restart instances - requires operator role + cluster access
      restart_instance_set if user.can_manage_instances
      restart_instance_set if user.admin

      # Delete pod - requires operator role + cluster access
      delete_pod if user.can_manage_instances
      delete_pod if user.admin

      # DNS check - requires observer role + cluster access
      dns_check if user.can_observe_cluster

    # User authorization logic
    user.rego: |
      package user

      roles := input.user.roles

      # Role flags
      admin if "admin" in roles
      regional_director if "regional-director" in roles
      operator if "operator" in roles
      viewer if "viewer" in roles

      # Regional access - user has appropriate regional role
      has_matching_region if {
        some region_label in input.cluster.labels.region
        region_label in ["us-east", "us-west"]
        "americas" in roles
      }

      has_matching_region if {
        some region_label in input.cluster.labels.region
        region_label in ["eu-east", "eu-west"]
        "europe" in roles
      }

      has_matching_region if {
        "global" in roles
      }

      # Cluster role access - user role matches cluster role label
      has_matching_cluster_role if {
        some role in roles
        some cluster_role in input.cluster.labels.role
        role == cluster_role
      }

      # Environment access - user has environment role
      has_matching_environment if {
        some env in input.cluster.labels.environment
        env in roles
      }

      # Can see cluster - must match region, role, and environment
      can_see_cluster if {
        has_matching_region
        has_matching_cluster_role
        has_matching_environment
      }

      # Admin bypass - admins can see clusters with just region and role
      can_see_cluster if {
        admin
        has_matching_region
        has_matching_cluster_role
      }

      # Observer permissions - can read logs and check DNS
      can_observe_cluster if {
        can_see_cluster
        "observer" in roles
      }

      can_observe_cluster if {
        admin
      }

      # Content manager permissions - can clear cache
      can_manage_dns_content if {
        can_see_cluster
        "content-manager" in roles
      }

      # Operator permissions - can restart/delete
      can_manage_instances if {
        can_see_cluster
        "operator" in roles
      }

      # Non-prod operator - only in non-production environments
      can_manage_instances if {
        can_see_cluster
        "operator-non-prod" in roles
        input.cluster.labels.environment != "production"
      }

      can_manage_instances if {
        admin
      }

    # Dashboard permission flags for custom dashboards
    # Default dashboards have built-in permissions; custom dashboards need these
    pdns_global_flags.rego: |
      package pdns_global_flags

      # Critical infrastructure dashboard - all authenticated users
      dashboard_critical_infrastructure if true

      # Production critical dashboard - requires production access
      dashboard_production_critical if "production" in input.user.roles
      dashboard_production_critical if "admin" in input.user.roles

      # Tier-specific dashboards (used with URL parameters)
      # /tier/critical, /tier/standard, /tier/experimental
      dashboard_tier_critical if true
      dashboard_tier_standard if true
      dashboard_tier_experimental if "development" in input.user.roles
      dashboard_tier_experimental if "admin" in input.user.roles

# Authorization Flow:
# 1. User authenticates (static user or OIDC)
# 2. User has roles assigned (in staticUsers or from OIDC claims)
# 3. User attempts action on cluster
# 4. REGO evaluates input.user.roles against input.cluster.labels
# 5. Action allowed if policy rules evaluate to true
#
# Role Types:
# - Regional: "americas", "europe", "apac", "global"
# - Cluster Role: "authoritative", "recursor" (matches cluster label)
# - Environment: "production", "staging", "development"
# - Permission: "admin", "observer", "operator", "content-manager"
#
# Policy Evaluation Context (input):
# - input.user.roles: Array of role strings
# - input.cluster.labels: Map of cluster labels (region, role, environment, etc.)
# - input.operation: The operation being attempted (read, write, etc.)

What's Next?

Now that you understand SPOG's architecture, you can:

Deep Dive into Specific Areas

  • Dashboards and Playlists Guide: Learn advanced dashboard configurations, dynamic playlists, and widget options
  • Helm Chart Reference: Explore complete configuration options for Glass UI and Glass Instrumentation

Integration Guides

  • OIDC Integration: Configure enterprise authentication
  • Custom Label Taxonomies: Design label schemas for your organization
  • Advanced REGO Policies: Implement complex authorization rules

Example Configurations


Summary

SPOG's architecture combines six core concepts to provide flexible, secure multi-cluster management:

  1. Hub-and-Spoke: Centralized control with distributed execution
  2. Labels: Flexible taxonomy for cluster organization
  3. Filters: SQL-like queries for cluster selection
  4. REGO: Policy-driven authorization with three permission layers
  5. Dashboards: Customized views with powerful visualizations
  6. Navigation: Structured access to resources

These concepts work together to create a platform that adapts to your organizational needs while maintaining security and usability.