Wardengate

Administer

Approval workflows

Approvals are the "ask someone else first" layer. They do not replace policy — a user still has to be allowed to connect at all — they add a second human into the loop for cases where a policy match alone is not enough. This page covers when to require approval, how to route it, and what the audit artefact looks like on the other side.

When to require approval

A good rule of thumb: require approval when the cost of a bad decision is larger than the cost of a two-minute delay. Some concrete cases that are worth the friction:

  • High-risk targets — production databases with customer PII, domain controllers, payments processors.
  • Off-hours or weekend access to production.
  • Break-glass account use — by construction, always.
  • Four-eyes on destructive admin actions: rotating a CA, unfreezing a retired account, promoting a role.
  • First access to a target by a user who has never connected to it.
  • High-risk signals: a new device, a novel geography, a spike in risk score.

Approval routes

An ApprovalRoute resource says "when an approval is needed, ask these people through these channels, with this SLA and this escalation." A policy that requires approval names a route rather than embedding the details — that way you change approvers once and every policy follows.

apiVersion: wardengate/v1
kind: ApprovalRoute
metadata:
  name: prod-sre-leads
  organization: acme
spec:
  quorum:
    kind: one-of
    approvers:
      - kind: Group
        name: sre-leads
  channels:
    - kind: slack
      target: "#sre-approvals"
      webhookSecret: slack-approvals-hook
    - kind: email
      target: "sre-leads@acme.example"
  sla:
    respondWithin: 10m
    onTimeout: escalate
  escalation:
    quorum:
      kind: one-of
      approvers:
        - kind: Group
          name: sre-oncall
    channels:
      - kind: pagerduty
        serviceKey: sre-oncall-pd-key
    onTimeout: deny
  selfApproval: denied
  audit:
    requireReason: true
    minReasonLength: 20

Quorum shapes

Three quorum shapes cover the real cases:

  • one-of-N — any one approver from the pool says yes. Fast; most routine approvals are this shape.
  • N-of-M — at least N distinct people from a set of M must say yes. Four-eyes is the N=2, M=anyone case.
  • all-of — every named approver must say yes. Rare; usually a signal that the process has too many cooks.

Channels

Wardengate sends the approval prompt to the channel the approvers already live in. Supported channels are Slack, Microsoft Teams, email, PagerDuty, ServiceNow (creates a request record), Jira (creates a ticket), and generic webhook. The same route can have multiple channels — the first one that produces a decision wins, the others are cancelled.

Each channel carries a one-click link back to the Wardengate console where the approver sees the full context: the requester, the target, the account they will use, the reason they gave, the risk score, and the current in-flight state. Approve or deny happens inside the console, not on the third-party side — so the decision can be attested to by the audit trail, not by a Slack message that anyone could reconstruct.

SLAs, escalation, and auto-decline

Every route has a response SLA. When it expires, the route takes one of three actions:

  • deny — the request is rejected. Safe default, but annoying when the approver pool is out on a Friday afternoon.
  • escalate — the request re-fires against a broader approver pool or a different channel (commonly PagerDuty).
  • ask-again — the same pool is pinged once more before anything escalates. Useful when the first prompt is easy to miss.

For incidents, escalate to an on-call rotation rather than a static list. Wardengate can read the current on-call from PagerDuty, Opsgenie, or any source that exposes a simple "who is on call" endpoint.

Attaching a route to a policy

apiVersion: wardengate/v1
kind: Policy
metadata:
  name: prod-db-write
  organization: acme
spec:
  principals: { groups: ["sre"] }
  targets:   { tags: ["env=prod", "tier=db"] }
  protocols: ["postgres"]
  accounts:  ["app_rw"]
  constraints:
    approval:
      route: prod-sre-leads
      window: 30m
      requireReasonMinLength: 40
  session:
    maxDuration: 30m
    recording: { mode: full }

The window is how long the approval stays fresh. A user who lingers longer than the window after approval has to ask again. Short windows make approvals meaningful; long windows save clicks. Thirty minutes is a reasonable floor for interactive work.

Self-approval

Self-approval is denied by default and should remain denied for anything privileged. There are two cases where it is defensible: opt-in "break the glass for yourself" flows where the act of clicking "approve" is logged as a first-person decision, and small-team environments where literally no one else is on the rotation. If you turn it on, require a minimum reason length and a high-severity notification to a secondary channel.

Audit trail

Every approval request and decision is its own audit event, tied by ID to the session that grew out of it. The record carries:

  • The requester, target, account, and reason.
  • The route used, the pool at the time, and which channels were hit.
  • Every approver's decision, their user identity, and their timestamp.
  • The final state and, if denied or timed out, the reason.
  • A link to the session — if one was started — and therefore its recording.

The audit events can be exported to SIEM alongside session events, so a single query can show "every privileged prod-db session and the approval that authorised it" across an arbitrary window.

Related