Compliance and Drift Assessment

Overview in Rudder

Rudder is built to continuously assess drift compared to defined policies, with or without auto-healing.

By auto-healing, we mean that optionally, Rudder can continuously enforce correct configuration over time, correcting the assessed drift so that your configuration converges towards desired states. This behavior is optionnal, and Rudder can only report drift without changing configuration. That policy enforce or audit mode can be configured by node, rule or directive (see the section called “Policy Mode (Audit/Enforce)” for more details).

Rudder is able to adapt to complex process and only do the minimal required work so that the server converges to the desired state, and so whatever was the starting state point. Rudder works as a GPS would, adapting the path to your destination depending of the path you actually took. This process is much more resilient to changes than a step by step, procedural description of the commands to execute.

Compliance and drift from expected configurations are then reported with possibility to drill down in non-compliance issues to identify the root problem.

Of course, one can always correct a drift error by hand by updating coniguration target and changing policy mode from "audit" to "enforce" mode.

Compliance and drift reporting

Compliance drifts (non compliances, enforcement errors, repaires) are reported in Rudder by several means:

  • Compliance are reported in aggregated format globally in the dashboard, and by rules or nodes (example for Rule below)
  • they are stored in Rudder compliance database, and each Rule displays an history of changes as depicted in "Changes history on a Rule" below.
  • each drifts fires an event which is logged in file /var/log/rudder/compliance/non-compliant-reports.log and can be used to integrates with log aggregation engine like Logstash, or hooks (typically to send notification to IRC or Slack, send email, etc)
  • see for example the Slack connector here: https://github.com/Normation/rudder-tools/blob/master/scripts/rudder-notification/forward-non-compliance-to-slack.sh
  • compliance and drift are also available from Rudder API to provide deeper integration with your IT Infrastructure.

Figure 15. Compliance on a Rule

Rule compliance


The Rule detailed compliance screen will also graph compliance deviations on a recent period as well as display a deviation log history for this period.

Figure 16. Changes history on a Rule

Changes compliance history


How compliance is calculated ?

As previously seen, in Rudder you define Rules which target groups of Nodes, and are composed of configuration Directives.

A Directive contains one or multiple sub-configuration elements which generates reports. For example, for a Sudoers Directive, each user can be such an element.

Reports have states explaining what is the drift between the expected configuration and the actual configuration. Some states depends if the user choose to auto-matically enforce drift correction or if he chose to only reports on drift).

Finaly, a node can get a global state if reports don’t come at expected frequency or for expected policy configuration version.

Below you will find all details about the possible states and their meaning with the actual compliance calculus method.

Checking that the node is correctly reporting, at correct frequency

At the node level, we are checking that the node is sending reports according to the expected frequency, and for the currently defined version of the configuration for it.

Based on this information, we get a

Applying
When a new set of policies are defined for a node (or any update to existing one), Rudder waits during a grace period for reports so that the node has time to apply the new policies. During this period, the configuration is said Applying.
No report
The system didn’t send any reports since a time incompatible with the agent frequency run interval. Most likelly, the node is not online or there is an ongoing network issue between the node and Rudder server.

At directive level: checking for drift and auto-healing

Success or Compliant
The system is already in the desired state. No change is needed. Conformity is reached.
Repaired
When a configuration policy is "enforced", that state means that the system was not in the desired state. Rudder applied some change and repaired what was not correct. Now the system is in the desired state.
Error
When configuration is enforced, it means that the system is not in the desired state and Rudder wasn’t able to repair the system.
Non compliant
When configuration is not enforced, it means that the systemn is not in the desired state. A drift is reported.
Not applicable
A specific configuration may not be applicable on a given node because some precondition are not met. For example, the specified configuration is only relevant for Linux nodes, and thus is Not applicable on a Windows server.
Unexpected
We have a special kind of report for unexpected states (both for enforce and audit mode). These reports generally mean that the node is sending reports for unexpected configuration components. It may be due to bad parameters for the configuration, or an error in the Technique.

Compliance calculus

Based on these facts, the compliance of a Rule is calculated like this:

Number of Nodes for which conformity is reached for every Directive of the Rule / Total number of Nodes on which the Rule has been applied