Project

General

Profile

« Previous | Next » 

Revision 1bc98d9b

Added by François ARMAND over 6 years ago

Fixes #11338: Better explain configuration drift reporting

View differences:

00_introduction/20_key_features.txt
resilient to changes than a step by step, procedural description of the commands
to execute.
image::./images/continuous-configuration.png[Continuous Configuration]
Rudder is natively integrated with the supported OS (Linux, Windows, AIX - see
<<node-supported-os>>) so that it provides generic, abstract, OS independant
primitives to the user who can:
23_configuration_management/36_compliance.txt
=== Compliance
[[compliance-and-drift-assessment]]
=== Compliance and Drift Assessment
A Directive contains one or multiple components. Each component generates
one or multiple reports, based on the number of keys in this component. For
example, for a Sudoers Directive, each user is a key. These states are
available in reports:
Success::
==== Overview in Rudder
The system is already in the desired state. No change is needed. Conformity is
gained.
Repaired::
Rudder is build to continuously assesses drift compared to defined policies, with or without auto-healing. By auto-healing,
we mean that optionally, Rudder can continuously enforces configuration over time, correcting the assessed drift so that
your configuration converge toward desired states (for how to configure policy mode by node, rule or directive, see <<policy_mode_audit_enforce>>).
The system was not in the desired state. Rudder applied some change and repaired
what was not correct. Now the system is in the desired state. Conformity is
gained.
Rudder is able to adapt to complex process and only do the minimal required work so that the server converges to the desired state,
and so whatever was the starting state point. Rudder works as a GPS would, adapting the path to your destination depending of the path
you actually took. This process is much more resilient to changes than a step by step, procedural description of the commands to execute.
Error::
Compliance and drift from expected configurations are then reported, by configuration and/or by node, with possibility to drill down
in non-compliance issues to identify the root problem.
The system is not in the desired state. Rudder couldn't repair the system.
Of course, one can always correct a drift error by hand by updating coniguration target and changing policy mode from "audit" to "enforce" mode.
Applying::
===== Compliance and drift reporting
When a Directive is applied, Rudder waits during 10 minutes for a report.
During this period, the Directive is said 'Applying'.
Compliance drifts (non compliances, enforcement errors, repaires) are reported in Rudder by several means:
No report::
- you are notified about them in Rudder dashboard.
- the are reported for each Rules as showed in "Compliance on a Rule" graph below.
- they are stored in Rudder compliance database, and each Rule displays an history of changes as depicted in "Changes history on a Rule" below.
- each drifts fires an event which is logged in file /var/log/rudder/compliance/non-compliant-reports.log and can be used
to integrates with log aggregation engine like Logstash, or hooks (typically to send notification to IRC or Slack, send email, etc)
- compliance and drift are also available from Rudder API to provide deeper integration with your IT Infrastructure.
The system didn't send any reports. Rudder waited for 10 minutes and no report
was received.
A Directive has gained conformity on a Node if every report for each
component, for each key, is in 'Success' state. This is the only condition.
Based on these facts, the compliance of a Rule is calculated like
this:
Number of Nodes for which conformity is reached for every Directive of the
Rule / Total number of Nodes on which the Rule has
been applied
.Compliance on a Rule
......
The Rule detailed compliance screen will also graph compliance deviations on
a recent period as well as display a deviation log history for this period.
.Compliance history on a Rule
image::./images/rudder-rule-compliance-history.png[Rule compliance history]
.Changes history on a Rule
image::./images/rudder-rule-compliance-history.png[Changes compliance history]
==== How compliance is calculated ?
As previously seen, in Rudder you define Rules which target groups of Nodes, and are composed of configuration Directives.
A Directive contains one or multiple sub-configuration elements which generates reports.
For example, for a Sudoers Directive, each user can be such an element.
Reports have states explaining what is the drift between the expected configuration and the actual configuration.
Some states depends if the user choose to auto-matically enforce drift correction
or if he chose to only reports on drift).
Finaly, a node can get a global state if reports don't come at expected frequency or for expected policy configuration version.
Below you will find all details about the possible states and their meaning with the actual compliance calculus method.
*Checking that the node is correctly reporting, at correct frequency*
At the node level, we are checking that the node is sending reports according to the
expected frequency, and for the currently defined version of the configuration for it.
Based on this information, we get a
Applying::
When a new set of policies are defined for a node (or any update to existing one), Rudder waits during a grace period
for reports so that the node has time to apply the new policies.
During this period, the configuration is said 'Applying'.
No report::
The system didn't send any reports since a time incompatible with the agent frequency run interval. Most
likelly, the node is not online or there is an ongoing network issue between the node and Rudder server.
*At directive level: checking for drift and auto-healing*
Success or Compliant::
The system is already in the desired state. No change is needed. Conformity is reached.
Repaired::
When a configuration policy is "enforced", that state means that the system was not in the desired state.
Rudder applied some change and repaired what was not correct. Now the system is in the desired state.
Error::
When configuration is enforced, it means that the system is not in the desired state and Rudder wasn't able to repair the system.
Non compliant::
When configuration is not enforced, it means that the systemn is not in the desired state. A drift is reported.
Not applicable::
A specific configuration may not be applicable on a given node because some precondition
are not met. For example, the specified configuration is only relevant for Linux nodes, and
thus is Not applicable on a Windows server.
Unexpected::
We have a special kind of report for unexpected states (both for enforce and audit mode). These
reports generally mean that the node is sending reports for unexpected configuration components. It
may be due to bad parameters for the configuration, or an error in the Technique.
*Compliance calculus*
Based on these facts, the compliance of a Rule is calculated like this:
Number of Nodes for which conformity is reached for every Directive of the
Rule / Total number of Nodes on which the Rule has been applied

Also available in: Unified diff