/ - Diff - Rudder - Issue Tracker

« Previous | Next »

Revision 1bc98d9b

Added by François ARMAND over 6 years ago

ID 1bc98d9bb697ef3fdb2fcbea498f48f68f6178ee
Parent df1f34e2
Child e141b49f, 42d5fd7e

Fixes #11338: Better explain configuration drift reporting

     resilient to changes than a step by step, procedural description of the commands
     to execute.
     image::./images/continuous-configuration.png[Continuous Configuration]
     Rudder is natively integrated with the supported OS (Linux, Windows, AIX - see
     <<node-supported-os>>) so that it provides generic, abstract, OS independant
     primitives to the user who can:

     === Compliance
     [[compliance-and-drift-assessment]]
     === Compliance and Drift Assessment
     A Directive contains one or multiple components. Each component generates
     one or multiple reports, based on the number of keys in this component. For
     example, for a Sudoers Directive, each user is a key. These states are
     available in reports:
     Success::
     ==== Overview in Rudder
     The system is already in the desired state. No change is needed. Conformity is
     gained.
     Repaired::
     Rudder is build to continuously assesses drift compared to defined policies, with or without auto-healing. By auto-healing,
     we mean that optionally, Rudder can continuously enforces configuration over time, correcting the assessed drift so that
     your configuration converge toward desired states (for how to configure policy mode by node, rule or directive, see <<policy_mode_audit_enforce>>).
     The system was not in the desired state. Rudder applied some change and repaired
     what was not correct. Now the system is in the desired state. Conformity is
     gained.
     Rudder is able to adapt to complex process and only do the minimal required work so that the server converges to the desired state,
     and so whatever was the starting state point. Rudder works as a GPS would, adapting the path to your destination depending of the path
     you actually took. This process is much more resilient to changes than a step by step, procedural description of the commands to execute.
     Error::
     Compliance and drift from expected configurations are then reported, by configuration and/or by node, with possibility to drill down
     in non-compliance issues to identify the root problem.
     The system is not in the desired state. Rudder couldn't repair the system.
     Of course, one can always correct a drift error by hand by updating coniguration target and changing policy mode from "audit" to "enforce" mode.
     Applying::
     ===== Compliance and drift reporting
     When a Directive is applied, Rudder waits during 10 minutes for a report.
     During this period, the Directive is said 'Applying'.
     Compliance drifts (non compliances, enforcement errors, repaires) are reported in Rudder by several means:
     No report::
     - you are notified about them in Rudder dashboard.
     - the are reported for each Rules as showed in "Compliance on a Rule" graph below.
     - they are stored in Rudder compliance database, and each Rule displays an history of changes as depicted in "Changes history on a Rule" below.
     - each drifts fires an event which is logged in file /var/log/rudder/compliance/non-compliant-reports.log and can be used
       to integrates with log aggregation engine like Logstash, or hooks (typically to send notification to IRC or Slack, send email, etc)
     - compliance and drift are also available from Rudder API to provide deeper integration with your IT Infrastructure.
     The system didn't send any reports. Rudder waited for 10 minutes and no report
     was received.
     A Directive has gained conformity on a Node if every report for each
     component, for each key, is in 'Success' state. This is the only condition.
     Based on these facts, the compliance of a Rule is calculated like
     this:
     Number of Nodes for which conformity is reached for every Directive of the
     Rule / Total number of Nodes on which the Rule has
     been applied
     .Compliance on a Rule
-...
     The Rule detailed compliance screen will also graph compliance deviations on
     a recent period as well as display a deviation log history for this period.
     .Compliance history on a Rule
     image::./images/rudder-rule-compliance-history.png[Rule compliance history]
     .Changes history on a Rule
     image::./images/rudder-rule-compliance-history.png[Changes compliance history]
     ==== How compliance is calculated ?
     As previously seen, in Rudder you define Rules which target groups of Nodes, and are composed of configuration Directives.
     A Directive contains one or multiple sub-configuration elements which generates reports.
     For example, for a Sudoers Directive, each user can be such an element.
     Reports have states explaining what is the drift between the expected configuration and the actual configuration.
     Some states depends if the user choose to auto-matically enforce drift correction
     or if he chose to only reports on drift).
     Finaly, a node can get a global state if reports don't come at expected frequency or for expected policy configuration version.
     Below you will find all details about the possible states and their meaning with the actual compliance calculus method.
     *Checking that the node is correctly reporting, at correct frequency*
     At the node level, we are checking that the node is sending reports according to the
     expected frequency, and for the currently defined version of the configuration for it.
     Based on this information, we get a
     Applying::
     When a new set of policies are defined for a node (or any update to existing one), Rudder waits during a grace period
     for reports so that the node has time to apply the new policies.
     During this period, the configuration is said 'Applying'.
     No report::
     The system didn't send any reports since a time incompatible with the agent frequency run interval. Most
     likelly, the node is not online or there is an ongoing network issue between the node and Rudder server.
     *At directive level: checking for drift and auto-healing*
     Success or Compliant::
     The system is already in the desired state. No change is needed. Conformity is reached.
     Repaired::
     When a configuration policy is "enforced", that state means that the system was not in the desired state.
     Rudder applied some change and repaired what was not correct. Now the system is in the desired state.
     Error::
     When configuration is enforced, it means that the system is not in the desired state and Rudder wasn't able to repair the system.
     Non compliant::
     When configuration is not enforced, it means that the systemn is not in the desired state. A drift is reported.
     Not applicable::
     A specific configuration may not be applicable on a given node because some precondition
     are not met. For example, the specified configuration is only relevant for Linux nodes, and
     thus is Not applicable on a Windows server.
     Unexpected::
     We have a special kind of report for unexpected states (both for enforce and audit mode). These
     reports generally mean that the node is sending reports for unexpected configuration components. It
     may be due to bad parameters for the configuration, or an error in the Technique.
     *Compliance calculus*
     Based on these facts, the compliance of a Rule is calculated like this:
     Number of Nodes for which conformity is reached for every Directive of the
     Rule / Total number of Nodes on which the Rule has been applied

Also available in: Unified diff

Project

General

Profile

Rudder

Revision 1bc98d9b

Added by François ARMAND over 6 years ago

Project

General

Profile

Rudder

Revision 1bc98d9b

Added by François ARMAND over 6 years ago

Related issues