[rudder-users] [rudder-dev] Any change to the policy is propagated to all nodes at the same time
Benoit Peccatte
benoit.peccatte at normation.com
Wed Mar 15 15:17:00 CET 2017
Le 14/03/2017 à 10:30, Janos Mattyasovszky a écrit :
> Hi dear Rudder Community,
>
> _The Challenge: _
> The biggest benefit of a Config Management tool can also become pretty
> fast the greatest doom of the environment you have to manage. Imagine
> you make a typo and it goes undetected to all your systems. If you hit
> 3000 nodes with a bad policy within 10 minutes, you can easily create
> the biggest IT Outage your company had, even making it go out of
> business... not a good way to become famous... Remember AWS? :-)
>
> _The idea:_
> Currently Rudder only knows one version of Policy that is "current"
> (correct me if I'm wrong) and that is applied to all nodes at once.
> You can of course workaround your way by not applying a policy to all
> nodes at once, and use "exclude groups" attached to Rules and then
> removing them step-by-step, but that does not solve the question on
> how to modify a rule already applied to all your nodes in an "elegant"
> manner? There is of course the way to unassign it from all nodes,
> wait-for-policy-generation, then modify it, attach it back
> step-by-step, each time wait-for-policy-generation, but that is pretty
> error-prone and also hard to track if you get interrupted.
>
> This OTOH would require each piece of policy to be versioned
> separately and the ability for the Nodes to have different "current"
> versions of Config as their valid policy. This would enable you modify
> something (that change would increment the version of that policy
> item), and then you could apply that _somehow_ incremental to the
> designated receivers of the config (the set of groups), by chosing
> some kind of rollout mechanism.
Maybe just having 2 versions of generated policies could be sufficient,
current one and last one.
We also need something to monitor how late a node is, to make sure we do
not forget some node in the process.
>
> Rudder could take care of rolling out the change by a staged way, like
> "/10 nodes/hour/" or "/10%-25%-75%-100% with safety pauses of 2h/".
> Since Rudder also knows the compliance, it could monitor the those
> nodes already having the new version of policy, and if it's over X%,
> it would commence to the next stage of the rollout.
>
> This _somehow_ is probably the hardest thing to define, since there
> are probably as many "rollout methods" as Rudder users itself. I have
> came up with some examples, which could probably be used by most of
> the people, but there are of course also very dedicated ways that are
> very-very specific to an organization, so any feedback on this generic
> idea and possible rollout methods I think is highly welcome.
If you are rolling out to avoid a mistake, a 2 steps rollout may be
sufficient.
If you do it because thee is some level of unknown in your platform (a
platform is rarely uniform) you may prefer a progress based on
confidence, more and more machines, for example 1 - 10 - 100 - 1000 - 10000
Then in this second case, your node selection method also becomes
important, should you prefer preproduction first or choose a list of as
diverse machines as possible ?
If you do it because you want to manually test things, you may prefer a
human based progress: - human choose this filter - stop - human choose
another filter - stop - ...
You probably also want to automate this and have a machine doing the
tests and triggering next step.
>
> Thanks for reading,
>
> Best Regards,
> Janos Mattyasovszky
>
>
>
>
> _______________________________________________
> rudder-dev mailing list
> rudder-dev at lists.rudder-project.org
> http://www.rudder-project.org/mailman/listinfo/rudder-dev
--
------------------------------------------------------------------------
*Logo Normation Benoît Peccatte*
/Architecte/
Normation <http://www.normation.com>
------------------------------------------------------------------------
*87, Rue de Turbigo, 75003 Paris, France*
Phone: +33 (0)1 85 08 48 96
------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.rudder-project.org/pipermail/rudder-users/attachments/20170315/71e9c80e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logo-square3.gif
Type: image/gif
Size: 1036 bytes
Desc: not available
URL: <http://www.rudder-project.org/pipermail/rudder-users/attachments/20170315/71e9c80e/attachment-0001.gif>
More information about the rudder-users
mailing list