[rudder-dev] Relay synchronisation via rsync

Wed Jun 15 21:33:00 CEST 2016

Hi

Answers inline.

Janos

Benoit Peccatte <benoit.peccatte at normation.com> ezt írta (időpont: 2016.
jún. 14., K 12:26):

> Hello everyone,
>
> When people have a lot of rules or a lots of nodes, they also have a lot
> of generated promises.
> When you run a relay, those promises can take a lot of time to synchronize.
> To avoid this we devised an rsync based with the help of some of our best
> users.
> This can reduce synchronization time "a lot".
>
> But let's see how to do it.
>
> *How does it work ?*
> - The update promises will be changed to make it run an rsync command
> instead of a cfengine download if a specific flag is set
> - rsync will connect to the server via ssh on the rudder user
>

I can only suggest to spawn an own instance of the os-provided sshd binary
with a custom sshd_config file, to make sure you have full control over the
settings.
I can imagine people tend to customize and secure their sshd config with
additional restrictions, since rudder has full control of everything what
is enrolled to it, so heavy restrictions might be in place that could
conflict with your purpose.

I could imagine to have a custom config that would consist of something
like this:

Port 60022
PidFile /var/rudder/run/sshd.pid
HostKey /opt/rudder/etc/ssh/host_key_*
Protocol 2
AllowUsers rudder
PermitRootLogin no
UsePAM no
PasswordAuthentication no
ChallengeResponseAuthentication no
PermitEmptyPasswords no
AllowTcpForwarding no
AllowAgentForwarding no
PermitTunnel no
UsePrivilegeSeparation no
X11Forwarding no
PrintMotd no

Something like this would make sure you secure your policy server (root or
relay) from those where you want to implement the ssh login from.
This could then create a rudder-sshd init script, that would control your
own sshd instance, and you could let it directly run with the rudder user,
since you would actually not require root privileges to bind to a high port
and you'd also not need to go through the PAM stack, when you only want
key-based auth...

- The key used to connect will be derived from the cfengine key (openssl
> command to be defined)
>

It would make sense to use 4k keys (at least for new nodes), since you are
going to use them for ssh key derivation. Small change, big gain for future
security.
See https://www.rudder-project.org/redmine/issues/6253

- The authorized keys will be filled by the webapp on the server since it
> knows the public keys of each relai
> - A script on the server triggered by ssh remote_command will check that
> the remote relay is allow to synchronize the directory it is asking on each
> ssh connection
>
- The relay will get its promises over rsync over ssh which can be really
> fast
>
> *Prerequisite: the rudder user*
> First we need a specific rudder user on the server so that we do not use
> root to connect to the rudder server.
> This in itself is a challenge because today we only have a rudder group.
>
> To do this we need to:
> - make the package create a new "rudder" user with the "rudder" group
> - change the group of the generated ncf and generated promises to "rudder"
> - change the right of those directories to g+rs, this will make sure that
> we will always have read access to those files from the rudder user
> - change the post install script to change existing rights
> - test that this properly works with cf-serverd
>
> This should be sufficient for our current case, but it allows us to
> imagine a day where jetty and cf-serverd could run as the "rudder" user
> instead of "root".
>

This would actually make sense since AFAIK there is no real need to have
them being run as root, no high ports, and FS-Perms can be adjusted to be
rudder:rudder...

>
> We would like to be able to synchronize the shared-files directory too.
> But since this directory is under direct control of the user and its access
> rights can be used in promises we won't touch its sgid bit nor synchronize
> it by default. If the user want to synchronize it via rsync he will need to
> make sure he has posix acl activated on this directory, run a command to
> add the rudder group via those acls and then activate the rsync protocol
> spécifically for this directory.
>
> *Other steps*
> - Have the rudder user on the server and the rudder group on the relays
> - Create a technique to manage the authorized_keys file on the server (it
> will be based on a system variable that holds uuid/key pairs from the
> webapp)
> - On the relay, create a technique (or maybe a postinstall script would do
> better) that:
>   * create the ssh key from the cfengine key and put it in a known place
>   * call ssh-keyscan to add the server's key to ~/.ssh/known_hosts
>

If you manage your own sshd, you know about your hostkeys. You can also
pre-generate the known_hosts, since you know which policy-server a relay
has to connect to (from which it is receiving the base policy to set up the
rsync method anyway, so rather trust what is coming than scanning it over
network)

> - Create an rsync ACL script that takes 1 parameters (the UUID of a host),
> it will:
>   * know that it is running from ssh, so know the remote command used
>   * check that this command is authorized by its parameter (share/<uuid>
> ncf and shared-files only are authorized)
>   * https://www.samba.org/ftp/unpacked/rsync/support/rrsync can be a good
> source of how to do this
>

Since you are highjacking the rsync command being run over ssh with a
forced command, you actually don't have to use the rsync command in the
format of "*rsync server.fqdn:/path/to/what/I/want*", but you can use
directly "*rsync server.fqdn:POLICY /target/dir*", "*rsync server.fqdn:NCF
/target/dir*" and "*rsync server.fqdn:SHARED /target/dir*", and you could
have the forced-command wrapper script translate them to the appropriate
directory.

This would also simplify the generation of the rsync-command, since the
only variable you need to add to it is the server.fqdn, as the other
directories would be resolved on the policy server to the right directory.

- Add a flag file in generated promises to avoid running rsync if it's not
> necessary (maybe it can be implemented using a filter-from-file option in
> rsync to avoid using cfengine again)
>

The overhead is negligible, not sure it is worth checking for it, unless
you want to use a flag-file to disable sync for some dedicated relays
(rudder relay disable $uuid?) which you could put in maintenance mode or if
you want to make sure an upgrade / risky rule is not going to break the
nodes connected to a prod-relay, and would give you a low-level method to
exclude a subtree of nodes from receiving an updated set of policy.

> - Add an option in the interface to configure synchronization via rsync
> instead of regular synchronization
> - Add an option in the interface to synchronize shared-files too
> - Be ware that a relay can become a server for another relay so everything
> on the server must work on another machine
>
> Note:
> During our meeting we talked about a user on the agent by i can't remember
> why.
>
>
> If you see something wrong or if you want to add a comment, just hit reply
> !
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.rudder-project.org/pipermail/rudder-dev/attachments/20160615/e185ef91/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logo-square3.gif
Type: image/gif
Size: 1036 bytes
Desc: not available
URL: <http://www.rudder-project.org/pipermail/rudder-dev/attachments/20160615/e185ef91/attachment.gif>