Project

General

Profile

Bug #7381

Process management issues on nodes hosting LXC containers

Added by Alexis MOUSSET almost 2 years ago. Updated 8 months ago.

Status:
Released
Priority:
N/A
Category:
System integration
Target version:
Target version (plugin):
Severity:
User visibility:
Effort required:
Priority:

Description

When running Rudder agents in LXC containers, the agent on the hosting node sees all the cf-execd processes (and thus kills them).

[root@localhost amousset]# ps -eo pidns,cgroup:50,pid,user,args --sort pidns | grep cf-exe
4026531836 1:name=systemd:/system.slice/rudder.service         4903 root     /var/rudder/cfengine-community/bin/cf-execd
4026531836 1:name=systemd:/user.slice/user-1000.slice/session  4924 root     grep --color=auto cf-exe
4026532309 10:hugetlb:/lxc/c7m2,9:perf_event:/lxc/c7m2,7:net_  4779 root     /var/rudder/cfengine-community/bin/cf-execd
4026532376 10:hugetlb:/lxc/c7m1,9:perf_event:/lxc/c7m1,7:net_  4786 root     /var/rudder/cfengine-community/bin/cf-execd

[root@localhost amousset]# rudder agent run                                                                                                                
R: @@Common@@log_info@@hasPolicyServer-root@@common-root@@00@@common@@StartRun@@2015-11-06 12:55:24+00:00##e06c2cde-94ce-4ba7-8514-ac95697d2d9a@#Start e
xecution
R: @@Common@@result_success@@hasPolicyServer-root@@common-root@@00@@Security parameters@@None@@2015-11-06 12:55:24+00:00##e06c2cde-94ce-4ba7-8514-ac9569
7d2d9a@#The internal environment security is acceptable
R: @@Common@@result_repaired@@hasPolicyServer-root@@common-root@@00@@Process checking@@None@@2015-11-06 12:55:24+00:00##e06c2cde-94ce-4ba7-8514-ac95697d
2d9a@#Warning, more than 2 cf-execd processes were detected. They have been sent a graceful termination signal.
R: @@Common@@result_success@@hasPolicyServer-root@@common-root@@00@@CRON Daemon@@None@@2015-11-06 12:55:24+00:00##e06c2cde-94ce-4ba7-8514-ac95697d2d9a@#
The CRON daemon is running
R: @@Common@@result_success@@hasPolicyServer-root@@common-root@@00@@Binaries update@@None@@2015-11-06 12:55:24+00:00##e06c2cde-94ce-4ba7-8514-ac95697d2d
9a@#The CFengine binaries in /var/rudder/cfengine-community/bin are up to date
2015-11-06T13:55:26+0100    error: /default/doInventory/commands/'/usr/bin/curl -L -k -1 -s -f --proxy '' -o "/var/rudder/cfengine-community/rudder-serv
er-uuid.txt" https://rudder/uuid'[0]: Finished command related to promiser '/usr/bin/curl -L -k -1 -s -f --proxy '' -o "/var/rudder/cfengine-community/r
udder-server-uuid.txt" https://rudder/uuid' -- an error occurred, returned 6
2015-11-06T13:55:26+0100    error: /default/doInventory/commands/'/usr/bin/curl -L -k -1 -s -f --proxy '' -o "/var/rudder/cfengine-community/rudder-serv
er-uuid.txt" https://rudder/uuid'[0]: Fatal CFEngine error: cf-agent aborted on defined class 'could_not_download_uuid'

[root@localhost amousset]# ps -eo pidns,cgroup:50,pid,user,args --sort pidns | grep cf-exe
4026531836 1:name=systemd:/user.slice/user-1000.slice/session  5201 root     grep --color=auto cf-exe
4026532309 10:hugetlb:/lxc/c7m2,9:perf_event:/lxc/c7m2,7:net_  4779 root     /var/rudder/cfengine-community/bin/cf-execd

[root@localhost amousset]# rudder agent version
Rudder agent 3.1.4.release (CFEngine Core 3.6.5)

Happens on Rudder 3.1.4, CentOS 6.7 and 7.


Related issues

Related to Rudder - Bug #7189: issues with process management on physical hosting LXC containers Released 2015-09-12
Related to Rudder - Bug #4498: Several issues with process management on Proxmox host (and container) Rejected
Related to Rudder - Bug #7423: If using proxmox, process management fails due to bad options used on vzps Released 2015-12-07
Related to Rudder - Bug #4499: Rudder init script kill all agent on Open VZ (or similar system) Released 2014-02-23
Related to Rudder - Bug #10258: If rudder server component is stopped on Rudder root server, it is never restarted Released
Related to Rudder - Bug #10088: Inventory is not resent in case of error - and agent don't report the error Released

Associated revisions

Revision 51f90897
Added by Alexis MOUSSET about 1 year ago

Fixes #7381: Process management issues on nodes hosting LXC containers

History

#1 Updated by Alexis MOUSSET almost 2 years ago

  • Related to Bug #7189: issues with process management on physical hosting LXC containers added

#2 Updated by Alexis MOUSSET almost 2 years ago

  • Related to Bug #4498: Several issues with process management on Proxmox host (and container) added

#3 Updated by Jonathan CLARKE almost 2 years ago

  • Related to Bug #7423: If using proxmox, process management fails due to bad options used on vzps added

#4 Updated by Alexis MOUSSET over 1 year ago

  • Assignee set to Alexis MOUSSET

#5 Updated by Alexis MOUSSET over 1 year ago

  • Status changed from New to In progress

#6 Updated by Alexis MOUSSET over 1 year ago

  • Status changed from In progress to Discussion
  • Assignee deleted (Alexis MOUSSET)
We currently have a cf- processes check in check-rudder-agent, that does the same thing as our system promises. We can:
  • Add or wait for Linux namespaces support in CFEngine processes promises
  • Remove the cf- processes check from the techniques
  • Document that we do not support running Rudder in a Linux container when the host runs Rudder

#7 Updated by Alexis MOUSSET over 1 year ago

  • Related to Bug #4499: Rudder init script kill all agent on Open VZ (or similar system) added

#8 Updated by Alexis MOUSSET over 1 year ago

  • Status changed from Discussion to In progress
  • Assignee set to Alexis MOUSSET
  • Target version set to 4.0.0~rc2

#9 Updated by Alexis MOUSSET over 1 year ago

  • Status changed from In progress to New

#10 Updated by Alexis MOUSSET about 1 year ago

  • Status changed from New to In progress

#11 Updated by Alexis MOUSSET about 1 year ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from Alexis MOUSSET to Benoît PECCATTE
  • Pull Request set to https://github.com/Normation/rudder-techniques/pull/1069

#12 Updated by Alexis MOUSSET about 1 year ago

  • Status changed from Pending technical review to In progress
  • Assignee changed from Benoît PECCATTE to Alexis MOUSSET

#13 Updated by Alexis MOUSSET about 1 year ago

  • Assignee changed from Alexis MOUSSET to Benoît PECCATTE

#14 Updated by Alexis MOUSSET about 1 year ago

  • Assignee changed from Benoît PECCATTE to Alexis MOUSSET

#15 Updated by Alexis MOUSSET about 1 year ago

  • Status changed from In progress to Pending release
  • % Done changed from 0 to 100

#16 Updated by Alexis MOUSSET 11 months ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 4.0.0 which was released the 10th November 2016.

#17 Updated by Nicolas CHARLES 8 months ago

  • Related to Bug #10258: If rudder server component is stopped on Rudder root server, it is never restarted added

#18 Updated by Benoît PECCATTE 8 months ago

  • Found in version (s) 3.1.0 added

#19 Updated by Benoît PECCATTE 8 months ago

  • Found in version(s) old deleted (3.1.0)

#20 Updated by Nicolas CHARLES 8 months ago

  • Related to Bug #10088: Inventory is not resent in case of error - and agent don't report the error added

Also available in: Atom PDF