Bug #11587: Ensure service (re)started does now work if systemd hit "start-limit" - Rudder - Issue Tracker

Actions

Copy link

Bug #11587

closed

Ensure service (re)started does now work if systemd hit "start-limit"

Added by Janos Mattyasovszky over 6 years ago. Updated almost 2 years ago.

Status:

Released

Priority:

N/A

Assignee:

Nicolas CHARLES

Category:

Generic methods

Target version:

6.0.0

Pull Request:

https://github.com/Normation/ncf/pull...

Severity:

Minor - inconvenience | misleading | easy workaround

UX impact:

User visibility:

Operational - other Techniques | Technique editor | Rudder settings

Effort required:

Very Small

Priority:

Name check:

To do

Fix check:

To do

Regression:

Description

After playing around with a haproxy config templated by rudder and having an NCF method that restarts the daemon after the file has changed, systemd <3 broke down due "start-limit" being hit.

Rudder reported:

E| error         HAProxy                   Service check running     haproxy            Check if the service haproxy is started could not be repaired

Because:

systemd[1]: haproxy.service: Failed with result 'start-limit'.

It would be nice if rudder could handle this case in a "sane" way.

Actions

Copy link

Updated by Janos Mattyasovszky over 6 years ago

Subject changed from Ensure service restarted does now work if systemd hit "start-limit" to Ensure service (re)started does now work if systemd hit "start-limit"

Actions

Copy link

Updated by Janos Mattyasovszky over 6 years ago

This is basically crap, because a service does not only have a started/stopped state, but also a "might be good but go F.you I refuse to restart it" broken one :( pretty hard to get included somehow...

maybe always reset the state with systemctl reset-failed haproxy before trying any service start/restart actions?

Actions

Copy link

Updated by Nicolas CHARLES over 6 years ago

Project changed from Rudder to 41

Actions

Copy link

Updated by Alexis Mousset over 6 years ago

Category set to Generic methods - Service Management

Actions

Copy link

Updated by Benoît PECCATTE over 6 years ago

This looks like a systemd limitation.
It seems that your service is not properly integrated into systemd and needs a better status check.

I think we should first understand what systemd is trying to tell us before working around it.

Actions

Copy link

Updated by Benoît PECCATTE over 6 years ago

Severity set to Minor - inconvenience | misleading | easy workaround
User visibility set to Operational - other Techniques | Technique editor | Rudder settings
Priority changed from 0 to 32

We don't want to do this for every systemd service because it could hide real problems.

As a workaround, since you have restarted the service using ncf, you can also call systemctl reset-failed from ncf.

Actions

Copy link

Updated by Alexis Mousset over 5 years ago

Target version set to 4.1.16
Priority changed from 32 to 27

Actions

Copy link

Updated by Vincent MEMBRÉ over 5 years ago

Target version changed from 4.1.16 to 4.1.17

Actions

Copy link

Updated by Vincent MEMBRÉ over 5 years ago

Target version changed from 4.1.17 to 4.1.18
Priority changed from 27 to 0

Actions

Copy link

#10

Updated by Vincent MEMBRÉ over 5 years ago

Target version changed from 4.1.18 to 4.1.19

Actions

Copy link

#11

Updated by Alexis Mousset about 5 years ago

Target version changed from 4.1.19 to 4.1.20

Actions

Copy link

#12

Updated by François ARMAND about 5 years ago

Target version changed from 4.1.20 to 4.1.21

Actions

Copy link

#13

Updated by Vincent MEMBRÉ about 5 years ago

Target version changed from 4.1.21 to 4.1.22

Actions

Copy link

#14

Updated by Benoît PECCATTE almost 5 years ago

Target version changed from 4.1.22 to 5.0.10

Actions

Copy link

#15

Updated by Vincent MEMBRÉ almost 5 years ago

Target version changed from 5.0.10 to 5.0.11

Actions

Copy link

#16

Updated by Vincent MEMBRÉ almost 5 years ago

Target version changed from 5.0.11 to 5.0.12

Actions

Copy link

#17

Updated by Vincent MEMBRÉ almost 5 years ago

Target version changed from 5.0.12 to 5.0.13

Actions

Copy link

#18

Updated by Vincent MEMBRÉ over 4 years ago

Target version changed from 5.0.13 to 5.0.14

Actions

Copy link

#19

Updated by Vincent MEMBRÉ over 4 years ago

Target version changed from 5.0.14 to 5.0.15

Actions

Copy link

#20

Updated by François ARMAND over 4 years ago

Effort required set to Very Small

Budget: very-small: what we want to do with that ticket? Close it as "won't fix", document workaround, implement a work around?

Actions

Copy link

#21

Updated by Alexis Mousset over 4 years ago

See https://bugzilla.redhat.com/show_bug.cgi?id=1016548 for reference.

We can add ystemctl reset-failed {}.service as it would just force to actually try a restart when Rudder wants to do it.

Ideally we should only do iy just before an actual start or restart.

Actions

Copy link

#22

Updated by Alexis Mousset over 4 years ago

Status changed from New to In progress
Assignee set to Alexis Mousset

Actions

Copy link

#23

Updated by Alexis Mousset over 4 years ago

Target version changed from 5.0.15 to 6.0.0

Let's target 6.0 as it may alter the behavior of the method. For 5.0, the command may be triggered manually before the restart with a "service action" method.

Actions

Copy link

#24