Project

General

Profile

Actions

Bug #5650

closed

promises can become invalid if copies fail rendering the agent unusable

Added by Nicolas CHARLES over 9 years ago. Updated almost 2 years ago.

Status:
Released
Priority:
1
Category:
System integration
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
0
Name check:
Fix check:
Regression:

Description

When there is no space left on device, inputs may get purge. Or inputs may also get purge if somebody is manually deleting the files in the inputs folder on the node
In this case, the node is in a broken state, and nothing can revive it, unless a manual intervention

We should have a check for the validity of syntax of failsafe.cf, and if invalid, use initial promises to fetch repared promises from the server


Related issues 1 (1 open0 closed)

Related to Rudder - User story #5641: Make the agent policies update a state machine with integrity checkNewActions
Actions #1

Updated by François ARMAND over 9 years ago

This seems that it can be one of the check of the step 3 of #5641

Actions #2

Updated by Jonathan CLARKE over 9 years ago

  • Target version set to 3.1.0~beta1

This will be addressed by #5641.

Actions #3

Updated by Florian Heigl almost 9 years ago

You could additionaly consider adding a copy of the most basic initial promises (failsafe.cf?) as part of /opt/rudder/bin/rudder-check.

Actions #4

Updated by Vincent MEMBRÉ almost 9 years ago

  • Target version changed from 3.1.0~beta1 to 3.1.0~rc1
Actions #5

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 3.1.0~rc1 to 3.1.0
Actions #6

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 3.1.0 to 3.1.1
Actions #7

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 3.1.1 to 3.1.2
Actions #8

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 3.1.2 to 3.1.3
Actions #9

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 3.1.3 to 3.1.4
Actions #10

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 3.1.4 to 3.1.5
Actions #11

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 3.1.5 to 3.1.6
Actions #12

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 3.1.6 to 3.1.7
Actions #13

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 3.1.7 to 3.1.8
Actions #14

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 3.1.8 to 3.1.9
Actions #15

Updated by Vincent MEMBRÉ almost 8 years ago

  • Target version changed from 3.1.9 to 3.1.10
Actions #16

Updated by Jonathan CLARKE almost 8 years ago

  • Translation missing: en.field_tag_list set to Sponsored, Next minor release, Quick and important
  • Subject changed from When there is no space left on device, promises files can get deleted, rendering the agent unusable to promises can become invalid if copies fail rendering the agent unusable
  • Assignee set to Jonathan CLARKE
  • Priority changed from N/A to 1
  • Target version changed from 3.1.10 to 2.11.21

This can happen in two distinct cases:

  1. The main promises (promises.cf + includes) get broken. An aborted copy can cause this (a new promises.cf gets copies, including a reference to fileX.cf, but the copy stops before fileX.cf is copied, breaking that set of promises)
  2. The backup failsafe.cf can be broken in much rarer cases, like no space left on device, a really bad error in failsafe.cf or update.cf, or possibly neutrino rain (see #5641)

A good, thorough workaround is to implement #5641. A quick and easy workaround is to check for these conditions in check-rudder-agent and fix them.

Actions #17

Updated by Jonathan CLARKE almost 8 years ago

  • Status changed from New to In progress
Actions #18

Updated by Jonathan CLARKE almost 8 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from Jonathan CLARKE to Benoît PECCATTE
  • Pull Request set to https://github.com/Normation/rudder-packages/pull/943
Actions #19

Updated by Jonathan CLARKE almost 8 years ago

  • Status changed from Pending technical review to Pending release
  • % Done changed from 0 to 100
Actions #20

Updated by Vincent MEMBRÉ almost 8 years ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 2.11.21, 3.0.16, 3.1.10 and 3.2.3 which were released on 2016-06-01, but not announced.

Actions #21

Updated by Alexis Mousset almost 2 years ago

  • Priority set to 0
Actions

Also available in: Atom PDF