Project

General

Profile

Actions

Bug #8051

closed

Compliance is not correctly computed if we receive run agent right after generation

Added by Nicolas CHARLES about 8 years ago. Updated almost 8 years ago.

Status:
Released
Priority:
1
Category:
Web - Compliance & node report
Target version:
Severity:
UX impact:
User visibility:
Effort required:
Priority:
Name check:
Fix check:
Regression:

Description

I generated promises, and run agent right away after that
The sent reports are correct
reportsexecution:

 root   | 2016-03-08 20:51:52+00 | t        | 955012795    |         747
 root   | 2016-03-08 20:52:20+00 | t        | -2036910663  |         790

reports do show up in technical logs
nodes_infos is

 root    | {"-2036910663":"2016-03-08T20:52:16.474Z","-1439985566":"2016-03-08T20:41:40.652Z","963405331":"2016-03-08T20:28:31.928Z","-1553294083":"2016-03-08T20:41:21.995Z","955012795":"2016-03-08T20:41:49.768Z","801954688":"2016-03-08T20:27:15.608Z","1904794682":"2016-03-08T20:11:23.712Z","-1729685780":"2016-03-08T20:39:54.108Z"}

but the explain compliance logger tells me:

[2016-03-08 20:52:24] TRACE explain_compliance.root - Computing compliance for node root with: [Pending: until 2016-03-08T21:02:1
6.474Z expected NodeConfigId: -2036910663/[2016-03-08T20:52:16.474Z-now] | last run: nodeConfigId: 955012795/[2016-03-08T20:41:49.768Z-2016-03-08T20:52:16.474Z] received at 2016-03-08T20:51:52.000Z]
[2016-03-08 20:52:24] TRACE explain_compliance.root - Node is Pending with reports from previous run, using merge/compare strategy between last reports from run 955012795/[2016-03-08T20:41:49.768Z-2016-03-08T20:52:16.474Z] and expect reports -2036910663/[2016-03-08T20:52:16.474Z-now]
[2016-03-08 20:52:24] TRACE explain_compliance.root - Compute compliance for node root using: rules for which compliance is based on run reports: [server-roles->2][inventory-all->2][root-DP->2][hasPolicyServer-root->2]; rule updated since run: [32377fd7-02fd-43d0-aab7-28460a91347b->6]

It is as if it didn't see the 20:52:20 run

After next run, it does tell me

[2016-03-08 20:55:34] TRACE explain_compliance.root - Computing compliance for node root with: [CheckChanges: expected NodeConfigId: -2036910663/[2016-03-08T20:52:16.474Z-now] | last run: nodeConfigId: -2036910663/[2016-03-08T20:52:16.474Z-now] received at 2016-03-08T20:52:20.000Z | expire at 2016-03-08T21:02:20.000Z]
[2016-03-08 20:55:34] TRACE explain_compliance.root - Using merge/compare strategy between last reports from run -2036910663/[2016-03-08T20:52:16.474Z-now] and expect reports -2036910663/[2016-03-08T20:52:16.474Z-now]

it seems as if reports used are of by one


Subtasks 3 (0 open3 closed)

Bug #8344: Missing relevant information about why the compliance is what it is on nodeReleasedNicolas CHARLES2016-05-19Actions
Bug #8363: Some agent runs are not seen in the update processRejectedNicolas CHARLES2016-05-24Actions
Bug #8364: Only invalidate cache for node with new completed runsReleasedNicolas CHARLES2016-05-24Actions

Related issues 5 (0 open5 closed)

Related to Rudder - Bug #7743: Compliance take into account expired runReleasedNicolas CHARLES2016-01-08Actions
Related to Rudder - Bug #8118: When a node send reports with a wrong config_id it is never marked as unresponsiveReleasedNicolas CHARLES2016-03-29Actions
Related to Rudder - Question #8176: All nodes compliance report unexpected/missing except root server.Resolved2016-04-13Actions
Related to Rudder - Bug #8424: When updating runs, hooks should really be asyncReleasedBenoît PECCATTE2016-05-30Actions
Has duplicate Rudder - Bug #8337: Broken compliance summary on a nodeRejected2016-05-19Actions
Actions #1

Updated by Nicolas CHARLES about 8 years ago

  • Related to Bug #7743: Compliance take into account expired run added
Actions #2

Updated by Nicolas CHARLES about 8 years ago

  • Related to Bug #7336: Node stuck in "Applying" status added
Actions #3

Updated by Nicolas CHARLES about 8 years ago

ok, this is not happening always, and hard to reproduce.

Actions #4

Updated by Vincent MEMBRÉ almost 8 years ago

  • Target version changed from 3.0.15 to 3.0.16
Actions #5

Updated by Alexis Mousset almost 8 years ago

I'm seeing a similar issue: After each generation, I see some missing reporting (between 5 and 100% missing) on some nodes (between 5 or ten over around 50). I cannot identify any pattern in the missing reports.

Actions #6

Updated by Jonathan CLARKE almost 8 years ago

  • Translation missing: en.field_tag_list set to Next minor release
Actions #7

Updated by François ARMAND almost 8 years ago

  • Status changed from New to In progress
Actions #8

Updated by François ARMAND almost 8 years ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from François ARMAND to Nicolas CHARLES
  • Pull Request set to https://github.com/Normation/rudder/pull/1099
Actions #9

Updated by François ARMAND almost 8 years ago

  • Related to Bug #8118: When a node send reports with a wrong config_id it is never marked as unresponsive added
Actions #10

Updated by François ARMAND almost 8 years ago

  • Related to Question #8176: All nodes compliance report unexpected/missing except root server. added
Actions #11

Updated by François ARMAND almost 8 years ago

So it seems that somehow, the calls to future { } lead to some king of deadlock, and the corresponding code was never executed. It may be linked to the fact that there was a future{} in an method called from a future{}, or not.

At least, removing the future and making the order of execution stricter lead to a comprehensive behavior from the user. If the last run in "select * from reportsexecution order by date desc limit 5;" is marked as finished, just after - and BEFORE the next processing of runs - the compliance is reprocessed, and the debug output of

<logger name="com.normation.rudder.services.reports.CachedReportingServiceImpl" level="debug" />
and

<logger name="explain_compliance.root" level="trace" additivity="false">
    <appender-ref ref="OPSLOG" />
    <appender-ref ref="STDOUT" />
  </logger>

are consitant.
Actions #12

Updated by Jonathan CLARKE almost 8 years ago

  • Related to deleted (Bug #7336: Node stuck in "Applying" status)
Actions #13

Updated by François ARMAND almost 8 years ago

  • Status changed from Pending technical review to Pending release
  • % Done changed from 0 to 100
Actions #14

Updated by François ARMAND almost 8 years ago

  • Related to Bug #8424: When updating runs, hooks should really be async added
Actions #15

Updated by François ARMAND almost 8 years ago

  • Related to Bug #8337: Broken compliance summary on a node added
Actions #16

Updated by François ARMAND almost 8 years ago

  • Related to deleted (Bug #8337: Broken compliance summary on a node)
Actions #17

Updated by François ARMAND almost 8 years ago

  • Has duplicate Bug #8337: Broken compliance summary on a node added
Actions #18

Updated by Vincent MEMBRÉ almost 8 years ago

  • Status changed from Pending release to Released

This bug has been fixed in Rudder 3.0.16, 3.1.10 and 3.2.3 which were released on 2016-06-01, but not announced.

Actions

Also available in: Atom PDF