Project

General

Profile

Bug #10446

Policy validation fails

Added by Janos Mattyasovszky 6 months ago. Updated 5 months ago.

Status:
Released
Priority:
N/A
Category:
Web - Config management
Target version:
Target version (plugin):
Severity:
Major - prevents use of part of Rudder | no simple workaround
User visibility:
Getting started - demo | first install | level 1 Techniques
Effort required:
Priority:
0

Description

I have a 4.1-rc1 (vanilla - no modifications) on SLES11 SP4, and I got a validation error after I have added many new nodes (batches of 100/200/500) up to 2000.

Policy update process was stopped due to an error:

⇨ Policy update error for process '41' at 2017-03-16 16:40:08 
⇨ Cannot write configuration node 
⇨ Exit code=1 for hook: '/opt/rudder/etc/hooks.d/policy-generation-node-ready/10-cf-promise-check' with environment variables: [PATH:/sbin:/usr/sbin:/usr/local/sbin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin] [NLSPATH:/usr/dt/lib/nls/msg/%L/%N.cat] [OLDPWD:/] [TERM:linux] [XFILESEARCHPATH:/usr/dt/app-defaults/%L/Dt] [PWD:/opt/rudder/jetty7] [SHLVL:1] [_:/usr/java/latest/bin/java] [RUDDER_GENERATION_DATETIME:2017-03-16T16:39:26.954+01:00] [RUDDER_NODEID:184c352a-e8d4-4827-946c-474c7fc8bb2d] [RUDDER_NEXT_POLICIES_DIRECTORY:/var/rudder/share/184c352a-e8d4-4827-946c-474c7fc8bb2d/rules.new/cfengine-community] [RUDDER_AGENT_TYPE:cfengine-community]. 
Stdout: ' error: Can't stat file '/var/rudder/share/184c352a-e8d4-4827-946c-474c7fc8bb2d/rules.new/cfengine-community/promises.cf' for parsing. (stat: No such file or directory)
' 
Stderr: ''
Close

The error was not shown again after I started a new manual generation.

list_folders (163 KB) Nicolas CHARLES, 2017-03-17 09:55

folder_promises_10 (737 Bytes) Nicolas CHARLES, 2017-03-17 10:15

folder_promises (737 Bytes) Nicolas CHARLES, 2017-03-17 10:15

list_folders (203 KB) Nicolas CHARLES, 2017-03-17 10:15


Related issues

Related to Rudder - Bug #9444: When accepting 100 nodes, I randomly get multiple policy generation New
Related to Rudder - Bug #7689: Error on promise generation after accepting a node Released 2015-12-24
Related to Rudder - Bug #6575: When we accept a new node, we have two promises generation instead of one Released 2015-05-11

Associated revisions

Revision 9f535367
Added by Nicolas CHARLES 6 months ago

Fixes #10446: Policy validation fails

History

#1 Updated by Janos Mattyasovszky 6 months ago

  • Description updated (diff)

#2 Updated by François ARMAND 6 months ago

Thanks for reporting. This seems... interesting. Like in "oh, there is i/o and load and asynchronous process implied in the fun".

We will try to reproduce.
Perhaps we could add a trap on the script to do a ls on the '/var/rudder/share/184c352a-e8d4-4827-946c-474c7fc8bb2d/rules.new/cfengine-community/' directory on error, to have some more information about what goes wrong.

#3 Updated by Janos Mattyasovszky 6 months ago

I have currently 8 cores for the system, but they burn like hell during a complete regeneration.
I'll try to increase the number of cores to ensure all cf-promises get a CPU.

#4 Updated by François ARMAND 6 months ago

Well, we are trying to exploit at maximum the resource of the underlying machine :)
(at least the jvm, and especially scala, has nice management of asynchronicity and parallelism :)

#5 Updated by Janos Mattyasovszky 6 months ago

It happend 21x today:

# grep 'Exit code=1 for hook:' /var/log/rudder/webapp/2017_03_16.stderrout.log -c
21

All having the same error:

Stdout: '   error: Can't stat file '/var/rudder/share/xxxxxxxxxxxxxxx/rules.new/cfengine-community/promises.cf' for parsing. (stat: No such file or directory)
'

#6 Updated by Nicolas CHARLES 6 months ago

  • Category set to Web - Config management
  • Target version set to 4.1.0
  • Severity set to Major - prevents use of part of Rudder | no simple workaround
  • User visibility set to Getting started - demo | first install | level 1 Techniques

Does this happen only at node acceptation ?

#7 Updated by Janos Mattyasovszky 6 months ago

Ok, new info: it only happend when I added ~100 new nodes, not on manual full regen. This happened 32 and also with 8 cpu-s, so apparently this is not cpu bound.

The hook was modified to add more debug info:

rudder41rc1:/opt/rudder/etc/hooks.d/policy-generation-node-ready # diff -u 10-cf-promise-check.orig 10-cf-promise-check
--- 10-cf-promise-check.orig    2017-03-16 22:47:54.326238008 +0100
+++ 10-cf-promise-check 2017-03-17 08:50:07.234670923 +0100
@@ -12,6 +12,11 @@
   echo "The directory for node ${RUDDER_NODEID} new policies is empty" 
   exit 1;
 else
+  if [[ ! -f ${RUDDER_NEXT_POLICIES_DIRECTORY}/promises.cf ]]; then
+    echo "oopsie!" 
+    ls ${RUDDER_NEXT_POLICIES_DIRECTORY}
+    exit 5
+  fi
   case "${RUDDER_AGENT_TYPE}" in
     "cfengine-community")
       /var/rudder/cfengine-community/bin/cf-promises -f "${RUDDER_NEXT_POLICIES_DIRECTORY}/promises.cf" 

The errors produced this output (added line breaks for better reading):

[2017-03-17 09:29:43] ERROR com.normation.rudder.batch.AsyncDeploymentAgent$DeployerAgent - Error when updating policy, reason Cannot write configuration node  <- Exit code=5 for hook: '/opt/rudder/etc/hooks.d/policy-generation-node-ready/10-cf-promise-check' with environment variables: 
[PATH:/sbin:/usr/sbin:/usr/local/sbin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin]
[NLSPATH:/usr/dt/lib/nls/msg/%L/%N.cat]
[OLDPWD:/]
[TERM:linux]
[XFILESEARCHPATH:/usr/dt/app-defaults/%L/Dt]
[PWD:/opt/rudder/jetty7]
[SHLVL:1]
[_:/usr/java/latest/bin/java]
[RUDDER_GENERATION_DATETIME:2017-03-17T09:28:50.764+01:00]
[RUDDER_NODEID:68e558bc-3067-11e5-bf83-6b55cf67da3d]
[RUDDER_NEXT_POLICIES_DIRECTORY:/var/rudder/share/68e558bc-3067-11e5-bf83-6b55cf67da3d/rules.new/cfengine-community]
[RUDDER_AGENT_TYPE:cfengine-community].
  Stdout: 'oopsie!
checkGenericFileContent
clockConfiguration
inventory
properties.d
routingManagement
rudder_expected_reports.csv
'
  Stderr: ''

I can reproduce this so if you need any other debug info at the time this happens, just tell me what debug-code to add to the hook.

#8 Updated by Nicolas CHARLES 6 months ago

I could reproduce it also - happen only at node acceptation apparently
Log says

[2017-03-17 08:50:35] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Checked node configuration updates leading to rules serial number updates and serial number updated in 126 ms
[2017-03-17 08:50:36] INFO  com.normation.rudder.services.policies.nodeconfig.NodeConfigurationServiceImpl - Configuration of following nodes were updated, their promises are going to be written: [49659fcf-da40-49ec-bd46-3baa7fe8bb70, 1773a50a-5a19-49ad-a821-c34bcc960047, a724b68d-7dee-4382-8757-e6deca4c1188, 0b0cc1b6-595b-406d-a5c7-ed2cd7501a5b, 727095ca-6e0d-4e9f-ae77-65f1c82e5833, 3fd250f8-a28a-42a2-8d25-2d08ffa3bc50, 12242f44-c6e0-4fa6-80c5-827817fe5971, 8d47d185-58d4-4695-b30e-38704d040021, b3137f97-95f7-48bf-827b-79416e62fde0, 40b3d9cd-bc22-4f71-a73e-c15a49415683, 3417678c-f6e1-4f2c-89b3-e3975a17bbe9, e7adad42-3726-4bd8-85ba-1fb400477cb1, 943751f7-f128-4c51-9ffb-63c1312d13ba, c6f72748-0814-4737-ad8a-ada65c9d275d, 8bedf955-ccbe-4d3a-ad9d-09a07d81de39, 145ff399-3012-47ec-a7e7-7ee61cc3fd7d, 1236f904-fa42-400a-88cf-a5b54c8446f7, 3e878cc4-5158-4cec-82b2-ff79a1163b73, ee47ad2f-f06e-4d98-8f9d-8ff9ba424394, 3edb0a6f-dd67-4e1e-8a9f-fb730c28bd94, 91619502-c062-45d1-ad71-4d753cea404f, 917047d5-48d8-485f-a314-11909fc8bfdb, f53e1dc4-996f-4f69-9cf7-b8301fd55557, 6bcfdaf1-3ba5-4256-8590-88e4bdfd6d36, 7fbf7df2-6382-47f3-95e3-11c948a97774, 87650f22-3b3b-4e9e-99f6-33e851712cfb, eaab7645-b2eb-4a5d-a703-03b296009d1c, 086648bc-c85d-43b0-92cc-ebe6114b841c, 74704916-6107-45f1-957c-df50e2eb7140, 1f1187f2-c2fd-4b38-a3e9-fcfe361042a7, bb45cd62-8799-4890-a15d-90ddc2bd59d3, 296892c0-a1c7-4444-b601-dcb115093227, 7e48f35c-be50-4453-a613-44ba4ad47a8b, f7c3f1f8-640d-4c21-a9b2-eee0d8481b14, 0b5254df-6977-4ee5-8979-c3d5c722c084, bc13a004-cdb2-4d3c-878b-7c841b6ae86b, 109987c9-52c1-49fa-9670-28fd497fdbbe, 51eb5fd4-de0f-43bb-bfc3-2c883a56e9d4, 8b0cb178-487a-49d4-a1ed-0259638404c8, b012dcf1-9de2-4a01-9a71-deafd3c6d4a5, 484162e6-e6c8-4d7f-8bf8-975b0b5afe1d, 88486e49-8d27-4bca-8ca6-9e67ffea0e7c, 224e17e1-60a9-49a1-8a06-2f202414d94c, fc92ff5f-a26e-4e0d-af79-08f3e2f999ac, 14c9d31f-f18d-4168-995c-4386542f476f, 844e1f6e-03d6-4290-b7b9-333956f1da65, 0b4feb40-09fe-4968-8010-25c3339597cd, 01c7683b-e1e9-46f6-aab1-024fdf0af7b6, 2b058351-90a9-4e09-b9b1-cd33970f8e86, a40e925a-5b5c-48d7-b712-d4f36f902a31, bceb7cb6-bf06-4fb5-b6d1-6693e9b726e8, 6d85b479-f7b5-454f-a325-54fb69055658, 35d30eff-7951-4a6c-9297-79698edcee59, dcddbaa2-1054-4de8-b2ea-0d21d036ba57, b8ca0353-0393-4ded-855a-35f6f6c58e3a, 0e8f816f-d90a-428c-9c97-ff5671f190c0, 6406a529-c1af-4f00-a445-37ac7469e306, 2a74b86e-9a8c-4066-a38f-b7783c8ad4cb, 36fcc19a-bf39-421f-8f46-fe0d8a312c21, 8f288ee0-1c56-4295-8d01-667d5eba6c74, 27614ef4-35f4-4558-a922-a5e2cf3257e9, 9d982f0e-42c9-4eb0-8fec-c99440d32312, 24d39caf-d475-4187-9bc6-c2c1ab5714b0, 59936a91-6781-4e15-aa83-7d06239b6a1c, d24f1775-715a-4cc3-bd3e-22bbe3cfe4d5, d5e96790-4db6-4b24-8987-e51c9973cd38, cf825793-7869-4480-9db5-ca41c4694ac1, acb10a3a-738b-4abb-b6a0-1225f23e0076, 7e6556bd-f88c-4040-bc88-058010bcae2a, f60be80c-5d0a-4163-9c1d-c9be8f23d97e, 2ef61fb2-a85f-4d9d-b979-1cb49ee0e0f8, 9606621e-bdda-46f4-9511-29620ffd02d4, 5554c4a3-ede8-4945-bebd-2b536b1611be, 2f857797-aec7-43f0-8227-da1f888cfe94, 6e223bad-5ae5-431a-86cc-b1710aa42714, f9125641-9377-4061-a047-caf137174a61, e771d346-20b6-4268-a940-3275144d29e1, 86d94d79-7cf9-4981-8d9a-681c2338fb31, 4b353000-5706-494e-95eb-8e2e8ceec4dd, c843e2c8-7580-4f8e-9f5e-b11b6fb7f6aa, 4d28703e-0821-4048-9d94-015896343ad8, 12f903d7-327e-4727-879c-20ce7098889f, fefad6fa-d253-4994-9f6d-1a31835141d3, a8f26834-fd38-4f7a-a081-b61908fd80dc, 5a871aff-7055-4df4-9a6a-03affa285574, d5245471-f8cf-4aae-8908-2f31b45c568e, b62e6c3f-f081-4b65-862b-b5e5062e3fab, 7a8a1d33-76d6-4760-b24d-4f221d74c00c, 4f126f6d-a446-44bb-996e-a44485c17062, 8eb07627-3ba1-45b7-a647-d033dfd0d68b, 4a7fd241-6447-4ad9-92e2-db7247d12ee3, af85fb64-f3f0-425e-b7ac-0fb76b770c0f, 335c5153-7d8e-44c9-82d0-905e98353f74, 71002409-0e35-4db1-898f-c88529b282ce, 9a16abbb-ad54-4fef-b89e-1734ea914d9b, 1f837d10-3854-4b24-ae01-68f587256058, 0309f8ee-49cc-4938-b9ec-92b78a65b271, 15d6ee6a-8429-4610-885a-fa3345b88151, eab6be29-1e7d-4214-9d77-a9fcd709de31, 097167b1-93b4-47bf-af14-aa3a30c7af7e, 43c6defa-0ebe-4904-9f65-3c609071c071, ab4c77a8-f345-4a99-bbf8-89ec369866be, 037daaad-917d-45b2-8e72-6ca016f4ebce, 36269ec0-e460-42b9-b233-9ec9f96cf430, 287394a3-c3b3-4d1c-87f3-33c61d9a4488, 5d89afa5-1a59-4987-a71d-969117e42102, 5b90a293-602c-42ca-80d0-f41d31df364c, 0299931e-9f20-4cfa-9432-9d2ecdfaac84, ec3ae7dd-2acd-474e-9270-1ec0bdbaf21e, 1b202ba7-9031-494d-a743-1bf04e93f5aa, aaae1334-683e-496b-8c45-6f0ef025e460, e236476c-67d4-4366-ae30-0c94a069a43a, 832040c4-c8b5-47b6-b5b5-9efc522b4615, a5c54d32-7e57-4ce4-a9f9-477ff7601dd0, fb0eae99-6be3-4bd0-af40-9bf81ae353f1, 6972e03e-71d5-4661-9898-432a08ffd42a, d49f4693-b5bf-4370-af3a-cca6be38ba05, d7f2ee12-ceea-4400-95eb-57f72ca30d6d, 2d407bba-ca0e-4aad-9324-70cc407624d2, ba07e3b5-bcfd-4d99-b3f0-4cb108dd43fb, b0b8b1f4-172a-45b1-a334-4d6065619cab, b26d7334-ec0e-47d3-b0e9-ffe3957131fa, 9868ed84-7f77-4372-95b6-427bdfcf4011, 94b01ded-6710-4e94-b435-23acd0664ff9, a406bab4-01b9-452e-be86-161bb8fadca2, f5daba23-3245-4ed0-9003-02cb04e5d96a, 3551bdd1-9cb2-47d0-942d-a84b6ff99bb0, 7ef55dde-f801-43c3-84a1-6514904aaf7f, b1f5adf5-93dc-46e2-8501-1e4db42ce6ec, f234b032-c30f-45e8-8951-e7d5a8581d2f, 68544240-9844-48d0-962e-e42414abcf3f, 716293ae-8170-4e34-bff3-7e28847475eb, 211178fd-c7a9-4e11-b5e5-12b8d1a6dab1, f5ec0b16-ba31-457c-a639-c368f786ecf4, 979cd32c-bd50-47b6-a448-2cfe55489781, root, 84d4301c-415a-415c-a552-67e330ae210b, 09b4984f-8de5-4f17-9e40-153758a2ce05, 12fc7bbf-620a-483d-8978-f58c3228f8fe, 8330279f-73b8-429a-84e0-6e44a39170f6, e93c8cac-f878-4cb6-89a7-08569bc7361e, 78fceb91-d625-47b9-af18-0aaddc89635c, 4edc1122-df66-458c-ba6d-c27200d94d6c, e1153c25-ed40-4122-8b3a-5758445fee2e, 4e63917b-3ae0-47ba-a8ac-b9bd5cdd0981, 1c2680e8-db9b-4af5-afb0-beb550210d56, 4e794554-2b97-4db0-a10e-cda15b12ddf4, 029c99b0-7686-426b-bb79-ae185789ef63, ce0cf445-a9fc-4a04-840a-0a7e85388a8f, 9d2efdc6-873f-4145-b490-d9d56b889dc8, e0745069-1e27-450d-9ad2-cfef7bda31f5, 2f86a3b3-d72d-49bf-b332-b30650b9ec5b, d8b985a5-4e54-4541-a718-183722f6774e, 83899fb2-2cb9-4091-8fa2-9abfe07e6b29, 24c653b7-fdb0-4d25-b93a-ee4ed7c4a23f, 7ef1396e-bc8e-4a4a-a9df-65554a95851c, a68bad79-9fd1-459a-a2d1-7bd110508622, a272c71a-dae9-44ab-8796-76384adb8551, c933b327-8501-49bd-9765-c534d6fb2a9c, 6e3a33ac-d10f-41b1-bcb7-e665d9c61e86, 76af9062-babf-490c-802b-1c6cd3597a2b, 61e7df71-bf0f-4f61-b0df-170a5e8f8719, bb298b31-a8b8-4c5f-999b-8d5e19f96d0f, 4a20f60a-a069-46e1-ad75-e99f3ffa5208, 84895631-a8bc-48ee-937b-ed380804775f, 8f6ce2c4-97f8-4d5c-8ee9-b3e5a54081c9, 8642ea8e-d070-4946-bddd-74c57a2a7472, f619dba1-ee36-464b-a647-59471eb4911d, 9a8e7638-3cbc-4458-a15f-5ba28c4ba8b9, c4bd8ed9-780c-430a-98e2-2b56aae36795, debe0f5f-d278-4acb-8d13-5aec198b6714, 39d12c7a-3cf1-41e5-8dea-2a487a01e5f5, 450c1699-8e2e-4ca9-9700-9413e033312c, 714b2b41-bc66-44d9-8f9e-623b0219fe57, 25e98a4d-8683-436d-a536-eb91fc3ebba0, 6192678f-50e2-4b95-a609-b615508bd9bf, a49ead41-58e5-4f17-bc1d-fbe195e409c7, 0de17bfa-0edc-4512-9095-82fb8bd167f7, 8ca1a2d7-b66f-43fa-9e27-8cf5fbc267ce, 4fa07504-0a15-404c-9fea-b9f380f47f64, b27d06a8-9abf-40fb-b584-b889bbcf7582, 5cfc841b-676f-48a2-aca1-1b6172cc0131, 53348962-efaf-4530-8525-ac50418647b4, 85eaee98-fa3a-48cc-b236-a15ea4c33b12, 9a7c36b3-4305-443c-bb24-daa7f8f5bd1b, 707964c4-e3b4-486a-9590-87534ed0e849, 9c4d80a6-00fb-4a7e-96a2-bda4a27c00ea, 72960988-7a96-4270-b375-2edf1978e1a8, 6cdd0471-2e7f-4de3-be7b-26815b41bd76, 4d380198-84f9-4c13-8158-7fecb8cea9ad, e8329de8-3ff3-4e36-801b-ee9310540e09, 57b73dda-7d6c-4797-9e02-bc239d236522, 3cd6b2b6-9eeb-473f-9d4f-d33cd147fa5a, d5135ac8-b085-4b1c-bb10-d67cdb5b00d3, 8915c20b-cf59-4446-9173-891d6abf457a, cb33eb66-a3b4-4531-a89a-fe1e485c0da4, f662c18b-8e5a-4313-9680-03c57173a828, 5235e103-3680-4f5c-93bb-33796cb4f85f, 3a9329f4-ad5d-49fc-b904-38344f201f1b, 63f204eb-092b-41a5-bc81-2ee1bfb7ffc3, 00e2b90c-c1e8-4d1b-be74-9a8c735ebd68, 6a30d06d-acaa-4683-8d2c-dae23f9c5653, 4f842b14-7bd7-4e59-b876-fb81e59f1192, cf3f7654-afb5-4011-acc0-df204c6189a3, 181bdf6b-3d77-408b-a0b2-8b5ce3fcff1e]
[2017-03-17 08:50:48] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Policy generation completed in 26724 ms
[2017-03-17 08:50:48] ERROR com.normation.rudder.batch.AsyncDeploymentAgent$DeployerAgent - Error when updating policy, reason Cannot write configuration node <- Exit code=1 for hook: '/opt/rudder/etc/hooks.d/policy-generation-node-ready/10-cf-promise-check' with environment variables: [PATH:/sbin:/usr/sbin:/bin:/usr/bin] [SYSTEMCTL_IGNORE_DEPENDENCIES:] [NLSPATH:/usr/dt/lib/nls/msg/%L/%N.cat] [SYSTEMCTL_SKIP_REDIRECT:] [OLDPWD:/] [TERM:xterm] [XFILESEARCHPATH:/usr/dt/app-defaults/%L/Dt] [PWD:/opt/rudder/jetty7] [SHLVL:1] [_:/bin/java] [RUDDER_GENERATION_DATETIME:2017-03-17T08:50:21.668Z] [RUDDER_NODEID:83899fb2-2cb9-4091-8fa2-9abfe07e6b29] [RUDDER_NEXT_POLICIES_DIRECTORY:/var/rudder/share/83899fb2-2cb9-4091-8fa2-9abfe07e6b29/rules.new/cfengine-community] [RUDDER_AGENT_TYPE:cfengine-community]. 
  Stdout: 'ERROR: file /var/rudder/share/83899fb2-2cb9-4091-8fa2-9abfe07e6b29/rules.new/cfengine-community/promises.cf is not present

and content of /var/rudder/share is, at this time, as exposed in attached file
There are no .new folder in this list of course, the .new is one level below, sorry

#9 Updated by Nicolas CHARLES 6 months ago

With proper logs:

[2017-03-17 09:10:29] DEBUG com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Policy generation completed in 34345 ms
[2017-03-17 09:10:29] ERROR com.normation.rudder.batch.AsyncDeploymentAgent$DeployerAgent - Error when updating policy, reason Cannot write configuration node <- Exit code=1 for hook: '/opt/rudder/etc/hooks.
d/policy-generation-node-ready/10-cf-promise-check' with environment variables: [PATH:/sbin:/usr/sbin:/bin:/usr/bin] [SYSTEMCTL_IGNORE_DEPENDENCIES:] [NLSPATH:/usr/dt/lib/nls/msg/%L/%N.cat] [SYSTEMCTL_SKIP_R
EDIRECT:] [OLDPWD:/] [TERM:xterm] [XFILESEARCHPATH:/usr/dt/app-defaults/%L/Dt] [PWD:/opt/rudder/jetty7] [SHLVL:1] [_:/bin/java] [RUDDER_GENERATION_DATETIME:2017-03-17T09:09:54.900Z] [RUDDER_NODEID:795eb177-f
c24-4ba0-9707-d76a13d5ff89] [RUDDER_NEXT_POLICIES_DIRECTORY:/var/rudder/share/795eb177-fc24-4ba0-9707-d76a13d5ff89/rules.new/cfengine-community] [RUDDER_AGENT_TYPE:cfengine-community]. 
  Stdout: 'ERROR: file /var/rudder/share/795eb177-fc24-4ba0-9707-d76a13d5ff89/rules.new/cfengine-community/promises.cf is not present
' 
  Stderr: ''
[2017-03-17 09:10:29] ERROR com.normation.rudder.batch.AsyncDeploymentAgent - Policy update error for process '44' at 2017-03-17 09:10:29: Cannot write configuration node

attached is:
list of folder with subfolder in /share : list_folder
content of generated folder: folder_promises
content of generated folder, 10s after error: folder_promises_10

#10 Updated by Nicolas CHARLES 6 months ago

What happens there is that Dynamics groups are not all recomputed at once when accepting nodes, but rather recomputed one by one, and each one trigger a new policy generation
We are missing at least common here - so no promises.cf

Interestingly, it fixes itself

#11 Updated by Nicolas CHARLES 6 months ago

  • Related to Bug #9444: When accepting 100 nodes, I randomly get multiple policy generation added

#12 Updated by Nicolas CHARLES 6 months ago

  • Related to Bug #7689: Error on promise generation after accepting a node added

#13 Updated by Nicolas CHARLES 6 months ago

  • Related to Bug #6575: When we accept a new node, we have two promises generation instead of one added

#14 Updated by Nicolas CHARLES 6 months ago

Interestingly, this bug is not reliably triggered.

#15 Updated by Nicolas CHARLES 6 months ago

So, when accepting 100 nodes, at 10:37:01, I get about 55 nodes correctly updated, group has-policy-server-root update meanwhile (with only 48 added nodes), and policy generation starts, because group has-policy-server changed

dynamic group updated

[2017-03-17 10:37:06] DEBUG com.normation.rudder.batch.UpdateDynamicGroups$LAUpdateDyngroupManager - Dynamic group update started at 2017/03/17 10:37:01, ended at 2017/03/17 10:37:06

[2017-03-17 10:37:06] DEBUG com.normation.rudder.batch.AsyncDeploymentAgent - Flag file '/opt/rudder/etc/policy-update-running' created
[2017-03-17 10:37:06] INFO  com.normation.rudder.services.policies.PromiseGenerationServiceImpl - Start policy generation, checking updated rules

then 47 nodes accepted, a new dynamic group update trigger at the same time, and it includes the last accepted node from previous batch

[2017-03-17 10:37:12] DEBUG com.normation.rudder.batch.UpdateDynamicGroups$LAUpdateDyngroupManager - Dynamic group update started at 2017/03/17 10:37:01, ended at 2017/03/17 10:37:12

When we update dynamic group, we have an invalid state system: we check if group changed, if so we set that we must trigger a deployment, but if we have pending dynamic gorup update, we still trigger the deployment, and recompute group

#16 Updated by Nicolas CHARLES 6 months ago

  • Status changed from New to In progress
  • Assignee set to Nicolas CHARLES

#17 Updated by Nicolas CHARLES 6 months ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from Nicolas CHARLES to François ARMAND
  • Pull Request set to https://github.com/Normation/rudder/pull/1597

#18 Updated by Nicolas CHARLES 6 months ago

  • Status changed from Pending technical review to In progress
  • Assignee changed from François ARMAND to Nicolas CHARLES

don't know why it submited the PR, it should only WIP

so, it's probably not only related to 4.1, but also to 3.1
code works as expected, however, when accepting gazziliions nodes, it takes quite some time before the popup close, and it starts generating - it's probably something we'll need to change (generation started at 2017-03-17T13:41:19.619+01:00, first node accepted at 2017-03-17 13:38 , for 320 nodes)

benefit: only one generation when accepting node, with the warranty that the data generated are correct !

#19 Updated by François ARMAND 6 months ago

So, the full text explanation of the problem is:

- the node is accepeted
- but before dynamic group update have a chance to run, a generation is started (typically one was pending and the former just finished)
- the node is accpected, but NOT in the group "hasPolicyServer-root" (because dynamic)
- it is the fact that a node is in that group that gives it the system rule "common", which is responsible for the "promises.cf" file
- "promise.cf" is missing, and the generation fails.

So, we should perhaps do three things:

- correct dynamic group generation logic regarding pending generation (cf the proposed PR),
- directly add the node in the policy server group during acceptation (not waiting a dynamic group update to know that),
- exclude nodes that are not in a policy server group (or even in the one corresponding to their policy server) from generation, reporting inconsistency errors for them

#20 Updated by Nicolas CHARLES 6 months ago

François ARMAND wrote:

So, we should perhaps do three things:

- correct dynamic group generation logic regarding pending generation (cf the proposed PR),
- directly add the node in the policy server group during acceptation (not waiting a dynamic group update to know that),
- exclude nodes that are not in a policy server group (or even in the one corresponding to their policy server) from generation, reporting inconsistency errors for them

Second and third solution are not correct: a node can be in the dynamic group hasPolicyServer-*, but not have all its groups computed, resulting in invalid generated policies - it can be pretty hard if it exclusion group for specific rules

#21 Updated by Nicolas CHARLES 6 months ago

  • Target version changed from 4.1.0 to 3.1.19

Targetting to 3.1

#22 Updated by Nicolas CHARLES 6 months ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from Nicolas CHARLES to François ARMAND
  • Pull Request changed from https://github.com/Normation/rudder/pull/1597 to https://github.com/Normation/rudder/pull/1601

#23 Updated by Nicolas CHARLES 6 months ago

  • Status changed from Pending technical review to Pending release

#24 Updated by Vincent MEMBRÉ 5 months ago

  • Status changed from Pending release to Released
  • Priority set to 0

This bug has been fixed in Rudder 3.1.19, 4.0.4 and 4.1.1 which were released today.

Also available in: Atom PDF