Project

General

Profile

Bug #10485

Inventory endpoint accepts inventory even if ldap or postgresql connectivity failed

Added by Nicolas CHARLES 6 months ago. Updated 4 months ago.

Status:
Released
Priority:
N/A
Category:
Server components
Target version:
Target version (plugin):
Severity:
Major - prevents use of part of Rudder | no simple workaround
User visibility:
Getting started - demo | first install | level 1 Techniques
Effort required:
Priority:
53

Description

I had a misconfigured web interface, that accepted the root inventory, so it was deleted afterward, and I end up with "no machine inventory" for my root server

Agent log run:

udder     info: <address>Apache/2.4.18 (Ubuntu) Server at 127.0.0.1 Port 443</address>
rudder     info: </body></html>
rudder     info: Automatically promoting context scope for 'inventory_sent' to namespace visibility, due to persistence
rudder     info: Transformer '/var/rudder/inventories/server-root.ocs.gz' => '/usr/bin/curl -L -k -1 -f -s --proxy '' --user rudder:rudder -T /var/rudder/inventories/server-root.ocs.gz https://127.0.0.1/inventories/' seemed to work ok
rudder     info: Transforming '/usr/bin/curl -L -k -1 -f -s --proxy '' --user rudder:rudder -T /var/rudder/inventories/server-root.ocs.sign https://127.0.0.1/inventories/' 
rudder     info: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
rudder     info: <html><head>
rudder     info: <title>201 Created</title>
rudder     info: </head><body>
rudder     info: <h1>Created</h1>
rudder     info: <p>Resource /inventories/server-root.ocs.sign has been created.</p>
rudder     info: <hr />
rudder     info: <address>Apache/2.4.18 (Ubuntu) Server at 127.0.0.1 Port 443</address>
rudder     info: </body></html>
rudder     info: Automatically promoting context scope for 'inventory_sent' to namespace visibility, due to persistence
rudder     info: Transformer '/var/rudder/inventories/server-root.ocs.sign' => '/usr/bin/curl -L -k -1 -f -s --proxy '' --user rudder:rudder -T /var/rudder/inventories/server-root.ocs.sign https://127.0.0.1/inventories/' seemed to work ok
rudder     info: Transforming '/bin/rm -f /var/rudder/inventories/server-root.ocs.gz' 
rudder     info: Transformer '/var/rudder/inventories/server-root.ocs.gz' => '/bin/rm -f /var/rudder/inventories/server-root.ocs.gz' seemed to work ok
rudder     info: Transforming '/bin/rm -f /var/rudder/inventories/server-root.ocs.sign' 
rudder     info: Transformer '/var/rudder/inventories/server-root.ocs.sign' => '/bin/rm -f /var/rudder/inventories/server-root.ocs.sign' seemed to work ok
rudder     info: Created file '/var/rudder/tmp/inventory_sent', mode 0600
rudder     info: Touched (updated time stamps) for path '/var/rudder/tmp/inventory_sent'
rudder     info: Transforming '/bin/rm -f /var/rudder/tmp/inventory/server-root.ocs' 
rudder     info: Transformer '/var/rudder/tmp/inventory/server-root.ocs' => '/bin/rm -f /var/rudder/tmp/inventory/server-root.ocs' seemed to work ok
E| compliant     Inventory                 inventory                                    The inventory has been successfully sent
rudder     info: Deleted file '/opt/rudder/etc/force_inventory'
   info          Inventory                 inventory                                    An inventory was already sent less than 8 hours ago
rudder     info: Can't stat file '/var/rudder/cfengine-community/inputs/distributePolicy/1.0/nodeslist.json' on 'localhost' in files.copy_from promise
E| compliant     DistributePolicy          Configure ncf                                Configure ncf was correct
   warning       DistributePolicy          Propagate nodeslist                          Cannot copy local nodes list
rudder     info: Transforming '/var/rudder/tools/send-clean.sh http://localhost:8080/endpoint/upload/ /var/rudder/inventories/incoming/server-root.ocs.gz /var/rudder/inventories/received/ /var/rudder/inventories/failed/' 
rudder     info:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
rudder     info:                                  Dload  Upload   Total   Spent    Left  Speed
100  182k  100    62  100  182k     55   163k  0:00:01  0:00:01 --:--:--  163k
rudder     info: Transformer '/var/rudder/inventories/incoming/server-root.ocs.gz' => '/var/rudder/tools/send-clean.sh http://localhost:8080/endpoint/upload/ /var/rudder/inventories/incoming/server-root.ocs.gz /var/rudder/inventories/received/ /var/rudder/inventories/failed/' seemed to work ok
E| compliant     DistributePolicy          Send inventories to CMDB                     Incoming inventories were successfully added to Rudder
E| compliant     server-roles              Check logrotate configur|                    The logrotate configuration is correct
E| compliant     server-roles              Check LDAP in rudder-web|                    The Rudder Webapp configuration files are OK (checked LDAP password)
E| compliant     server-roles              Check LDAP credentials                       The OpenLDAP configuration file is OK (checked rootdn password)
rudder     info: Executing 'no timeout,uid=112' ... '/usr/bin/psql -q -c "ALTER USER rudder WITH PASSWORD '8f12d6dcfb56'"'
rudder     info: Completed execution of '/usr/bin/psql -q -c "ALTER USER rudder WITH PASSWORD '8f12d6dcfb56'"'
E| compliant     server-roles              Check SQL in rudder-weba|                    The Rudder Webapp configuration files are OK (checked SQL password)
E| repaired      server-roles              Check SQL credentials                        The Rudder PostgreSQL user account's password has been changed
E| compliant     server-roles              Check rudder-passwords.c|                    The Rudder passwords file is present and secure
E| compliant     server-roles              Check allowed networks c|                    The Rudder allowed networks configuration is OK
E| compliant     server-roles              Check WebDAV credentials                     The Rudder WebDAV user and password are OK
R: [INFO] Executing is-active-process on apache2 using the systemctl method
E| compliant     server-roles              Check apache process                         Check apache process running was correct
R: [INFO] Executing is-enabled on apache2 using the systemctl method
E| compliant     server-roles              Check apache boot script                     Check apache boot starting parameters was correct
R: [INFO] Executing is-active-process on .*java.*/opt/rudder/jetty7/start.jar using the systemctl method
E| compliant     server-roles              Check jetty process                          Check jetty process running was correct
E| compliant     server-roles              Check configuration-repo|                    The /var/rudder/configuration-repository directory is present
E| compliant     server-roles              Check configuration-repo|                    The /var/rudder/configuration-repository GIT lock file is not present or not older than 5 minutes
rudder     info: Executing 'no timeout' ... '/usr/bin/curl --proxy '' -s http://localhost:8080/rudder/api/techniqueLibrary/reload |/bin/grep -q OK'
   error: Finished command related to promiser '/usr/bin/curl --proxy '' -s http://localhost:8080/rudder/api/techniqueLibrary/reload |/bin/grep -q OK' -- an error occurred, returned 1
rudder     info: Completed execution of '/usr/bin/curl --proxy '' -s http://localhost:8080/rudder/api/techniqueLibrary/reload |/bin/grep -q OK'
   info          server-roles              Check Technique library |                    The /opt/rudder/etc/force_technique_reload file is present. Reloading Technique library...
   warning       server-roles              Check Technique library |                    The Technique library failed to reload. Will try again next time
   error: Method 'root_technique_reload' failed in some repairs
rudder     info: Executing 'no timeout' ... '/usr/bin/curl --proxy '' --max-time 240 -s http://localhost:8080/rudder/api/status |/bin/grep -q OK'
   error: Finished command related to promiser '/usr/bin/curl --proxy '' --max-time 240 -s http://localhost:8080/rudder/api/status |/bin/grep -q OK' -- an error occurred, returned 1
rudder     info: Completed execution of '/usr/bin/curl --proxy '' --max-time 240 -s http://localhost:8080/rudder/api/status |/bin/grep -q OK'
rudder     info: Executing 'no timeout' ... '/bin/systemctl --no-ask-password restart rudder-jetty.service'
rudder     info: Completed execution of '/bin/systemctl --no-ask-password restart rudder-jetty.service'
R: [INFO] Executing restart on rudder-jetty using the systemctl method
R: [INFO] Promise repaired, made a change: Run action restart on service rudder-jetty
R: [INFO] Promise repaired, made a change: Restart service rudder_jetty
E| error         server-roles              Check rudder status                          The http://localhost:8080/rudder/api/status web application failed to respond for the second time. Restarting jetty NOW !
   error: Method 'generic_alive_check' failed in some repairs
rudder     info: Executing 'no timeout' ... '/usr/bin/curl --proxy '' --max-time 240 -s http://localhost:8080/endpoint/api/status |/bin/grep -q OK'
rudder     info: Automatically promoting context scope for 'site_ok' to namespace visibility, due to persistence
rudder     info: Completed execution of '/usr/bin/curl --proxy '' --max-time 240 -s http://localhost:8080/endpoint/api/status |/bin/grep -q OK'
E| compliant     server-roles              Check endpoint status                        The http://localhost:8080/endpoint/api/status web application is running
R: [INFO] Executing is-active-process on /opt/rudder/libexec/slapd using the systemctl method
E| compliant     server-roles              Check slapd process                          Check slapd process running was correct
E| compliant     server-roles              Check PostgreSQL configu|                    There is no need of specific postgresql configuration on this system
R: [INFO] Executing is-active-process on postgres:.* writer process using the systemctl method
E| compliant     server-roles              Check postgresql process                     Check postgresql process running was correct
R: [INFO] Executing is-enabled on postgresql using the systemctl method

rudder-jetty logs

        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.eclipse.jetty.start.Main.invokeMain(Main.java:473)
        at org.eclipse.jetty.start.Main.start(Main.java:615)
        at org.eclipse.jetty.start.Main.main(Main.java:96)
2017-03-22 17:55:50.709:INFO:oejs.AbstractConnector:Started SelectChannelConnector@127.0.0.1:8080
[2017-03-22 17:59:06] INFO  com.normation.inventory.provisioning.endpoint.FusionReportEndpoint - New input inventory: 'server-root.ocs'
[2017-03-22 17:59:07] INFO  com.normation.inventory.provisioning.endpoint.FusionReportEndpoint - Inventory 'server-root.ocs' parsed in 698 milliseconds ms, now checking signature
[2017-03-22 17:59:07] INFO  com.normation.inventory.provisioning.endpoint.FusionReportEndpoint - Inventory 'server-root.ocs' signature checked in 274 milliseconds ms, now saving
2017-03-22 17:59:09.446:INFO:oejs.Server:Graceful shutdown SelectChannelConnector@127.0.0.1:8080

(note that the webapp is shut down, because it didn't correctly work

Targeting to 4.1, but it may happen in every version

Associated revisions

Revision 773dddb8
Added by François ARMAND 5 months ago

Fixes #10485: Inventory endpoint accepts inventory even if ldap or postgresql connectivity failed

History

#1 Updated by François ARMAND 6 months ago

  • User visibility changed from First impressions of Rudder to Getting started - demo | first install | level 1 Techniques

#2 Updated by Benoît PECCATTE 6 months ago

  • Priority set to 54

#3 Updated by Nicolas CHARLES 6 months ago

  • Target version set to 3.1.19

Happens in 3.1

#4 Updated by Vincent MEMBRÉ 5 months ago

  • Target version changed from 3.1.19 to 3.1.20

#5 Updated by François ARMAND 5 months ago

The problem here is that we have two steps:

- 1/ check that inventory is well formed and signature is OK. If so, we ACK it and tell the sender "it is OK, I will process it" and put it in a queue. Notice that even if an error happens here, only the root node will know.
- 2/ try to save the inventory. But if an error happens here, the agent does not know that there was a problem.

Before accepting a node, we could check that LDAP is up and report an error if so.

#6 Updated by François ARMAND 5 months ago

  • Status changed from New to In progress
  • Assignee set to François ARMAND

#7 Updated by François ARMAND 5 months ago

  • Status changed from In progress to Pending technical review
  • Assignee changed from François ARMAND to Nicolas CHARLES
  • Pull Request set to https://github.com/Normation/ldap-inventory/pull/104

#8 Updated by François ARMAND 5 months ago

  • Status changed from Pending technical review to Pending release

#9 Updated by Vincent MEMBRÉ 4 months ago

  • Status changed from Pending release to Released
  • Priority changed from 54 to 53

This bug has been fixed in Rudder 3.1.20, 4.0.5 and 4.1.2 which were released today.

Also available in: Atom PDF