Project

General

Profile

Actions

Architecture #7297

open

Speed up of promise validation

Added by Florian Heigl over 8 years ago. Updated about 6 years ago.

Status:
New
Priority:
N/A
Assignee:
-
Category:
Performance and scalability
Effort required:
Name check:
Fix check:
Regression:

Description

Currently promise validation scales with the number of nodes
  • potential Name service lookups
  • Disk IO for reading the promises
  • CPU performance for the general validation

... multiplied by nodes, in sequential

I think I have a solution for this.

You only need to run a validation for

the first node of
each group
containing >= 1 node.

Now, I've just put it here and not tried to come up with a twisted confusing explanation.
I hope you're happy


Related issues 1 (1 open0 closed)

Related to Rudder - Architecture #4427: cf-promises check on ALL generated promises leads to huge generation time NewNicolas CHARLESActions
Actions #1

Updated by François ARMAND over 8 years ago

You're right, validation time is a major problem right now, and in fact, it's scalling more than lineary with the number of nodes. Tickets like #7265 allows to at least get the low hanging fruits and take advantage of the full CPU, but as you are saying, at some point I/O or CPU are the bound.

So, the ticket is traced in #4427, and if you are OK, I will mark that one as duplicate, modulo adding you speeding up idea. Unfortunatly, your proposition is not sufficient, because there is a whole set of checking that are node-bounded, are even worse, node-runtime-bounded.

For example, for a given group, you can have node defined parameters, so that some nodes (not the first for the sake of the demo) are not valide, and other are (the first).

But even if you are able to identify each group of semantically equivalent node regarding their promises (because, after all, not all promises uses parameters, and depending of your use case, most of the generated promises may be the same), you may still have a valide check on the server leading to an invalide check on the node. The cause is that cf-promise evaluate AND EXECUTE a whole bunch of things during validation. Yeah.

So. the thing that you really, really want is a clear way (and UX) to be able to know that a set of new promises, from the point of view of a node (or a group of node), didn't passed cf-promises AND SO WAS NOT USED AS THE NEW SET. That can be done if we do things proposed in #4427 and #5641 + work on the UI to allows that king of error to be more visible and actionnable in the UI.

What do you thing about that ?

Actions #2

Updated by Florian Heigl over 8 years ago

As you can see, I have been thinking about it for a really long time.

...

...

I don't like it.

If an error can be detected prior to sending the policy out to clients then we need to do that.
Gatekeeper etc.
I think a reduced test (even a random test over 0.1% of nodes) does suffice here.

I do like if the clients do an independent test to avoid running anything that is not sane.(*)
They're considered autonomous agents so it makes sense to give them the final decision.

Summarizing:
You'd need a strong reason to pass out a policy to all nodes if you already could have spotted the error right on the server.
If both parties can check, they should. But policy issues on the server are easier to spot & we don't even need to pedantically check everything there.
As you explained, it's not even possible.

Footnote / Clarification:
I'm afraid I sound like "keep it the old way" or "do the thing i said i'm in love with it".
No. Really not. I like the independent checking and... I'm ready to be a total fanboi of each client running a "config version" based on their decision & test, with a version suggestion from the server.
It's not in scope here... But there's a wonderful rollout mechanism right there, and a probability that everything goes down in fire as each system picks a different one.
Trap for distributed systems. One could give a max. difference setting to cope with that & stop new "ripples" of configuration of the distribution is getting different speeds.
Is there anything on fluid dynamics in systems yet? *lol

Actions #3

Updated by Florian Heigl over 8 years ago

I would love to know why the footnote is now in fat print.
That totally makes no sense.

Actions #4

Updated by Jonathan CLARKE over 8 years ago

Florian Heigl wrote:

I would love to know why the footnote is now in fat print.
That totally makes no sense.

It's because you included a "*" character at the beginning of "Footnote", and another at the end of "lol". In Redmine's wiki syntax, this means "bold", like bold. (Click on the little pen icon to edit your comment to see the "source code")

Actions #5

Updated by Vincent MEMBRÉ over 8 years ago

  • Target version changed from 3.2.0~beta1 to 3.2.0~rc1
Actions #6

Updated by Benoît PECCATTE over 8 years ago

  • Target version changed from 3.2.0~rc1 to 3.2.0~rc2
Actions #7

Updated by Benoît PECCATTE over 8 years ago

  • Target version changed from 3.2.0~rc2 to 3.2.0
Actions #8

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 3.2.0 to 3.2.1
Actions #9

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 3.2.1 to 3.2.2
Actions #10

Updated by Vincent MEMBRÉ about 8 years ago

  • Target version changed from 3.2.2 to 3.2.3
Actions #11

Updated by Vincent MEMBRÉ almost 8 years ago

  • Target version changed from 3.2.3 to 3.2.4
Actions #12

Updated by Vincent MEMBRÉ almost 8 years ago

  • Target version changed from 3.2.4 to 3.2.5
Actions #13

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 3.2.5 to 3.2.6
Actions #14

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 3.2.6 to 3.2.7
Actions #15

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 3.2.7 to 3.2.8
Actions #16

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 3.2.8 to 3.2.9
Actions #17

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 3.2.9 to 3.2.10
Actions #18

Updated by Vincent MEMBRÉ over 7 years ago

  • Target version changed from 3.2.10 to 3.2.11
Actions #19

Updated by Vincent MEMBRÉ about 7 years ago

  • Target version changed from 3.2.11 to 339
Actions #20

Updated by Vincent MEMBRÉ about 7 years ago

  • Target version changed from 339 to 4.0.4
Actions #21

Updated by Vincent MEMBRÉ about 7 years ago

  • Target version changed from 4.0.4 to 4.0.5
Actions #22

Updated by Vincent MEMBRÉ almost 7 years ago

  • Target version changed from 4.0.5 to 4.0.6
Actions #23

Updated by Vincent MEMBRÉ almost 7 years ago

  • Target version changed from 4.0.6 to 4.0.7
Actions #24

Updated by Vincent MEMBRÉ almost 7 years ago

  • Target version changed from 4.0.7 to 357
Actions #25

Updated by Alexis Mousset almost 7 years ago

  • Target version changed from 357 to 4.1.6
Actions #26

Updated by Vincent MEMBRÉ over 6 years ago

  • Target version changed from 4.1.6 to 4.1.7
Actions #27

Updated by Vincent MEMBRÉ over 6 years ago

  • Target version changed from 4.1.7 to 4.1.8
Actions #28

Updated by Vincent MEMBRÉ over 6 years ago

  • Target version changed from 4.1.8 to 4.1.9
Actions #29

Updated by Vincent MEMBRÉ over 6 years ago

  • Target version changed from 4.1.9 to 4.1.10
Actions #30

Updated by Benoît PECCATTE about 6 years ago

  • Target version changed from 4.1.10 to Ideas (not version specific)
Actions #31

Updated by Nicolas CHARLES about 5 years ago

  • Related to Architecture #4427: cf-promises check on ALL generated promises leads to huge generation time added
Actions

Also available in: Atom PDF