-
Notifications
You must be signed in to change notification settings - Fork 123
Allow quota buckets to be specified in flagd experiment config #10753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Instead of adding new machinery to block RPCs like this, wdyt about relying on the existing quota implementation for that and just allowing it to be configured via the experiments framework? It'd be handy to be able to push an experiment flag change, have quota pick it up, and for a malicious user to start getting ResourceExhaustedErrors. |
f6eee99 to
7237992
Compare
It'd be quite complex blocking a specific group using a bunch of rate limits, so I kept a setting to always block based on user group. Moving the logic over to the quota_manager did add some complexity, but I think I'm happy with the implementation. A user can configure the same settings that went into the SQL table but do it like this: This example would restrict the rate for Execute RPC, for a specific group {
"quota.buckets": {
"state": "ENABLED",
"variants": {
"default": {},
"limited": {
"rpc:/build.bazel.remote.execution.v2.Execution/Execute": {
"name": "execute-restrict",
"maxRate": {
"numRequests": 50,
"period": "60s"
},
"maxBurst": 5
}
}
},
"defaultVariant": "default",
"targeting": [
{
"if": [
{"in": ["$group_id", ["GR1111"]]}
],
"variant": "limited"
}
]
}
} |
7237992 to
caf0b9c
Compare
3c4accb to
337d4fc
Compare
50c1eef to
e509acf
Compare
a27b1b6 to
eb11963
Compare
d4315f6 to
4f2376d
Compare
| return status.InternalError("experiment flag provider not configured") | ||
| } | ||
|
|
||
| // If the bucket is already loaded, skip it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean existing quota buckets can't be updated in real time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the listenForUpdates to also Subscribe to the flagd updates. So now any change to flagd will refresh the buckets.
| // | ||
| // quota.buckets must have valid structure, for example: | ||
| // | ||
| // "rpc:/google.bytestream.ByteStream/Read": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kinda confused about the structure here, is it supposed to be:
"rpcName": {
"name": "arbitrary name",
"maxRate": {...}
"maxBurst": ...
}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaned this up a bit. I think name is not useful, so I'll just generate a name to satisfy the structure.
| return res | ||
| } | ||
|
|
||
| func bucketRowFromMap(namespace string, bucketMap map[string]interface{}) (*tables.QuotaBucket, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder if it would make sense to put the quota representation as a type so we can do all the parsing, merging, and type-checking in functions on that type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might make sense if we had more control over the flagd unmarshal. However, since the Object returns us a map[string]interface{}, we need to parse/validate that anyway to a useful form.
I originally looked at going from map -> proto -> bucket which would be more clean, but this code could theoretically be happening on the hot path. The proto has a validateBucket(pbBucket), which is used if a bucket is being added via the API.
If we transition the logic to entirely flagd, then we can rremove the the qpb.Bucket AND the tables.QuotaBucket entriely.
For now, I just used a local structure to decouple from the DB table object.
6c9f5db to
35968ea
Compare
83352df to
96b162b
Compare
| periodDurationUsec: period, | ||
| maxBurst: maxBurst, | ||
| } | ||
| return config, config.validate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to return config if config.validate() returns an error?
96b162b to
7f911d7
Compare
This PR allows for an easier application of quota buckets using the flagd config, which is evaluated at runtime. When the context of a request matches the config, the quota manager will attempt to create a new quota bucket and store it. If this group+namespace already has a matching bucket, the bucket creation will be skipped and the existing bucket will be used. Quota manager refresh or a restart will clear the buckets, and enable fresh loading from
flagdfor existing buckets, new buckets will be added automatically.Since
flagdevaluates by the request context, the flagd evaluation criteria available to the caller doesn't include information on whether the flag was evaluated by "group_id" or by something else.To keep this safe, we need to ensure that the experiment configs applied are only ever applied per group_id, otherwise we'll have unexpected behavior:
To prevent this, we need to ensure flagd configuration for
quota.bucketsis only applied by group_id.We can do this by adding strict tests like: https://github.com/buildbuddy-io/buildbuddy-internal/pull/6216
Ref: https://github.com/buildbuddy-io/buildbuddy-internal/issues/354
Testing info:
Tested in dev:
https://app.buildbuddy.dev/invocation/e83dfef5-8732-4de6-b076-d2d879c096b9#