I've tried using that, and it doesnt seem to work consistently. I've reviewed what I'm doing, and think I'm using "Apply Throughput Assertion" in a non standard way, which is probably the root cause of my issues.
Most people would call this before routing the request to some other web server, but in my case the response from the web server defines what the rate limiting should be. This is to make it easy for any developer of our endpoints to define rate limiting for themselves, without having to make any changes to Layer7.
So I get response headers back from the routed endpoint that say what the max per minute should be, and other values that define the key to be used for the bucket name
e.g. The following response headers could come back
USAGE_COUNTER_TYPE: USER
ALLOWED_USAGE_PER_MINUTE: 10
So for me, if this request came from "/test/me" endpoint, from user 52, the "counter id" would be calculated as something like the following, which is unique for this endpoint and user:
throttle.endpoint./test/me.by.user.52
and then max quota would be set to "10"
Since I only get this info from the routed endpoints response, I have to do my "Apply throughput" after the routed response, and then during the next request, compare my max and current counter then. If the count is exceeded then, then I fail the request before it's routed.
In theory, this should work, and does basically work - except that it seems that sometimes the "counters" dont expire - they seem to live past a minute, and wont expire until I "Apply throughput" one more time.
So my counter should have expired, and then I have to do one more request before it's refreshed.
I've tried with this with the "Scalability" set to "Consistency" to ensure it wasn't a caching issue, but that didnt seem to help