Severe traffic bursts on your APIs and web services can be caused by malicious attackers attempting to disrupt a service (DoS attack style) or by expected and regular spikes in consumption (daily patterns for example) and even by friendly fire type of attacks (for example when a calling application is "acting up"). Whether intentional or not, traffic bursts, when severe enough, will have negative effects on your existing APIs. The most common effect of a traffic burst is the lowering of an increase in the response times latencies, but sustained traffic bursts can go as far as producing major interruptions of service.
One of the key functions of CA API Gateways is the protection of existing web APIs and services. Through its various rate limiting and quota enforcements functionality, policies enforced by CA API Gateways are tailored to protect against traffic bursts and ensures that service is kept operating within design performance parameters.
The Rate Limit assertion can be used to set limits onto backend traffic. In its simplest application, the Rate Limit assertion will set a simple limit associated with the node itself regardless of what is being invoked or by whom. This is illustrated below;, in this case, the assertion will limit incoming traffic to 1000tps (transactions per second). Beyond this limit, the assertion fails and an error message is returned as defined by the rest of the policy.
This 1000tps is a hard limit but you can also configure the assertion to allow for bursts if the backend application can withstand such bursts. When you check the “Allow burst traffic” check box, you can define a maximum burst period in seconds to control allowable bursts characteristics. This way, you can define a global limit but still allow shorts bursts that are known to not disrupt service and minimize false negatives rejecting legitimate traffic.
In this assertion, you can also control the behavior of the gateway when the limit is exceeded. The option “Throttle” will ‘fail’ requests that exceed the limit. This means that an error is returned immediately. The option “Shape” will instead queue the request exceeding the limit so that it is processed later in such a way that the limit is not exceeded. Shaping is useful when you prefer to queue incoming traffic for expected bursts so that the backend is effectively not seeing the burst. On the front end, messages exceeding the limit are no longer rejected but their response times are increased instead.
The Rate Limit assertion also lets you impose fined-grained limits instead of global limits. The “Limit Each” setting lets you choose a transaction parameter to which the limit is applicable as illustrated below.
For example, you could impose a limit onto each user thatin the policy authenticates an identity as part of the transaction. This lets you impose SLA type limits where a 1000tps limit is imposed upon each user is imposed 1000tps limit for example. You can also impose different levels to for different users by branching your policy and using a different Rate Limit assertion instance in each. In the policy fragment illustrated below, users that have a contract level ‘platinum’ are allowed 1000tps, users that have contract level ‘gold’ are allowed 500tps and all other users are allowed 100tps. This contract level could be looked up from an LDAP attribute or a database using a policy-level SQL statement for example.
The “Custom” option on the "Limit Each" parameter of this assertion lets you define limits that are imposed on an application level factor. This is where your policy enforcement point leverages its metadata awareness to enforce logical limits. For example, you could impose that the same account id cannot be referenced by multiple simultaneous transactions as illustrated below. This contract identifier could be extracted from the message using an XPath expression, for example.
In addition to the Rate Limit assertion, the CA API Gateway policy language has a Throughput Quota assertion. This assertion differs from the Rate Limit assertion in that it enforces a quota based on a counter which is persisted and distributed across a cluster of gateway nodes. Another major difference is that the Throughput Quota assertion limits use closed time units. This can be used to enforce hard limits of a contractual nature that have a longer history. Not only can you set quotas per second, but also per minute, hour, day and even month.
Finally, another strategy to limit rate on a backend API is to cache its responses to avoid hitting the backend service at every request. The CA API Gateway has policy-level caching functionality which let you store messages in memory based on an application level index and retrieve these messages when the same patterns come back. Whenever you receive a response message that is deemed "cacheable" from your backend service, you can assign a cache key using context variables that identify its unique parameters and set caching parameters such as illustrated below.
Being able to retrieve the same response message from memory instead of invoking a backend service can be an effective way to reduce the stress on an application while maintaining a high level of performance.