Decouple backend processing from a frontend host, where backend processing needs to be asynchronous, but the frontend still needs a clear response : https://docs.microsoft.com/en-us/azure/architecture/patterns/async-request-reply
Context and problem
Need a response quickly enough to arrive back over the same connection and avoid latency to the response and fake the appearance of asynchronous processing (recommended for I/O-bound operations) because we use Http polling (useful to client-side code).
But, the work done by backend may be long-running (seconds, minutes, or even hours). So, Queue-Based Load Leveling could he helpful because the queue decouples the tasks from service and the service can handle the messages at its own pace regardless the volume of requests from concurrent tasks.
Http response is the key in this kind of pattern : HTTP 202 (Accepted) status code, acknowledging that the request has been received for processing.Solution
If the backend in ARR may be slow (more seconds, more minutes, even hours), then Queue-Based Load Leveling pattern may be the solution.Challenges
3.1) Asynchronous Request/Reply
In some scenarios, you might want to provide a way for clients to cancel a long-running request. In that case, the backend service must support some form of cancellation instruction.
Return appropriate status code.
Implement with legacy clients (i.e implement a facade over the asynchronous API to hide the asynchronous processing from the client. Logic Apps could help for that).
Predict the volume of requests to the service at any time.
3.2) Queue-Based Load Leveling
Message queues are a one-way communication mechanism. But, a task could expect a reply from service, we need to implement a mechanism used by the service to send a response.
If we autoscale services, it may result in increased contention and may diminish the effectiveness of using the queue to level the load.
Adjust the number of queues and the number of services instances to facilitate to handle the load.