...
In some scenarios, you might want to provide a way for clients to cancel a long-running request. In that case, the backend service must support some form of cancellation instruction.
Return appropriate status code.
Implement with legacy clients (i.e implement a facade over the asynchronous API to hide the asynchronous processing from the client. Logic Apps could help for that).
Predict the volume of requests to the service at any time.
3.2) Queue-Based Load Leveling
Message queues are a one-way communication mechanism. But, a task could expect a reply from service, we need to implement a mechanism used by the service to send a response.
If we autoscale services, it may result in increased contention and may diminish the effectiveness of using the queue to level the load.
Adjust the number of queues and the number of services instances to facilitate to handle the load.