Batching
Batching with different configurations, as well as when and where to use batching can be confusing. Batching is commonly used for apis, and queues. Although they share many similarities and tradeoffs, they are subtly different. Lets look into queues. We have a standard setup: Producer → Queue → Consumer. Sometimes the producer and the consumer are the same, sometimes they are not. The queue might be FIFO or LIFO, it might be kinesis, or kafka, it does not matter. There are nuances between event driven, short polling, and long polling, but those are topics for another day.
No Batching: Producer tells the queue, “here is a record”, and the queue stores it. Consumer asks the queue, “are there any records available?” and the queue returns with a record.
Server side batching: The server in this scenario is the queue, and it handles all batching operations. Producer tells the queue, “here are N records”, and the queue stores them. Consumer asks the queue, “are there any records available, I will take up to M”, and the queue returns with some records. Notice how producing and consuming has become decoupled. The producer may be writing in 10 records at a time, but the consumer is only consuming 1 record at a time. Or visa-versa, and the producer is writing records 1 at a time, and the consumer is consuming 10 at a time. This style of batching is easy to reason about and implement, and is useful when decoupling workloads through the queue. The queue acts as a short term buffer and offers resiliency between the producer and the consumer.
Client side batching: The clients in this scenario are the producer and the consumer, and they handle all batching operations. Producer collects N records and groups them into a single item, then tells the queue “here is 1 record”. The queue has no idea these are batched. The consumer asks the queue “give me a record”. Then the consumer degroups the record into the N records that the producer grouped together. Notice how the queue has no idea that batching is even happening. This access pattern gains some resiliency from the queue, but the workloads of the client and the producer are now coupled. The client must be able to handle the N records the producer are creating. Many times, the N records are similar in nature and can be compressed efficiently. For instance, when transferring json data, it is not uncommon to have 70-90% compression ratios for multiple records. So the cost (monetary, speed, and latency) of transferring 10 client side batched records, is roughly the same as transferring a single record.
Mixed batching: Mixed batching is combining client and server side batching. This becomes difficult to reason about and debug when things go wrong, so is not recommended. The producer groups N records into M items, and tells the queue “here are M items”. The consumer then asks the queue, “are there any records available, I will take up to K”. The consumer then decompresses the K items into records. Depending upon the sizes for N, M, K, a lot of different things can happen. Since batching sizes commonly evolve, changing one parameter can drastically impact the system. Use this technique at your own peril.