CP DP
In this post I am going to write about control plane and data plane. You will take away what a control plane and data plane is. As well as why you should design your system in a CP/DP separation, then round out with a few common strategies.
What is a control plane and a data plane?
Dating back to the origin of the terms, a control plane is responsible for configuration while the data plane is responsible for traffic. There are a lot of documents describing the differences between control planes and data planes, but not all apis seem to fit into one or the other. Data plane apis are expected to be high throughput, high availability, low latency operations. If the operation does not fall under this category, then it is most likely part of the control plane.
Control Plane: The control plane consists of components that control, manage, and configure the infrastructure and services in a cloud environment. It includes systems like cloud controllers/management software, orchestration services, etc. that decide how, when, and where to run workloads and services. The control plane manages the high-level operation of the cloud.
Data Plane: The data plane comprises the infrastructure and components that actually handle the workloads, provide the services, store and process data, etc. This includes the virtual machines, servers, storage systems, routers, switches, and other devices that do the “ground work” in the cloud based on decisions from the control plane. The data plane forwards user data through the cloud system.
Why should you implement a control plane and data plane in your service?
1/ Separation of concerns. Keeping the control logic separate from the data processing enables each layer to focus on doing one thing well, rather than tightly coupling them.
2/ Scalability. The control and data elements can scale independently as needed. Control plane operations tend to be higher latency lower throughput operations, while data planes are lower latency and higher throughput.
3/ Availability. Loosely coupled planes allow for failure isolation. If the data plane has issues, the control plane can remain operational to manage recovery.
4/ Modular. Well defined interfaces between the planes enable a more modular resuable architecture. For example, a single control plane may interact with a cellularized dataplane when the dataplane hits scaling limits.
Where do services live?
API - An API can fall into either the control plane or the data plane. An API such as “create resource” like “create lambda” falls purely into the control plane. The actual creation of the resource happens later in the data plane, but the customer facing API to create the resource lives within the control plane. An API such as “use resource” commonly falls into the data plane if it is high throughput and highly available, such as “invoke lambda”.
Database - Both the control plane and the data plane need databases. However, the databases should be split such that they are not shared.
Compute - Large compute needs commonly point to a data plane.
Caching - While caching can be implemented in a CP, caching is almost always present on the high throughput operations that a data plane is doing
How does the control plane interface with the data plane?
1/ Async push based. The control plane asynchronously pushes its configuration to the dataplane. This must be asynchronous to keep the isolation of the planes. Sometimes this is done through a secondary process in the control plane calling a synchronous API on the dataplane and sometimes this is done through a message queue. The control plane never synchronously calls the data plane in its control flow.
2/ Poll based. The control plane builds a set of apis such that the data plane can poll for configuration. A data plane polling for configuration is very similar to a customer requesting configuration. When designing this, care must be taken such that the data plane does not brown out the control plane.
I personally prefer a mix of the two. The control plane sends a light message to the dataplane in its control flow that something happened. Then the data plane polls the appropriate configuration (straight from DDB) and does what it needs to do.