what kind of architecture is the real A- A- balanced architecture, that is, the real distributed architecture?
毕须说  2024-07-30 17:38   published in China

As for the architecture dispute, after years of IT, the data technology architecture basically focuses on how to implement dual-active/Multi-Active load balancing across nodes. In fact, there are essentially three types:

1. A- A(Active-Active) balancing architecture, each node can provide external load balancing access, including read and write. The general implementation method is to split and balance data slices. A single LUN has dozens of MB of space slices and rotates by slice on each controller node. Because of the small granularity, balanced access is achieved for the entire LUN, of course, it also combines the front port sharing architecture of the hardware to avoid cross-node routing forwarding and reduce the loss of latency. It is very difficult to implement A- A Architecture. There are only 2 to 3 manufacturers on this planet that have truly realized A- A balanced architecture.

2. A- A/A(Acitve-Active/Asymmetric) Asymmetric AA balancing architecture, that is, A single LUN can be accessed externally with load balancing, in fact, there is a concept of working controller/node internally. If I/O is sent to a non-working node, it is internally forwarded to the working node for cross-node access, which increases the I/O latency.

3. A- P(Active-Passive) is the master-slave architecture. Only the master node is accessible externally, and the Slave node is inaccessible or readable and unwritable. When the master node fails, the system switches to the Slave node to achieve high availability. However, the master-slave architecture cannot achieve load balancing. Of course, you can perform master-slave operations on the configuration management policy, that is, split multiple slices or multiple LUNs, some LUNs work on controller node 1, some LUNs work on controller Node 2, and so on, in this way, the balance is achieved through manual configuration. However, due to the large LUN granularity, the master-slave architecture is essentially A- P.

Obviously, the A- A architecture is optimal, which can achieve multi-node load balancing and high-availability switching. The number of nodes is not large, and the utilization rate is high, thus avoiding resource waste. Similar to the Oracle RAC cluster and storage A- A balancing architecture. The second is the A- A/A non-completely symmetric balanced architecture, and the worst is the A- P master-slave architecture.

What is the current database architecture? It is basically a master-slave architecture, A- P architecture, multi-replica architecture, and large TB-level shards. This is the distributed architecture that many manufacturers talk about. Essentially, it is A- P master-slave multi-replica architecture, technically, load balancing cannot be achieved, and the resource utilization rate is low. If one person works and N people watch, the waste is very serious, which is euphemistically called & ldquo; Distributed architecture & rdquo;, I just took the worst route.

Check the characteristics of the distributed system: distribution, random distribution in space: Equivalence. Nodes in the distributed system are not divided into primary and secondary nodes, and all nodes are equivalent. Concurrency, concurrent access to shared resources. Isn't this the A- A architecture in the storage field? The A- A-balanced architecture made by the storage manufacturer 20 years ago is the same as the distributed architecture that is popular nowadays. With a different name, A- A architecture can be considered as the ancestor of the distributed architecture. On the contrary, what the distributed architecture really implements now is that the master-slave replicas do not meet the equivalence requirements at all and are fake distributed systems.

However, the master-slave multi-replica architecture has the following problems: Cross-node requires extremely high network quality, how to isolate error code jitter, and what logic is this if the CPU and node need to be kicked out if the disk fails? Replicas need to be copied and reconstructed from other node networks, which affects the performance of the production system and is slow to reconstruct. However, there are only a few disks on the local site, which poses great challenges to the storage I/O capability and takes a long time to back up, due to insufficient I/O capability and insufficient capacity, CPU expansion is required, and the cost is higher. Therefore, the next-generation database architecture is bound to evolve into database node shared data storage, including OceanBase and hivauss. Multiple replicas are eliminated, saving resources and eliminating frequent cross-node synchronization, disk faults are handed over to professional storage to ensure reliability. There is no need to cut nodes. Database nodes are stateless without persistent data and can be switched faster. This is the real distributed architecture and the ultimate goal of the database architecture, which is similar to the Oracle RAC shared storage architecture.

The master-slave replica architecture goes back to the real A- A architecture. Therefore, don't deny the technical framework of Silicon Valley bosses. Professional companies are doing professional work and inheriting evolution, which is the right way out. Otherwise, they are tossing and turning, and then solving the inexplicable problem, timeout problem, the problem of Hang is caused by high latency, slow switching, slow backup, difficult operation and maintenance of heap resources, and extremely low utilization rate. This wastes a lot of people, money and time, but widens the technological gap.

 

Source: Bi xunshuo

毕须说公众号二维码.jpg

Replies(
Sort By   
Reply
Reply