since the database has been transformed into a distributed database, why is centralized storage still attached?
毕须说  2024-07-27 15:57   published at in China

Recently, I participated in a database technology architecture circle activity, at which I shared a lot of dry goods, which was wonderful. During the free discussion session, a financial user asked, since the database has been reformed in a distributed manner, why should centralized storage be downloaded? This question is very common and has been discussed a lot. The G-line user, one of the major banks, replied that the GaussDB FCSAN Dorado storage solution for distributed databases also forced HW product lines to do so. Mysql did use a large number of server-local storage integration solutions, but found a variety of problems. The use of distributed database external enterprise storage, that is, the storage and computing separation architecture, brings many benefits:

1. After the storage and calculation are separated, the disk can be expanded vertically, which is more convenient. However, there are only a few dozen slots in the local site. It is difficult to expand the disk directly, and multiple nodes need to be expanded, high complexity. In addition, a large-capacity database can not store data. The external storage space is much larger, and the number of disks is much larger than that of local disks. It also has a storage pool of hundreds of disks, which can be expanded as you like, flexible and convenient. For example, the historical database has a capacity of tens of TB or even hundreds of TB. It needs to stack many servers with local disks. However, the frequency of service access is very small, and the CPU utilization of servers is low, resulting in serious waste, the cost is too high. With external storage, database nodes are saved, and the capacity is divided into LUN space as needed.

2. The failure rate of the PC server is relatively high, while with the local disk database, the server will be damaged almost every few days or even every day when the scale is large. How can the operation and maintenance stand that the disk or components are faulty, A new server needs to be replaced and a replica needs to be rebuilt, which affects the production network. It takes a long time to rebuild the replica. However, the failure rate of using external storage is much lower. If the server is broken, the external storage LUN is mapped to it, there is no need to rebuild the replica, which makes O & M much more complex. The disk fault storage can be reconstructed by RAID without any sense, and does not need to be involved in database node switching and recovery.

3. At present, network replication between master and slave nodes is limited to only a few disks, which may take a long time to slow down, resulting in RTO as long as tens of minutes or even hours, which is unacceptable. Unmount to storage for replication. On the one hand, the network is an FC/ROCE low-latency lossless network with good quality and low latency. On the other hand, there are many storage disks, no I/O bottleneck exists in the playback of secondary nodes, and the upper limit performance is high, playback is fast, RTO is very short. Therefore, database nodes do not contain local data and are external, so node switching is faster.

4. Database technology is still evolving. Centralized databases are easy to use and have sufficient performance. Most of the scenarios are sufficient. Large parts of the centralized database for core transactions of medium and small financial institutions are actually sufficient. To avoid frequent cross-node access to distributed transactions, distributed databases have a large cluster size, high risks of system robustness, and few observability measures. How to quickly locate problems? Scale-out performance is too much and O & M complexity is too high. Simple devices make O & M better.

In fact, there is also a concept that customers need to clarify. What is centralized Storage? Its International official name is Enterprise Storage System and Enterprise Storage System. IDC Consulting company defines it in this way, there is no concept of centralized storage at all. On the contrary, the internal architecture of the storage system is a distributed architecture. How to distribute and balance the data, divide and balance the data to each node, and compute nodes have multi-path equilibrium, the front-end shared interface card is balanced. The cross-node granularity of data balancing distribution slices is as small as 64MB and the disk-level granularity is 4MB. Each SSD disk of each node can be accessed in load balancing. The storage field is called A- A architecture, it has been done for 20 years. Enterprise storage systems can scale out multiple nodes horizontally, including 2 nodes, 4 nodes, 8 nodes, and 16 nodes.

1.png

On the contrary, the distributed architecture known as the Internet now is a master-slave multi-replica architecture with a large number of replicas, but it does not mean reliability. Which architecture can be called a distributed architecture? Because it does not meet the equivalence characteristics of the distributed architecture, as shown in the following figure.

2.jpg

Source: Bi xunshuo

毕须说公众号二维码.jpg

Replies(
Sort By   
Reply
Reply