multi-master scenario analysis for distributed databases
vikingapple  2024-06-29 10:38   published at in China

We know that one of the biggest problems in the current distributed database transformation is that the database cannot support the multi-read and multi-write mode, so data must be sliced in addition to replicas, which makes business transformation more difficult, at the same time, the reliability is reduced. On the other hand, many distributed databases, including MySQL in the open source generation of the community, claim to support the multi-master working mode and can read and write more.

What are the characteristics and differences of the multi-master working modes described in these databases. Through the discussion of this article, I hope to get some answers.

Why do distributed databases need to be multi-master

before discussing the specific implementation plan, we need to discuss the scenario first.

The so-called multi-master, as the name implies, is relative to the master-slave mode. MySQL is still widely used in open source databases, so let's take MySQL as an example. In MySQL master/slave deployment mode, only the host can accept read and write requests. The standby machine obtains the binlog generated by the host and releases the page. Therefore, the data of the standby machine and the host is not completely synchronized in real time. In actual deployment, most services are deployed to directly access hosts. The standby machine is only used as a takeover in case of failure and does not run away from business.

062901.png

This will result in a waste of resources. Most of the resources must be reserved for the standby machine and the server size will increase dramatically. In addition, RPO and RTO between master and slave databases are difficult to guarantee. Many database vendors based on MySQL are trying to modify the implementation of master-slave synchronization. Through continuous patching, enhanced reliability for better RPO/RTO.

Some businesses combine multiple MySQL master-slave groups to form a distributed solution. For example, if distributed middleware is used, there are many types of middleware. Many Internet companies have also released their own middleware. The essence of these middleware is to achieve horizontal expansion through Database Sharding and table sharding. Let's abstract it, probably like the following.

In this mode, different MySQL master-slave groups correspond to completely different contents. Therefore, although the overall cluster size can be large, each master-slave group is actually an isolated island. The business layer or middleware layer needs to know which island a data is located in so that it can be accessed correctly. For each MySQL master-slave group, the problem encountered is exactly the same as the single master-slave deployment mode mentioned above. In addition, this mode is not very elastic, because to increase performance and nodes, you must adjust the distribution of data between MySQL primary and secondary groups. This requires business participation, perception, and even downtime.

Therefore, to realize database multi-master, we need to solve several problems. The first is that the standby machine should be able to participate in the work and improve the resource utilization rate, instead of wasting resources there; the isolated island between multiple primary and secondary groups should also be broken, without the need for business or middleware to perceive and record. Further, the business can be transparently scaled out without perception, you do not need to worry about retroactive or data migration. Finally, you need to solve the problem that large-scale clusters are easy to maintain and manage.

With the above questions, let's take a look at how the industry builds multiple hosts.

Start with writing a book

for the convenience of intuitive understanding, let's not talk about the database for the time being, but tell a story about writing a book first.

Suppose I want to write a book. In order to write books, I hired a man named Zhang San, who built a Word document on his computer to start writing. In order not to interrupt his writing when he asked for leave, I invited two more people, Zhang A and Zhang B, as the backup of Zhang San. However, when Zhang San was on duty normally, they could only watch by the side. In order not to let them be idle, I asked Zhang A and Zhang B to take A computer respectively. At the same time of Zhang San's code Word, they opened A Word beside each other and copied it completely according to what Zhang San wrote. Every day, these three people & ldquo; Crackling & rdquo; Knock the keyboard and it is in full swing.

062903.png

However, the content of my book is a bit too much. Zhang San wrote too slowly by himself. I hired another person named Li Si to help Zhang San. But Zhang San and Li Si need to use a computer respectively, so they can't open the same word document to write together. What should they do? I split the chapters, split some of them separately into another Word file, and let Li Si write them on his computer. In order to prevent Li Si from asking for leave, I wanted Zhang A and Zhang B to help him write when Li Si asked for leave. As A result, these two people said that they were too busy to copy three codes every day, I don't have time to help Li Si. I had to hire another two reserve Li A and Li B to copy Li Si's envoy every day * *.

However, the press still thought my writing was too slow and kept urging me. I had to hire Wang Wu again, split several chapters for Wang Wu to write, and invited Wang A and Wang B to copy according to Wang Wu's every day.

In this way, there are more and more people in the team, so it is necessary to hire a project manager. So Zhao Liu came. Zhao Liu is responsible for how to divide the chapters of the whole book and who will write them. A small team of 10 people is working.

062904.png

Readers may have seen that the basic architecture of this figure is exactly the same as that of distributed MySQL.

What mistakes have I made to make it so difficult to write a book

one day, I told Zhao Liu that I needed to add a plot, which should be revised by Zhang San and Li Si respectively. As a result, there was no progress in the past two days. Looking for Zhao Liuwen, he said that he had already conveyed the task; While Zhang San said that he did not know about it; Li size said that the manuscript had already been finished, and no one had ever looked for him. I had to call a few people together and had a meeting to discuss for half a day before I solved the problem. I told Zhao Liu that he should coordinate the work of these people well and don't let me wipe my ass if something goes wrong.

But it is always normal to go wrong. Zhang San came back from A day off and quarreled with Zhang A. Zhang San insisted that A piece of content he wrote before his vacation was tampered with by Zhang A, and his local computer disk was somehow broken, and all the data was gone. What Zhang A writes now is completely different from what he wrote before. Zhang A said that he had never seen this paragraph written by Zhang San. This paragraph was completely written by himself. The two made my head buzzing. I had to call Zhao Liu to ask what was going on, but Zhao Liu said with both hands that he could not manage this matter, which was between Zhang San and Zhang A. With A sigh & hellip;& hellip; I had to call Zhang San, A and B together for another afternoon meeting.

I am so tired that I have hired ten people and a full-time project manager. Just help me write a book, why is it so difficult?

Because I made a mistake. I mistakenly thought that several people couldn't edit a document at the same time. In fact, the technology of sharing and editing a document online by many people has long been available, but I don't know. Of course, Zhang San, Li Si, Wang Wu, Zhao Liu didn't mention it to me either.

In order to improve work efficiency, my writing team has become a multi-master architecture.

I decided not to let each of them write their own documents on their own computers. I recombined the document of Zhang San, Li Si, Wang Wu into one (it was originally a book, why did I open it at that time?), They are also allowed to continue to be responsible for their own chapters and edit them through shared documents.

At the same time, in order to avoid the failure of the local disk of the computer, I bought a set of shared storage and stored the document on it.

Hearing this, Zhang A, Zhang B, Li A, Li B, Wang A and Wang B all came to me. Because their job is to copy documents every day. Now that documents are all combined, whose documents should they copy. I said, you don't have to copy documents any more. Go home first. But on second thought, the three people, Zhang San, Li Si, Wang Wu, may still ask for leave. He added, Wang A, you can stay. If any of the three of them asks for leave, you can continue to write essays for them.

The other five people quit when hearing that they were about to be dismissed. They shouted what to do if Zhang San and Wang A were ill at the same time. No one replaced them. As soon as I thought it was right, I told everyone that you don't have to come to the office tomorrow, and I won't pay. In case there are not enough people, you can come at any time and settle accounts on a daily basis. In this way, there are five fewer permanent members in my team.

Seeing that so many people had left, Zhao Liu felt a little panic and asked me quietly what he should do. I patted Zhao Liu and said that the project manager still needed it, but the documents were shared. In the future, I would directly hand over the story to one of them to take charge of it, and I would no longer fight. Zhao Liu, your main task is to pay attention to everyone's work pressure and state, and don't tire a person out.

In this way, the number of my project team was reduced by half, and there was no quarrel. Everything has become much smoother.

062905.png

The preceding model is a multi-master and multi-write database consistency model. This is multi-write based on Share Everything. All database instances Share all data for collaborative work. Each instance can connect to the business and access all databases and tables as needed. In this way, many database instances can be saved and many computing servers can be reduced.

At the same time, because everyone shares data, it is much less difficult to collaborate than to split into several parts, and you can directly add instances when you need to play, so the application is almost imperceptible and the maintenance and management are much simpler.

What's the matter with multipart master

there is also a multi-master sharding method in the industry. How does it work. Let's go back to my writing studio first.

In order for many people to write books together, everyone still copies a document and writes each document without merging. But zhao liu helped me release a division of labor table. Everyone needs to know how the three people divide the documents. Once I need to add a new plot, I just leave the work to one person (for example, Zhang San), and then the person will not be responsible for the part, hand it over to another person to write (such as Wang Wu), and give it to me after writing (Zhang San). In this way, Zhang San, Li Si and Wang Wu can receive writing tasks directly and exchange them privately according to the division of labor. My job seems to be simple, just send the task to the three of them.

062906.png

But a few days later, I checked the manuscript and found that there was a paragraph wrong. After checking, Zhang San was in charge of this part, so he called Zhang San to scold him. However, Zhang San thought it was not his fault, saying that Li Si corrected the mistake. I was very angry and asked, "the task is given to you. Do you just care about yourself? Don't you know how to pull through the end-to-end connection.

Zhang Sanyi spread out his hands. I am not responsible for the content of Li Si. I will just send a message. I am not the leader of Li Si. I glared at Li Si. Li Si immediately bounced up and said, "I get a lot of work from your leader every day, and Zhang San and Wang Wu have to transfer a lot of work to me. The explanation is not clear, and I am too busy. Zhang san wang wu has nothing to do recently. Why do you have to leave it to me.

Li Si's excuse was about to make me break out, but I turned around and saw Zhang A, Zhang B, Li A, Li B. Several people were reading newspapers nearby to drink tea. The fire is even bigger. Why don't you guys go to work?! Four people said cheekily, "lead us to copy the books of zhang san and li si, but they are all having a meeting with you & hellip;& hellip;

improved multipart, can you save me

I can't bear it any more. Let these tea drinkers leave! Zhang San, Li Si, Wang Wu, you copy each other's books for me in the future. I don't support these people who don't work. Hearing this, Li Siyi collapsed directly on the ground. Are you serious about leading? You can't do the work of writing books. You still need to copy books. Do you think you don't have to spend time copying books?

I didn't care about Li Si who was in a daze. I called Zhao Liu and asked him to split everyone's work into details and balance the tasks. Zhao Liu said that no matter how the work was torn down, the total amount was still there. I'm afraid these people were really too busy to copy books from each other.

As soon as I gritted my teeth, I divided the chapters of writing books again so that more people could write and copy books together. When Zhao Liuyi heard that he was going to add people, he didn't say anything, and his heart was very complicated. Add people? The dismantled documents need to be dismantled again, and the task table of division of labor needs to be reorganized again. I have to toss myself for a long time, and everyone's writing task has to stop for at least half a day. The key is to stop quarrelling after adding people?

062907.png

This kind of multi-master sharding can solve the problem of idle resources by cross-master-slave sharding at the sharding level, but the extra overhead of the standby machine (that is, the scriber) is still there. At the same time, the partitions are logically isolated islands. Although some coordination can be done, the connection is not complete. When scaling up or down, it cannot be completely transparent to the business. Therefore, the degree of simplification is far from enough for management and operation and maintenance.

Is MGR a multi-master solution?

MySQL itself also provides a multi-master method called MGR. How does the multi-owner in this way work. We invite zhang san, li si, wang wu to appear again, dangdang & hellip;& hellip;

first of all, Zhang San, Li Si and Wang Wu combined the document into a book, but they did not share collaborative editing. Each of them copied a complete document and put it on the local disk of their own computer for editing. In order to ensure that everyone will not write in a mess, I made a rule that before zhang san finished writing and saving the modified content (smart readers, you can think about why it was before writing and saving, rather than before Zhang San started to write), both must inform Li Si and Wang Wu, let them modify and save together. The same is true for Li Si. Zhang san and Wang Wu must be notified to modify and save each time before modifying and saving. I think this should be no different from the modification of shared documents, so I don't need to buy a shared storage separately. It's just that they spend some time communicating every day.

But soon, Zhang San got mad. He questioned Li Si and Wang Wu why they didn't save his modifications simultaneously. Wang Wu was puzzled, saying that Li Si told him to change a certain paragraph first, but Zhang San informed him late. Zhang San was furious and asked Li Si why he wanted to change this paragraph. He wrote it all afternoon for nothing. He could have got off work but now he has to work overtime. Li Si was also very upset. We didn't inform you until you agreed to save it. I am to blame for your slow writing. Of course I can change it. Zhao Liu came to me to report the situation. By the way, he also said that I was not in charge of the boss. All right, I will bear all the sins. I can only coordinate and appease every little partner.

062908.png

The so-called multi-master mode has been broken. Everyone has a complete data, and the resource utilization rate has become worse. The storage of multiple copies has not been solved, everyone is still in a state of constant negotiation with others. Once there is a conflict, the work will be done for nothing, and business intervention is needed to solve it. This kind of management and maintenance is also a test of people's patience. Therefore, this method is not used much. People who use MGR have their own bitter tears.

Distributed database, shared data multi-master era, has come

this is the end of the story of writing a book. In order to improve the efficiency of writing books together, we have tried three different methods: consistency multi-master, sharding multi-master, and MGR-like multi-master.

Seeing here, you may say that the setting at the beginning of this story is not reasonable. The technology of sharing and editing documents by many people has already been established. With this technology, the division of labor can be more reasonable at the beginning, there is no need for such a complicated process.

Yes, but in the process of transforming distributed databases, the actual working methods of many businesses are actually the same as the writing of this article, moving forward in exploration. We are also trying to solve the problem by using multi-master views or MGR-like multi-master views. However, these two methods cannot maximize the utilization of resources, and at the same time, the management overhead is also very high.

A small partner asked, the premise of the efficient and reasonable division of labor and cooperation of many people writing books is to use the multi-person sharing editing technology. Now the database has this kind of & ldquo; Multi-person sharing editing & rdquo; Method? This is really available now. As shown in the following figure:

062909.png

all database instances share the same data to eliminate redundant copies. Based on the global cache and the global resource management system, the service is fully transparent, multi-read and multi-write. Currently, some vendors support the MySQL ecosystem without intrusion, and can directly access existing database clusters by using plug-ins without modification. The GaussDB-based ecosystem is also expected to be released in the future.

Replies(
Sort By   
Reply
Reply