diff --git a/docs/cn/design.md b/docs/cn/design.md index f97980ed092efef59a99d005bb41ebf61736d289..5b11912bafe3662cacc160793bbe5c8f420efc3c 100644 --- a/docs/cn/design.md +++ b/docs/cn/design.md @@ -48,7 +48,7 @@ rocketmq-remoting 模块是 RocketMQ消息队列中负责网络通信的模块 ![](image/rocketmq_design_3.png) -### 2.2 协议设计与编解码 +#### 2.2 协议设计与编解码 在Client和Server之间完成一次消息发送时,需要对发送的消息进行一个协议约定,因此就有必要自定义RocketMQ的消息协议。同时,为了高效地在网络中传输消息和对收到的消息读取,就需要对消息进行编解码。在RocketMQ中,RemotingCommand这个类在消息传输过程中对所有数据内容的封装,不但包含了所有的数据结构,还包含了编码解码操作。 Header字段 | 类型 | Request说明 | Response说明 @@ -176,9 +176,9 @@ Apache RocketMQ在4.3.0版中已经支持分布式事务消息,这里RocketMQ ![](image/rocketmq_design_11.png) -RocketMQ的具体实现策略是:写入的如果事务消息,对消息的Topic和Queue等属性进行替换,同时将原来的Topic和Queue信息存储到消息的属性中,正因为消息主题被替换,故消息并不会转发到该原主题的消息消费队列,消费者无法感知消息的存在,不会消息。其实改变消息主题是RocketMQ的常用“套路”,回想一下定时任务的实现机制。 +RocketMQ的具体实现策略是:写入的如果事务消息,对消息的Topic和Queue等属性进行替换,同时将原来的Topic和Queue信息存储到消息的属性中,正因为消息主题被替换,故消息并不会转发到该原主题的消息消费队列,消费者无法感知消息的存在,不会消费。其实改变消息主题是RocketMQ的常用“套路”,回想一下延时消息的实现机制。 -2.倘若一阶段的消息对用户可见 +2.Commit和Rollback操作以及Op消息的引入 在完成一阶段写入一条对用户不可见的消息后,二阶段如果是Commit操作,则需要让消息对用户可见;如果是Rollback则需要撤销一阶段的消息。先说Rollback的情况。对于Rollback,本身一阶段的消息对用户是不可见的,其实不需要真正撤销消息(实际上RocketMQ也无法去真正的删除一条消息,因为是顺序写文件的)。但是区别于这条消息没有确定状态(Pending状态,事务悬而未决),需要一个操作来标识这条消息的最终状态。RocketMQ事务消息方案中引入了Op消息的概念,用Op消息标识事务消息已经确定的状态(Commit或者Rollback)。如果一条事务消息没有对应的Op消息,说明这个事务的状态还无法确定(可能是二阶段失败了)。引入Op消息后,事务消息无论是Commit或者Rollback都会记录一个Op操作。Commit相对于Rollback只是在写入Op消息前创建Half消息的索引。 diff --git a/docs/en/design.md b/docs/en/design.md new file mode 100644 index 0000000000000000000000000000000000000000..4bd788aa99eca5f35323b4a7df961bc9ae6accbc --- /dev/null +++ b/docs/en/design.md @@ -0,0 +1,110 @@ + +## Design +### 1 Message Store + +![](../cn/image/rocketmq_design_1.png) + + +#### 1.1 The Architecure of Message Store + +#### 1.2 PageCache and Memory-Map(Mmap) + +#### 1.3 Message Flush + +![](../cn/image/rocketmq_design_2.png) + + +### 2 Communication Mechanism + +#### 2.1 The class diagram of Remoting module + +![](../cn/image/rocketmq_design_3.png) + +#### 2.2 The design of protocol and encode/decode + +![](../cn/image/rocketmq_design_4.png) + + +#### 2.3 The three ways and process of message communication + +![](../cn/image/rocketmq_design_5.png) + +#### 2.4 The multi-thread design of Reactor + +![](../cn/image/rocketmq_design_6.png) + + +### 3 Message Filter + +![](../cn/image/rocketmq_design_7.png) + +### 4 LoadBalancing + +#### 4.1 The loadBalance of Producer + +#### 4.2 The loadBalance of Consumer + +![](../cn/image/rocketmq_design_8.png) + + +![](../cn/image/rocketmq_design_9.png) + + + +### 5 Transactional Message +Apache RocketMQ supports distributed transactional message from version 4.3.0. RocketMQ implements transactional message by using the protocol of 2PC(two-phase commit), in addition adding a compensation logic to handle timeout-case or failure-case of commit-phase, as shown below. + +![](../cn/image/rocketmq_design_10.png) + +#### 5.1 The Process of RocketMQ Transactional Message +The picture above shows the overall architecture of transactional message, including the sending of message(commit-request phase), the sending of commit/rollback(commit phase) and the compensation process. + +1. The sending of message and Commit/Rollback. + (1) Sending the message(named Half message in RocketMQ) + (2) The server responds the writing result(success or failure) of Half message. + (3) Handle local transaction according to the result(local transaction won't be executed when the result is failure). + (4) Sending Commit/Rollback to broker according to the result of local transaction(Commit will generate message index and make the message visible to consumers). + +2. Compensation process + (1) For a transactional message without a Commit/Rollback (means the message in the pending status), a "back-check" request is initiated from the broker. + (2) The Producer receives the "back-check" request and checks the status of the local transaction corresponding to the "back-check" message. + (3) Redo Commit or Rollback based on local transaction status. +The compensation phase is used to resolve the timeout or failure case of the message Commit or Rollback. + +#### 5.2 The design of RocketMQ Transactional Message +1. Transactional message is invisible to users in first phase(commit-request phase) + + Upon on the main process of transactional message, the message of first phase is invisible to the user. This is also the biggest difference from normal message. So how do we write the message while making it invisible to the user? And below is the solution of RocketMQ: if the message is a Half message, the topic and queueId of the original message will be backed up, and then changes the topic to RMQ_SYS_TRANS_HALF_TOPIC. Since the consumer group does not subscribe to the topic, the consumer cannot consume the Half message. Then RocketMQ starts a timing task, pulls the message for RMQ_SYS_TRANS_HALF_TOPIC, obtains a channel according to producer group and sends a back-check to query local transaction status, and decide whether to submit or roll back the message according to the status. + + In RocketMQ, the storage structure of the message in the broker is as follows. Each message has corresponding index information. The Consumer reads the content of the message through the secondary index of the ConsumeQueue. The flow is as follows: + +![](../cn/image/rocketmq_design_11.png) + + The specific implementation strategy of RocketMQ is: if the transactional message is written, topic and queueId of the message are replaced, and the original topic and queueId are stored in the properties of the message. Because the replace of the topic, the message will not be forwarded to the Consumer Queue of the original topic, and the consumer cannot perceive the existence of the message and will not consume it. In fact, changing the topic is the conventional method of RocketMQ(just recall the implementation mechanism of the delay message). + +2. Commit/Rollback operation and introduction of Op message + + After finishing writing a message that is invisible to the user in the first phase, here comes two cases in the second phase. One is Commit operation, after which the message needs to be visible to the user; the other one is Rollback operation, after which the first phase message(Half message) needs to be revoked. For the case of Rollback, since first-phase message itself is invisible to the user, there is no need to actually revoke the message (in fact, RocketMQ can't actually delete a message because it is a sequential-write file). But still some operation needs to be done to identity the final status of the message, to differ it from pending status message. To do this, the concept of "Op message" is introduced, which means the message has a certain status(Commit or Rollback). If a transactional message does not have a corresponding Op message, the status of the transaction is still undetermined (probably the second-phase failed). By introducing the Op message, the RocketMQ records an Op message for every Half message regardless it is Commit or Rollback. The only difference between Commit and Rollback is that when it comes to Commit, the index of the Half message is created before the Op message is written. + +3. How Op message stored and the correspondence between Op message and Half message + + RocketMQ writes the Op message to a specific system topic(RMQ_SYS_TRANS_OP_HALF_TOPIC) which will be created via the method - TransactionalMessageUtil.buildOpTopic(); this topic is an internal Topic (like the topic of RMQ_SYS_TRANS_HALF_TOPIC) and will not be consumed by the user. The content of the Op message is the physical offset of the corresponding Half message. Through the Op message we can index to the Half message for subsequent check-back operation. + +![](../cn/image/rocketmq_design_12.png) + +4. Index construction of Half messages + + When performing Commit operation of the second phase, the index of the Half message needs to be built. Since the Half message is written to a special topic(RMQ_SYS_TRANS_HALF_TOPIC) in the first phase of 2PC, so it needs to be read out from the special topic when building index, and replace the topic and queueId with the real target topic and queueId, and then write through a normal message that is visible to the user. Therefore, in conclusion, the second phase recovers a complete normal message using the content of the Half message stored in the first phase, and then goes through the message-writing process. + +5. How to handle the message failed in the second phase? + + If commit/rollback phase fails, for example, a network problem causes the Commit to fail when you do Commit. Then certain strategy is required to make sure the message finally commit. RocketMQ uses a compensation mechanism called "back-check". The broker initiates a back-check request for the message in pending status, and sends the request to the corresponding producer side (the same producer group as the producer group who sent the Half message). The producer checks the status of local transaction and redo Commit or Rollback. The broker performs the back-check by comparing the RMQ_SYS_TRANS_HALF_TOPIC messages and the RMQ_SYS_TRANS_OP_HALF_TOPIC messages and advances the checkpoint(recording those transactional messages that the status are certain). + + RocketMQ does not back-check the status of transactional messages endlessly. The default time is 15. If the transaction status is still unknown after 15 times, RocketMQ will roll back the message by default. +### 6 Message Query + +#### 6.1 Query messages by messageId + +#### 6.2 Query messages by message key + +![](../cn/image/rocketmq_design_13.png)