Replication-factor
is the total number of copies of the data stored in an Apache Kafka cluster.
min.insync.replicas
is the minimum number of copies of the data that you are willing to have online at any time to continue running and accepting new incoming messages.
Suppose if I started a 5 node cluster and create a topic with replicator-factor of 3 with ack=all.
- Now when I publish a message will i get ack when data is replicated to other 3 broker every time ? what if 3 out of 5 nodes are down, will it wait to node come live again and then replicated the message and send the ack ? I believe min.insync.replica here is 1 by default ?
- Now if the min.insync.replica is set to 2 and replication factor is set to 3 then does this means that after replicating the data to 3 other node, ack is send back and the cluster will make sure that the the data will be present in atleast 2 nodes all the time. Is this understand on correct ?
- If the min.insync.replica is 2 and replication factor is 3. Will I get the ack after the data is replicated in 2 nodes and later the leader will add to 3 node or it will return the ack after the the data is replicated to 2 nodes ?
Basically I am interested in ack time and the durability of the data which is of highest priority so getting confused in some concepts.