Multi-Master Replication: Conflict Resolution, CRDTs, Galera, and BDR

Multi-master replication allows writes to multiple database nodes simultaneously. Unlike primary-replica setups, there is no single point for writes. The trade-off is complexity, particularly around conflict resolution.
Why Multi-Master?
The primary reasons to consider multi-master replication are:
-
Multi-region writes : Users in the US and Europe both write with local latency.
-
Zero downtime upgrades : Any node can be taken offline without losing write availability.
-
Read scalability with local writes : Each node can serve both reads and writes with low latency.
Conflict Resolution Strategies
When two nodes concurrently modify the same row, a conflict occurs. Resolution strategies vary by system:
Last-Write-Wins (LWW)
Each row carries a timestamp. The write with the latest timestamp wins. Simple but lossy:
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Cassandra default
UPDATE users SET email = 'new@example.com', updated_at = now() WHERE id = 1;
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- The LWW with the highest timestamp wins
Pros : Simple, always converges. Cons : Loses data silently.
Application-Mediated Conflict Resolution
The database detects conflicts and presents them to the application for resolution. BDR (Bi-Directional Replication) for PostgreSQL supports this:
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Create a conflict handler function
CREATE OR REPLACE FUNCTION resolve_conflict()
RETURNS trigger AS $$
BEGIN
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Custom logic: keep the row with the higher version
IF NEW.version >= OLD.version THEN
RETURN NEW;
END IF;
RETURN OLD;
END;
$$ LANGUAGE plpgsql;
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Apply to the replication set
SELECT bdr.conflict_handler('orders', 'resolve_conflict');
CRDT-Based Reconciliation
Conflict-free Replicated Data Types (CRDTs) mathematically guarantee convergence without coordination. Instead of storing a scalar value, you store a data structure where concurrent operations commute or combine:
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Counter CRDT table
CREATE TABLE page_views (
page_id INTEGER PRIMARY KEY,
counter INTEGER DEFAULT 0
);
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Each replica increments independently
UPDATE page_views SET counter = counter + 1 WHERE page_id = 42;
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- The final value after reconciliation: MAX of all replicas' counters (for grow-only counters)
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- or SUM (for counters implemented as PN-Counters)
More sophisticated CRDTs include:
-
G-Set (grow-only set): Elements can only be added. Union merges converge.
-
LWW-Register : Last-write-wins register (if using wall clocks) or compare-and-swap.
-
OR-Set (observed-remove set): Supports both add and remove without losing concurrent adds.
Galera Cluster (MySQL / MariaDB)
Galera Cluster implements synchronous multi-master replication for MySQL. All nodes coordinate before committing:
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- All nodes are writable
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- On each node:
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
email VARCHAR(255) NOT NULL
);
INSERT INTO users (email) VALUES ('alice@example.com');
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- This INSERT is certified on all nodes before returning
Galera Key Properties
-
Synchronous replication : All nodes apply changes simultaneously.
-
Certification-based : Transactions are certified on every node before commit.
-
Automatic node provisioning : New nodes join via State Snapshot Transfer (SST) or Incremental State Transfer (IST).
-
No replication lag : All nodes are consistent at commit time.
Galera Limitations
-
Cluster size penalty: 2-3 nodes is optimal. More nodes increase certification overhead.
-
Full table writes locks must be avoided;
SELECT ... FOR UPDATEon uncached workloads causes deadlocks. -
Network latency between nodes directly impacts write latency (the "slowest node" problem).
-
REPEATABLE READis the default;SERIALIZABLEcauses high conflict rates.
PostgreSQL BDR (Bi-Directional Replication)
BDR extends PostgreSQL's logical replication for multi-master scenarios. It operates at row level with conflict resolution:
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Configure BDR group
SELECT bdr.create_node(
node_name := 'node1',
dsn := 'host=node1.example.com port=5432 dbname=appdb'
);
SELECT bdr.create_node(
node_name := 'node2',
dsn := 'host=node2.example.com port=5432 dbname=appdb'
);
SELECT bdr.create_group(
group_name := 'global_app',
nodes := ARRAY['node1', 'node2']
);
BDR supports multiple conflict resolution modes:
-
latest_timestamp_wins -
latest_version_wins -
apply_remote -
apply_local
CAP Theorem Implications
Multi-master systems live in the CAP trade-off space:
-
Galera : CP (Consistency + Partition tolerance). Sacrifices availability during network partitions because writes must certify on all nodes.
-
Cassandra-style LWW : AP (Availability + Partition tolerance). Sacrifices immediate consistency but always accepts writes.
-
BDR with custom handlers : Tunable. Configure per-table conflict strategies.
When to Avoid Multi-Master
Multi-master replication is not the default for good reasons:
-
Single-region apps : A single PostgreSQL primary with streaming replicas handles the vast majority of workloads.
-
Apps with hot rows : If 90% of writes target 10 rows (e.g., a counter), conflicts are guaranteed.
-
Teams without operational maturity : Multi-master requires monitoring replication lag, conflict rates, and node health.
Monitoring Conflicts
Whichever system you choose, monitor conflict rates:
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- BDR conflict stats
SELECT * FROM bdr.conflict_history;
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Galeria cluster status
SHOW STATUS LIKE 'wsrep_cluster_size';
SHOW STATUS LIKE 'wsrep_local_cert_failures';
A rising conflict rate indicates application-level contention that might be better solved by sharding the data model rather than relying on replication to resolve conflicts.
Multi-master replication is a powerful but complex tool. Start with single-master and add multi-master only when your availability or latency requirements genuinely demand it.
Enjoy this article? Share your thoughts, questions, or experiences in the comments below — your insights help other readers too.
Join the discussion ↓