Multi-Master Replication: Conflict Resolution, CRDTs, Galera, and BDR

Multi-Master Replication: Conflict Resolution, CRDTs, Galera, and BDR

Multi-master replication allows writes to multiple database nodes simultaneously. Unlike primary-replica setups, there is no single point for writes. The trade-off is complexity, particularly around conflict resolution.

Why Multi-Master?

The primary reasons to consider multi-master replication are:

  • Multi-region writes : Users in the US and Europe both write with local latency.

  • Zero downtime upgrades : Any node can be taken offline without losing write availability.

  • Read scalability with local writes : Each node can serve both reads and writes with low latency.

Conflict Resolution Strategies

When two nodes concurrently modify the same row, a conflict occurs. Resolution strategies vary by system:

Last-Write-Wins (LWW)

Each row carries a timestamp. The write with the latest timestamp wins. Simple but lossy:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Cassandra default

UPDATE users SET email = 'new@example.com', updated_at = now() WHERE id = 1;

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- The LWW with the highest timestamp wins

Pros : Simple, always converges. Cons : Loses data silently.

Application-Mediated Conflict Resolution

The database detects conflicts and presents them to the application for resolution. BDR (Bi-Directional Replication) for PostgreSQL supports this:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Create a conflict handler function

CREATE OR REPLACE FUNCTION resolve_conflict()

RETURNS trigger AS $$

BEGIN

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Custom logic: keep the row with the higher version

IF NEW.version >= OLD.version THEN

RETURN NEW;

END IF;

RETURN OLD;

END;

$$ LANGUAGE plpgsql;

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Apply to the replication set

SELECT bdr.conflict_handler('orders', 'resolve_conflict');

CRDT-Based Reconciliation

Conflict-free Replicated Data Types (CRDTs) mathematically guarantee convergence without coordination. Instead of storing a scalar value, you store a data structure where concurrent operations commute or combine:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Counter CRDT table

CREATE TABLE page_views (

page_id INTEGER PRIMARY KEY,

counter INTEGER DEFAULT 0

);

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Each replica increments independently

UPDATE page_views SET counter = counter + 1 WHERE page_id = 42;

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- The final value after reconciliation: MAX of all replicas' counters (for grow-only counters)

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- or SUM (for counters implemented as PN-Counters)

More sophisticated CRDTs include:

  • G-Set (grow-only set): Elements can only be added. Union merges converge.

  • LWW-Register : Last-write-wins register (if using wall clocks) or compare-and-swap.

  • OR-Set (observed-remove set): Supports both add and remove without losing concurrent adds.

Galera Cluster (MySQL / MariaDB)

Galera Cluster implements synchronous multi-master replication for MySQL. All nodes coordinate before committing:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- All nodes are writable

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- On each node:

CREATE TABLE users (

id INT PRIMARY KEY AUTO_INCREMENT,

email VARCHAR(255) NOT NULL

);

INSERT INTO users (email) VALUES ('alice@example.com');

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- This INSERT is certified on all nodes before returning

Galera Key Properties

  • Synchronous replication : All nodes apply changes simultaneously.

  • Certification-based : Transactions are certified on every node before commit.

  • Automatic node provisioning : New nodes join via State Snapshot Transfer (SST) or Incremental State Transfer (IST).

  • No replication lag : All nodes are consistent at commit time.

Galera Limitations

  • Cluster size penalty: 2-3 nodes is optimal. More nodes increase certification overhead.

  • Full table writes locks must be avoided; SELECT ... FOR UPDATE on uncached workloads causes deadlocks.

  • Network latency between nodes directly impacts write latency (the "slowest node" problem).

  • REPEATABLE READ is the default; SERIALIZABLE causes high conflict rates.

PostgreSQL BDR (Bi-Directional Replication)

BDR extends PostgreSQL's logical replication for multi-master scenarios. It operates at row level with conflict resolution:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Configure BDR group

SELECT bdr.create_node(

node_name := 'node1',

dsn := 'host=node1.example.com port=5432 dbname=appdb'

);

SELECT bdr.create_node(

node_name := 'node2',

dsn := 'host=node2.example.com port=5432 dbname=appdb'

);

SELECT bdr.create_group(

group_name := 'global_app',

nodes := ARRAY['node1', 'node2']

);

BDR supports multiple conflict resolution modes:

  • latest_timestamp_wins

  • latest_version_wins

  • apply_remote

  • apply_local

CAP Theorem Implications

Multi-master systems live in the CAP trade-off space:

  • Galera : CP (Consistency + Partition tolerance). Sacrifices availability during network partitions because writes must certify on all nodes.

  • Cassandra-style LWW : AP (Availability + Partition tolerance). Sacrifices immediate consistency but always accepts writes.

  • BDR with custom handlers : Tunable. Configure per-table conflict strategies.

When to Avoid Multi-Master

Multi-master replication is not the default for good reasons:

  • Single-region apps : A single PostgreSQL primary with streaming replicas handles the vast majority of workloads.

  • Apps with hot rows : If 90% of writes target 10 rows (e.g., a counter), conflicts are guaranteed.

  • Teams without operational maturity : Multi-master requires monitoring replication lag, conflict rates, and node health.

Monitoring Conflicts

Whichever system you choose, monitor conflict rates:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- BDR conflict stats

SELECT * FROM bdr.conflict_history;

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Galeria cluster status

SHOW STATUS LIKE 'wsrep_cluster_size';

SHOW STATUS LIKE 'wsrep_local_cert_failures';

A rising conflict rate indicates application-level contention that might be better solved by sharding the data model rather than relying on replication to resolve conflicts.

Multi-master replication is a powerful but complex tool. Start with single-master and add multi-master only when your availability or latency requirements genuinely demand it.