Change data capture in Postgres: How to use logical decoding and wal2json (2023)

Databases don’t exist in isolation. Databases live in an ecosystem of software components: caches, search, dashboards, analytics, other databases, data lakes, web apps, and more. Your Postgres database partners with all these components to deliver the unique value of your application.

How does your ecosystem stay in sync? How do those other components get informed about changes in Postgres? Change data captureor CDC refers to any solution that identifies new or modified data.

One solution: Maybe you add a timestamp column to your PostgreSQL tables to record change times. Periodically you run a query to pull all the new data—that timestamp column will help you to identify what’s new since your last pull. It can be a workable solution if you’re okay with that schema change and batched updates—only identifying what’s new on some periodic basis. You have to determine an appropriate batching frequency. Too often may tie up CPU; too infrequent and you’ll fall behind on updates.

If what you want is row by row streaming of Postgres data changes as they happen, you’ll need a different solution

Real-time data updates allow you to keep disparate data systems in continuous sync and to respond quickly to new information. Like being able to show an online shopper recommendations that are based on items they’ve placed in their cart so far. Or, for a bank, being able to send customers a notification when there’s an unusual transaction on their credit card. Not to mention being able to use the customer’s response to adjust whether similar subsequent payments go through.

That’s where transaction log-based change data capture comes in. The transaction log naturally keeps track of each data change as it happens. You just need a way to read the log.

Logical decoding is the official name of PostgreSQL’s log-based change data capture feature. If the term logical decoding sounds unfamiliar, you may have heard of wal2json instead. Wal2json is a popular output plugin for logical decoding. People often use ‘wal2json’ to refer to ‘wal2json + logical decoding’. (FYI,Azure Database for PostgreSQL, our managed database servicefor Postgres, supports logical decoding and wal2json).

Let’s dive into this powerful Postgres feature.

Decoding “logical decoding” in Postgres

Question: What is “logical decoding” decoding?

Answer: The WAL – PostgreSQL’s write ahead log.

(Video) Real-Time Change Data Capture with PostgreSQL

The WAL (or transaction log) keeps track of all committed data transactions: It is the authority on everything that has happened on your Postgres instance. Its primary purpose is to help your Postgres database recover its state in the event of a crash. The WAL is written in a special format that Postgres understands because Postgres is its first and primary customer.

But such a log of all database changes is a handy way for other interested parties to understand what has happened on the database. And not only what happened, but in what order.

So, there’s this treasure trove of data changes in Postgres (the WAL) but it’s written in a format your non-Postgres services won’t understand. Enter logical decoding: the way Postgres enables you to translate (decode!) and emit the WAL into a form you can use.

Here’s what the logical decoding process looks like:

Change data capture in Postgres: How to use logical decoding and wal2json (1)


When a row is changed in a Postgres table, that change is recorded in the WAL. If logical decoding is enabled, the record of that change is passed to the output plugin. The output plugin changes that record from the WAL format to the plugin’s format (e.g. a JSON object). Then the reformatted change exits Postgres via a replication slot. Finally, there’s the consumer. A consumer is any application of your choice that connects to Postgres and receives the logical decoding output.

Logical decoding in action

To stream logical decoding, three Postgres parameters need to be set in postgresql.conf:

wal_level = logicalmax_replication_slots = 10max_wal_senders = 10

Setting wal_level to logical allows the WAL to record information needed for logical decoding. max_replication_slots and max_wal_senders must be at least 1 or higher if your server may be using more replication connections.

Then your Postgres server needs to be restarted to apply the changes.

I’m using an Azure Database for PostgreSQL server. In Azure, instead of individually setting those three parameters, you can set azure.replication_support:

az postgres server configuration set --resource-group rgroup --server-name pgserver --name azure.replication_support --value logical

and restart the server:

(Video) PostgreSQL Logical Replication Guide

az postgres server restart --resource-group rgroup --name pgserver

For our example we’ll be using wal2json as the output plugin. WAL to JSON. As you might have guessed, this output plugin converts the Postgres write-ahead log output into JSON objects. wal2json is an open source project that you can download and install to your local Postgres setup. It is already installed on Azure Database for Postgres servers.

We need a slot and a consumer. pg_recvlogical is a Postgres application that can manage slots and consume the stream from them. pg_recvlogical is included in the Postgres distribution. That means if you’ve ever installed Postgres on your laptop, you probably have pg_recvlogical.

To confirm you have pg_recvlogical, you can open a terminal and run

pg_recvlogical --version

Now connect to your Postgres database with pg_recvlogical and create a slot. Connect with a user that has replication permissions.

pg_recvlogical -h pgserver.postgres.database.azure.com -U rachel@pgserver -d postgres --slot logical_slot --create-slot -P wal2json

Then start the slot.

pg_recvlogical -h pgserver.postgres.database.azure.com -U rachel@pgserver -d postgres --slot logical_slot --start -o pretty-print=1 -f –

This terminal won’t return control to you – it is waiting to receive the logical decoding stream.

So we should give the slot something to stream by making data changes in Postgres. To do that, connect to the Postgres database like you normally do. I’m using psql (in a different terminal). Create a table and modify some rows.

CREATE TABLE inventory (id SERIAL, item VARCHAR(30), qty INT, PRIMARY KEY(id));INSERT INTO inventory (item, qty) VALUES ('apples', '100');UPDATE inventory SET qty = 96 WHERE item = 'apples';DELETE FROM inventory WHERE item = 'apples';

You can see the logical decoding output in the pg_recvlogical terminal. The format of the output is determined by wal2json, the output plugin we selected.

{ "change": [ ]}{ "change": [ { "kind": "insert", "schema": "public", "table": "inventory", "columnnames": ["id", "item", "qty"], "columntypes": ["integer", "character varying(30)", "integer"], "columnvalues": [1, "apples", 100] } ]}{ "change": [ { "kind": "update", "schema": "public", "table": "inventory", "columnnames": ["id", "item", "qty"], "columntypes": ["integer", "character varying(30)", "integer"], "columnvalues": [1, "apples", 96], "oldkeys": { "keynames": ["id"], "keytypes": ["integer"], "keyvalues": [1] } } ]}{ "change": [ { "kind": "delete", "schema": "public", "table": "inventory", "oldkeys": { "keynames": ["id"], "keytypes": ["integer"], "keyvalues": [1] } } ]}

To drop the replication slot,

pg_recvlogical -h pgserver.postgres.database.azure.com -U rachel@pgserver -d postgres --slot logical_slot --drop-slot

What data is captured by logical decoding?

Logical decoding can only output information about DML (data manipulation) events in Postgres, that is INSERT, UPDATE, and DELETE. DDL (data definition) changes like CREATE TABLE, ALTER ROLE, and DROP INDEX are not emitted by logical decoding. And neither is any command that's not an INSERT, UPDATE, or DELETE. Remember when we ran CREATE TABLE in our logical decoding test above? The output was blank:

{ "change": [ ]}

For INSERT and UPDATE, a new row is added to a table. This new row data is always sent to the output plugin.

(Video) PostgreSQL Logical Decoding of Replication Slots

For UPDATE and DELETE, a row is removed from a table. Whether that old row’s information gets sent to the output plugin depends on a Postgres table property called REPLICA IDENTITY. By default, only the primary key values of the old row are sent from the WAL.

As an example, let’s look at the update we did earlier.

UPDATE inventory SET qty = 96 WHERE item = 'apples';

The logical decoding output was:

{ "change": [ { "kind": "update", "schema": "public", "table": "inventory", "columnnames": ["id", "item", "qty"], "columntypes": ["integer", "character varying(30)", "integer"], "columnvalues": [1, "apples", 96], "oldkeys": { "keynames": ["id"], "keytypes": ["integer"], "keyvalues": [1] } } ]}

You can see that we get all the new row’s data, [1, "apples", 96]. For old data, we only get the primary key column, id.

REPLICA IDENTITY has other settings that vary the information you can get about updated and deleted rows.

Wal2json is one output plugin. There are other options

We’ve already talked about one popular output plugin: wal2json. There are other output plugins to choose from. If you are self-hosting Postgres, you could even make your own.

Two output plugins ship natively with Postgres (no additional installation needed):

  • test_decoding: Available on Postgres 9.4+. Though created to be just an example of an output plugin, test_decoding is still useful if your consumer supports it (e.g. Qlik replicate).
  • pgoutput: Available since Postgres 10. pgoutput is used by Postgres to support logical replication, and is supported by some consumers for decoding (e.g. Debezium).

An output plugins receives data from the WAL. The plugin then decides what information to keep and how to present that information to you.

For example, new row data is always sent from the WAL to the output plugin. However, wal2json chooses not to output new row data for an UPDATE if the table has no primary key. test_decoding, on the other hand, will publish that row. But test_decoding is not JSON formatted with name/value pairs. Pick the output plugin that suits your scenario.

Six important things to know about slots

Logical decoding outputs data changes as a stream. That stream is called alogical replication slot.

(Video) Petros Angelatos – Change data capture with Debezium…and without

You should keep in mind the following when dealing with slots:

  1. Each slot has one output plugin (you choose which).
  2. Each slot provides changes from only one database.
  1. But a single database can have multiple slots.
  2. Each data change is normally emitted once per slot.
  3. However, if the Postgres instance restarts, a slot may re-emit changes. Your consumer needs to handle that situation.
  4. An unconsumed slot is a threat to your Postgres instance’s availability.
    Strong words, yes, but it’s true. It is critical that you monitor your slots. If a slot’s stream isn’t being consumed, Postgres will hold on to all the WAL files for those unconsumed changes. This can lead to storage full or transaction ID wraparound.

Postgres has a table called pg_replication_slots that tracks the state of all replication slots. Keep your eye on the ‘active’ column. If a slot does not have a connection to a consumer, the column will be false.

Change data capture in Postgres: How to use logical decoding and wal2json (2)

Another way to watch for the impact of an inactive slot is to have storage alerts configured. If storage is growing and you don’t know why, it may be due to WAL file retention by an unconsumed slot.

Finally, the consumer you choosemay come with built-in monitoring so look for that feature.

You are better off deleting an unused slot than keeping it around.

With local decoding, Postgres gives you the WAL data. Now what?

Remember that a consumer is any application that can connect to Postgres and ingest the logical decoding stream. We used pg_recvlogical as the consumer in our example earlier.

You can create your own consumer, an app that parses and redirects the Postgres logical decoding stream to other components in your system. For example, in this PGConf.EU presentation Webedia uses a custom service called walparser to convert wal2json’s output into MQ messages, then sends the messages to RabbitMQ and ElasticSearch. Another example is Netflix’s DBLog.

Or, instead of making your own consumer, you can let someone else do the heavy lifting. There are change data capture connectors available that support Postgres logical decoding as a source and provide connections to various targets.

If you are looking for an open-source offering, Debezium is a popular change data capture solution built on Apache Kafka. Learn more about Debezium from their FAQ or this deep dive into change data capture patterns.

You could also explore paid services like Striim and Qlik Replicate. One advantage of all three consumers, compared to creating your own solution, is they support a variety of other sources like MySQL, Oracle, and SQL Server. If you have data in other database engines, you can use their connectors to integrate data instead of building a custom connector for each one.

(Video) Part 18 - PostgreSQL : What is Logical Replication in PostgreSQL.

Dynamic real-time responses are the hallmark of modern applications

Logical decoding in PostgreSQL provides an efficient way for your other app components to stay up-to-date with data changes in your Postgres database. Write once to the reliable Postgres log, then derive those change events for downstream targets like caches and search indexes. Instead of a pull model where each component queries Postgres at some interval, this is a push model where Postgres notifies you and your application of each change, as it happens. With logical decoding your Postgres database becomes a centerpiece of your dynamic real-time application.

FAQs

How do you do logical replication in PostgreSQL? ›

To perform logical replication in PostgreSQL, you've to change the wal_level = logical parameter. The parameter tells the server to store additional information in WAL for converting binary changes into logical ones. To change the value of this parameter, you have to open the postgresql. conf file.

What is logical decoding in Postgres? ›

Logical decoding in PostgreSQL allows you to stream data changes to external consumers. Logical decoding is popularly used for event streaming and change data capture scenarios. Logical decoding uses an output plugin to convert Postgres's write ahead log (WAL) into a readable format.

Does Postgres support change data capture? ›

Change Data Capture (CDC) allows you to track and propagate changes in a Postgres database to downstream consumers based on its Write-Ahead Log (WAL). In this guide, we'll cover how to use Materialize to create and efficiently maintain real-time materialized views on top of CDC data.

How do you implement change data capture in PostgreSQL? ›

In MySQL you do this by turning on the binlog and in Postgres, you configure the Write Ahead Log (WAL) for replication. Once the database is configured to write these logs you can choose a CDC system to help capture the changes. Two popular options are Debezium and Amazon Database Migration Service (DMS).

What is the difference between physical and logical replication? ›

Logical replication is a method of replicating data objects and their changes based upon their replication identity (usually a primary key). We use the term logical in contrast to physical replication, which uses exact block addresses and byte-by-byte replication.

How do you set up logical replication? ›

To set up logical replication with pglogical, logical decoding must be enabled on the primary instance. Set cloudsql. logical_decoding=on on the Cloud SQL instance, or wal_level=logical on an external instance. Additionally, pglogical must be enabled on both the primary and replica instance; set cloudsql.

What does $1 mean in postgres? ›

Arguments to the SQL function are referenced in the function body using the syntax $n: $1 refers to the first argument, $2 to the second, and so on. If an argument is of a composite type, then the dot notation, e.g., $1.name, can be used to access attributes of the argument.

What is the difference between vacuum and analyze in PostgreSQL? ›

When a vacuum process runs, the space occupied by these dead tuples is marked reusable by other tuples. An “analyze” operation does what its name says – it analyzes the contents of a database's tables and collects statistics about the distribution of values in each column of every table.

What is logical replication slot in PostgreSQL? ›

A replication slot has an identifier that is unique across all databases in a PostgreSQL cluster. Slots persist independently of the connection using them and are crash-safe. A logical slot will emit each change just once in normal operation.

Does Pg_restore delete existing data? ›

If you use the --clean option of pg_restore , the old tables will be dropped before the new ones are created. If you do not use the --clean option, you will get an error message that the table already exists, but pg_restore will continue processing unless you use the --exit-on-error option.

What encoding does Postgres use? ›

The character set support in PostgreSQL allows you to store text in a variety of character sets (also called encodings), including single-byte character sets such as the ISO 8859 series and multiple-byte character sets such as EUC (Extended Unix Code), UTF-8, and Mule internal code.

What is the best way to transfer the data in a PostgreSQL database? ›

Data export with pg_dump

The idea behind this dump method is to generate a file with SQL commands that, when fed back to the server, will recreate the database in the same state as it was at the time of the dump. PostgreSQL provides the utility program pg_dump for this purpose.

How is change data capture implemented? ›

Change Data Capture (CDC) can be implemented using the following 3 steps: Step 1: Extract the Data. Step 2: Transform the Data. Step 3: Load the Data.

What is the correct method of changing a stored procedure PostgreSQL? ›

To change a procedure's schema, you must also have CREATE privilege on the new schema. To alter the owner, you must also be a direct or indirect member of the new owning role, and that role must have CREATE privilege on the procedure's schema.

What is change data capture in Yugabyte? ›

Change data capture (CDC) in YugabyteDB provides technology to ensure that any changes in data (inserts, updates, and deletions) are identified, captured, and automatically applied to another data repository instance, or made available for consumption by applications and other tools.

How many types of replication are there in PostgreSQL? ›

In PostgreSQL, there are two types of replication features: streaming replication (physical replication) that collectively replicates a database cluster, and logical replication that replicates in units of tables and databases.

What is the difference between synchronization and replication? ›

Replication implies strongly that there are two or more copies of (all) the data. Synchronization implies that two or more copies of data are being kept up-to-date, but not necessarily that each copy contains all of the data (although this is typically the case for database syncing)

What is the difference between streaming replication and logical replication? ›

Streaming replication was introduced for use with PostgreSQL v10. 0. Logical replication works by copying/replicating data objects and their changes based on their replication identity. In many cases, the data's identity is a primary key.

What are the different types of replication techniques? ›

Data Replication Methods
  • Log-Based Incremental Replication.
  • Key-Based Incremental Replication.
  • Full Table Replication.
  • Snapshot Replication.
  • Transactional Replication.
  • Merge Replication.
  • Bidirectional Replication.

Why do we add delimiter $$ here? ›

Delimiters other than the default ; are typically used when defining functions, stored procedures, and triggers wherein you must define multiple statements. You define a different delimiter like $$ which is used to define the end of the entire procedure, but inside it, individual statements are each terminated by ; .

How many inserts can Postgres handle per second? ›

When using Postgres if you do need writes exceeding 10,000s of INSERT s per second we turn to the Postgres COPY utility for bulk loading. COPY is capable of handling 100,000s of writes per second. Even without a sustained high write throughput COPY can be handy to quickly ingest a very large set of data.

What is $$ language Plpgsql? ›

PL/pgSQL (Procedural Language/PostgreSQL) is a procedural programming language supported by the PostgreSQL ORDBMS. It closely resembles Oracle's PL/SQL language. Implemented by Jan Wieck, PL/pgSQL first appeared with PostgreSQL 6.4, released on October 30, 1998.

Will vacuum lock the table? ›

Does a vacuum analyze lock tables ? No, it's the "FULL VACUUM" command that locks tables.

Does vacuum full reindex? ›

A full vacuum doesn't perform a reindex for interleaved tables. To reindex interleaved tables followed by a full vacuum, use the VACUUM REINDEX option. By default, VACUUM FULL skips the sort phase for any table that is already at least 95 percent sorted.

How do I know if Postgres is running a vacuum? ›

If you see the size of any PostgreSQL tables increasing unexpectedly, VACUUM processes may not be executing properly on that table. To confirm if that's the case, we can query the database to determine the last time each of our tables was vacuumed.

What is difference between vacuum and vacuum full in PostgreSQL? ›

VACUUM FULL , unlike VACUUM , touches data that has not been deleted. On pre-9.0 versions of PostgreSQL, it moves data into spaces earlier in the file that have been freed.

What is difference between snapshot and transactional replication? ›

SQL Server Transactional Replication

Unlike the snapshot agent which takes a snapshot of the whole article (table, view, etc), in transactional replication we have a log reader agent, it reads the logs and looks for transactions which are marked for publication, it delivers these transactions to the distributor.

What is the difference between replication and sharding? ›

What is the difference between replication and sharding? Replication: The primary server node copies data onto secondary server nodes. This can help increase data availability and act as a backup, in case if the primary server fails. Sharding: Handles horizontal scaling across servers using a shard key.

What is pg_dump and Pg_restore? ›

The pg_dump and pg_restore command line tools are used to export and import PostgreSQL databases. They create PostgreSQL backups and migrate PostgreSQL databases between servers.

How long does a Pg_restore take? ›

During pg_restore , the database size is increasing at a rate of 50 MB/minute estimated using the SELECT pg_size_pretty(pg_database_size()) query. At this rate, it will take approximately 130 hours to complete the restore which is a very long time.

Does pg_dump lock the database? ›

pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently. pg_dump does not block other users accessing the database (readers or writers).

What is the difference between UTF-8 and Unicode? ›

The Difference Between Unicode and UTF-8

Unicode is a character set. UTF-8 is encoding. Unicode is a list of characters with unique decimal numbers (code points).

What UTF-8 means? ›

UTF-8 (UCS Transformation Format 8) is the World Wide Web's most common character encoding. Each character is represented by one to four bytes. UTF-8 is backward-compatible with ASCII and can represent any standard Unicode character.

What model of communication is used in PostgreSQL? ›

PostgreSQL uses a message-based protocol for communication between frontends and backends (clients and servers). The protocol is supported over TCP/IP and also over Unix-domain sockets.

How do I transfer data between two databases? ›

Solution
  1. Fire up SSMS and sign into the source DB. ...
  2. Start the export. ...
  3. Select your tables. ...
  4. Configure the export. ...
  5. Set the data to be exported. ...
  6. Hit “OK” -> Next -> and Next. ...
  7. Copy the .sql file contents. ...
  8. Connect to the target database.
8 Jun 2021

Why is copy faster than insert? ›

The COPY FROM command operates much faster than a normal INSERT command because the data is read as a single transaction directly to the target table. On the other hand, it is a very strict format, and the entire COPY procedure will fail if just one line is malformed.

How do I transfer data from one database to another? ›

Migrate data from one database to another
  1. Create a source database Connection.
  2. Create a destination database Connection.
  3. Start creating a Flow by selecting Database to database in the Gallery.
  4. Add a new source to destination transformation.
  5. Define the transformation parameters when the source is a database.
21 Sept 2021

What are the three methods of data capture? ›

The best data capture methods are: Optical character recognition. Intelligent Character Recognition. Optical Mark Reading.

What are the basic steps in data capturing? ›

The Six Steps to Optimal Data Capture
  • Step One: File Identification. ...
  • Step Two: OCR and Rendering. ...
  • Step Three: Classification. ...
  • Step Four: Data Extraction. ...
  • Step Five: QA and Reconciliation. ...
  • Step Six: Upload and Output.
6 Sept 2018

Which are the data capturing techniques? ›

The most appropriate data capture method depends on the nature of data to be captured and the application area.
  • Manual Keying. ...
  • Nearshore keying. ...
  • OCR (Optical Character Recognition) ...
  • ICR (Intelligent Character Recognition) ...
  • Barcode/ QR recognition. ...
  • Template based intelligent capture. ...
  • IDR (Intelligent Document Recognition)

What is the correct method of changing a stored procedure? ›

Expand Stored Procedures, right-click the procedure to modify, and then select Modify. Modify the text of the stored procedure. To test the syntax, on the Query menu, select Parse. To save the modifications to the procedure definition, on the Query menu, select Execute.

Which command tells PostgreSQL to make the database changes permanent? ›

To complete the transaction and have PostgreSQL make your changes permanent, execute the COMMIT command.

What command is used to change a particular data of an existing record? ›

The UPDATE statement is used to modify the existing records in a table.

What is CDC and SDC? ›

While some might observe that the difference between slowly changing dimensions (SCD) And Change Data Capture (CDC) might be subtle, there is in fact a technical difference between the two processes. Both processes detect changes in a source database and deliver the changed data to a target database.

How capture data change in SQL? ›

Before changes to any individual tables within a database can be tracked, change data capture must be explicitly enabled for the database. This is done by using the stored procedure sys. sp_cdc_enable_db. When the database is enabled, source tables can be identified as tracked tables by using the stored procedure sys.

What is CDC in data migration? ›

This process is called ongoing replication or change data capture (CDC). AWS DMS uses this process when replicating ongoing changes from a source data store. This process works by collecting changes to the database logs using the database engine's native API. You can migrate views using full-load tasks only.

What is logical replication slot in PostgreSQL? ›

A replication slot has an identifier that is unique across all databases in a PostgreSQL cluster. Slots persist independently of the connection using them and are crash-safe. A logical slot will emit each change just once in normal operation.

What are the types of replication in PostgreSQL? ›

In PostgreSQL, there are two types of replication features: streaming replication (physical replication) that collectively replicates a database cluster, and logical replication that replicates in units of tables and databases.

What is the difference between streaming replication and logical replication? ›

Streaming replication was introduced for use with PostgreSQL v10. 0. Logical replication works by copying/replicating data objects and their changes based on their replication identity. In many cases, the data's identity is a primary key.

What is the difference between replication and sharding? ›

What is the difference between replication and sharding? Replication: The primary server node copies data onto secondary server nodes. This can help increase data availability and act as a backup, in case if the primary server fails. Sharding: Handles horizontal scaling across servers using a shard key.

What is difference between vacuum and vacuum full in PostgreSQL? ›

VACUUM FULL , unlike VACUUM , touches data that has not been deleted. On pre-9.0 versions of PostgreSQL, it moves data into spaces earlier in the file that have been freed.

What is difference between snapshot and transactional replication? ›

SQL Server Transactional Replication

Unlike the snapshot agent which takes a snapshot of the whole article (table, view, etc), in transactional replication we have a log reader agent, it reads the logs and looks for transactions which are marked for publication, it delivers these transactions to the distributor.

What is vacuuming in PostgreSQL? ›

A vacuum is used for recovering space occupied by “dead tuples” in a table. A dead tuple is created when a record is either deleted or updated (a delete followed by an insert). PostgreSQL doesn't physically remove the old row from the table but puts a “marker” on it so that queries don't return that row.

What are the two types of replications? ›

At least two key types of replication exist: direct and conceptual. Conceptual replication generally refers to cases where researchers 'tweak' the methods of previous studies [43] and when successful, may be informative with regard to the boundaries and possible moderators of an effect.

What is difference between replication and caching? ›

Replicating data and distributing it as needed is called mirroring. Pure replication differs from caching in the sense that caching systems "pull data" from the origin server, while replication systems tend to "push" data to maintain mirror copies of the same data at various place on the network.

What are the three types of replication? ›

There were three models for how organisms might replicate their DNA: semi-conservative, conservative, and dispersive.

What are the different types of replication techniques? ›

Data Replication Methods
  • Log-Based Incremental Replication.
  • Key-Based Incremental Replication.
  • Full Table Replication.
  • Snapshot Replication.
  • Transactional Replication.
  • Merge Replication.
  • Bidirectional Replication.

How streaming replication works in PostgreSQL? ›

PostgreSQL streaming replication is based on transferring the WAL files from the primary to the target database. PostgreSQL streaming replication is implemented using a master-slave configuration. The master is known as the primary instance and handles the primary database and its operations.

Which is good replication or sharding? ›

Replication may help with horizontal scaling of reads if you are OK to read data that potentially isn't the latest. sharding allows for horizontal scaling of data writes by partitioning data across multiple servers using a shard key. It's important to choose a good shard key.

Does sharding improve query performance? ›

Sharding was one of the first ways databases were distributed to improve performance. Recent innovations have made it one of the best. Databases are now given an enviable amount of attention since they manage a company's most important property: data.

How many types of sharding are there? ›

While there are many different sharding methods, we will consider four main kinds: ranged/dynamic sharding, algorithmic/hashed sharding, entity/relationship-based sharding, and geography-based sharding.

Videos

1. Data Streaming for Microservices using Debezium (Gunnar Morling)
(Devoxx)
2. Change Data Capture for a Brand New World - Hannu Valtonen
(pgDay Paris)
3. Scaling Postgres Episode 84 Postgres 12 | Generate Columns | Parallel Estimate | Change Data Capture
(Scaling Postgres)
4. Scaling Postgres Episode 142 Insert vs. Update | Select * | Debezium Set Up | Standardizing Data
(Scaling Postgres)
5. Streaming Database Changes with Debezium by Gunnar Morling
(Devoxx)
6. Berlin Buzzwords 2015: Martin Kleppmann - Change Data Capture: The Magic Wand We Forgot #bbuzz
(Plain Schwarz)
Top Articles
Latest Posts
Article information

Author: Frankie Dare

Last Updated: 02/26/2023

Views: 6149

Rating: 4.2 / 5 (53 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Frankie Dare

Birthday: 2000-01-27

Address: Suite 313 45115 Caridad Freeway, Port Barabaraville, MS 66713

Phone: +3769542039359

Job: Sales Manager

Hobby: Baton twirling, Stand-up comedy, Leather crafting, Rugby, tabletop games, Jigsaw puzzles, Air sports

Introduction: My name is Frankie Dare, I am a funny, beautiful, proud, fair, pleasant, cheerful, enthusiastic person who loves writing and wants to share my knowledge and understanding with you.