Logical Replication, Database Antipatterns, max_wal_size Setting, Delete vs. Truncate

In this episode of Scaling Postgres, we discuss use cases for logical replication, database anti-patterns, how to set max_wal_size and the difference between delete and truncate.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Well, we are very light on content this week. I assume it's because spring break is happening because spring has started. But nevertheless, we have about half the amount of posts we normally do. Now I should also add that I'm going to change the day when I collect all of the content and I'm going to do it a bit earlier which means for the show next week, it may also be very light on content.

But then I'll be back to normal in terms of seeing a full week of content. But for this week our first piece of content is "'Real-World Logical Replication' at Nordic PGDay 2023". This is from thebuild.com. Now this is a very short post because it essentially just has the link to the slides. So this is a PDF and it talks all about logical replication. It goes over some of the history, how it evolved, what it is, and how it works.

And a lot of the discussion talks about logical decoding which actually came first and how it essentially reads from the WAL stream to replicate the commands logically that are sent to the database. It also follows up by talking about native logical replication which hit in Postgres 10. And he talks about the opportunity to do database upgrades using this technique because now you can do an upgrade by changing the platform you're running on by not worrying about coalitions and things of that nature.

He also talks about different ways you could use a logical replication because maybe you want to create one central data warehouse where a lot of other application databases feed it. You could use a logical replication for that purpose. Or maybe even only as a read replica, not something that you can promote to a primary because that's not its purpose. You should use physical streaming replication for that. But it's great for reporting because you're not going to have query cancellations like you would with physical streaming replication. So a great presentation, I encourage you to check it out.

Next piece of content also from thebuild.com is "'Database Antipatters, and where to find them' at SCaLE 20x". And this is another presentation where he talks about database anti-patterns. He covers things like using the entity attribute value pattern. He talks about having long transactions, adding too many indexes to your database, as well as some dangers of partitioning. So in total, he talks about 10 different anti-patterns. But again a good presentation, I encourage you to check it out.

Next piece of content also from thebuild.com is "The importance of max_wal_size". So this is a configuration option that determines how many WAL segments should be kept before the database has to do a checkpoint. So the question is how large do you make it? Now the problem that he mentioned here, if your max_wal_size is too small, you're going to get frequent checkpoints which can cause a lot of performance problems.

And he mentions here, number one, quote "Checkpoints themselves are expensive, since all of the dirty buffers and shared_buffers need to be written out". And two "The first time a page has changed after a checkpoint, the entire page is written to the WAL rather than just the change. On a busy system this can be a very significant burst of WAL activity". And I've experienced this first hand myself with clients for whom most of the disk access on their system was these full-page image writes.

So it is very important to configure your checkpointing appropriately. Now he has some suggestions for general checkpoint configuration. For example, he suggests a checkpoint timeout at 15 minutes, a checkpoint completion target of 0.9, turning on WAL compression, logging the checkpoints, and a max_wal_size of 16 gigabytes. And he suggests running the system and seeing if checkpoints are happening more frequently than 15 minutes, you're going to want to up your max_wal_size.

Now the only disadvantage is the disk space required for increasing max_wal_size. But I should add there's a configuration option he didn't mention here and that's the checkpoint warning setting. And I actually like to set that about five minutes earlier than the checkpoint timeout. So what happens is if a checkpoint happens before that checkpoint timeout, like if you have the timeout set at 15 minutes, you have the warning set at 10 minutes. If a checkpoint occurs before 10 minutes, it'll actually write a log entry to the Postgres log.

And it says checkpoints are occurring too frequently and it tells you how many seconds from the last checkpoint it started. So that helps you dial in your max_wal_size a little bit better. And I use this setting because on a periodic basis, scan the logs of different database systems I manage. And I like being able to see that to make sure that things aren't checkpointing too early. So that's an additional configuration I like to use to properly set max_wal_size.

And then he follows up with a min_wal_size and what that should be set at. And basically, it helps keep WAL files around so it doesn't have to recreate them as the WAL is being managed by Postgres. So a lot of times I set it to a 10th or a fifth of the size of the max_wal_size. And I agree with him here in His last quote "...the performance impact is small” for this particular setting. But if you want to learn more, definitely check out this blog post.

Next piece of content- "POSTGRESQL: DELETE VS. TRUNCATE". This is from cybertec-postgresql.com. And they're talking about Delete commands versus Truncate commands. And delete is a row-level operation, whereas you can think of Truncate as a table-level operation. So when you do a Delete, it individually deletes each row. And that takes a while to do. He did some timing here and he saw that deleting a million rows took over 700 milliseconds. Whereas when he did a Truncate it took a little over two milliseconds, so much, much faster.

Now be warned, Truncate deletes the entire table, it gets rid of all the data. So if you're only wanting to delete part of the data, then you're going to want to use the delete command. And then lastly, the post covers how Truncate actually works. It actually creates a new file and points the OID to it, as opposed to manipulating the data within the file. That's what makes it so fast to work. And you can even do a rollback of it, they say. But if you want to learn more, you can check out this blog post.

Next piece of content- "Postgres connection pooling: Comparing PgCat and pgbouncer". This is from pganalyze.com. And this is their next episode of "Five minutes of Postgres". And they're covering the post that we discussed last week on adopting PgCat, the NexGen Postgres connection Pooler. So again, this is something I'm super excited about. If you want to learn my perspective on this post, definitely check out last week's Scaling Postgres episode. And you can also watch Lucas's viewpoint on it as well with this piece of content.

Next piece of content is actually a YouTube video and it's "All Postgres Locks Explained- A Deep Dive". And this is from Hussein Nasser's YouTube channel. And he goes into, I agree, a deep dive on Postgres locking that's about 48 minutes in length. So if that's of interest to you, you can definitely check out this blog post.

Next piece of content- "Postgres 16 highlight- More patterns for pg_ident.conf". This is from paquier.xyz. And much like the patch to adjust pg_hba.conf to allow patterns or regexes within the file, you can now do the same thing with the pg_ident file. Now personally, I haven't used pg_ident.conf much because I haven't really used external authentication with Postgres. But if you do, you might find it interesting to check out the different options that you can use with that file coming in Postgres 16.

Next piece of content, there was another episode of Postgres FM this week. This one was on "pgsql versus GUIs", or I assume it's terminal tools versus GUIs. They discuss some different options and some tips and tricks for learning and which one they use and prefer. So if you're interested in that, you can listen to the episode or watch it on YouTube.

And the last piece of content, the PostgreSQL person of the week is Chris Travers. If you're interested in learning more about Chris and his contributions to Postgres, definitely check out this blog post.

Logical Replication, Database Antipatterns, max_wal_size Setting, Delete vs. Truncate | Scaling Postgres 258

Join Over 1,000 Engineers & Get New Episodes Weekly!

Content Discussed

YouTube Video

Podcast Audio

Transcript