background

New Postgres Releases, A Hairy Incident, Slow Down To Go Faster, Loadable Module Archiving | Scaling Postgres 202

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss the new releases of Postgres, a hairy upgrade incident, why slowing down can make you go faster and using loadable modules for WAL archiving.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "PostgreSQL 14.2, 13.6, 12.10, 11.15, and 10.20 Released!". This is from postgresql.com. So these are mostly for bug fixes and improvements. There are no security-related issues that they mentioned, so that's good. However what I'm noticing, what seems to be a trend, is that some index-related items keep appearing on the bug list for things that are improved. And the top two items, one is a low probability scenario of index corruption. The other is when reindexing concurrently, there's a possibility for toast tables to be corrupted. So basically, the recommendation is to go in and re-index your indexes again for your database. But if you want to learn more about bug fixes and improvements, you can go ahead and check out this post.

The next piece of content- "A Hairy PostgreSQL Incident". This is from ardentperf.com, and this is basically a story walking through an episode of a database issue that this author encountered. So it was at the end of the day and another part of his organization had done a PostgreSQL upgrade to version 11 and suddenly they started having performance problems. He mentions in the post that sometimes you get some regressions that happen after you do an upgrade, but this was causing significant problems and they had to try and figure out what was going on. This blog post goes into detail about, okay, what they suspect was going on here, what the queries look like, and then walking through the process of actually discovering the issue and then how to rectify it.

What was convenient is that they had another database system on the previous version that was operating fine. So they were able to look at what a good query plan was and then ultimately what the problem was. But you can tell they were looking at flame graphs, looking at what processes were running, but ultimately they discovered that this BitmapAnd that the planner was using was causing basically 100% CPU usage. Now, from what they could tell, it wasn't a statistics issue. They were relatively up-to-date. Things were recently vacuumed and analyzed, so they basically couldn't think of any other way to resolve it rather than to rewrite the query.

What they used is a CTE that basically materialized a subquery and then used it in the query and that gave them the result that they were looking for and the good performance. So this blog post is a great insight into a series of problem-solving steps to find the issue and come up with a solution. Now it may not be super elegant and he calls it a total hack, a beautiful hack. But the reality is if the planner is kind of not doing what you think it should do and the statistics are up to date and it doesn't help to add extended statistics, this is probably the thing you have to do. But if you want to learn more about that, definitely check out this blog post.

The next post- "Slow things down to make them go faster [FOSDEM 2022]". This is from vyruss.org and this is actually a YouTube video, a presentation that was done by Jimmy Angelakos and this is an updated presentation that I think was given at another conference probably in December or November of 2022. But this is such a good presentation I definitely wanted to post it and make note of it again because I believe it has some updates to it. What he means by slowing things down to make them go faster is that counterintuitively, sometimes you need to put some constraints or barriers in that will actually make you process more transactions or increase the throughput. For example, maybe you don't want to be explicitly locking things but instead use the serializable isolation level.

Now, a perfectly designed locking solution may beat that, but a lot of times you're not going to come up with a perfect locking solution. Serializable isolation may be faster. And if there are any failures, because that'll be what happens. If there's a serialization error, you just redo the work again. So that is slowing down to ultimately give you greater transaction throughput. The next area talked about is connection pooling. When you have a lot of connections to your database counterintuitively, you may need to put PgBouncer in front of that. That will slow down the number of connections that hit the database at one time. So the PgBouncer kind of queues them in a state. But because of the shared resources of the database, this actually lets the database perform work faster. So you slow things down to make them go faster. But this post is a great education about a lot of different concurrency-related matters with regard to Postgres and I highly suggest you check it out.

The next piece of content- "Waiting for PostgreSQL 15- Allow archiving via loadable modules". This is from depesz.com and I have to agree, what he's saying here is that this is huge and a huge improvement. So historically if you wanted to archive the WAL files you would use an archive command and it would run that command for every file that needed to be archived. Well, you can run into big problems when you're generating a ton of WAL, like thousands or tens of thousands of WAL files in a short span of time because you have to fork and get that process going for each file. Whereas if this is a loadable module, presumably that's going to be able to operate much more efficiently. They do have an example module that's coming in Postgres 15 called Basic Archive that you can use. But he predicts, and I agree with him, that Barman and pgBackRest and these other third-party backup and recovery tools will be implementing this new solution to give them more efficiency. But definitely, a great update is coming to Postgres 15.

Next piece of content- "Server-Side LZ4 Backup Compression". This is from rhaas.blogspot.com and he's talking about an enhancement coming to Postgres version 15 where they've added lz4 backup compression to pg_base_backup. He has an example here of one example where without it took 16 minutes to do a backup, whereas with it it took 1 minute and 50 seconds, which is quite significant. The size went from 1.5GB down to 169 megabytes. So quite a high compression ratio. Now, he said there is a lot of repeatability because this is a synthetic database that he backed up from running pgbench. But he did another test case using the plane as opposed to the tar output and the result went from 15 minutes down to three minutes.

So not as good, but still pretty significant. So this is definitely a great enhancement. But the question I'm wondering about is that frequently, the built-in gzip compression I actually have never used for large clients, we've been using Pigs or the parallel gzip process to be able to use multiple processes in the server to do the compression. So we've always piped pgBackRest to Pigs or this parallel gzip functionality. Now, I know lz4 is fast, but can it be as fast as gzip running across 16, 32, or 64 processors? I don't know that, but definitely an interesting enhancement coming to Postgres.

The next piece of content- "Postgres Constraints for Newbies". This is from blog.crunchydata.com, and they're talking about how you can apply constraints to your database. In their example here, they had a users table, a rooms table, and then a linked reservations table that links users to rooms and then the actual times. Now, in terms of the different constraints, they mentioned you can set up foreign key constraints so that the data between the reservations table and the users table and the rooms table can be consistent. They talked about unique constraints, so you can only have a unique room number, for example, in rooms. They talked about the importance of setting cascading for certain operations.

Like if you're going to delete a user, you want that delete to actually cascade and delete any reservations. They talked about NOT NULL constraints and how you would want to set that for appropriate columns, as well as being able to implement check constraints so that you can have bounds for certain values and certain fields. And lastly, they covered exclusion constraints, which are great for excluding overlapping time ranges when you're doing room reservations or even bounding boxes when you're talking about geographical information systems. So they cover that topic here as well. But if you want to learn more about constraints in Postgres, definitely check out this blog post.

The next piece of content- "5 mins of Postgres E4: Writing your own custom Postgres aggregates with Rust or raw SQL". So this is from pganalyze.com and he's talking about the ability of Postgres to create your own aggregate. So aggregates are like sum or they are the average of a value or even things like being able to aggregate JSON together. Well, you can create a function and then create your own aggregate from that function and do it in PL/PGSQL. But he also mentioned that there's this extension called pgx that lets you easily create extensions in Postgres and you can actually write them in Rust. So you write them in Rust, create your extension from it, and then you can install that extension in Postgres to get all of those aggregation features. So he goes over different blog posts that cover how to do that. So if you want to learn more about that process, definitely check out this episode.

The next piece of content is actually a YouTube channel. It's the United States PostgreSQL Association. About a week ago, they posted all of the talks, it looks like from PgConf New York City 2021. So if you're interested in more video content, you can definitely check out this YouTube channel.

The next piece of content- "Waiting for PostgreSQL 15- Add UNIQUE null treatment option". This is from depesz.com they're talking about how normally when you assign a UNIQUE constraint to a field, you can insert as many NULLs as you want. So in this example here, they created a table and made the code name unique. Well, you can actually insert as many NULL code names as you want by default. But there's a new enhancement coming to 15 where you can actually specify that the NULLS NOT DISTINCT. So what that means is you can only ever insert one NULL value. If you try to insert more than one, it'll give you an error. It says "duplicate key value violates unique constraint". So definitely a customization that you could potentially add to your database if you'd like.

The next piece of content- "Logical replication of all tables in schema in PostgreSQL 15". This is from postgresql.fastware.com and they're talking about an enhancement in Postgres 15 to logical replication, where you can now define all the tables in a schema to be able to be published for logical replication purposes before you could do individual tables or even a whole database. Well, now you can do it at the schema level, and this blog post walks through that enhancement. So if you want to learn more, you can check out this blog post.

The next piece of content, there have been three blog posts that have been posted about Pgpool II. They are "What's new in Pgpool-II 4.3?" in three parts. So if you're interested in learning more about Pgpool II and the enhancements that are coming to it, definitely check out these blog posts.

The next piece of content, the PostgreSQL person of the week is Bernd Helmle. If you are interested in learning more about Bernd and his contributions to Postgres, definitely check out this blog post.

The last piece of content. We did have another episode of The Rubber Duck Dev Show this past Wednesday. In this episode, we discussed how we got started coding. So if you want to learn a little bit more about our history, in terms of becoming developers, feel free to check out our show.

episode_image