background

20,000 Times Faster Order By Limit | Scaling Postgres 366

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss a top ten dangerous Postgres issues list, importing and exporting statistics, more Postgres 18 beta features and a 20,000 times faster query by fixing an order by limit issue.

Content Discussed

YouTube Video

Podcast Audio

Transcript

Imagine you're working at work, minding your own business and then suddenly something catastrophic happens to your database. Well, we're going to talk about 10 issues that could potentially crop up if you're using postgres. And also be sure to stay till the end of the show where in my Consulting corner segment I talk about where finding a problematic order by limit query resulted in a 20,000 fold performance improvement. But I hope you, your friends, family and co workers continue to do well.

Our first piece of content is actually the last episode from Postgres FM and that is top 10 dangerous issues. And in this episode Nikolai and Michael discuss 10 dangerous issues. So the first issue they discuss is heavy lock contention. So basically what they're describing is doing schema changes. So maybe you need to add some constraint, maybe you need to add a column, alter a column. These things can require heavy locks and may lock up your system. So the most important thing to do when you're doing schema changes is of course use a lock timeout that will cancel those schema changes if they happen to start locking the system. It basically helps protect the system. They also mentioned queue like workloads where you can run into problems with locks against again, but the best way to avoid that is to use for updateskiplocked if you're building any kind of a queue in postgres. The second issue they covered is a bloat control and index maintenance. So I wouldn't call this too dangerous an issue, but it's definitely commonplace and you need to watch how frequently you're doing updates. Try to keep that under control and maybe even use an extension like pgstatstople to give you accurate insight into your level of bloat. And of course index maintenance, you could check for bloated index as well, and you can simply just re index those concurrently. The third was lightweight lock contention, and this situation presents itself once you start having a lot of active queries or statements running on the system. This could be contention trying to read or write from the shared buffers area, or even when you're really pushing the load. Just too many queries may introduce lock manager issues. Now some recent versions of Postgres have made some enhancements to help this, but this could hit you as well. I've also seen this issue with poorly designed schemas that require a lot of joins and they're using a lot of indexes. You could also hit this type of issue when you have a lot a lot of partitions as well. Number four is Transaction ID wrap around so that is always a danger that exists. So you definitely want to be monitoring that and make sure you don't hit the 2 billion limit. And if you are generating that many transactions, is there a way you can batch things together into one commit to try to minimize the number of transactions you're using. Number five is four bit integer primary keys hitting the limit. So if you are using integer primary keys, you should probably convert those to a bigint now so you're not going to hit this problem in the future. Or if you do have them, definitely start monitoring those and address this in advance before you hit the limit. Number six is replication limits. And this is basically due to the single threaded nature of I think the wall sender and receiver. And you can basically hit limits with that because it is a single threaded operation, basically one process. And if you add compression on top of it, that could slow things down even more or less cause this particular process to hit the maximum usage of a particular core. Number seven is hard limits. This could be the limit of database size you're allowed, like on a hosting service. Or maybe it's the size of a single table in Postgres, or maybe it's the number of columns allowed in a particular table. Like we talked about a blog post last week where when you delete a column, add a column, delete a column, add a column, you're actually using up a finite number of columns. So I think for most people these limits are pretty far away. But as your database gets larger and more sophisticated, you do want to be aware of those limits. Number eight is data loss. So this could be someone accidentally deleting rows or truncating a table or dropping a table, or maybe even some sort of hardware issue where you actually lose data. Because maybe you have replicas and you had to do a failover, you might lose some data in that process, particularly if your replica is asynchronous. So that's always a concern to be aware of. Number nine is poor high availability choice. So maybe your solution can introduce split brain issues where you have two primaries that think they're the only primary because one side of your application may be speaking to one, one side to the other. And now you have essentially two sources of truth. Intent is corruption of various kinds. But thankfully this has been rare enough. I actually haven't seen seen this issue. It seems like the database hosting providers hardware has been relatively resilient to that. But doing things like turning on checksums for your database as well as using utilities like Amcheck can help you be aware of these particular issues, but if you want to learn more, definitely encourage you to listen to the episode or watch the YouTube video down here. Next piece of content PostgreSQL 18 beta preview, export or amend statistics with ease this is from database rookies.WordPress.com and they're talking about a change to pgdump and pgdumpall and pgrestore where you can export or restore statistics only. And the reason why this can be great is that imagine you have your production system and you try replicating queries on a lower environment that maybe doesn't have as much data. You you may be getting different query plans compared to production, but if you could export the statistics and load them into that lower environment or smaller environment like staging, you should be able to run the same queries and get the same answer as you would in production. So this is a great enhancement particularly for doing performance related work. But if you want to learn more you can check out this blog post.

Next piece of content Good time to test I o Method for Postgres 18 this is from Vonder Me and if you missed Last week's episode, Postgres 18 Beta 1 was released. One of the main things with regard to it is asynchronous I o, and this can be configured three different ways using the I O method. One way is sync, which is basically it's doing synchronous I o, not asynchronous I O. It's basically equivalent to what you would get in Postgres 17. The other setting for it is worker, and right now this is the default as asynchronous method and you get three async workers by default with this implementation. And the third option is iouring, which uses the liburing library to pass IO requests to the kernel iouring interface directly. But the reason why he's writing this blog post is they need to actually pick what the default is going to be for Postgres 18. So he encourages everyone to download beta one, try it out with your workloads, try the different IO settings. Now you should be aware that Iouring only works with more recent versions of Linux. It doesn't work for other platforms or earlier versions of Linux, but he's advocating everyone try it out and report to the PGSQL hackers list on what performance you're seeing for these different settings, because that will help guide them to determine what they want to use as the default for Postgres 18 with regard to this value. But if you want to learn more, check out this blog post.

Next piece of content Postgres 18 beta is out 7 features you should know about. This is from Neon Tech and again, if you missed last week's episode, you can watch that or you can review this blog post that talks about the changes. Basically the asynchronous I O which I just covered. They mentioned the improvements for major version upgrades, so this could be transferring statistics with the feature we just mentioned before or using the jobs and swap option for the PG upgrade. They mention more detailed explain output new stats for vacuum and analyze getting a UUIDv7 function. Finally in postgres which is time ordered in the front and random in the back which be great for using for primary keys as opposed to a fully random uuid, various performance improvements including skip scan and they also mentioned OL support.

Next piece of content is understanding PostgreSQL write ahead logging or wall. This is from postgresql.fastware.com and this blog post actually has a presentation embedded in it talking all about understanding Write ahead logging in postgres. So I thought this was pretty interesting. So if you want to deep dive into the wall, its purpose basically for crash recovery of Postgres as well as implementing streaming replication and logical replication, you can definitely check out this presentation.

Next piece of content Understanding Logical Replication in postgres so after you've looked at the well presentation you can look at this blog post to understand more about logical replication in Postgres. And this is from Springtail I.O. and they go through describing logical replication in detail, comparing it to physical replication, which is essentially a binary copy of the cluster compared to logical replication, which is logical copy of it. They talk about what you need to start it. Basically you need to set a few PostgreSQL configuration values. You set up a publication on the publisher, a subscriber on the subscriber. You need to establish the replication slot and the output plugin you're going to be using, as well as getting into the different details of how logical replication works, including talking about the replication protocol as well they talk a little bit about their service where they basically use logical replications to set up replicas for you. But if you want to learn more you can check out this blog post.

Next piece of content migrating to Postgres this is from engineering.usemotion.com and they were on CockroachDB but due to some issues they decided to migrate to postgres. Now a lot of those posts talks about the issues they were dealing with cockroachdb and how things improved in Postgres as opposed to giving details about how they actually did the transition to Postgres from CockroachDB. But if you're interested in that, you can check out this blog post.

Next piece of content Neon and Databricks this is from Neon Tech and the reason why it says this is because apparently Neon has been acquired by Databricks, so they go into the detail of this future partnership and acquisition. So check this out if you want to learn more.

And last piece of content How I got started with FerretDB and why we chose Postgres with Peter Farkas. Now I was a little confused by this because it says shares the origin story of this open source MongoDB alternative. So Ferro DB apparently is a MongoDB alternative. So I'm thinking why is this on a postgres podcast? But what they did is took postgres and developed a wire protocol compatibility layer that works with MongoDB. So basically if your application uses MongoDB, you can now point it to FerretDB and it will speak MongoDB but use Postgres on the backend. But you can listen to this podcast if you want to learn more.

And now it's time for my Consulting corner segment. Well, I was looking at some slow queries for a customer's database and this particular query was running really slow, approximately two seconds. So I looked at the explain plan and it was choosing an horrendous plan. Basically it was joining two tables, it was ordering by one of the columns in one of the tables and doing a limit by one. But the most selective part of the query was a where clause on the first table. Well, postgres was entirely ignoring that more selective where clause and and was using an index on the order by columns and then joining to the first table to try and find that matching id, thinking oh, I just need to get one row. But it was taking forever and accessing thousands of buffers. So basically this looks like the order by limit issue that sometimes people run into postgres. So as soon as I took off the limit, the Query ran in 0.1 milliseconds. So basically 20,000 times faster because it then chose to use the most selective column. Now I could also get this result by increasing the limit to 10 rows or 100 rows and that would work, but you never really know what's the appropriate level of that limit to make it work. So I didn't want to do that. It's possible that the statistics were off, but given this particular system and their rigid policy based upon when updates can be done. It was better to choose a query solution to this than try to update the statistics to see if that would actually resolve the query issue. So basically I came up with three potential ways to address this. One is just leave the limit off, but then in the application code itself it'll return an array of records and then you can just choose the top result. Now that's not the most elegant, but it works and is fast because normally it is only returning one or two rows anyway. The second option is to actually modify the order by column and make that column an expression. So if it's a number maybe you add zero to it. If it's a string you could add an empty string to it like concatenate an empty string so it's the same value. But postgres now thinks it's an expression and it will not use that index in its planning. So if you do that, essentially Postgres will choose the selective index and won't worry about the order by or the third option you could do a materialized CTE and have the whole query except for the limit in the CTE portion and then just select from that table limit one to give you the answer. But I will say this is the most severe performance problem I've seen with the order by limit issue is where it results in a 20,000 fold improvement. But just some things you can do if you run into this issue
 

episode_image