Postgres 15 Beta 2, Concerning Locks, Vacuum Tuning, Transaction Anomalies

In this episode of Scaling Postgres, we discuss the release of Postgres 15 Beta 2, investigating concerning locks, best ways to tune autovacuum and handling transaction anomalies with select . . . for update.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "PostgreSQL 15 Beta 2 Released!". This is from postgresql.org, and they've announced that beta two has been released, which seems a bit early given it's not going to be until the fall when Postgres 15 is released. But here's the second beta and it mentions some of the changes from the first beta as well. So if you want to learn more about that, check out this blog post.

The next piece of content- "Postgres Locking: When is it Concerning?". This is from crunchydata.com, and they're talking about when you're monitoring your database. They're mentioning some tools here like pganalyze and PgMonitor. The question is, when should you start worrying about locks? Because locks are a fundamental thing that a database does because you have a single source of data and you potentially have multiple consumers of that data that need access to it to update it, change it, delete it, et cetera. They say the types of locks you generally have are kind of in two big buckets when you're doing data changes. One is shared locks, so multiple people can have a shared lock on an object and not interfere with one another. The second is exclusive locks, which means one single session needs to have exclusive access to a particular lock. They also mentioned that once a lock is taken, it's held until a transaction is committed.

So that's just something to keep in mind. They say the best way to monitor locks is by looking at the pg_locks view. Although in terms of seeing issues, I kind of agree with them down here that the pg_stat_activity is a great place to look at that as well. Because you see the active statements and you can see in the wait event column what type of lock is potentially happening for a transaction that's running. Because the pg_locks view can be very difficult to interpret. Although they mention the key columns here. One is, has a lock been granted yet or not because it's a boolean. What kind of mode? So that's essentially the type of lock it's taking. And then what is the pid or the Process ID of that particular lock? So those are important bits of information if you're going to look at the pg_lock screen. So, like I mentioned, they're mentioning here, locks happen in the system all the time.

So you can't just look at, say, the count of locks to say there's a particular problem. But one way that you may be clued into that there's some type of problem going on is looking at where locks haven't been granted. So looking at pg_locks WHERE NOT GRANTED, and say getting a count of those. If you say that number is increasing, there's presumably some type of problem that things are backing up. If something has acquired an exclusive lock and another process is waiting for a shared lock, it's essentially in a queue, a lock queue and where not granted is essentially that lock queue. So if you see an increasing size of transactions waiting in the lock queue, that indicates a problem. Now, if you want to do more investigation here, you can also still look at the pg_stat_activity and look for a lock wait_event_type. You can actually look at the process, but then you can also use the pg_blocking_pids(pid) function to see what processes are blocking this one.

Why is it waiting? What's it waiting on? Do you want to kill those processes or get insight into what the issue is? Then he mentions you do have some non-blocking locks and these are lightweight locks that affect data reading. So these are waiting for data to come back, say, reading something from disks or accessing buffer content. These can happen on occasion if your system is really active, but if you notice consistent issues, then maybe it's time to upgrade your system. Lastly, he covers advisory locks, so these are another type of locking mechanism.

Now, Postgres doesn't do this explicitly. This is something that an application developer can use these functions to create their own locks within their application for accessing data. He does cover that if you're using this with PgBouncer and you're using transaction pooling, you definitely don't want to use the pg_advisory_lock command because that is session-based. If you're using transaction pooling with PgBouncer, you're going to want to use the pg_advisory_xact_lock functions instead. But this is a great post explaining how to determine when locks are concerning and some queries to take a look at that.

The next piece of content- "Importance of PostgreSQL Vacuum Tuning and Custom Scheduled Vacuum Job". This is from percona.com and they're talking about how to deal with autovacuum issues, which tend to become a problem the larger your database gets. Some of the issues that they've seen that they're mentioning are that tables become candidates for auto vacuum during peak hours because the trigger mechanism for vacuuming something is basically the threshold of unvacuumed tuples. So if you have a very active database that's changing a lot of data, that's probably going to trigger that autovacuum. And now it's happening during peak hours. The other problem is starving tables. So your active tables are getting all the vacuums and your less active tables essentially haven't been vacuumed in a while, which can lead to certain problems. The third is no way to control the throttle of autovacuum workers dynamically.

Now they do mention here you can adjust the cost limit dynamically and reload the database to update the configuration, but any actively running worker doesn't automatically change. So if you have a long-running worker that takes hours to run, that will run to completion without taking into account this change. It's any new workers that will then use this new configuration. Four is the attempt by DBAs to tune parameters that often backfires. So basically this is basically configuration mistakes. One thing they mentioned here is creating a high number of workers, which many people think just adding that will vacuum things faster. It won't, it may actually slow things down. Five is an autovacuum during the active time window, which defeats its own purpose. Again, this is triggering vacuum during active periods, which is not really what you want.

And starved tables trigger wraparound prevention autovacuum. So there's an aggressive autovacuum that happens to prevent wraparound of the txids. So the longer you go without vacuuming the table, you run that risk. Then they go into ways to tune auto vacuum and the advice here is really great. So this is the process you should follow. So the two parameters that are most important are the autovacuum_vacuum_cost_limit. A lot of times this is inherited from the vacuum_cost_limit. So either or because this determines how much work you can do. And then the second configuration change is how long to wait before you kick off the next autovacuum worker. So basically you can reduce that to get the autovacuum going faster. Now, one thing to be aware of, if you try to increase the workers, this vacuum_cost_limit is a single pool. So if you increase the workers to 5, 6, or 10, that's all going to share this single cost limit.

So actually, the more workers you have, each individual worker will be slower, whereas the fewer workers you have, each individual worker will vacuum faster. That's because it's using this common limit. And like we said, DBA is making mistakes. He says "One common mistake I see across many installations is that autovacuum_max_workers is set to a very high value, like 15!". I haven't seen that. But that's going to make each worker really slow doing vacuums. Now they say it's fine to keep the default value of 3. I've kept it at 3 or I've brought it up to 4, 5, or 6, never more than 6 based on how many tables we want to vacuum in parallel. Now, one recommendation they're using is that as you're making your autovacuum settings, keep in mind that you can change things per table. This can be useful if you have two or three tables that are highly active; that need to be vacuumed more than your other general tables. You can actually change those settings to be able to give these tables a more aggressive vacuum.

So keep that in mind and that's a good practice to do. The second thing that I have not heard of and I really like the recommendation here is that you supplement what autovacuum is doing by adding your own vacuum job. But by supplementing it, you can target to do a particular vacuum job during off hours. Basically what this command is here is finding the oldest transaction IDs, 100 of them, that aren't being actively vacuumed so essentially this is targeting those starved tables and says 'Okay, just do a vacuum freeze on these tables'. Maybe this job runs nightly. Well, you may get a different population of tables being vacuumed every evening and the other thing that they mentioned, which is great here, is that you're choosing a slightly different parameter to trigger this and that is the relfrozenxid. So this is great advice and great recommendations. I definitely encourage you to check out this blog post.

The next piece of content is "TRANSACTION ANOMALIES WITH SELECT FOR UPDATE". This is from cybertec-postgresql.com and they're talking about an anomaly that can happen with SELECT FOR UPDATE. So basically they create a bill table that has a total on it and then an item table that references the bill table with essentially item amounts. So they put one row in the bill table with a total of 60 and then each of those items in the item table adds up to that 60 amount. Now, when you need to add a new item to the bill, essentially you have to insert that new item into the items table and then you need to update the total with the item that was added. So this is a technique to do that here and they're using a transaction to make sure that the total isn't updated without the item or the item is inserted without the total. And when they use this query here to join items and totals together.

You can see before that new item is added the query looks like this and once that $40 item is added, it increments the total and you can see the new item now. But you can have a transaction anomaly occur here if you use SELECT FOR UPDATE. So essentially this will create a lock on the bill table when you want an update. Now, they said the use case with this is that maybe you want to delete bills that don't have any items connected to them. So that's the particular use case they're thinking about here. When they run this, normally you're going to get the same results with this query as the previous one. But if you run that query after the inserting transaction has performed the update before it's committed, you can actually get this result here, where the $40 item is not available, but the total has been updated, which is a bizarre result, and they explain how this can happen. Basically, the query runs, and it does all of this processing on the items based upon the EXPLAIN plan.

Then the last thing it needs to do is lock rows. So essentially it's waiting for that lock. And what Postgres does because this query is waiting for the lock, it's waiting for that bill to be updated to the new value. So it's waiting for it to essentially become $100 and Postgres quote "...proceeds with its operation using the updated version of the row". So it's not using the original version of the row, which was $60, because it's waiting behind that lock. It's waiting to see what that new version of the row is and then goes ahead and does the update. But if you remember, because the query was processed early on with these items, the new item wasn't there yet, so it's not going to see it. So this is definitely a little bit of a head-scratcher. So I encourage you to read this post to get all the details behind it. However, the key way to resolve this type of anomaly is to actually use a different transaction isolation level.

So you can use SERIALIZABLE, of course, because that will assure that any concurrent anomalies just go away altogether. But in this particular case, you can use REPEATABLE READS, because a read committed just means that anything that you're reading has to have been committed, whereas REPEATABLE READS will assure the same snapshot for the whole transaction, as they say here. So from the start, even though the total has been updated, it's still only going to use the original total for the bill row. And they show you how to do this in a transaction. Instead of saying just BEGIN, you do BEGIN ISOLATION LEVEL REPEATABLE READS, and then you do your select for update query. Now, the caveat to this is that you may run into a serialization error and it will say "ERROR: could not serialize access due to concurrent update". So basically that means you need to roll back and repeat the transaction. So definitely interesting. If you want to learn all the details about this, I encourage you to read this blog post.

The next piece of content- "ctid and other PostgreSQL table internals". This is from andreas.scherbaum.la. He's talking about all the hidden columns that are on essentially every table. So every table has a tableoid, which indicates essentially the object ID of the table. Now, first you may be thinking, isn't this the same for every row? And that's true except in the case of partition tables. So it actually shows the child table and the origin of where the data came from. The xmin and xmax indicate what transactions can see which rows. So this handles the multiversion concurrency control. There are cmin and cmax that actually reference the commands within each transaction.

So if one transaction has, say, five insert statements, those will be indicated by the cmin and cmax, the five different commands within a transaction. The ctid essentially references the physical location of the row in the table. Keep in mind that this can change. So really, primary keys are the best way to reference rows. They make the point of that in the documentation. Then oids, which aren't really used anymore, but were in previous versions of Postgres. But if you want to learn more about these internals, this is a great post that explains all of them and he shows all these different examples about how to use them. So definitely encourage you to check out this blog post as well.

The next piece of content- "Introducing publication row filters". This is from postgresql.fastware.com and they're talking about in Postgres 15 the ability to filter what rows a publisher publishes, essentially. So you can create a publication and define a WHERE clause with it and you're only going to be getting those rows back. So they have a few examples where you could do WHERE a price is less than some amount, or WHERE non-managers are in the sales department. We only want to logically replicate those data to the subscriber database and you can create independent publications. So maybe certain subscribers want to subscribe to the USA data, or the UK data or the France data based upon different publications set up with different WHERE clauses in them. And I really didn't think about this, but what is quite interesting here is that updates are handled slightly differently by the publisher compared to how the subscriber receives them.

So for example, if an old row and a new row aren't in the publisher filter, nothing's going to be replicated, that's clear. But when a row wasn't in the filter and then it's updated on the publisher to be in the filter, that actually needs to be an INSERT of the subscriber now not an update because it never had the data, so it needs to be an INSERT. Whereas if the old row was a part of the filter and then you've updated the data in the publisher to no longer apply to the filter, now that row actually needs to be deleted on the subscriber. And of course, an update only happens when there's a match on the filter with both the old row and the new row. They also talk about how it is possible to combine multiple row filters and how they do that by using an OR to do that. But definitely an interesting feature coming for logical replication in Postgres 15.

Next piece of content- "Introducing PostgreSQL interface for Spanner: availability at scale with the interface you know". This is from cloud.google.com. So basically, Google now offers a PostgreSQL-compatible interface for their product called Spanner, which is basically a distributed SQL-based database. So now you can talk to it as if it was PostgreSQL. So if you want to learn more, definitely check out this post.

The next piece of content- "SQL Isn't That Hard (When You Don't Approach It From an Object-Oriented Point of View)". This is from timescale.com and this is an interview they did with John Pruitt, so if you want to learn about his perspective, you can definitely check out this blog post.

Next piece of content- "SQL Isn't That Hard (If You Have the Freedom to Experiment)". This is another interview on timescale.com with Haki Benita, so if you want to learn more about his perspective on SQL, you can definitely check out this blog post.

The next piece of content. The PostgreSQL person of the week is Mladen Marinović. If you want to learn more about Mladen and his contributions to Postgres, definitely check out this blog post.

And the last piece of content. We did have another episode of The Rubber Duck Dev Show this past Wednesday evening. This one was on "Ruby Fibers", so these are smaller threads, but basically a language implementation that offers some concurrency control. So if you're interested in this type of long-form developer content, we welcome you to check out our show.

Postgres 15 Beta 2, Concerning Locks, Vacuum Tuning, Transaction Anomalies | Scaling Postgres 222

Join Over 1,000 Engineers & Get New Episodes Weekly!

Content Discussed

YouTube Video

Podcast Audio

Transcript