background

No More Vacuum, Zero-Downtime Cut-Over, Network Impact, Not In Optimization | Scaling Postgres 275

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss getting rid of vacuum, how to do zero-downtime cut-overs, analyzing the impact of slow networks and seeing a not in optimization.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "PostgreSQL: No More VACUUM, No More Bloat". This is from oriolidata.com. And they're talking about, I guess you'd call it at this stage, a library called Orioli DB. Or they're calling it an engine that you can use in Postgres. So this would be another storage engine that you could use in Postgres apart from the Heap. So this is similar to what we've discussed in previous episodes called ZHeap. It's basically the attempt to create a storage system that uses an undo log to keep track of previous versions of the database as opposed to storing those versions right in the heap itself, right amongst your other data rows.

So right now, Postgres' version of multi-version concurrency control, when you update a row, it actually creates a new row. And then that old row needs to be deleted at some point. And even when you delete a row, it just marks it for deletion and it needs to be deleted later. That's what VACUUM does. But this type of storage system also leads to bloat. It can also cause a lot of writing to the wall at times. However, since Postgres version 12, they introduced the table access methods feature that allows additional storage engines to be implemented.

So this blog post is calling it The Engine of the Future. So clearly, they would prefer to switch from the existing heap to their solution. I do like competition because I actually want to see which one wins out and which will be better. But at these early stages, I don't think there has to be a winner and a loser. Maybe they both could be used in parallel on the same system. I mean frankly, that would be ideal, so that you could update heavy content on oriole DB and then all the other content on the Heap since that is proven to be relatively rock solid. But it's those implementations where you're trying to update a row many times that the Heap and the VACUUM can just get out of hand. Whereas this would be a great solution for that.

And this actually uses two levels of an undue log. They have a row-level undo log, so that allows them to do updates in place. I think they're showing an example of that here where the particular ID, transaction 9, and then it goes through a couple of updates. So the undo log keeps track of all these previous versions in case a session needs to reference them. Then eventually they get discarded. But it also has a block-level undo, which it says allows them to evict tuples that had been deleted but they're still visible to some transactions. And they're using an example of that here where tuples 3 and 4 were deleted because 5 and 6 can be present now in the actual index page, what they're looking at here.

And then lastly, they also merge sparse pages, so pages, for example, that are half full. They can actually merge those pages to essentially compress the data and save bloat that way as well. And then we get to the part that I really like, which is the benchmarks that they have here. And they created a relatively simple table. They put a number of indexes on them and they use this as a test query that does a UPSer so it doesn't search data but then it also does an update if there happens to be a conflict with it. Now they didn't say, and I didn't try to figure out how often updates would be triggered, so I'm not sure what the ratio is, but presumably, an update-heavy workload would be in OrioleDB's favor. But take a look at some of these graphs here.

Not only the pretty impressive increased throughput versus the Heap but also the consistency of the performance. Because you'll see there's this drop-off that happens with Postgres' standard Heap. I'm wondering if these are checkpoints, not quite sure what that is. If you happen to know, please put it down in the common section. The CP usage is super high but again, it's very consistent. But it demonstrates that all the resources of the system are really being used. But it has a very low IO load compared to PostgreSQL and none of these spikes that Postgres does. And then also the space used, well I guess most of these are updates because this piece is an increase with OrioliDB. And it's steadily increasing with Postgres indicating bloat this is happening. Now they didn't show how the system was configured so that's something missing from this.

And you always have to take these kinds of things with a grain of salt but you know they say this test demonstrated five times higher transactions per second. Two times less CP load per transaction so a CPU gets more done. 22 times less IOPS per transaction and no table or index bloat. So the interesting thing is that this is available today so you can visit GitHub here to try it out. Now, it still requires some patches to Postgres. So it's not just an extension that you can install, you would have to try this out with patched Postgres. But I encourage you to check out the comment sections down here because it gives additional detail. The first thing they want to do is eventually to roll this into the Postgres core which is great. But the first step is to get patches into Postgres so that OrioliDB can just become an extension.

But they have to get a number of patches into Postgres and get approved and committed for that to happen. They're hoping to get a lot of the most important patches by Postgres 17 and become a pure extension by Postgres 18. So still a few ways away, but I love the fact that this is happening. Because at the point that this becomes an extension and it's seen as reliable, this could be a great option for people who potentially want to migrate from Postgres. Or try to use it for very update-heavy workloads that the Heap is not ideally suited for. Whereas this looks like it might be. So definitely encourage you to check out this piece of content.

Now the next piece of content is related to this- "How Table Access Methods in Postgres may offer a future without Bloat". This is from pganalyze.com. And Lukas goes into more depth of this topic as well. So I encourage you to check out his content.

Next piece of content- "Zero-Downtime PostgreSQL Cutovers". This is from tech.instacart.com. This is a blog post describing some technology they've developed at Instacart. Now I didn't see if they're releasing anything open source but they're basically describing what they have set up for their Postgres RDS instances. They want to be able to do a logical replication upgrade relatively seamlessly whereas they say here with zero downtime. So they go over the reason why they built this new tooling. Now the process they use to do this is basically, they do a blue-green implementation. So the existing database system running here, for example, is Postgres 12. They've got a primary to read replicas.

They create a logical replica of Postgres 15 and they have a very specific way they do it using RDS snapshots to get this up to speed quickly. Because just using raw logical replication to get it up to speed really takes a long time. Whereas the snapshot technique was much faster and they consider this they're green. They get it up and running and then at the transition point, they redirect the traffic to the green set of instances. Now there are a lot of steps that they go through and they go into detail about how they set this up and things to look out for. So even though they're not offering the tooling, they're describing exactly what they do.

So you could take this and develop your own tooling to be able to mimic what they've been able to build here. So I found the super interesting, and I encourage you to check it out if you want to use logical replication as a way to seamlessly do cutovers. Because you wanted to upgrade your database version or do patches or maybe move operating system versions without worrying about coalition changes. So definitely a great piece of content this week that I encourage you to check out.

Next piece of content- "How To Measure the Network Impact on PostgreSQL Performance". This is from percona.com. They're talking about how you can use pg_stat_activity and the wait_events to determine the ratio of time spent for different operations that take place over the network with Postgres. And they're actually using a stand-alone script they call pg_gather that takes 2,000 samples of pg_stat_activity and the wait events to show a distribution of what time was spent where. So you can see when they're querying a large number of rows, you can see the vast majority is CPU wait time when the network is fast.

Or the slow network, suddenly you see all the time is spent with the client writing. So writing data to the client. And they go over different cases such as bulk data loading, different impact on transactions, and connection utilization. So this is a great tool, this pg_gather script to be able to just sample pg_stat_activity to see, for long-running processes, what is taking the most time turns in terms of wait events. So if you're interested in that, check out this blog post.

Next piece of content- "REPRODUCIBLE BUILDS: A POSTGRESQL QUERY OPTIMIZATION EXAMPLE". This is from cybertec-postgresql.com. And this is a project called reproducible builds and they had a query that was looking at different packages to test. And this query was taking 45 minutes to run. So someone looked into this and they saw the infamous NOT IN. And if you saw last week's episode of Scaling Postgres where cybertec was talking about how this makes a very inefficient query. This shows an example of where you can actually convert it to a NOT EXISTS and suddenly the query ran in a hundred milliseconds. So a 27,000x speed up. So definitely if you're thinking about using NOT IN, you probably don't want to do it. So here's a real real-life example of some issues that it can cause.

Next piece of content- "Timeseries with PostgreSQL". This is from alexplescan.com. This is a very simple post of how to create a time series chart in Postgres. He wanted to get the last five most recent days so he did a generate series from now to now minus four days and then decreased by one day to print out all of the days that you see here. So these are the five most recent days. Then he joined to the date of interest. He then grouped by the particular day and he got this great data that you can chart. So this is a very simple query that you could use to chart time series data.

Next piece of content- "Putting Postgres to the Test: How to Create and Run Regression and TAP Tests". This is from highgo.ca. This shows different progression tests that you can make to check your schema or check the existence of columns or indexes or things of that nature. So check out this blog post if you're interested in that.

Next piece of content- "Building PostgreSQL in a Modern Way- With Meson". This is from highgo.ca. So if you want to learn how to build Postgres with a new Meson build system, you can check out this blog post.

Next piece of content- "PgManage 1.0b2 released". This is from commandprompt.com. So if you want to learn more about the new features, bug fixes, and UI improvements in this release, you can check out this blog post.

Next piece of content- "Citus 12: Schema-based sharding for PostgreSQL''. This is from citusdata.com. And Citus, of course, does scale out Postgres and normally, you choose a shard key within a table and then that table is then sharted to multiple servers. But this one shards based on the schema. So if you have tenants separated by schemas or different use cases for different schemes that you want to shard out that way, you can use their new schema-based sharding to do that.

Next piece of content- "Say hello to Path To Citus Con, a podcast for developers who love Postgres''. This is from citusdata.com. And this is a new podcast. I don't know if it's going to stop at the point where Citus Con happens, but so far they have published five episodes. So if you're looking for more Postgres content, you can definitely listen to these.

Next piece of content- "Deploy PostgreSQL on Kubernetes Using GitOps and ArgoCD". This is from percona.com. And they're talking about using their Percona operator for PostgreSQL version 2, along with GitOps and ArgoCD to do their deployment and management of Postgres. So check this out if you're interested in that.

Next piece of content. There was another episode of Postgres FM last week, this one was on "Connection poolers" and they discussed quote "...when and why we need them, the king that is PgBouncer, and the many new pretenders to the throne". So you can click the button here to listen to it or visit their YouTube channel to watch the show.

And the last piece of content. We did have another episode of the Rubber Duck Dev Show this past Thursday afternoon. And this was on "Scarpe Diem With Nick Schwaderer". And Scarpe is actually a successor to a Ruby application called Shoes that lets you write desktop apps. So if you're interested in that type of content, we welcome you to check out our show.

episode_image