background

Easy Recursive CTE, Zheap Undo, High Availability, Loading Data | Scaling Postgres 196

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss easily writing a recursive CTE, the zheap undo capability, high availability considerations and fast ways to load data.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "Learn how to write SQL recursive CTE in 5 steps". This is from dev.to in the Yugabyte section. What I thought was interesting about this post is that they really break down into five steps on how to write a recursive CTE. A lot of times, someone just gives you a result of recursive CTE and not how it's built out, whereas that's exactly what this does. So step zero is basically creating an employee table and a recursive CTE is just something that references itself. This example here shows people who are managed by other people. So they have a table of employees and they have a manager ID that basically points to another employee record. So that's how the self-references are set up.

He even gives an example of a foreign key reference that mentions that. So the first step is to find the root or basically what is level 0 and it's where the manager is NULL. So that's essentially the president, no one else is managing him. This is a very simple query to get you exactly that information. Now you'll see references to Yugabyte here, but of course, this works with Postgres as well. The next step is to define this level 0 in the WITH clause. So he just rewrites this very simple query just using a WITH recursive statement. It basically just does one step, but it gives you the exact same information you were seeing above. The next step three is to define the join to the next level. So he receives the one row and joins it to the employee table again to pull out who that top person is managing.

So now you can see the three records that the top person is managing just by simply adding this join within this section. Step four is to concatenate the two levels with a UNION. So the first one showed this record, the next one showed who's managed by him and he just did a UNION ALL to put those together. So essentially this gives you the first two levels. But to make the leap to do it recursively, you need to put that UNION ALL actually in the recursive CTE. So that's what this step does and he says to get it in one recursive WITH clause. He moves the UNION that was here, moves the queries and the UNIONs up here so you get it all in one WITH clause, and that's how you set up a recursive CTE. So I really like the way he laid it out. If you're interested in finding out more, you can check out this blog post.

The next piece of content- "ZHEAP UNDO LOGS DISCARDING IN POSTGRESQL''. This is from cybertec-postgresql.com, and this is talking about the zheap project, essentially. Basically, it's creating another information store that's separate from the standard heap where that table history is recorded in an UNDO log, as opposed to being stored directly in the heap using row versions. Now, the benefit of this is that you don't have to use vacuum to handle that. Basically, there's a separate process that would take care of those UNDO records. It's stored in a separate place, much like Oracle does it. So the benefit is avoiding table bloat. This can particularly impact when a table gets tons of updates because essentially in the standard storage engine of Postgres, you create a row for each update that happens.

So you can easily get a bloated table by doing that. Whereas with a zheap table storage, the update is done in place, but it creates UNDO records to reference all of those versions. So this is essentially an update on the project. And he says, how do you get started with zheap? And they provided a container here. Unfortunately, what they're saying is that zheap is too invasive to be done as an extension. But my assumption is that this will eventually become a core part of PostgreSQL once it's been vetted and looks to be working. So right now, this is still a technical preview, but once it's working as expected, I'm assuming it would just be a part of the core and not considered an extension. It would just be another storage engine you could use. But once you get that docker container set up and get it working, you go ahead and get it set up.

He did it for his entire session, setting the default_table_access_method to zheap, although you can do it at table creation time if you want to instead. So it's basically just a different way to access table information. And then he tested zheap undoes when you have an insert load. So he inserted 10 million records. Here you can see as he shows here in the base UNDO directory, there's a large number of files that get generated by all of these inserts. Because essentially, zheap stores the UNDO in this separate location. It's not a part of the table. So until the insert is actually committed, you're going to have all of these files around and he shows that here. Once it's committed, then you take a look at this directory. Now it has much fewer files. Now, part of what this post describes is how this removal of the UNDO happens. It actually uses a separate worker that's been set up for zheap called an undo_discard_worker. So that's the one responsible for no longer needing UNDO once all changes have been committed and no other connection needs to see that information because it still needs to comply with MVCC concurrency control.

Then he tests the zheap UNDO with an UPDATE load. So essentially updates that table and updates it 10 million times. And he says, as you can tell, there's a ton of wall that gets generated, so it's still write-heavy. Again, this process is also in the base UNDO directory, but once it's committed, the directory shrinks back down to size. The important thing he mentions here is quote "The cleanup is not part of a commit, but it's actually done by the discard worker which is in charge of making sure that the log is not killed too early". So again, has to ensure that there are no other transactions running that need access to that undo information before it does it. But this is another update to the zheap project, something that I like keeping track of because I think this could prove beneficial in some table use cases with PostgreSQL to hopefully avoid bloat. But if you're interested in learning more, you can check out this blog post.

Next piece of content- "Postgres HA: roles are dynamic". This is from tapoueh.org, and he's talking about Postgres High Availability. Specifically, he's talking about how your different nodes in a High Availability setup must be dynamic. So typically in Postgres, you set up a primary and it's always going to be a primary until you essentially decommission it. Then you have a replica that you set up and it will continue to be a replica until you promote it. That's pretty much how it is. But with a High Availability setup, you actually want the primary to float back and forth between different machines as necessary. Maybe you want to upgrade a particular machine, therefore you want to change which server the primary is, and potentially you may want to flip back to it too. So this post makes a point when you're talking about High Availability, the roles being dynamic.

So what's primary, what's secondary needs to dynamically change? And essentially you need a tool to be able to handle that because Postgres by default really doesn't handle that capability. Basically, you can make an old primary catch-up as a replica by using pg_rewind, and you can promote a replica to a primary using a simple command. However, being able to change roles back and forth requires additional tuning. Now this post is by the person who has authored pg_auto_failover, so he definitely mentions it more than once in this post as a method for doing HA. So that's definitely something you can check out. But this post also covers a lot of issues related to dynamically changing these roles. And things you need to keep in mind, like capacity planning. So you need to make sure that your replicas are essentially identical to the primaries or can handle the load, and then also discuss things that can impact your long-term maintenance of this.

Specifically, he's talking about when you're choosing what to use for your replicas. A lot of times, replicas are sitting there with nothing going on, so people want to start using them as a read-only source, so they start sending traffic to it. But then what happens if you have a failover? Now, can your replica handle both the primary traffic and the read-only traffic it's receiving too? Or do you need additional instances to handle that? So, it's a lot of different things to consider when you're setting up your High Availability solution. Then he talks about the application side of things, handling connection pooling, as well as different disaster recovery scenarios. So definitely a long post that talks about a lot of issues with regards to High Availability. So if you're interested in that, definitely check out this blog post.

The next piece of content- "The Fastest Way To Load Data Into Postgres With Ruby on Rails". This is from pganalyze.com and the post describes exactly what he says. He's got this convenient table up here that shows you different speed times. So if you're going to insert one record at a time, inserting a million records will take you 1.3 hours, which seems really long. If you do a bulk insert with Activerecord Import, it happens in 5.1 minutes. So what is this? So basically an insert statement can contain one row of information you want to insert. That's pretty much what this is. You're inserting one record at a time. Although an insert statement, you can place many rows of information to be inserted. So you could have an insert statement that has say, 100 rows of data that will be inserted. That's essentially a bulk insert. Specifically, he's using a separate library here called Activerecord Import to do it.

Although you don't need to use this, you can send a multi-row insert statement to Postgres without a problem using different languages. The next one he talks about is the PostgreSQL Copy and he again uses a separate library called Activerecord-copy. But in my experience, Copy is the fastest way to insert data into Postgres. You don't necessarily need a separate library. I guess this has some convenience features to it. Like if I'm receiving data, I basically would ideally want to have it as a CSV file and I use the Postgres Copy command to quickly load it into Postgres. Now, his 1.5 minutes here, that is super slow because I've done a separate tutorial video where I inserted 100,000 records in just over a second. So something else is going on here. Either this library is causing some slowdown issues or the other thing to take into account is that maybe there are indexes that are present. Because of course, the fastest way to insert data is with no indexes on the table nor triggers or things of that nature.

So if you want to get the maximum speed, you want to drop any triggers or indexes before you do the load if you can do that. A lot of times you're not going to be able to do that, but that's definitely the fastest way to insert the data. And then he talks about using background jobs and basically what he's saying is you kick it off to a background process that does it so it looks to the end user like it's less than 1 second. But basically, when you're inserting data, doing a row at a time is going to be the slowest. A multi-row insert is going to be in the intermediate speed, but the Postgres Copy command gets you there the fastest. If you want to do it as fast as you can, drop indexes and triggers. That should get you the most speed. But if you want to learn more about this, you can check out this blog post.

The next piece of content- "Explore the new SEARCH and CYCLE features in PostgreSQL 14". This is from aiven.io. He's talking about new features related to recursive queries. Now this post first talks about setting up a database that has tables with different locations and it's basically a traveling problem. It's asking the question, what are the different routes to get to different locations? So he set up a trips table and set up all the different routes between the cities and their cost in terms of Euros. Then he sets up a query here to show you where you can go in Rome. So this is an example of what it looks like. Then as he says, he adds more hops to the journey. So using a recursive CTE from a previous post, he set the budget at €800, which is here to see how far they can travel.

Essentially, this comes up with about 89 rows and the results look similar to this. But then it gets into discussing SEARCH. So that's the new Postgres 14 feature where he says you can define your exploration path. SEARCH allows you to do either a BREADTH option or a DEPTH option. So a SEARCH option basically gives you BREADTH, so it shows you the shortest routes going up to the longer routes, whereas the DEPTH option shows you how deep you can go for a particular trip. In addition to that, he shows you the cycle option where you can avoid loops, so you can avoid going back to the same city. So it just shows you a single loop to each of the cities. So this was an interesting post exploring the new SEARCH and CYCLE capabilities with recursive CTEs in Postgres 14. So if you want to learn more, check out this blog post.

The next piece of content- "PostGIS Nearest Neighbor Syntax". This is from blog.cleverelephant.ca. So this is a way to get only one distance calculation and one target literal from an end nearest neighbor search using PostGIS. So if you're interested in that, you can check out this blog post.

The next piece of content- "UK COVID-19 dashboard built using Postgres and Citus for millions of users". This is from citusdata.com, and this is with regard to the ordeal that's been happening in the last couple of years. I don't normally cover posts like this. This is essentially a case study for Citus. However, this had some pretty interesting details about the data, the volumes, how they set things up, dealing with partitioning of the data, and some issues they were encountering with processing it. So I found it a pretty interesting read. But just a warning, it is quite long, but if you want to learn more about this project, you can definitely check out this blog post.

The next piece of content- "MANAGE ENCRYPTION KEYS WITH POSTGRESQL TDE". This is from cybertec-postgresql.com. They have an open-source tool they call PostgreSQL TDE, which does on-disk encryption for Postgres. So it's provided as a separate extension that you can use. This specifically talks about how Postgres handles disk encryption. Basically, on startup, there's a flag called encryption-key-command, and basically, you provide a command that can then retrieve the key from any key source that you want for handling the encryption of your database data. So if you want to learn more about that, you can check out this blog post.

The next piece of content- "High Availability and Disaster Recovery Recipes for PostgreSQL on Kubernetes". This is from percona.com. This talks about different ways of using their Postgres operator for doing High Availability in a multi-data center with multiple availability zones. It discusses how to handle vertical scaling, horizontal scaling, handling, disaster recovery, backup, and restores, as well as continuous restores. Which are kind of like log shipping to a separate region, for example. But you can check out this blog post if you want to learn more.

The next piece of content. The PostgreSQL person of the week is Ashutosh Bapat. So if you're interested in learning more about Ashutosh and his contributions to Postgres, definitely check out this blog post.

The last piece of content. We did have another episode of The Rubber Duck Dev Show this past Wednesday evening. This episode we talked about "Code Quality Analyzers", and more specifically, Ruby Critic for Ruby. But of course, there are other tools that can analyze your code quality. So if you're interested in some long-form, developer-based content, you can definitely check out our show.

episode_image