Zheap Status, Sequence Gaps, Efficient Pagination, Set Up Patroni

In this episode of Scaling Postgres, we discuss the current status of zheap, the gaps in sequences, how to efficiently paginate and how to set up patroni with etcd and haproxy.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "POSTGRESQL ZHEAP: CURRENT STATUS". This is from cybertec-postgresql.com, and this article talks about zheap and its status. Now, zheap, as a reminder, is a different storage engine from Postgres's typical heap. What it does is it supports a redo or an undo operation, similar to how Oracle handles multi-version control. So the way Postgres works now is that when a row needs to be updated, it actually doesn't update that individual row. It creates a new row and then marks that old row for deletion, but it's able to maintain that row for other sessions that still need it. In other words, that's its multi-version concurrency. But the way Oracle works, which is similar to how zheap is intended to work with Postgres, is it maintains an undo or redo log. So the rows do get updated in the table, but the old values of that row get stored in an undo log so that if you need to roll back or look at something previously, it can consult that undo or redo log.

So this post goes over zheap's current status and it mentions that they're working on logical decoding, which is the process of being able to read the wall and decode it back into SQL. This is for solutions that do change data capture, so zheap will have to be decoded to support that feature. There also seems to be an UNDO infrastructure that Postgres is working on that may make it into Postgres 15. Basically, zheap wants to work with this UNDO infrastructure because they want this type of undo/redo infrastructure to be able to work with more than just zheap. So it looks like a general UNDO infrastructure they are creating. So they're working on zheap to get it compatible with that. The other thing they're doing is a lot of fixing bugs and merging and they're showing all the code that's been developed so far for zheap here, as well as continuing to improve zheap. For example, one area they're working on now is the discard worker, which would essentially go through the undo chunks to throw away rows that are no longer needed, similar to, I guess, how vacuum works.

Now, they do have a build that has zheap enabled that you can check out here and you can create a table using zheap. So you just add this when you create the table using zheap and it will start using this storage engine for it. Now, the advantage of zheap is that there should be much less of an overhead load for tables that get highly updated. Because with Postgres now and its standard heap, if you have a single row and you update a million times, it's going to create a million rows in that table because each one is a new row and it saves the old version. Then all of those need to be vacuumed up. But with zheap the same row would be updated a million times. Now, it would log those million changes in the undo log or the redo log to be able to roll back or maintain different versions, but those would be deleted eventually and you wouldn't bloat your main table. And in terms of zheap, they are asking for assistance. So if you want to help test out or lend your assistance in any way, go ahead and reach out to cybertec-postgresql.com.

The next piece of content-"One, Two, Skip a Few". This is from incident.io and they're talking about some incrementing incident numbers. So each customer has their own set of identifiers and they're identified by incident 1, incident 2 and another customer would have incident 1, incident 2. So they're unique for each customer. It's not a globally unique number. Now, to implement this they used sequences but then they ran into a problem of support, they were getting a lot of gaps which if you know anything about sequences, are not guaranteed to be gapless. There are a number of situations where gaps get introduced but in this post, they describe how they set up a trigger and used a sequence to be able to track and give new numbers for each incident and then identified cases where if you have a rollback you can get gaps in the sequence.

So for example, here they insert two values, A and B. Then they insert C and D but do a rollback before those are committed. Then they insert E and F and you can see that C and D are missing but so are their sequence numbers. So that's a way you can get gaps. But that wasn't the case that they were experiencing for their end users. What they identified is that the gap that they were seeing was 32. Now, what's interesting about this is that this is actually in the Postgres source code, the number of sequence values that are pulled for each time and they say here, quote in the source code "We don't want to log each fetching of a value from a sequence, so we pre-log a few fetches in advance". So namely 32. "In the event of a crash, we can lose or skip over as many values as we pre-logged".

So that is another case. If your Postgres crashes you could get gaps in the sequences because it's pulling 32 at a time. But they hadn't had any crashes. So again, they looked more into it and what they discovered is that they did have an upgrade and they switched over to a new system, and that essentially caused the gap that they experienced. So basically, if you want this type of gapless incrementing, don't use sequences for that purpose. Now, the way that they ended up implementing it was getting a max of the ID per customer and then using that value as the next Identifier for the row. Now there is of course overhead with this because for every insert, you basically need to make a SELECT to identify what identifier to use, but this works in their case. But if you want to learn more, you can check out this post.

The next piece of content- "Efficient Pagination in Django and Postgres. This is from pganalyze.com and they're talking about how for frameworks in general, and in Django, the pagination tends to use limit and offset. Then what it also tends to do is do a SELECT all of the rows from a table or whatever list you're paginating through, because it needs a full count of the list to be able to know how many pages to put out there. This implementation could get really slow as your number of records increases as well as the further you go out in the offset. Because the more records the longer this query will take to run and then the further you go out in the offset it has to scan through all those records to get to it. So they were seeing something earlier taking 89 milliseconds.

Then when they went out to 5 million of a 10 million record set they were seeing over 2 seconds and then over 4 seconds to just get a count of all the rows so very very inefficient. Now they made some adjustments, some hacking of the library as it were, to try and speed this up. They tried removing the count query so basically it would just have an open-ended, you don't know how many total pages you're supposed to have and of course, that helped. They also tried approximating the count. Now you can do the number of reltuples from pg_class to get an estimate, but that's not going to work if you're looking at a subset of the table.

If you have a multi-tenant application where multiple customers use a segment of the table, this solution won't really work. But the solution that's the most efficient is this keyset pagination. So basically, you use some sort of indexed identifier. In this case, it's probably the primary key, an ID. So basically your offset is the ID you specify and then you do a limit of ten. So here, it can use the index that's in place to quickly identify the record and then just get the next rows from it. So this is the most efficient solution to this type of pagination problem. They talk a little bit more about some of the plugins and some of the adjustments they did, but the key set pagination is the way to go. So if you want to learn more about that you can check out this blog post.

The next piece of content- "How to Set Up PostgreSQL High Availability with Patroni". This is from arctype.com and this walks through the process using four nodes to set up a highly available PostgreSQL instance using Patroni. So basically on one node, they installed Patroni and Postgres. On a second node, they also installed Patroni and Postgres. So these would operate as the primary and one replica. Then on a third node, they set up etcd. Now they also mentioned you should, of course, have more than one etcd instance set up to potentially avoid split-brain scenarios, as well as a HaProxy instance to identify where to send the primary traffic. This blog post runs through all the commands to run on Ubuntu 20.04. So from installing the prerequisites Postgres, Patroni, etcd, and HaProxy and all the configuration that you would need to do to get it up and running. So it's a pretty comprehensive blog post. So if you're interested in setting up Patroni, I definitely encourage you to check out this blog post.

The next piece of content- "PostgreSQL Extension Catalogs". This is from fluca1978.github.io and he's talking about three views that are available in Postgres. They show you the PG extensions that are installed. They show you the extensions that are available for you to install, and usually the default install of Postgres, at least from the package managers of the operating systems, generally has a lot of versions available that you can install. But then there's also a different version of those. This view, pg_available_extension_version, shows you all the different versions that you can install of those extensions. So I encourage you to go ahead and check out this blog post if you want to learn more about these views and how you can use them to help manage your extensions.

The next piece of content- "The Amazing Buffer Tag in PostgreSQL". This is from highgo.ca. They're talking about the buffer tag in Postgres that allows Postgres to identify a specific file and block it needs to update. Basically, this buffer tag contains five numbers that identify table space, the database, the table, the fork number, and the blocked number. So this blog post goes into how these work and explains how Postgres uses these under the covers to identify where data is stored. So if you're interested in that, you can check out this blog post.

The next piece of content- "Maps with Django (Part 2): GeoDjango, PostGIS and Leaflet". This is from paulox.net. So this post walks through setting up a mapping solution in Django, the web framework for Python, and they use the library GeoDjango. They use PostGIS with Postgres, of course, to store the data and then Leaflet, which is a JavaScript library for interactive maps. Now, this is quite a comprehensive blog post that even has this very comprehensive index here, but basically, at the end of it all, you can get map markers rendered on a map. So if you want to learn how to implement this type of solution. Definitely check out this blog post.

Next piece of content- "Using PostGIS and pg_featureserv with QGIS". This is from blog.crunchydata.com and they're talking about how to use pg_featureserv to output data using the OGC API for features protocol to feed data into a GIS application. So if you're interested in doing that, you can check out this blog post.

The next piece of content, also from Crunchy Data, is "Waiting for PostGIS 3.2: ST_MakeValid". So this is a new function in PostGIS 3.2 that allows you to correct invalid geometries. So if you're interested in that, you can check out this blog post.

Next piece of content also from Crunchy Data is "Waiting for PostGIS 3.2: ST_Contour and ST_SetZ". So these are two additional functions that allow you to use Raster data to set up contours and find the value of a Raster point. So if you're interested in that, you can check out this blog post.

The next piece of content, the PostgreSQL person of the week is Rafia Sabih. So if you're interested in learning more about Rafia and her contributions to Postgres, you can check out this.

And the last piece of content is that we did another episode of The Rubber Duck Dev Show this past Wednesday. This one was on "Object Oriented vs. Functional Programming". So if this type of content is of interest to you, we welcome you to check out our show.

Zheap Status, Sequence Gaps, Efficient Pagination, Set Up Patroni | Scaling Postgres 175

Join Over 1,000 Engineers & Get New Episodes Weekly!

Content Discussed

YouTube Video

Podcast Audio

Transcript