background

PG15 Public Schema, Kubernetes Experiences, Dump Manifests, PgBouncer Fork | Scaling Postgres 231

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss the PG15 changes to public schema permissions, experiences with kubernetes for Postgres management, using dump manifests for restores and a fork of PgBouncer.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "Changes to the public schema in PostgreSQL 15 and how to handle upgrades". This is from Ad's Corner on andreas.scherbaum.la, and he's talking about an enhancement coming to Postgres 15 where they by default are revoking the create privilege on the public schema. So they're taking a look at Postgres 14 here and if you look at the public schema with this backslash \dn+ command, you can see it's the public schema. And you can see that any user, no user in front of the equals has usage and create rights. That's what the C means. Whereas when you look at it in Postgres 15, you can see the public schema and any user only has usage rights and no create rights. Also, it has the owner of pg_database_owner as opposed to the name of the actual owner.

So the question is what happens during an upgrade? Well, basically an upgrade, based upon what he's saying here, it stays the same. In other words, it stays the same way in Postgres 15 if you're upgrading from Postgres 14 or earlier. Now, we did say if we're using a dump and restore to do it, you can create the database on Postgres 15 using the "-T " command to specify the template and specify the template zero of the Postgres 15 installation. It will use the new default where the owner is pg_database_owner, and the create privilege is revoked from all users from the public schema except for the database owner.

Now, that's for a pg_dump and restore, which is a logical restore. So if you're going to be doing an upgrade using pg_upgrade, then this technique won't work because you can't specify the template, the files are just transitioned over. But you can run these two commands to be able to set it to the Postgres 15 default. Basically, after your upgrade, you REVOKE CREATE ON SCHEMA public FROM PUBLIC, and that will revoke those rights for essentially all users to be able to create something in the public schema. And if you want the pg_database_owner to be set as the owner, you do ALTER SCHEMA public owner to pg_database_owner. And now your database will be set as the default comes in Postgres 15. So if you want to learn more about that, definitely check out this blog post.

The next piece of content- "PostgreSQL on Kubernetes Experiences". This is from proopensource.it. This is actually an assessment and review of the different Postgres operators that work with Kubernetes. And there's a number of them. There's CloudNativePG, Procona Operator for PostgreSQL, PGO by Crunchy Data, Postgres-operator by Zalando, and StackGres. This post walks through the latest version. They support the high availability option they're using, the backup option that they use by default, the PgBouncer that they're using available extensions, and whether each supports minor or major upgrades of Postgres. Now, this post also gives some experience using them and basically, you can no longer do configuration changes using Ansible scripts or shell scripts.

Basically, everything has to be done through YAML files, through the operator. So it's definitely a different way to work with Postgres. And in addition, all of them do require some configuration changes that she mentioned down here. So I haven't really delved into Kubernetes, especially with Postgres, and actually seeing all these different operators kind of gives me pause. Frankly, I would kind of like to see some of these different operators combine forces as it were, and create a really good full-featured one. But that's just my opinion from a user's perspective, or I may just wait to see which one becomes the most popular option. Kind of like PgBouncer clearly is the most popular connection pooler. Will they be a Postgres operator that becomes the most popular? And at that point maybe I'll dedicate some time to learn that one. But if you're interested in exploring using Kubernetes with Postgres, definitely check out this blog post.

Now related to that, that's actually what this next article is about- "5mins of Postgres E33: Postgres on Kubernetes, choosing the right operator, and handling major version upgrades". This is from pganalyze.com and Lukas covers this particular post that we just discussed in this episode, as well as a few other previous ones that we've mentioned. But he talks about Kubernetes in general and the current state of managing Postgres with it and shows how you can do upgrades with them, as well as gives you a little bit of his opinion on the subject. So definitely encourage you to check out this piece of content as well.

Next piece of content- "Working With PostgreSQL Dump Manifests". This is from percona.com and a manifest file basically tells you what steps a pg_restore will have to take. And the purpose of it is so that you can alter the manifest. So maybe you will only want to restore certain things from a database dump. So to get started with it, you use a pg_dump using the custom option as the format output to make your database dump, and then you use pg_restore and the "-l" command to output to a manifest file. Then you can make changes to it if you want and read back in the manifest file with pg_restore using the "-L" option.

Now, in this example here, they used pgbench and added a few other additions to the database objects in there. They did the dump, they did a manifest file generation using pg_restore and the manifest file looks like this, and then they went and selectively removed certain rows, so it will only restore this set of data that they show here. They then did the restore using that modified manifest file and they only got the objects that they were interested in to be restored. So it's basically a little bit more convenient than specifying this table at the command line to load stuff in, you can get more granular with how you restore things. So check this post out if you want to learn more.

Next piece of content- "Open sourcing our fork of PgBouncer". This is from blog.cloudflare.com, and apparently, they were using PgBouncer in their architecture and they were running into issues with looking up the user to connect to the actual database to. So with PgBouncer, you can use a user list text file in PgBouncer to identify the users that PgBouncer should connect to Postgres as. You can also use the pg_shadow database. So basically PgBouncer queries Postgres to find out the proper username and password for a user. And basically, there were some existing issues with PgBouncer where this wasn't working. So Cloud Flare did some work to actually resolve this issue and they did it on a fork of PgBouncer. Now, once they did that, they discovered that they could add some new features such as dynamically shrinking connection pool limits at Runtime.

So as opposed to, I'm assuming, doing a reload at Runtime, you can dynamically change the pool limits to give you real-time configuration of PgBouncer to try to avoid bad tenants taking most of the resources of the database. So they showed some different Runtime commands down here. So while this is a great addition, this is a fork of PgBouncer and kind of like what the operators are doing, I really wish they would join forces with PgBouncer and basically backport the fixes and ideally this new feature into existing PgBouncer. So as a community, we're all trying to work toward the same goal, making each piece of open-source software better. Now, I will say I did see in some of those issues that they were referencing this fork to fix some issues in PgBouncer. So maybe that's happening and that would be great news. But if you want to learn more, check out this blog post.

Next piece of content- "How PostgreSQL 15 improved communication in logical replication". This is from postgresql.fastware.com and they're discussing some enhancements that are coming to Postgres 15 that make logical replication more efficient. So they're talking about with logical replications you have a WAL sender on the primary and a WAL receiver on where you're replicating to and that their communication between them is based upon a wall receiver timeout. There are keep-alive messages that are sent to make sure that the WAL sender and the WAL receiver are still communicating. In addition, it sends logical replication protocol messages to say this INSERT happened, this UPDATE happened, the Delete, the TRUNCATE as well as transaction control messages.

Now, in addition to these features, you can also filter so a particular publication can only send certain of these DML messages— INSERT, UPDATE, DELETE, and TRUNCATE— to a subscriber so you could say where the status is finished or something like that. Only those would get sent to the replica. But the issue that they were addressing in this enhancement is that when you do that, meaning filter out some transactions that need to be sent, the transactions are still being created and they're essentially empty transactions. In other words, there's no data that needs to be sent, it still needs to initiate and commit the transaction, but there are no changes within it because it's being filtered by the publication. So this was kind of unnecessary and they basically addressed this problem by putting in a fix to that.

So it no longer needs to send these begin commits for essentially empty transactions. The other thing they did is that if it's actively working on a long transaction, it's possible that you've reached the WAL sender timeout time. So they added some modifications to that. So it more frequently sends keep-alive messages to make sure that the wall receiver doesn't time out. Now, they did show some performance impact here, but of course, the performance impact of this change is based on how many empty transactions- are you filtering your publications? So the more you filter it, the more benefit you'll get from these enhancements. If you're not really filtering and you're logically replicating everything, you're not going to see hardly any difference. But if you want to learn more about that, definitely check out this blog post.

Next piece of content- "SQL/JSON is postponed". This is from depesz.com and he's talking about some new features that were mentioned a couple of months ago coming to Postgres 15, adding some JSON features to be more compatible with the SQL standard, and, unfortunately, those have been backed out from both Postgres 15 and the development version of 16. So unfortunately it looks like we're not going to be seeing those definitely in 15 and maybe not in 16. Now they'll eventually be coming, he just doesn't know when yet.

Next piece of content- "POSTGRESQL: ALTER TABLE ... ADD COLUMN ... DONE RIGHT". This is from cybertec-postgresql.com and they're talking about when you add a column to a table, as long as you don't specify anything other than that, that should happen relatively quickly. And he's demonstrating this by putting 100 million rows in a table and he's just adding a column. The size of the table doesn't change that much and it happens very quickly in a matter of milliseconds that a column was added to a table. Now it does require an exclusive lock for a very short period of time and to mitigate any potential issues in production, you're going to want to use a lock timeout so that the adding of that doesn't block anything for some reason, or if it does it'll cancel the ADD COLUMN command.

So you could put a lock timeout of 5 seconds, 10 seconds, or something like that, just to make sure there's not something else that's locking the table, preventing this exclusive lock and then causing a lock queue behind this ADD COLUMN command. So definitely do that, but as long as it's a relatively simple column add, it should happen pretty much immediately. You can also add a DEFAULT to it or NOT NULL as long as that default you're setting is a constant. So basically not a function or nondeterministic now where you run into problems is when you're adding a fault that has a function, like doing a random function here, or maybe it's a time function like the NOW function.

Now essentially has to rewrite every single row and that's going to take forever to do. So you definitely want to be careful of that. As you can see here, this took over a minute, but if you have a much larger database, it'll take a lot longer. So just be cautious when adding essentially non-constants as a default when you're adding a new column and you should be okay. Now in terms of dropping the column, again, that's rather fast itself. It's not going to make any changes to the data. Essentially it's just dropping the reference to the column. But again, in both these cases, adding and dropping, you always want to use a lock timeout as well. But check this post out if you want to learn more.

Next piece of content- "Postgres Migration Pitstop: Collations". This is from crunchydata.com and they're talking about an issue where Postgres usually uses the glibc library for determining how it's going to sort things, particularly in indexes. And if you change glibc versions, the main way to do this is by upgrading your operating system. Like if you go from Ubuntu 18.04 to 22.04 or maybe even 20.04, you're going to get different versions and you're going to potentially have issues that include missing data when you query it, inconsistent sorting between versions, and undetected unique constraint violations. So basically, this can cause you a world of hurt. You want to be cautious whenever you're switching operating system versions because a lot of time you're relying on the glibc version of the operating system.

And I actually have customers that have stayed on 18.04 more than they would probably want to because of the issue of trying to address this. So how do you address it or how do you fix it? Well, basically you need to do a reindex. Now if you have a terabyte-plus database, that's going to take a while and a lot of planning to figure out how best to do that. Now, I should say this applies if you're just going to be moving the files over, like using a pg_upgrade, or if you for some reason do a base backup and a base restore, the files haven't been rewritten. You can run into these issues. But if you have a smaller database and you're using a pg_dump and a pg_restore, the indexes are created anyway, you bypass this problem. Also, as they say here, using logical replication will avoid it as well.

So if you do that as an upgrade process, logical replication, you can avoid it as well because essentially those indexes are going to be created fresh. There's not going to be files copied over that were already sorted according to a particular collation standard. And they show a little bit of how the sorting works here. They talk a little bit about what glibc is and how you can actually query to see what correlations you're using by looking in the pg_database table, as well as what collation version you're using by querying the pg_collation system view. And then they talked about fixing it.

And basically how you fix it is by doing a reindex, ideally reindexing concurrently. Now, there's a way you can sidestep this that they don't mention in this post, but that's using ICU collations. So when you're starting your database, if you want to use an ICU collation, then you can more easily control what versions you're using for collation, and you can stay on the same version or choose to upgrade it at a particular point. But a lot of clients I work with, are just using the standard glibc library for it. So that's unfortunately not an option. But if you want to learn more about that, you can check out this blog post.

Next piece of content- "Timescale Cloud Tips: Migrate Your PostgreSQL Production Database Without Downtime". This is from timescale.com, and they're talking about a different way to do an upgrade of your database or a migration of your database instead of using things like pg_upgrade or logical replication, they're advocating writing to the new database and the existing database in parallel and then just backfill the data. Now, this post is about TimescaleDB, but it is also applicable to Postgres. But the thing you have to keep in mind that they don't really mention here is how you handle updates and deletes.

So that's another consideration that they didn't really cover, and maybe they didn't because they're recording time series data which essentially continually appends data. Maybe there are not a lot of updates happening, so maybe that's why they didn't address it in this post. But if you want to consider this type of upgrade, you're definitely going to have to figure out how you're going to be handling the deletes and updates and how they can be handled with the database. For that reason, it may actually be easier to do a logical replication version as opposed to this more simplistic method of doing the transition. But if you have append-only data, this option works great, but you can check it out if you want to learn more.

Next piece of content- "Fill Gaps in Statistical Time Series Results". This is from sqlfordevs.com, and he's talking about a very simple technique you can use to fill in gaps in a time series. Like maybe you want to make a graph and if there's no data for a time series, you don't want the value to be empty, you want it to say zero, for example. Well, what you can do in Postgres, as he demonstrates here, is use the generate_series function to generate the series of days you want and then just join it to your data. And if there are any nulls or gaps in that data, it'll just appear as a zero because again, he's coalescing it and setting it to zero when it's null. So this is a great way to solve that problem in the database as opposed to doing code in your application. He even shows a MySQL way to do it as well, using a recursive CTE.

Next piece of content. Postgres FM had its next episode. This one is on "Intro to query optimization", so if you want to listen to that, you can definitely check it out or click the icon down here for the YouTube video version of it.

Next piece of content. The PostgreSQL person of the week is Douglas Hunley. If you're interested in learning more about Douglas and his contributions to Postgres, definitely check out this blog post.

The last piece of content. We did have another episode of The Rubber Duck Dev Show this past Wednesday evening. This one was on "Going Off The Rails Drew Bragg". So basically, this is a set of developers getting together using a particular framework. And what happens when you veer off of that framework and start going essentially off the rails that they've provided? Is that a good idea? Is that a bad idea? So we talk about that process. So if you're interested in that, we welcome you to check out our show.

episode_image