SQL Functions, Explain Analyze Buffers, Debug Autovacuum, RLS Multi-Tenancy

In this episode of Scaling Postgres, we discuss PG14's new SQL function syntax, including buffers when doing explain analyze, how to debug autovacuum and using row-level security to handle multi-tenancy.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "BETTER SQL FUNCTIONS IN POSTGRESQL 14". This is from cybertec-postgresql.com and they're talking specifically about SQL functions. So not procedural level SQL functions, but just pure SQL so that's all that it contains is Structured Query Language. So basically in version 14, there was an enhancement that improved error handling as well as security. So typically, you would have a function that looked like this. In this example, he wants to create a new function called Mangle that does a lower of an unaccent of a given text. So he has to create the extension unaccent. Then he created this function that returns text using the language of SQL as this statement. And when you run the function it should unaccent it as well as represent it in lowercase.

Now with this, when this function is stored as a literal text and as he says here, "...the function body of a PostgreSQL function is simply a string constant". And then essentially this is run again to execute that function when it comes time to run it. But this has at least two issues. One, there's a lack of dependency tracking. So for example, he shows a scenario here. If he just drops this extension, an accent that this function depends upon, and you try running it again, you're going to get an error. Ideally, if there was some sort of dependency tracking, it would say, oh, well, you can't drop this extension because it's being used in this function, but because it's simply storing a text string as a part of that function, it can't do any dependency handling like that. Second, you can run into search path problems depending on how the functions have been set up. So there are potential security issues around that.

But with Postgres 14, they have a new SQL function syntax and as opposed to storing a string literal, the function body is quote "...parsed at function definition time and stored in parsed form in the prosqlbody column of the pg_proc system catalog". So basically it does the parsing ahead of time. So it can do things like dependency tracking to make sure the function has everything it needs. And he shows that example here. When you drop the extension, it gives you an error saying you can't drop it. It also fixes some of these search path issues. So this is a new way to define functions. What he says here, you may have certain clients, not necessarily psql, but particularly, if you're using older versions of psql or you're using older versions of pgAdmin, then they need to include this type of support so that you can use these new features.

Now, our second post is related to this. "BEGIN ATOMIC: A BETTER WAY TO CREATE FUNCTIONS IN POSTGRESQL 14". This is from jkatz05.com and he's talking about the same new feature. He's showing here where again, what functions typically did was just record these as string literals. That's what the double dollar sign means. But if you want to use BEGIN ATOMIC, you'll see that there are no string literals here. You simply say BEGIN ATOMIC. You put one or more statements in here and then you do your end. He talks about the benefits of dependency tracking as well as creation time parsing checks to correct additional problems that can occur. So this is a new feature I was not aware of in version 14 and you can check both of these posts out if you want to learn more.

The next piece of content-" Using BUFFERS for query optimization". This is from pgmustard.com and they're saying when someone does optimization work, they use EXPLAIN ANALYZE and track timings of different steps. But they suggest that this has a few weaknesses. One, they vary from run to run, which is absolutely true, so you have to kind of get an average. Two, they are dependent on the cache, which is also true. And then the timings alone can hide efficiency issues through parallelism, so it doesn't necessarily hide them, but it may be harder to discern how many loops are happening with something. Whereas asking for buffers as well when you do an EXPLAIN ANALYZE, it can help give a sense of how much I/O is being done and allow you to know how much is happening with regard to cache. It should be relatively more consistent regarding how much data you need to pull back and avoid some potential timing issues.

They go into what the different buffer statistics mean, looking at the shared blocks area, the temp blocks area and the local blocks area, as well as what's a hit, a read, a dirtied, and when it's been written. And they give you an example here where they do an EXPLAIN ANALYZE from a particular table and you can see it is doing an index scan. The buffers in the shared area were hit with three. So the data was found in the shared buffers, but it did have to read one block from the disk. They feel that using buffers can help spot when operations are doing way more I/O than you expected. Two, getting a sense of the total I/O of the query. Three, spotting operations spilling to disk, although a lot of times that's pretty obvious. And four, signs of cache performance problems. So I would say any additional information, the buffers definitely help optimize your queries because any additional information you have is beneficial. If you want to learn more about that, you can definitely check out this blog post.

The next piece of content- "Debugging Postgres autovacuum problems: 13 tips''. This is from citusdata.com and they're talking about addressing issues that come with autovacuuming. And this is similar to other vacuum autovacuum posts that we've had. So some of the content will be similar, but they talk about three main areas and some possible fixes for it. So what do you do when a vacuum is not being triggered or your table is not being vacuumed? Well, one, you can decrease the autovacuum scale factor, which reduces the number of rows that need to be changed for vacuum to kick off. You can set this globally or for a specific table. Also, you can decrease the autovacuum insert scale factor.

So this is a new configuration option for newer versions where you can separately set the insert scale. So what do you do if your vacuum is too slow? So they say you can reduce the impact of cost limiting. Now, the main area for this, in my opinion, is increasing your vacuum cost limit or your autovacuum vacuum cost limit. You can also reduce your vacuum cost delay or your autovacuum vacuum cost delay. Those two things will allow vacuum or autovacuum to use more resources to vacuum a table faster. The next thing you could do to improve it is increase the number of parallel workers. Now, the issue is that this cost limit affects all workers. It's one setting for your whole pool of workers. So if you increase the number of workers, each worker will work slower because it's based upon this global setting. So if you have the scenario of a small number of very large tables, it makes sense to keep your autovacuum max workers low.

But if you have, say, a lot of partition tables that are relatively small, then it may make sense to increase your number of autovacuum max workers to be able to address those tables more quickly. Another area to help increase vacuum speed is improving the scan speed by prefetching and caching. Now, I haven't heard of this technique, but they're basically suggesting using pg_prewarm to get some of the heap in the shared buffers, or just increasing the configuration of the shared buffer so you have more of the heap and it doesn't have to pull things from disk as much to do a vacuum operation. The next way to speed things up is increasing memory to store more dead tuples.

So basically increasing your maintenance work memory so it can hold more of those dead tuples so it doesn't have to do multiple cycles of autovacuum operation. The other way to speed things up is actually vacuum your indexes in parallel. Now, you can't do this with autovacuum, so this would require a manual vacuum, but a manual vacuum can do things in parallel. So you do have that advantage there. You may want to, as he suggests here, adjust your max_parallel_maintenance_workers to be able to do vacuum operations more in parallel. The third problem is that vacuum isn't cleaning up dead rows. So basically something is holding onto a row that's preventing vacuum from vacuuming or cleaning it up. You could have long-running transactions.

You could have long-running transactions that are on standby because you have hot standby feedback on because you have replicas that you want to set that on for. Another possibility causing this is unused replication slots. Generally, you would see your disk space ever increasing because it's not able to clean anything. Or uncommitted prepared transactions. So if you're using a two-phase commit and you only have one phase initiated, that can't be resolved until that whole prepared transaction is complete. So having these orphaned does cause vacuum problems. So that's something that you can look at. But this is another great post about how to configure vacuum and autovacuum for Postgres. So I definitely suggest checking this post out.

The next piece of content- "Shipping multi-tenant SaaS using Postgres Row-Level Security". This is from thenile.dev and I was interested in looking at this post to see how they did this. Is this a technique I would want to potentially explore? But I really didn't see anything that persuaded me that this is an option I would potentially want to explore. A lot of the performance concerns that I've seen from so many other pieces of content still exist. Now, maybe things are changing with newer versions that will alleviate some of the performance concerns, but having this security overhead for every query just has concerns for me. And even in this post further down, they said that their access policy is relatively simple at this time, but quote "More mature access control policies such as RBAC/ABAC require their own schema design and can be more challenging to integrate with RLS and even more challenging to make performant".

So I think that the key thing here is that I keep hearing with regard to row-level security, a struggle to make it performant. Now again, that's not to say you can't use it, but if you want something performant, maybe you would want to avoid this for a multitenancy concern. I should also add that the technique that they're using here is essentially storing the user in a session variable so you read it to be able to determine what tenant they belong to. Now, as you scale, presumably you're going to have to start using some sort of connection pooler such as PgBouncer. And if you did that, you would be restricted to session pooling, which is definitely not as efficient as transaction-based pooling where any particular transaction can be sent to any particular database. So with this technique, you could not use the more performant transaction pooling with PgBouncer. You'd have to stay on session pooling, which kind of alleviates some of the benefits of PgBouncer.

Now, this is the exact post that pganalyze.com covers in their "5mins of Postgres E28: Row Level Security, security invoker views and why LEAKPROOF functions matter". Now, he talks about this post in general, and I highly suggest looking at his episode as well. But he also talks about some security considerations you should take into account because most views will bypass row-level security and you have to use some of the new features like security invoker views, to actually get it working well. He mentions how you need to be cautious of functions, even if you need to use specific LEAKPROOF functions. And with regard to performance, he even shows someone posting that, well, they tried using row-level security and they ran into performance issues even with relatively simple queries. So again, there's that performance issue I keep hearing about with row-level security. But if you want to learn more about this, definitely check out both of these posts.

The next piece of content- "State of PostgreSQL 2020—13 Tools That Aren't psql". This is from timescale.com, and again they're revealing some more results from the State of Postgres survey. And they're showing here that definitely, psql is far and away the tool used for doing queries and administration tasks. But they also mentioned pgAdmin, DBeaver, DataGrip, IntelliJ, and actually a few others that they list down here as well. They also talk about the third-party Postgres tools used for application development and the top response is they don't use any. The second is pgBouncer, which is not surprising. Also using Depesz Explain, PGAnalyze, PgBouncer, pgBackRest, or pgMustard or some others. And then they cover some visualization tools that people use such as Grafana, pgAdmin, or DBeaver. But if you want to learn more about the results of the survey, you can check out this blog post.

The next piece of content is actually a YouTube channel. This is the Percona YouTube channel. And this week they posted approximately 60 or 70 videos from Percona Live. So a lot, a lot of video content. Now, all of this is not Postgres-related. They talk about a number of different database solutions, but you can tell there are numerous postgres specific ones as well, such as Oracle to Postgres or Solving Postgres Wicked Problems, although it's showcasing another database, but definitely some Postgres content amongst all of these videos. So if you're looking for some video content, definitely suggest checking out this channel.

Next piece of content- "Partitioning with Native Postgres and pg_partman". This is from crunchydata.com and they're talking about partitioning, how to get started, and the different benefits of partitioning predominantly around data lifecycle and cost management. Basically, if you can partition and maybe archive to another area, your working data set can be much smaller. Also, even if you have all of the data still in the database, the process of partitioning can make the vacuum process much easier to maintain. Each of those partitions. Now, there are performance benefits. They're not as great as you might expect, but you do see some improvements. They talked about the different types of partitioning and how to actually get started, and then in terms of managing those partitions, like keeping them maintained, creating new ones, et cetera, they talked about the pg_partman extension as well. So if you want to learn more, you can check out this one.

The next piece of content- "Running PostgreSQL on Docker". This is from endpointdev.com, and this is a pretty comprehensive post about running Postgres on Docker. So if you are interested in doing that, as well as doing things such as handling the persistent storage so that your data stays around after the container gets restarted, and also supporting port forwarding from a host to a container, as well as managing the container with Docker Compose. So if you're interested in that, you can check out this blog post.

The next piece of content. The PostgreSQL person of the week is Elizabeth Garrett Christensen. So if you're interested in learning more about Elizabeth and her contributions to Postgres, definitely check out this blog post.

The last piece of content, we did have another episode of The Rubber Duck Dev Show this past Wednesday evening. This one covered "The Agile Manifesto's 12 Principles". So if you're interested in that type of long-form developer discussion, we welcome you to check out our show.

SQL Functions, Explain Analyze Buffers, Debug Autovacuum, RLS Multi-Tenancy | Scaling Postgres 226

Join Over 1,000 Engineers & Get New Episodes Weekly!

Content Discussed

YouTube Video

Podcast Audio

Transcript