background

Connection Scaling, Cached Sequences, Message Queuing, Harmful Subtransactions | Scaling Postgres 181

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss connection scaling, how to cache your sequences, a message queuing implementation and the harms of subtransactions.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "When Les is More: Database Connection Scaling". This is from richyen.com and he's talking about the scenario where you have a max_connection setting in Postgres. Sometimes, people who manage the database want to set it very high, even though you may not be using all of them. The thing to keep in mind is that Postgres is process-based. If you only have, for example, he's talking about a pretty big machine with 128 CPU cores and maybe it's hyperthreaded, you do have some limit on how much work you can do at a particular given time. So having 10,000, 20,000, and 30,000 max_connections doesn't make a lot of sense because you won't be able to use them. But there's a problem even if they are all idle. That's what he mentions in the section here.

But the connections are all idle because there's still work that needs to be done by all the shared components of Postgres to maintain the fact that these connections could potentially be used. This was demonstrated in a great post a number of months ago that was linked in the conclusion. I highly suggest you check out this post if you haven't already. They're talking about making improvements to actually improve the scalability of Postgres's connections. They show when you have a high number of unutilized connections, your performance drops over time as you scale those idle connections. And that's what this post does here, it essentially recreates that test. In his example here, he was reaching 1,500 transactions per second with a very low number of idle connections. Then as he increased the number of connections, you can see the transactions per second of what was running started to decline.

Somewhere between 10,000 and 15,000 connections, basically, it dropped the throughput to a third of what it was when there were fewer connections. Now this was mitigated somewhat by removing autovacuum from the picture, but still, it eventually caught up with it as well. So the key takeaway is to not set your max_connection settings to more than you need because you're going to run into this problem potentially. That's another reason to check out this post because they have made some improvements to Postgres coming in version 14. In the future, they hope to push the number of max connections that are supported out even further and to alleviate some of these problems. But of course, you can always rely on a connection pooler such as pgBouncer to be able to support many thousands of connections without having to run too many max connections on your actual database system. But if you want to learn more, you can check out this blog post.

The next piece of content- "UUID or cached sequences?". This is from dev.to in the Yugabyte section. He's talking about where people decide whether to use a UUID or an integer associated with a sequence to have as a primary key for their tables. In deciding between them, a lot of people choose UUID for its scalability, but he makes the argument that integers with sequences can be pretty darn scalable as well. Particularly if you use bigints for the integers as well as sequences have the ability to cache values. So what that means is that you can have a session store in memory a certain number of sequence values, not just one. So when you request a sequence, it doesn't give you one sequence number, it gives you say, the next 50. Or in his case, what he's potentially advocating here is, essentially, two bytes worth, around 32,000 sequence numbers per request.

These will all be cached in memory for that particular session, so they can very quickly grab what the next one is without asking the Central Sequence Number generator. So he shows you how to implement this and has an example with Python code. You can see the first thread pulls those 32,000 sequences, but you can see thread four here, its number starts at around 32,000, and thread five starts at about 65,000. So the next set, thread two, is at 98, and thread three is at 131,000. So each thread is pulling two bytes worth of these sequence numbers and they don't have to request another one from the Central Sequence Generator until it has expired the range that it initially retrieved. Now, of course, one downside of this, if your sessions get disconnected, you won't be able to use those sequence numbers again.

You would have to do another sequence request and get another 32,000. So if you're constantly making connections and disconnections, this won't work as a great solution. But if you have long-running connections to the database using some sort of a connection pooler, this could be a solution to give you scalability with sequences. Now, in my database application, I tend to use all integers for my primary keys and I only use UUIDs when a unique ID is generated within some application, like a JavaScript frontend. That's the only time where I would use a UUID for that purpose. Basically, it's an ID generated outside of the database. But if you want to learn more about this, you can check out this post.

The next piece of content- "Devious SQL: Message Queuing Using Native PostgreSQL". This is from blog.crunchydata.com. They're talking about wanting to set up a message queuing system within PostgreSQL. So basically, they want to be able to capture some inputs stored in a table and then be able to have some processor worker pull off that work without conflicting with other workers and then be able to delete it. So their general requirements here are a table to hold events or items to be processed, something to enqueue or put the items in the table, something to dequeue and consume these items and do so without locking. So if you have multiple workers that are dequeuing the work to be done, make sure they don't step on each other's toes. So to set up the table, they set up this basic queue table with the ID as the primary key, a timestamp as well as a payload to contain what needs to be processed.

Now, to select what rows are going to be processed by each worker, he came up with a command that says select all the rows from the queue table, and limit it by ten. So get ten records at a time and use FOR UPDATES. That means this needs to be locked while you're pulling it. But the problem is this locks and prevents others from pulling from the table at the same time. So he did it as a part of a transaction and you can see that the second backend essentially hangs, it can't do anything. But how you get around that is you use SKIP LOCKED. So instead of doing FOR UPDATE, you would do FOR UPDATE SKIP LOCKED. That way each backend will know it can pull its ten and skip over any ones that are already locked. So this way backend 1 can pull the unique ten it needs and backend 2 can pull the unique ten it needs. So that's a way to pull the data off.

But then the question is how do you then delete that data? What's the most efficient way to do it? Now the way that they came up with doing this is by using a DELETE statement with some joins and returning the rows that are deleted and that is what would be consumed. So for example, you do a DELETE FROM the queue table so it's a DELETE statement. Then you do USING to kind of do a join to the SELECT statement because it's the SELECT statement that limits it to ten rows and does it for an UPDATE SKIP LOCKED. So the USING essentially gives you a join to get the limit in the row locks and what will be deleted.

Then it says WHERE for the USING subquery ID is equal to the queue_table ID and it returns the entire contents of the queue_table. So you're getting all of the rows being returned from this DELETE statement. So essentially this is a way to delete the rows that you need and get all of that data from those deleted rows to be processed by the worker. If there's any issue with the worker or something crashes, this should be rolled back and those rows would still be available for the next worker to pick up and process. So overall this is a pretty simple implementation of how you could do a message queue within PostgreSQL without any additional applications or tools. So if you're interested in that, you can check out this blog post.

The next piece of content- "PostgreSQL Subtransactions Considered Harmful". This is from postgres.ai and they're talking about subtransactions. So subtransactions are transactions within existing transactions. So when you have a transaction, you can actually fire off another subtransaction by doing a save point and that allows you to roll back to a given save point. If you want to save, work in the process of a long transaction. Here's a more visual cue where you have the main transaction and within it, you can have a save point, which is the subtransaction that you can choose to roll back to or not. Now, in terms of who is using subtransactions, he mentions a number of popular object-relational mapping frameworks such as Django, SQL Alchemy, Ruby on Rails with its Active Record, Sequel, et cetera. There's a number of them that use that.

The other thing to keep in mind is that if you're using PL/pgSQL code when you have an exception block, that also uses subtransactions. Now there are a number of problems that you can encounter when using subtransactions. The number one he mentions here is XID Growth. So the transaction ID growth. Because every time you create a subtransaction, it creates a new transaction ID. So if you have a lot of these being generated, you're going to dramatically increase the number of transaction IDs that you're using in the system. So you want to be cautious of that because autovacuum may have trouble keeping up and freeing those XIDs for use. So you want to be cautious of that. The second issue is you can have a per-session cache overflow. So by default, a given session is allowed 64 subtransactions. So each session can have one transaction happening and it can have 64 subtransactions happening within it. If you exceed that, you get a dramatic decrease in performance. So he shows some examples of that here.

But there's also this great post from cybertec-postgresql.com called "POSTGRESQL SUBTRANSACTIONS AND PERFORMANCE that you should check out, and it was covered on a previous episode of Scaling Postgres as well. So the thing to keep in mind with this 64 limit, is if you have a loop in a PL/pgSQL command and you're doing an exception block within it, you could potentially hit this limit very quickly and cause performance problems for your application. The next problem he mentioned is the unexpected use of multi-transaction IDs. So these are generally used when you're locking a row more than once. Like they're talking about the example of SELECT FOR SHARE, but you can also hit it when you're using things like SELECT FOR UPDATE if subtransactions are used in a part of it. There, he has some examples where someone had a lot of performance problems when they were basically doing a SELECT FOR UPDATE, doing a save point, and then updating the same row.

When they were using Django, which automatically uses some save points. They were running into a lot of problems. So that's something you need to be aware of when using subtransactions with things like SELECT FOR UPDATE. The next problem is having subtransactions with SLRU overflow and in this case, it was basically a significant performance degradation on standbys caused by using subtransactions in a long-running transaction on the primary. They started seeing the standby: not responding and then finally came back and started responding again once the transaction was concluded. So that's an issue that you could potentially encounter. He also mentions that actually the cause of Amazon going down during Prime Day 2018 was actually subtransactions.

Now they were actually using Postgres Aurora, but still, they had this issue. Now he gives some recommendations here. First of all, you want to use some good monitoring of things like transaction ID wraparound, a weight event analysis from pg_stat_activity, and looking out for long-range transactions to try to identify potential issues. You can also check if you have subtransactions within your code base and look for cases where you have exception blocks in your PL/pgSQL code. And his final recommendation is to experiment, to predict and prevent failures by stress testing systems. Now, he does say that there are a number of things being worked on to improve some of the areas that he's mentioned here, but they are still in process. But this was a very comprehensive post and if you're interested in this content, I highly suggest you check it out.

The next piece of content- "How to create (lots!) of sampled time series data with PostgreSQL generate_series()". This is from blog.timescale.com and this is a great post for learning how to use generate_series to generate a series of data for your tables for testing purposes or load testing purposes. So they start off very basically where you can say "generate a series using integers" and it just prints out a series of numbers. With a third parameter, you can specify what increments should be used when generating those numbers. You can even use dates. You can set a start date, you can set a stop date as well as a third parameter. Do the interval of how often to generate this series of date times. Now with that, you can also include other columns, of course, when you're generating a series.

So for example here they just printed out a string that gets printed out with the series that was generated here. You can even do random numbers to give you a sense of hey, maybe this is CPU utilization, for example, for testing purposes. Then they showed an example of where you can actually do cartesian products. Basically, a cross-join to be able to generate a lot of rows at one time. So for every instance of this row, generate the number of rows indicated here. So it's ten rows times two will give you 20 rows. So you can think of using this technique to generate many, many rows. This example kind of puts it all together where they're using generate_series to generate a timestamp. Here they're using a device ID, also using a generate_series of 1-4, and then doing a random number generator in the third column to track CPU usage. They do some other examples of how to use date times to do this as well. This was a great introductory post about generate_series, so if you want to learn more, definitely suggest checking this out.

The next piece of content- "Partitioning use cases with PostgreSQL". This is from anayrat.info. He's predominantly talking about declarative partitioning and the different features it has and when it's a good time to use them. So he talks about how it can support partitioning by list, hash, and intervals. It can do multi-level partitioning or nested partitioning. You can partition on multiple columns and they also use foreign and primary keys. Now, he mentions two mistakes here that people use when they want to do partitioning. One is "Partitioning is necessary as soon as the size is important" and he says that is not necessarily the case. It's not like once you have so many gigabytes or so many terabytes that you have to partition. However, I'll push back a little bit here because I think this is one area where I see partitioning frequently used when the management of the table becomes more difficult.

So autovacuum, even after some optimization, still takes a very long time to run. There's just so much data in the table. Partitioning that table will make the management a lot easier. So vacuum can run much more quickly on, say, the active partitions than ones that are much older, for example, if you've partitioned by date. So I think that is one area that is a prime case for partitioning. The second reason he mentions not to do it is for partitioning to spread data over disks. I agree with this. With the advent of all the different cloud storage systems, spreading over disks using partitioning is not a great use case for it because you can easily spread your data over multiple disks with current hardware. The use cases for partitioning he mentions are partitioning to manage retention.

So if you need to keep data for a certain period of time and you're partitioning by a date, you can easily delete the old partitions without having to go in and do delete statements, you just drop that old partition. The next is partitioning to control index float. So for example, anytime that data is inserted and or updated, there's a chance for that index to become bloated. Now, you can run REINDEX, but if there's a particular load giving your table, I could see how it could be advantageous to potentially partition such that it's easier to reindex the older partitions that are no longer active. The next one he mentions is partitioning for low cardinality. Now, for me, this seemed kind of similar to the reindex use case, so I haven't seen a lot of reason to do it in this case.

Generally, I would rely upon a partial index to cover use cases such as this. The next area he mentions is partitioning to get more accurate statistics. So I could see this is a use case if you want to collect statistics at a more granular level due to the layout of your data. Partitioning may assist with that. He also mentions the benefits of a partition join and partition-wise aggregate. Although the performance of that is not significant, so I don't know if that would be a reason to do it. The next reason he mentioned is storage tiering being able to put, say, recent partitions on very fast storage and older partitions on slower storage. So I definitely see that as a use case. But if you want to learn more, you can definitely check out this post.

The next piece of content- "Logical Replication Tablesync Workers". This is from postgresql.fastware.com and they're talking about some improvements that have been made to Postgres with regard to logical replication in the publisher/subscriber model. This goes into quite a bit of detail about the apply worker and tablesync workers. It's all, essentially, the internal workings of logical replication and enhancements that have been done to potentially improve the initial copy time for logical replication and error handling. So if you want to learn more about the internals and details of Postgres, you could definitely check out this blog post.

The next piece of content- "Solving the knapsack problem in PostgreSQL". This is from dev.to by Francesco Tisiot. He's talking about a way to solve the knapsack problem, which is basically you have a number of items that have particular values and each has a particular weight, and you have a knapsack essentially with a capacity of 20. So what are the best items to put into the knapsack and get the highest value? So with that, he uses a recursive CTE to solve this problem and here's the overall recursive CTE that is able to do that type of calculation for you. He walks through exactly how it works and shows you the results. Here where the combination with the most value is the socks, the hat, and the shoes. So if you want to learn more about how to do this within PostgreSQL with a recursive CTE, definitely check out this blog post.

The next piece of content- "POSTGRESQL: THE POWER OF A SINGLE MISSING INDEX". This is from cybertec-postgresql.com and this is a very long poem about a missing index in Postgres. So this is quite interesting. If you want to learn more about this poem about Postgres, I definitely suggest you check out this blog post.

The next piece of content, the PostgreSQL person of the week is Dmitry Dolgov. So if you're interested in Dmitry and his contributions to Postgres, you can check out this blog post.

The last piece of content. We had another episode of The Rubber Duck Dev Show this past Wednesday. This episode was on "Modern Web Application Front Ends". So if you're a developer and want some more long-form content, you can definitely check out our show.

episode_image