background

psql Shortcuts, Debugging Deadlocks, Find & Stop Queries, Using SCRAM | Scaling Postgres 221

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss psql shortcuts, how to debug deadlocks, how to find & stop queries and how to understand and use SCRAM authentication.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "PSQL Helper: Managing Connections and Simplifying Queries". This is from elephas.io. He's talking about configuration changes you can do to psql predominantly by altering some of its configuration files it looks at. So they're talking about having a .pgpass file that you can create in the home directory of where you're running psql and it will consult that. That's an easy way for you not to have to fill out passwords, but you still have to put this long connection string when you're connecting to particular databases. However, there's this other file called .pg_service.conf that you can configure to define different servers that you can connect to. This kind of looks like a config file for SSH, but basically, you give a particular database a name with square brackets, and then you can define the host, DB name, port, user password, application name, and all the different parameters that you can pass into psql.

You just state them here. So to connect to the service, you just start psql and you pass in an environmental variable called PGSERVICE, which it equals the service you want to connect to. Now, I've used .pgpass, but I haven't used .pg_service.conf, probably because each database that I tend to do consulting with, they have it on one system and I'm usually just dealing with one database, so I just connect to that one database. The next file they mentioned is the .psqlrc file. This of course allows you to configure different configuration settings of psql, but it also enables you to create something like an alias. So for example, they did \set activity and gave it a particular query. And now just by typing :activity, you can run that particular query. So this is a great way to create aliases for frequently run queries for your database. And if you want to find out more, definitely check out this blog post.

Next piece of content- "DEBUGGING DEADLOCKS IN POSTGRESQL". This is from cybertec-postgresql.com. They're talking about deadlocks, where one session is holding a lock of some sort. The other one tries to access the same lock, but each is locking something different and the system can't resolve the lock contention. So it basically results in a deadlock. Now, for the example they've shown here, this is actually a type of deadlock situation I haven't necessarily encountered. When I see deadlocks, it's normally because there's a long-running process that acquires locks on potentially multiple tables. Doing a body of work, essentially, hasn't been ordered appropriately, so maybe one worker starts from the beginning and starts going through its locking process.

Another worker takes the reverse sorting order and goes to that process and eventually, one process ends up locking something the other needs at the same time that it's holding a lock for something the other process needs. So 95% of the time in my consulting, I've been able to resolve deadlocks by just telling the clients to order the work that they're doing and that essentially resolves the deadlocks. Now there is order involved with this one, but it's a little bit different. So basically they had a parent table and they created a child table. They started a transaction inserted into the child table's particular value. Now this will actually hold a lock because there is a foreign key constraint on the parent table for this row because it doesn't want it to be deleted. Then a second session starts and it inserts the child value. This will create another shared lock on that parent table row, but it won't be blocking.

But then it's going to lock that same row with a FOR UPDATE. But then when they try to do the same thing in session one, you get a deadlock. The locks can't resolve themselves. Now you'll see this kind of information in the log and it tells you the exact statement that has caused the deadlock. It was this one and this one. But normally these will work fine. You can hold many different FOR UPDATE locks if you want to on a particular row, or at least they can wait. The problem is one session already has an existing lock and it's trying to lock the same thing. But that information, the fact that it's holding a lock due to this INSERT that has a foreign key constraint on the parent, is not listed in this log information. So it becomes hard to debug what's causing this type of deadlock.

Now they say for this particular scenario if you did a SELECT FOR NO KEY update, that will actually allow you to avoid the deadlock in this particular case. But of course, this post is talking about how to debug the deadlocks. One solution is by doing a lot of logging on the application side to determine, okay, what transactions have led up to this deadlock because it's really historical locking that eventually results in a deadlock. Typically another way to try and figure out what's going on is annotating the queries on the application side. So putting in comments about where this code came from in the application to understand how it got locked and how this query is being run.

Another is trying to do more logging on the database side by altering the log_line_prefix and logging all statements. But that can be a performance dog, definitely makes things very slow. Another interesting solution I hadn't explored before is actually increasing the deadlock_timeout. I mean, I tend to keep it at 1 second, but they're talking about extending it to 15 minutes. Maybe that's a bit long but set it long enough that you can actually do a diagnosis to figure out what's locked. You could look at the pg_lock_table and try to figure out what locks exist that are causing this issue. But even with these techniques, it can be hard to understand how to determine what's causing a deadlock. But again, like I said, 95% of my cases are cases where something needs to be processed in an ordered fashion. And once that's done, it usually avoids the deadlock situation. But check out this blog post if you want to learn more.

The next piece of content- "How to Find and Stop Running Queries on PostgreSQL". This is from adamj.eu and basically he talks about how you can stop queries in two steps. The first thing is you find the pid and typically you would consult pg_stat_activity for that to find the right process ID. Then you can call one of two commands, either pg_terminate_backend, that essentially closes the whole database connection to the server for that client. The other option is pg_cancel_backend. That will just cancel the running statement on that client. It would still keep that session open so the database session doesn't get canceled, only the running query, whereas _terminate terminates the whole session. Then he looks at the different ways you can cancel sets of queries that potentially may be blocking.

So, basically, consulting pg_stat_activity and then rendering out, say, the pg_terminate_backend or pg_cancel_backend command. The first one he looks at queries blocking a particular process. So in his example, maybe you want to ALTER a table, but there are things blocking it. Well, you can find the process ID of that statement and then use the pg_blocking_pids function with that ID you want to run and it will find all the processes that are causing it to be blocked and then you terminate those. Then he looked at queries against a particular table, a way to cancel those or connections open longer than some number of seconds. Now, he also mentions that you can stop the queries through the operating system. So you can send a kill -SIGTERM pid signal which is essentially terminating the database connection.

Or you could send a kill -SIGINT pid command and that will cancel the active running statement. He also mentions the Windows equivalent here as well. Now he does say killing query processes using the kill -KILL pid command and that is definitely not advisable. Now he doesn't mention this here, but this basically causes downtime because the database system has not had an opportunity to get to a consistent state and you potentially may have shared memory inconsistencies. So it basically has to restart the whole system. So that could be seconds or minutes depending upon the state of your system when you run this, even on just a database connection. So you definitely don't want to do this, or I would advise against it. But if you want to learn more, you can definitely check out this blog post.

The next piece of content is actually a YouTube video and it is "Get Your Insecure Postgresql Passwords to SCRAM!". And this is on the San Francisco Bay Area PostgreSQL Users Group channel. This talk goes into a lot of detail about SCRAM, how it's implemented within Postgres and the different processes it goes through to do password authentication. And basically, SCRAM allows a client and a server to maintain a secret, but then do authentication where that secret is never passed across. They use cryptographic techniques to authenticate one another without actually sharing the secret. And he goes into a lot of depth explaining that. Frankly, the first 35 minutes of the talk is about that, and then there's about 1 minute of talk about how to start using it. So it's very easy to get started using SCRAM or convert from using MD5 to SCRAM as long as you don't have too many users. But this is a great piece of content if you want to learn more about how SCRAM works.

The next piece of content- "PostgreSQL Upgrades are hard". This is from andreas.scherbaum.la. He's talking about a presentation that was done at Postgres Vision 2022, and there's a link to the YouTube video here, but this is a text-based review of that talk. And with regard to minor version upgrades, they basically say, well, those are pretty easy to do. It does require a restart of the database because you're just changing the binaries and there are no new features with minor version upgrades, usually just bug fixes and security upgrades. So it's pretty easy to do these types of upgrades. The next one is major version upgrades, and those take a lot of work to validate that everything works as expected.

In particular, you can run into issues with extensions, of course, because these extensions are essentially third-party tools, so you need to validate that all of those are working on the new version and the particular upgrade process to use. Now, the default way to do upgrades of PostgreSQL is using the pg_upgrade. And that's what I use to do all of my PG upgrades, is the pg_upgrade tool, and I use the link method so that it converts them pretty quickly. Looks like they were talking a little bit about the support policy, and right now they support five major versions. So essentially five years of versions. Apparently, there was a discussion on whether they should adopt a model like, I guess, Ubuntu, where they have long-term service releases, but I think five years is fine.

And then people should upgrade their database every couple of years because you get better performance and a bunch of new features. The next area they talk about is cluster awareness, and that is a downside of Postgres, is that you may have one primary and three replicas, but when you're doing an upgrade, the replicas have a knowledge of who the primary is, but not other replicas. So trying to upgrade a whole cluster of servers, there's no easy way to do that. Now there's a particular sequence you should adopt and follow, but there's not really such a thing as cluster-aware upgrades. Now with regard to this, a thing to keep an eye on is Kubernetes. So with different vendors, Postgres operators are offering techniques to actually do coordinated upgrades using the operator in Kubernetes to coordinate version changes.

I think the Crunchy Data Postgres operator mentioned that in a previous article that they published. Another consideration they talked about is the downtime required for an upgrade. So that's definitely always a consideration. Then they talked about different upgrade strategies. So there's running pg_upgrade in link mode, which is what I tend to do. You can run pg_upgrade in copy mode, but that takes forever. They talked about logical replication as a solution. So basically logically replicating across solutions. Now this takes a lot of time and testing to do correctly, but it is another viable route.

They talked about Slony which uses a trigger-based mechanism. Frankly, I'd probably choose logical replication as opposed to Slony at this point. They say here the development of Slony stopped a while ago, so another reason to probably avoid it. And then, of course, pg_dump and pg_restore. If you have a small database, this is definitely a viable path because it's super easy to do. But if you have a larger database, then you're probably looking at pg_upgrade. So you can definitely check out this blog post or use the link above here to look at the YouTube video if you want to get all the details with regard to it.

Next piece of content- "Ordinality in function queries". This is from fluca1978.github.io and he's talking about the clause with ordinality. So when you develop a function and it returns a table, in this case, he's returning a three-column table from this function that he developed. And when you actually call that function, so SELECT all from the function that he created, it outputs this data. That is a table with three columns. If you use the WITH ORDINALITY operator, it actually adds an additional column called ordinality and it returns a bigint in an iterative order for each row.

Now this is identical to the PK column here, you'll see, but he just fabricated this PK column and it just happens to be identical to the ordinality column. Then he showed the ordinality operates from the output of the function. So you can see when you order it by random, you can see that they still are in alignment with what the primary key field says. You can also give the ordinality column an alias. So you can actually rename all the columns here and you can still do where and order by against it to pull out the exact data that you want. So in this case, he's only looking at the even rows. So if you want to learn more about this clause, definitely check out this blog post.

The next piece of content- "H3 INDEXES FOR PERFORMANCE WITH POSTGIS DATA". This is from rustprooflabs.com, and they're talking about the H3 hex grid extension, which helps map out data using Hexagons. This particular post looks at specific columns like an h3_ix column, offered by the H3 extension as well as an H3 index on a particular table. And it actually helped accelerate nearest neighbor style searches as well as regional analysis. Looking at the conclusions below, they saw nearest neighbor searches performed 73-77% faster with these H3 indexes and their regional analysis was 99% faster. So definitely a benefit of using this extension in these particular types of indexes. But if you want to learn more, definitely check out this blog post.

Next piece of content- "5mins of Postgres E23: Fuzzy text search & case-insensitive ICU collations in Postgres". This is from pganalyze.com and he goes over in depth the fuzzy text search article and case-insensitive ICU correlations that we mentioned last week on Scaling Postgres. So if you want Lukas's perspective on it, definitely check out this piece of content.

The next piece of content- "How We Made Data Aggregation Better and Faster on PostgreSQL With TimescaleDB 2.7". This is from timescale.com, and primarily what this post is talking about is the feature called continuous aggregates that's in Timescale. Now by continuous aggregates, my interpretation of that is that you have a time series database, maybe you have three years of data in it, and you want to run a query frequently during the day for three years of data, including the most recent one. Well, a continuous aggregate, I'm assuming you are maintaining an aggregate of that three years of data, maybe minus one day or minus two days, or they're bucketed in such a way that historical data is pre-aggregated.

So you just have to throw buckets of data together quickly or it's all in one bucket and then you just add the most recent data. I'm not sure about the implementation, but it's a way to maintain these aggregates so you don't have to run the whole three years of data. So they've added a new enhancement to this feature that has led to some pretty impressive speed performance. Now on the low end, it looks like a 2.6 improvement with a particular workload, but it goes as high as thousands. Here's over 1,000 times faster aggregation result, and here's a 44,000 times faster aggregation result as well as some smaller as well. But if you want to learn more about this new feature and some of the benefits, definitely check out this blog post.

The next piece of content. The PostgreSQL person of the week is Hervé Schweitzer. If you're interested in Hervé and his contributions to Postgres, definitely check out this blog post.

The last piece of content, we did have another episode of The Rubber Duck Dev Show this past Wednesday evening. This one was on "Ruby 3 Keyword Arguments". So if you're interested in the changes made with keyword arguments in Ruby 3, we definitely welcome you to check out our show.

episode_image