PL/Rust, Row Locks, Postgres Errors, SQL 2023 | Scaling Postgres 260
Join Over 1,000 Engineers & Get New Episodes Weekly!
In this episode of Scaling Postgres, we discuss PL/Rust 1.0 release, row locks, Postgres errors and the new SQL 2023 standard.
- PL/Rust 1.0
- PL/Rust 1.0 and its trusted language mode
- Row locks in PostgreSQL
- Assigning Blame For PostgreSQL Errors
- SQL:2023 is finished: Here is what's new
- PGSQL Phriday #007
- Postgres 16 highlight - Control of SCRAM iterations
- Run AlloyDB anywhere - in your data center, your laptop, or in any cloud
- Time Ranges without Overlapping
- PostgreSQL Unlogged Sequences - Look Mum, no WAL!
- PostgreSQL: Pgpool-II Use Cases and Benefits
- Ultimate Guide to Citus Con: An Event for Postgres, 2023 edition
- Peter Zaitsev
- Katharine Saar
- Safety Nets For Rails Apps With David Teren | Rubber Duck Dev Show 80
All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is actually a release on GitHub of P/LRust. So this is Procedural Level Rust, and it's an extension that runs in Postgres that lets you write user-defined functions in Rust. So what makes Rust a little different than some of the other languages is that it is compiled. And as they say here this compilation can be slow but once it's compiled, the execution is very fast. And they say, quote "...often x10 faster than PL/pgSQL!".
So essentially, whatever functions you want to write in Postgres, you can now write them in Rust with this new extension essentially. Some of the things they mentioned are that it is statically typed. But another advantage of compilation is that you can actually catch a lot of errors with it. So this is great news if you want highly performant Postgres functions.
And another piece of content that has a lot to say about this is Lucas in his "Five minutes of Postgres". And this is "PL/Rust 1.0 and its trusted language mode". This is from pganalyze.com. And he goes into a lot more depth on this release, including talking about it using a trusted language of model.
This means that any user can use Rust but as a consequence to be considered a trusted language, they had to disable certain standard features of Rust for them to be able to run within Postgres. Because you're not supposed to allow trusted languages to access the operating system or Postgres internals. But if you want to learn a lot more about this, definitely check out this piece of content as well.
Next piece of content- "ROW LOCKS IN POSTGRESQL''. This is from cybertec-postgresql.com. And this post specifically talks about row locks. Now they are different from table locks and a row lock is of course held on a single row. But what you may not know is that, typically, you can't see them in the PG_locks table. So if you look in the PG_locks table of an active database, you could see all these different table locks that are occurring. And other types of locks but not row locks.
And they talk about how there are basically four different kinds of row locks. One is a FOR UPDATE, a FOR SHARE, which is generally used for foreign key constraints. There's a FOR KEY SHARE and a FOR NO KEY UPDATE. And what he says is that a lot of people use SELECT...FOR UPDATE to explicitly lock a row against concurrent updates. But most of the time to improve your concurrency, what you really want to do is use FOR NO KEY UPDATE locks.
Because he says quote "...so that you do not block inserts into tables referencing the table you're updating". Because a lot of times you're not going to be updating the key when you're doing these updates. And then he talks about where the row locks are stored. Now they're not stored in the shared memory lock table. They're actually stored on the row itself. That helps Postgres lock as many rows as it needs for operation but as a consequence, their table rows can become dirty and then maybe written out to disk during a checkpoint.
So that is one small disadvantage of Postgres' implementation. Then he goes into a very detailed discussion on how Postgres acquires row locks and how the system works. I won't get into that here today, but I definitely recommend checking out this post to really understand how row locks work if you're interested. And he says whereas the row locks aren't usually present in the pg_locks table, there is an extension called pgrowlocks that lets you examine them. And he shows an example here of that. So if you want to learn more about rowlocks, definitely check out this piece of content.
Next piece of content- "Assigning Blame For PostgreSQL Errors". This is from rhaas.blogspot.com. And he's talking about when you're looking in your logs, you'll see errors classified in PostgreSQL. It may be an error that a check constraint was violated. Or there could be errors that it could not open a file because permissions are denied. Now each of these are errors, but they're actually caused by different things. This first one is basically considered a user or programmer error, where invalid data was submitted to the database.
But this other type of error is considered more of a configuration error or maybe an operation error because the file does not have proper permissions to be able to handle the query that was made. So this blog post goes into some of those different other types of areas you can find in Postgres related to primarily operating system-related problems. Now I actually find this a bit frustrating because when I'm doing DBA work, I really don't care about these first class of errors, where it's essentially a programmer error or an application error where invalid data is sent.
That's not really an error in Postgres to me. Or it's a different class of error and they're combined with potential errors down here. So when I'm looking for the errors of this class, where, operationally, there's something wrong, I have to filter through all of these essentially application-level errors. Now this blog post doesn't talk about this but, personally, I think it would be great if there was a way to actually distinguish between these. But if you want to learn more, check out this blog post.
Next piece of content- "SQL:2023 is finished: Here is what's new". This is from peter.eisentraut.org. And in terms of what's new and the changes, it is a lot. So I can't hope to go over all the different changes because as you can see, they are substantial. So if you're interested in all of the changes that made it into the SQL 2023 standard, definitely check out this blog post.
Next piece of content. It's time again for PGSQL Phriday. And the deadline for writing a blog post is actually this Friday, April 7th. This topic is on "Triggers''. And this post is from mydbanotebook.org. And with regard to triggers, the topic wants to know "Do you love them? Do you hate them?". And basically, how do you use them and why. Also related to this is the announcement on pgsqlphriday.com, announcing this month's blog post.
Next piece of content- "Postgres 16 highlight- Control of SCRAM iterations". This is from paquier.xyz. And he's talking about with Postgres 16, you can now control how many iterations are done to generate and also authenticate SCRAM-based passwords. And what's interesting to note here is 4,096 iterations are the Postgres default. Whereas RFC's 7677 recommends the default of 15,000 iterations. But of course, if you're going to increase it essentially by three times this amount, it's going to take longer to create those passwords as well as doing an authentication.
And so what he says here is basically a low value of iterations means that you can authenticate very quickly, but passwords are more sensitive to brute force attacks. Whereas a high value protects you from those attacks but it takes a lot longer to authenticate. So making a connection to Postgres will take longer. And he demonstrates that down here where he runs PGBench and doing the standard 4,096 iterations, the TPS was 236.
And the average connection time was four milliseconds, whereas, with 10 million iterations, the TPS was. 0.15 transactions per second which are nothing. And the average connection time took over six seconds compared to four milliseconds. So when you really ramp up the iterations, it can really slow things down. To protect against brute force attacks, this suggests using something like a connection Pooler to keep connections around more often. But if you want to learn more, definitely check out this blog post.
Next piece of content- "Run AlloyDB anywhere- in your data center, your laptop, or in any cloud". This is from cloud.google.com. And AlloyDB is the Postgres variant that Google developed and they offer this as a cloud service. But now, with their new product called AlloyDB Omni, you can actually download it and run it anywhere. So if you're interested in that, you can definitely check out this blog post.
Next piece of content- "Time Ranges without Overlapping". This is from sqlfordevs.com. And he's essentially talking about exclusion constraints. So these are great when you're trying to reserve a room, a car, or things of that nature. And he's using booking rooms as an example. He uses a time range, the extension btree-gist, and creates an exclusion constraint using a GIST index to avoid overlapping room reservations. So if you want to learn more about how to do that, definitely check out this blog post.
Next piece of content- "PostgreSQL Unlogged Sequences- Look Mum, no WAL!". This is from crunchydata.com. And he's talking about sequences where you can create them unlogged, which means they're not going to be logged into the WAL files. Now primarily, this was done in version 15. So when you have unlogged tables, they could have unlogged sequences as well. But in practice, there's no good use case for them. But this post goes over how they work. Also how much data actually gets saved to the WAL with regard to sequence, and it's not much. But if you want to learn a lot more about sequences in general and unlogged sequences, you can definitely check out this blog post.
Next piece of content- "PostgreSQL: Pgpool-II Use Cases and Benefits". This is from percona.com. They're talking about the different features that Pgpool II can do. Now, generally, I think of it as a connection Pooler, but it does things like load-balancing, connection pooling, and query caching, so your databases don't have to serve a query. Although I'm not quite sure how they expire the cache with regard to this. They talk about high availability features that enable you to direct connections to alternate servers, as well as doing online recovery. So if you're interested in learning more about Pgpool, you can check out this blog post.
Next piece of content- "Ultimate Guide to Citus Con: An Event for Postgres, 2023 edition". This is from citusdata.com. And this is a blog post listing all about the upcoming Citus Con event and all the different talks that are expected. So if you want to learn more about that, check out this blog post.
The next piece of content, there was another episode of Postgres FM. Last week, this one was with Peter Zaitsev. He's from percona.com and they're talking about all sorts of issues from MySQL vs. Postgres to open-source licenses. databases on Kubernetes, et cetera. So you can definitely listen to their episode or check the YouTube video here.
Next piece of content. The PostgreSQL person of the week is Katharine Saar. If you're interested in learning more about Katharine and her contributions to Postgres, definitely check out this blog post.
And the last piece of content, we did have another episode of the Rubber Duck Dev Show this Thursday afternoon. This one was "Safety Nets For Rails Apps With David Teren". Basically, what kind of constraints can you put into place for your application so that you can avoid it going off the rails. So these are things like linters and code quality tools. But if you want to learn more about that, we welcome you to check out our show.