background

Postgres 13 Released, What's New In Postgres 13, Debugging PL/PGSQL, Monitoring | Scaling Postgres 133

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss the release of Postgres 13, what is new in Postgres 13, how to debug PL/PGSQL and Postgres monitoring best practices.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres we talk about postgres 13 released What's Newin Postgres 13, debugging plpgsql and postgres monitoring. I'm Kristen Jameson and this isScaling Postgres, episode 133.

[00:00:19] You alright, right? I hope you family, friends and coworkers continue to do well. Ourfirst piece of content is PostgreSQL 13 is released and this is from the Postgresql.org website.So it was released on the 24th. And this news announcements goes over some of the newfeatures such as Deduplication in Btree indexes, an extended statistics system to bettercommunicate statistics, dependencies between columns for or clauses and any all lookupsimprovements with hash aggregate functionality that allows large aggregates and grouping setsto be done to disk as opposed to just to memory partition table enhancements. With regard toimproved Pruning, as well as being able to do more partition, wise joins a paralyzed vacuum forindexes. Again, not on auto vacuum, but when you manually run vacuum replication slots allowyou to define how much of the wall to retain so you don't run out of disk errors on your primary.Then they mentioned some application development conveniences such as the daytime functionto SQL JSON path support and included gen random UUID function so you don't need anextension partition supporting logical replication as well as before row level triggers and then thefetch first with ties as well in addition to some security enhancements. Talking about trustedextensions, being able to require channel binding with scram authentication and additionalfeatures such as that. So I encourage you to check out this post as well as the Release noteslink here. That of course goes into detail with all the different features that have been releasedwith PostgreSQL 13. Now, there's a number of other posts I'm going to follow up that coveressentially what's new in PostgreSQL 13. The first one is lessons learned from running postgres13, better performance monitoring and more. So they did this in the staging environment, thebeta and I believe release candidate. And with regard to index deduplication, they found about athreefold improvement in space savings, which again this could be a huge win for performancebecause smaller indexes, higher probability they're going to be cached and of course smaller, soit's much more easy to search through it. Then they talked about the extended statisticsimprovements and some of the benefits 13 brings. With that they talk about the parallel vacuumand again how this doesn't apply for auto vacuum, but it does work when you run a manualvacuum because they assume you want to get the maximum performance with running thatvacuum fast. Although I wonder if they're going to make auto vacuum support parallel vacuum inthe next version. Or maybe the version after they talked about the incremental sortingimprovement, where if you presort something and then you limit by it. Or do another sort andhave sub queries. The Planner will detect it's already been sorted and then won't try to resort thewhole thing again. So it supports an incremental sorting to give you better performance. Theymentioned some of the monitoring changes where you can actually track wall usage to see howmuch wall a given query is using. So for example, here they did a delete from users and theyasked for the wall and you can see how many wall records were generated, how many were fullpage images and the number of bytes involved. Then in particular, PG Stat statements can alsotrack this information as well. So now you can kind of get insight into what's writing all of thesewall records, which is something I've been asked about on occasion. And with Postgres 13 itshould be much easier to find out this information. Then they talk about the sampling statementlogging that is supported now. So you can take a sample, not just log all slow statements, but itallows you to take a sampling and then even log parameter changes for failed statements so youcan get more insight into why a statement failed. And then when you're analyzing, you can alsoinclude buffers to see, for example, what has been hit with regard to shared buffers. And this isalso reflected in the PGSTAT statements as well. And then they cover the inclusion of the glosssuri, the UUID automatic support and some psql improvements. So again, another great postfrom Pginalyze.com talking about the new features in Postgres 13.

[00:04 --:40] The next one is a post of what's new in PostgreSQL 13 from Enterprisedb.com. They talkabout the faster vacuum because being able to do the indexes in parallel. They talk about thesecurity features avoiding man in the middle attacks. Now that the client can do channel bindingwith Scram authentication, being able to reduce errors in production. So they have a PG catalogcheck utility to be able to check your catalog and make sure everything is good. Being able toverify backups because of the new included PG Verify Backup utility that's included so you canverify you have all the information required to be able to restore a full backup. Going back to thejoins of partition tables, this can happen more frequently now, as well as logical replication forpartition tables. Then they also mentioned the deduplication of Btree indexes as well as othersecurity enhancements.

[00:05:33] So the next post also covering new features is new features in PostgreSQL 13. Fullwebinar video. And there's a full webinar video, I included the video right here, it's on YouTube.It's about an hour and 18 minutes in length. And it covers things like the partitioningenhancements, Btree duplication, incremental sorting, hash aggregation enhancements, thebackup manifests which enables that verify backup, trusted extensions, replication time, sizelimits to avoid running at disk space on your primary, the parallel vacuum capabilities, securityenhancements, as well as general improvements for large databases. So a lot, a lot of contentrelated to all the new features in Postgres 13 that you can check out and analyze and see whichones you want to use when you decide to upgrade a postgres 13.

[00:06:22] The next piece of content is debugging plpgsql Getstacked diagnostics. This is fromCybertechgresql.com and they're talking about you have a function and they called it a brokenfunction which does a divide by zero which results in an error. And this is called from withinanother function but you can't really know why it crashed and you don't get a backtrace or adebug of the code. And he says one way you can do this is to create another function calledGetstack, although you can name it whatever you want to and it runs the function of interest andthen it has an exception where you do the Getstack diagnostics and then print out informationrelated to it. So when you wrap your function in something, as he has done here, you get thistype of output and you can see what the message was. It was a division by zero, the contextrelated to that division by zero, and then what line numbers in terms of the stack trace thingsoccurred on. So this is a great way to help you debug your plpgsql code. So if you're interestedin that, check out this post.

[00:07:29] The next piece of content is actually a YouTube video and it's monitoring PostgreSQL,a PostgreSQL talk. This is from the Procona YouTube channel and it's given by Jason Yee atDatadog. So they talk about different aspects of monitoring your PostgreSQL installation. Inaddition to iterating, some things you should be monitoring. It also talks about more so bestpractices with regard to do it. So if you're interested in learning more about monitoring, you cancheck out this YouTube video. The next piece of content is simple anomaly detection using plainSQL. This is from Hakibanita.com and he's talking about a technique he used to detectanomalies in a set of log entries. Like they were looking for status codes and they wanted to besay, alerted to it. So he goes through the process of doing that first. How does he detectanomalies? So he looks at it and then does a calculation in SQL to get a standard deviation. Andthen from the standard deviation he then gets a Z score. So a Z score helps you determine howmany standard deviations each data point is from the mean. So you can see in this post here theZ score. The only one that is less than or greater than one in terms of standard deviations is thistwelve value. So clearly it is the Outlier. So essentially he's using outliers as a term for ananomaly. And then he goes into analyzing the server log and the different status codes and hegoes through the process of getting those standard deviations, calculating a Z score form so thatyou can then identify the status codes. And then the question becomes what Z score should youuse? Should you use one standard deviation from the mean? Two, three? Four? What makessense? And he advocates doi --ng backtesting. So back test previous log information that you've been collecting to determinewhat you're seeing and what you want to alert on what do you call an anomaly or not. And hegoes through the process of determining that for his set of data and how to eliminate essentiallyrepeat alerts. Then he goes into talking about increasing accuracy through weighted means,potentially using a medium or the median absolute deviation. So this does involve a little bit ofstatistics, but a lot of SQL code in terms of doing this type of anomaly detection. So if you'reinterested in that, you can check out this post.

[00:09:51] The next piece of content is in Memory Table with pluggable storage API. This is fromHigo CA, and this is a process they're developing that I assume might be an extension topostgres or a pluggable storage solution that only uses memory to store tables and information.And they basically want to basically have the buffer become that memory storage mechanism, orat least intercepting its right to the disk and storing it in this memory structure that they'recreating. So they've gone through the process of developing an initial prototype for it, and for 20million rows, they're seeing a 50% performance increase for inserts, a 70% increase for updates,and a 60% delete, and a 30% increase in vacuum operations. So that's a pretty goodimprovement, but it's something they're working on. If you're interested in learning more, checkout this post.

[00:10:42] The next piece of content is announcing Crunchy Bridge, a modern postgres as aservice. So Crunchydata.com has released a service called Crunchy Bridge, which they're doinga postgres as a service service that provides production ready postgres out of the box, doesautomated backups, a high availability point in time recovery, and then encryption in transit andat rest. This is supported on AWS and Azure right now, and they can basically work betweendifferent clouds, if that's something of interest to you, or on premise solutions. So if you'reinterested in a managed solution that is cloud agnostic, maybe you want to check out thissolution from crunchydata.

[00:11:27] The next piece of content is Postgres data types in PL Python. And they're talkingabout translating Python data types to postgres data types used in PL Python. So if you'reinterested in that, you could check out this post from Crunchy Data.

[00:11:42] Next piece of content is using Postgres and PG routing to explore the smooth wavesof Yacht Rock. And basically PG routing is typically used for geographical information systemcalculations, but you can use it when you have any sort of interrelationships between pieces ofdata, so anything that defines relationships. So you can look at one artist here at one hop, andthen go to two hops, three hops, four hops, and they're using the PG routing extension to dothat. So if you're interested in doing this type of analysis, check out this post fromCrunchydata.com.

[00:12:16] The next piece of content is Diary of an Engineer Delivering 40 X Faster PercentilesUsing Postgres Citis and T Digest this is from Citusdata.com. Now, this particular post is talkingabout a use case with Citus, which is an extension for postgres that enables you to scale outpostgres. But they're talking about an extension called Tdigest for doing estimates of percentilecalculations and how with it they were able to achieve a 40 x improvement in performance. So ifyou calculate these types of percentiles, maybe you want to check out this post.

[00:12:52] The next piece of content is how we are building a self sustaining open sourcebusiness in the Cloud era. Version Two So this is from Timescale.com, and it's for TimescaleDB,an extension that does time series data in postgres, and they've basically altered their license tomake it more applicable to more use cases. So if you're interested in a time series database inusing postgres, maybe you want to check out this post to see what they're planning.

[00:13:18] And the last piece of content is the PostgreSQL Person of the Week is Dan Langill.So if you're interested in learning more about Dan and his contributions to postgres, definitelycheck out this blog post that does it. For this episode of Scaling Postgres, you can get links to allthe content mentioned in the show notes. Be sure to head over to Scalingposgres.com, wherebe via YouTube or itunes tunes. Thanks. --

episode_image