background

Show Plans, WAL Monitoring, Using UUIDs, Default Logging | Scaling Postgres 109

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss how to show live plans, how Postgres 13 allows WAL monitoring, how to use UUIDs in your app and how logging is setup by default.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about Show plans, wall monitoringusing UUIDs and default logging. I'm Kristen Jameson, and this is scaling postgres episode 109.

[00:00:22] Alright, given the unusual times that we live in, I hope you, your family and coworkersare safe and doing well. Our first piece of content is detailed look on the new PostgreSQLtroubleshooting extension, PG Show Plans. And this is from CyberTech postgresql.com. And wehad discussed this before. This new extension called PG Show Plans that allows you to seewhat plans live queries are using. So this goes into a little bit more detail about the extension.Basically, the idea is kind of like a PG Stat activity, but for each of those queries, see what theplan is and it says, what does this do exactly? Quote it copies query execution plan info from theprivate session backend context to shared memory context when the query starts so that isaccessible to other users. Now, in terms of the installation, you can install from source code, butthey also look to be offering Rpm and deb packages. And it works from 9.1 up to 13 beta. Nowof course, once you have it installed, you do need to add PG Show Plans to the shared preloadlibraries. And it does have a few different configuration options that you can set for it. And ofcourse, it will require a restart of your database. So here they show an example of an inaction.They've set up PG. Bench. Here, they've created the extension and then run the query to beable to look at the output. And here you can see the plan for this query here. So the intent is, ifyou have really long running queries, you could use this to be able to see, okay, what plan is thistaken? Why is this particular query taking so long? Now on the previous announcement theymentioned something about performance. And here they're saying that for very parallel shortstatements say less than 0.1 milliseconds, there's a penalty of about 20%, which is notinsignificant. But for normal, more slowish real life queries, the performance hit was smallenough to be ignored. So based upon this, it looks like your mileage may kind of vary. But it alsolooks like they have a function that allows you to disable or enable PG Show Plans. So perhapsyou could have the extension installed and then dynamically enable it or disable it, depending ifyou have problems you need to investigate. So again, some more details on this new extension.And if you're wanting to learn more, definitely check out this blog post.

[00:02:46] The next piece of content is new in PG 13 Wall Monitoring. And this is from our jujuGitHub IO. So basically in PG 13 looks like they are starting to track wall generation statistics. Itsays, quote here, each back end will track various information about wall generation such as thenumber of wall records, the size of wall generated and the number of full page imagesgenerated. Now this is great because I've had to do some analysis of wall files to determineessentially why they were so large. And being able to track this per back end is a huge benefit.And then you can do a per query wall activity with PG Stat statements. So essentially some ofthese columns will be added to PG Stat statements. So this is huge. This is a great feature I'mlooking forward to. And he goes through how some of it is implemented. So if you're wanting tolearn more about that, definitely check out this blog post.

[00:03:42] And from the same site is new in PG 13 monitoring the query planner. So most of thecounters that is in PG Stat statements, they just give you execution times. But with this addition,you can also get planning times. So for example, it looks like you can enable PG Statstatements track planning, and that will track the planning time for your queries. So if you havequeries that are quite complex that take a while to do the planning stage, you can actually collectthat information now with this new feature. So if you're wanting to learn more, check out this blogpost.

[00:04:16] The next piece of content is empowering a Rails application with UUID as defaultprimary key. And this is from Slash Betterprogramming. So we talked a little bit about BDR in thelast episode and how they had the concept of a shadow master. And I said, well, one thingyou're going to have to do is centrally create any primary keys be they IDs UUIDs. So this is away using a Ruby on Rails application to set up UUIDs. And they show you an e --xample how you would typically pull up a given article with using an ID as the primary key,whereas you can switch them to UUID. So it looks like this. So some of the benefits he'smentioning is that you get to obfuscate information in the URL. Perhaps it's a little bit of asecurity layer, although it's security by obscurity, of course. And then the benefits of EUID is thatyou can actually generate them from client applications. It doesn't all have to be generatedwithin the database. So he shows you how to actually do what they call a database migration inRails to be able to set up the extension. Here he's using the PG crypto extension to do it,although there is another you can use as well, as well as setting the primary key type to beUUID. Now, what he also mentions is that this framework, Ruby on Rails also has the concept ofbeing able to pull up the first record from a given set you're pulling or the last record. Andgenerally it orders that by the primary key. Well, that may not be what you're intending if youswitch over to use UUIDs, but he actually has a way to do this where you can actually specify adifferent implicit order within the application framework to say, be the created at date. So thenfirst and the last record kind of make more intuitive sense. And of course he makes thisdisclaimer be super careful if you're going to be transitioning from say, integers to UUIDs forexisting projects. But if you're looking to make this change, here's an article on how you couldpotentially do it.

[00:06:20] The next post is debian default logging explained. And this is from mydbainok.org andthis kind of goes into how Debian tends to set up a PostgreSQL when you're using theirpackages. And this would be for example, for a Ubuntu server. So when you actually do get thePG current log file from Postgres, it's actually blank even though you know it's loggingsomewhere. And by default it logs into VAR log PostgreSQL. So it goes over a little bit of aninvestigation on how to find it. So for example, checking the logging settings in Postgresql.comlog destination was not set, meaning was set to standard error and logging collector was off. Soaccording to Postgres, there's no logging to this location. So how is it doing it? So he looks intothe I believe it's a perl wrapper Pgctl cluster that they set up on Debian systems. So he gets theright one and then he looks at this particular code and it looks like it launches the systemdpostgres service. And here's the service definition for it, or I should say the template unit file forit. And basically it works because it's using the L or the log switch of Pgctl because essentiallythis calls Pgctl. And he said of course that means if you use the L switch of Pgctl log file, the PGcurrent log file function won't be able to help you find where your log file is. So if you usePostgreSQL on a Debian system, you may want to check out this blog post to kind of get abetter understanding of how Postgres is set up for the debian packages.

[00:07:55] Next post is Store Procedures in PG. Eleven. Better late than never. This is from ahighGo CA and they're basically explaining store procedures and what they are. And he goesover the difference between functions and store procedures. Basically it's executable code andcan run in different languages. Functions cannot have transactions, whereas store procedurescan. And you execute functions just by running the function, whereas a store procedure youactually call it. And he goes into a little bit of discussion of how you can return a value from astore procedure, how it gets invoked, its general structure which is very similar to a function andit even supports overloading. So if you want to get more insight into the differences betweenstore procedures and functions, definitely check out this blog post.

[00:08:42] The next piece of content is parallel query inside function. And this is from abdulYadi's blog and he's talking about where he was running a query here that had a bunch ofunions. And when he just runs the Raw SQL function, it runs in parallel, something similar tothis, and then appends everything together and gives the ultimate result. And that takes around1.9 seconds. But if you put this exact SQL code in a Plpg SQL function or in a pure SQLfunction, it actually takes much longer, about 3.4 seconds here and again about 3.4 secondshere. So the question is why? So we actually looked at his CPU output and you --can tell the plain SQL function looks to be running in parallel, it's using all the CPUs, whereasthe Plpg SQL function only seems to be using one CPU core. Same thing for the SQL function.So how do you get to be parallel? Now, he actually did kind of a hack to get it to work, whereashe said he instead of returning the query, the result set is saved into a temporary array inmemory and then exported out once finished. However, in the comments they also mentioned isthat you can do a parallel safe to your function to get it to be able to work in parallel. So this is aquick post, but it's just something to keep in mind is that if you're using functions and you wantthem to run in parallel, maybe you need to use this parallel safe.

[00:10:12] The next piece of content is Optimize PostgreSQL server performance throughconfiguration. And this is from Crunchydata.com, and they're basically going over some of theareas that you most want to change as you're optimizing server performance as you increasethe size of your hardware. Namely the Ford that they've chosen is Shared Buffers, wall buffers,effective cache size and Maintenance work memory. So they go over each of the configurationoptions and what they do and some recommendations for settings. So if you want a refresher onthat, definitely a blog post to check out.

[00:10:48] The next piece of content is actually a YouTube video and it's postgres Pulse Insightshow to Solve Bloated postgres Database improve Database Performance And this is from theEnterprise DB YouTube channel and it looks to be about on a weekly basis. They're going tohave this live stream, I believe, where they answer questions that they find on various onlineforums. So if you're wanting to have a live stream or a video covering questions that thecommunity has about postgres, definitely a piece of content to check out.

[00:11:21] The next piece of content is the new PostgreSQL Person of the Week. Is MichaelPacquiao or Pacquiao? My apologies if I'm pronouncing that incorrectly. So if you're interested inlearning more about Michael and his contributions and work in PostgreSQL, definitely check outthis new post.

[00:11:39] The next piece of content is Polygon averaging in PostGIS and this is fromCrunchydata.com and they show you how to average polygons in PostGIS. So if you'reinterested in that, check out this blog post.

[00:11:52] And the last piece of content is related to machine learning. So it's how to use therandom Forest machine learning model with two UDA PostgreSQL and orange. Part three. Sothis is the part three of this article series, so if you're interested in that, definitely a blog post tocheck out.

[00:12:11] That does it. For this episode of Scaling Postgres, you can get links to all the contentmentioned in the show notes. Be sure to head over to Scalingpostgres.com, where you can signup to receive weekly notifications of each episode. Or you could subscribe via YouTube oritunes. Thanks.

[00:12:29] Our Channel. --

episode_image