fsync Stopgap, CTE Changes, autovacuum_naptime, Postgres Community

In this episode of Scaling Postgres, we review articles covering a fsync stopgap, tuning autovacuum_naptime, upcoming CTE / WITH clause changes and the Postgres community.

Content Discussed

YouTube Video

Podcast Audio

Transcript

[00:00:00] In this episode of Scaling Postgres we talk about F Sync stopgap, auto vacuum naptime width and CTE changes and postgres community. I'm creston. Jameson. And this is scalingpostgres episode 52.

[00:00:22] I hope everyone is doing well. Unfortunately have a bit of a cold. So my apologies formy voice this week. But the first thing I wanted to mention is that there is a live show that'scoming up this Thursday. So if you go to Scalingpostgres.com, at the very top any of the pagesthere's a way to click the check it out button and there's a way to go ahead and claim your spotfor the free webinar. And we're going to be discussing scaling your postgres configuration. Sobasically from 10GB to ten terabytes, what are the different configuration options you need tochange for your database? And again, this will be happening on Thursday, February 20 eigth at02:00 p.m. Eastern Standard Time. Or eleven, actually, a M Pacific Standard Time. So I'll includea link in the show notes but feel free to go ahead and sign up to attend the webinar. Our firstpiece of content this week is actually new releases of PostgreSQL. So eleven point 210.7, etcetera. Now one of the big highlight changes is changing the behavior of F sync. So we coveredthis on a more recent episode talking about that F sync issue where if there's an error thatoccurs during Fsync, the data potentially may get lost. So the change that they've made isactually causing a panic which would basically cause the PostgreSQL instance to stop andrestart itself. And then PostgreSQL can then replay the data from the write ahead log to helpensure the data is written. So I would call this kind of a stopgap measure. They're potentiallythinking about things to maybe do for Twelve that may be a little bit different. But basically theychose this path to say we want to avoid data loss insofar as creating a panic to avoid thatpotential of data loss. So normally I would suggest go ahead and install your point releasespretty quickly, but definitely you want to monitor this better after making this changes in caseyou're having problems with your file system because this may bring some things to light. Now inaddition to that, there are tons of other bug fixes and improvements that are listed here that I'lllet you look at from the link in the show notes.

[00:02:37] The next post is Tuning Auto Vacuum Nap Time and this is from the Robert Haas blogand basically he talks about tuning the configuration parameter Auto Vacuum Nap Time andwhat it is and what it does. And I always like his in depth blog posts where he talks about certainfeatures like this. So I highly suggest checking this out. But basically this triggers when an autovacuum launcher process is going to start a new auto vacuum worker to scan a database in thecluster to determine what are the next tables that need to be vacuumed. Now he says by defaultthis is 1 minute, but he's had some cases where he's seen it increase to an hour or even onecase one day. And he says that extending it can cause huge problems because basically you'relike turning auto vacuum off. And how this configuration works, it's like how soon does it wait toactually start up another auto vacuum worker. So if it's set for example to a day that first day itmay only bring up one worker to auto vacuum one table, the next day it'll bring up anotherworker, the next day it bring up another worker. So that's getting close to the position of havingauto vacuum turned off. So you definitely don't want to extend this too long if you need to. Andhe said there's only one case where he's actually made it shorter than a minute, that it actuallyoffered some improvement. That was for a table that was basically being used as a job queueand there were a lot of inserts and deletes happening on that and in that situation it helped butotherwise it probably makes sense to leave it where it is or just tweak it slightly if you arenoticing issues. So definitely blog post to check out. The next post is with queries, present andfuture and this is from Crunchydata.com and it's talking about CTEs which are common tableexpressions or with queries and basically it helps make your SQL more readable. Now one thingthat they're mentioning here is that there's a new patch that has been made with PostgreSQLtwelve. And what this does is it changes some previous behavior and actually pushes downqueries into the CTE. So for example, in a previous episod --e of Scaling Postgres, we talked about this post, be careful with CTE and PostgreSQLbasically using the with clause, they noticed that CTEs and PostgreSQL is that the database willevaluate the query inside the CTE and store the results. And we have this example here whereyou have a with, he just called it CTE. So a with clause select all from Foo. Then he says selectfrom the CTE where the IED equals an integer. It's going to do a full sequential scan on the Footable and not use an index that exists. So it materialized. The CTE? It actually created this as adata structure and did not push down essentially this where statement to be able to execute thequery faster and use the index. And they were saying whereas you can do a subselect and thistechnique it does use the index scan. So going back to this original post, they basically made achange in PostgreSQL twelve that's projected to be released when that's released where incertain conditions it will push down those where statements to give you better performance andthey go through and do a bunch of tests to show you how it gets pushed down and in whatsituations. Now you can control the behavior to a certain extent by saying whether a CTE shouldbe materialized or not. So there is a materialized keyword and a not materialized keyword tospecify whether you want that CTE materialized or not. So overall this is a great feature andshould improve a lot of queries. But on the other hand, because it's changing the defaultbehavior, it actually may negatively impact some queries. Now again, this is not coming untilPostgreSQL twelve and we'll have to see what's included in there. But this could be somethingto watch over when you make the migration to twelve on what the impact will be to your queries.

[00:06:46] The next post is if PostgreSQL is the fastest growing database, then why is thecommunity so small? And this is from the Timescale.com blog, and he talks about how for thesecond year in a row, PostgreSQL is still the fastest growing database management system.And then they reference these different articles on all these organizations that have startedstandardizing or converting to using PostgreSQL. But what's interesting, the number ofattendees at PostgreSQL conferences seems rather small, at least compared to others. Like forexample, they mentioned here the two biggest PostgreSQL conferences had about 1100attendees, whereas MongoDB World had over 2000 attendees. And then Oracle Open Worldhas about 60,000 attendees. But the point about this Oracle Open World I believe they sponsorJava. There's a lot more other content that's being discussed rather than just the Oracledatabase, so I can kind of understand that. But it's still interesting that the PostgreSQLcommunity is not larger. Looking at some of the other statistics anyway, they're proposing howyou could potentially get more involved with the community in terms of joining the PostgresSlack channel, following on Twitter, or joining a Meetup or a Postgres users group. So I'd beinterested in what you think. Why do you think the community is potentially smaller than itpossibly should be? Do you know if a lot of developers follow PostgreSQL? Anyway, I thoughtthis was an interesting blog post to check out. The next post is PROJ six in PostGIS. Now, I'mdefinitely not an expert in PostGIS, I haven't really used it, but they do have a reprojectionsupport library called PROJ, and they're talking about changes that they've started making inorder to support PROJ Six. So if you use PostGIS, and potentially this utility, definitely a blogpost to check out the next post is georedundancy of PostgreSQL database backups withBarman. So if you're using Barman, they've introduced a way to do georedundant backups. So ifyou have a database server in one location being backed up by Barman, and another databaseserver in another geographic location being backed up by Barman, you can actually transfer thebackups between them to ensure georedundancy. So if you use Barman or want to use thisfeature, that's a blog post to check out. And the last post is monitor PG backrest backups withNagios. So again, this is specific. If you're using PG backrest to backup your PostgreSQLinstance and you want to better monitor using Nagios, definitely a blog post to check out.

[00:09:25] That does it. For this episode of Scaling Postgres, you could get links to all thecontent mentioned in the show notes. Be sure to head --over to Scalingpostgres.com where you can sign up to receive weekly notifications of eachepisode. Or you could subscribe via YouTube or itunes. Thanks. --

fsync Stopgap, CTE Changes, autovacuum_naptime, Postgres Community | Scaling Postgres 52

Join Over 1,000 Engineers & Get New Episodes Weekly!

Content Discussed

YouTube Video

Podcast Audio

Transcript