background

Just Upgrade, Vacuum Updates, Connection Strings, Postgres Showcase | Scaling Postgres 80

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss why you should just upgrade, vacuum updates in Postgres 12, psql connection strings and a Postgres showcase.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres we talk about just upgrading vacuum updates,connection strings and postgres showcase. I'm Kristen Jameson and this is Scaling Postgres,episode 80.

[00:00:20] Alright, I hope you're having a great week. Our first piece of content is Just Upgradehow PostgreSQL Twelve can improve your performance. This is from Crunchydata.com blog.Now this blog post starts off talking about what is the major feature or features of this release,but what he says here quote many of the features and enhancements in PostgreSQL twelve willjust make your applications run better without doing any work other than upgrading. So hencethe title. Just upgrade. So he talks about some of these different areas and the first area hementions is major improvements to indexing. Now this is something I wasn't aware of or haven'tread about, but he said PostgreSQL twelve makes significant improvements to how B treeindexes work. And from experiments using TPCC like test showed a 40% reduction in spaceutilization on average. So things that update tables regularly could see notice improvements todisk utilization and in turn that would mean less space required for caching those indexes,presumably. And the other big benefit he also mentions here of course is being able to re indexindexes concurrently. And actually you may need to do some of that to gain some of the benefitslisted here in terms of space utilization. And then he also talks about it also reduces theoverhead of wall records for the gist gen and SP gist indexes when an index is building. So thatis another potential advantage. Then he talks about partitioning improvements and significantlyin the area of being able to process tables that have thousands of partitions. So things likemaybe you're using time series storing time series data, those will presumably be more efficient,not for us, just from querying a few tables out of those partition sets, but also improvements oninsert speed. And he also mentions along with this that there's also been some boosts to copy.Then he mentioned CTEs being in line. So that is a big boon that's mentioned here in thissection with queries get a big boost the fact that JIT is now essentially a default. And then ofcourse there's all the additional features, but all of the ones he mentioned here of mostly ofperformance nature in some space wise that could maybe get you some performanceimprovements are all really great additions. So definitely a blog post to check out if you want tolearn more.

[00:02:55] The next post giving that PostgreSQL twelve is around the corner, is a primer onPostgreSQL upgrade methods. And this is from Cybertechn Postgresql.com and he basically iscovering the three main backup methods that exist. The first one is just a logical based dumpand restore. Just dump all the table data or in doing a restore of it. And then he covers binary inplace upgrades using a PG upgrade and it can happen really fast with using the link method andthen talking about using a logical replication as an upgrade method. And he has this convenienttable here that says essentially the logical dump and restore, simple, safe, somewhat flexible.The downsides it's the slowest method of doing an upgrade and doing a per database approach.Has some pitfalls using a PG upgrade. Basically the binary in place, as he's saying here, is afast, very fast. If you're doing the link method you still have to have some downtime, but it's stillpretty fast and the old instance won't be affected in the default mode. In other words, whenyou're not doing the link, the downsides are it's more complex than the dump and restoremethod. It can be risky in link mode because you're actually reusing the same data files. It's justusing a Linux hard link you're going to possibly lose standby servers. You have to do a lot ofwork in using R sync to be able to sync them back up. There's no default way to upgrade thestandby servers and if you're using the default mode, essentially you're going to have to havedouble the disk space required and then the last method using logical replication. So justtransferring data from one version to a newer version of PostgreSQL can be the shortestdowntime possible safe with a possibility of thorough live tests and essentially very flexiblebecause you're just determining how you want to move it over. But this is the most complexmethod. You may need some schema changes. It can possibly be s --low performance wise because you're having to send everything over using logical replicationand it's always per database so you can't do a whole cluster that way. You need to do eachdatabase. So if you're looking to upgrade your version of PostgreSQL, definitely a blog post tocheck out to help you determine what is the best method for upgrading your database.

[00:05:25] The next post is PG Friday postgres Twelve revs Up Vacuum so basically they'retalking about vacuum, they talk about some basics and they say that when you're doing avacuum, it also needs to vacuum each of the indexes because they can bloat as well. So theydid a test here and running a full vacuum using a PG bench example, which I think they hadabout 10 million rows in it. It took 21 seconds, or about 21 seconds, but there's this feature inversion twelve where you can avoid cleaning up the indexes. So when you set index cleanup tofalse, it actually happens about half the time, about 12 seconds. So that's pretty great. You'reprobably thinking but you always need to eventually vacuum the indexes, otherwise they'regoing to bloat. And I imagine there's issues if you are never going to vacuum those indexes. AndI don't think there's a way to configure auto vacuum to ignore the index cleanup necessarily. Butthey have some suggestions on how to potentially do this a little bit lower down in this blog post.Now here they added another index and you could see the time extended to 28 seconds. Butdoing the index cleanup Faust, it was a little bit higher, but still it's about half the time ofvacuuming the indexes. And if you have tons of indexes on the table, I imagine the benefitscould be increased. Now then they go over the disadvantage of skipping indexes. So basicallyyou don't want to do this as a long term thing. Your indexes do need to be cleaned up. Maybeyou could have some scenario where you're doing index rebuilds along with not doing the indexcleanup versus vacuum. And they talk about going into the transactions and the 2 billion XIDlimit, which is why vacuum, one of the primary reasons vacuum exists in the first place is to beable to freeze visible rows so that they can all be visible to everyone. So they get into using avacuum and vacuum DB command line tool and you can actually say only do a vacuum wherethe minxid age is a certain value. So when the transaction age is at say 98,000 for each of thesetables, you can set it at 100,000 and basically no work will be done when you're running avacuum of the database. However, if you put the minimum at 50,000, then all of these will bepicked up and it will do the vacuum and it finished in 37 seconds. What's interesting is when youcombine this with actually doing a freeze. So for example, they had a table with about 100,000max transaction ID for one table and close to 200,000 for other tables. So we actually did avacuum DB freeze with a min transaction age of 100,000 and now it's going to just freeze thosetables that have a transaction greater than 100,000. So you could see this as they talk about inthe conclusion here. This enables you to freeze certain tables at a certain level on a periodicbasis, basically do the work of vacuum in smaller bits and parts, but more frequently, maybe it'sa part of maintenance jobs that you schedule. Now, auto vacuum should still be running, but thisenables you to fine tune and do maybe more work when the database is less busy and even theindex cleanup flag. Maybe you need to get a vacuum done on a table, but maybe you want todefer the cleanup on the indexes. Now you'll eventually need to do it, but there's probably onetime maintenance tasks or you have issues you could address having this feature. It's just thesuggestion of course here, which makes sense, is don't always do your vacuum with just anindex cleanup, you need to eventually vacuum your indexes as well. So this is a great blog posttalking about some of the new features coming in version twelve for vacuum. So I encourageyou to check it out the next post is Postgres Connection Strings and psql. So this is talking aboutthe psql command line to utility and how it can use convenient connection strings whenconnecting to it. Instead of using what you may typically see, where you're using parameters likeU for the user, d for the database, h for the host, p for the port. You can build it as a connectionstring like this, similar to how you may see it in some application development --environments. Well, you can pass it in essentially as the database name and it will go aheadand connect. So essentially here there's three different connection methods being used andeach of them gives you the same value. And he gives an example of using Python here thatuses Lib PQ. So just a quick little post on connection strings for PostgreSQL.

[00:10:20] The next post is actually a new blog that has been started. Again, this is for the Art ofPostgresql.com site for the book that was recently released by Dimitri Fontaine and he starteddoing some blogging about SQL. Now it's not really about PostgreSQL administration, but justdiscussing SQL in general and using PostgreSQL as the example. So if you feel this would be ofinterest to you, definitely check out this new blog.

[00:10:48] The next post is updates for the postgres showcase project.

[00:10:53] So this is a list of topics covered mostly of an SQL related nature on how you createtables, create databases and they go into more detail how to connect, doing transactionmethods, creating altering tables, partitioning other table modifiers constraints, view storeprocedures, triggers so a lot of information related to how you set up the table structures andobjects within the database. So you can see the GitHub project here, that is the PostgresShowcase and it shows basically very short set of code. So a lot of code that shows you how todo things like in this view it explains views kind of what they are and shows you some examples.So if you're interested in this kind using this as a resource, definitely a blog post and a GitHubproject you may want to check out.

[00:11:46] The next post is remote backup and restore with PG backrest. Again. This is fromCyberTech postgresql.com. So this is a follow up to a previous blog post about PG Backrestwhere they just installed and did a backup on the same database server. This one actually setsup a separate PG backup server that then connects to a database to do the backup and eventests to restore. So it goes through all the different code that you would need to install and get itrunning and does the backup and restore. So if getting to learn PG backrest is of interest to you,definitely blog post to check out the next post is Cloud Vendor deep dive PostgreSQL on GoogleCloud platform.

[00:12:30] And this is from the several nines.com blog and this is a comprehensive examinationof Google Cloud's implementation of their PostgreSQL service.

[00:12:40] So they go over very long and comprehensive post about all the different features thatare available. What are some advantages and disadvantages. So talking about almost everysingle example you can think of and using PostgreSQL on the Google Cloud platform. So if thatis of interest to you, definitely blog post to check out. And the last post is Creating a PostgreSQLcluster with Kubernetes. CRDs and CRDs are custom resource definitions. So this is a postabout the PostgreSQL operator, which is a tool developed by Crunchydata.com, to myknowledge, that eases creation of PostgreSQL clusters on Kubernetes. So if you're interested indoing that, definitely a blog post to check out.

[00:13:24] That does it. For this episode of Scaling Postgres, you can get links to all the contentmentioned in the show notes be sure to head over to Scalingposgres.com, where you can signup to receive weekly notifications of each episode, or you could subscribe via YouTube or itunes.Thanks. --

episode_image