background

Postgres 13 Beta 2, System Stats Extension, Tuned, Join Strategy | Scaling Postgres 120

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss Postgres 13 Beta 2, a system stats extension, tuning your operating systems for Postgres and different join strategies.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about postgres 13 beta two systemstats extension. Tuned and joined strategy. I'm Kristen Jameson. And this is scaling postgresepisode 120.

[00:00:22] All right, I hope you, your family and co workers continue to do well in these times.Our first piece of content is PostgreSQL 13. Beta Two is released. So this is leading up to therelease of Postgres 13 in the fall, and it mentions some of the changes since Beta Onepredominantly bug fixes, but a couple of different changes. So if you're interested, go ahead andcheck that out. Now, to go along with this, there has been a recent YouTube video posted on theSan Francisco Bay Area PostgreSQL Users Group YouTube channel called A First Look atPostgreSQL version 13. And this talk was given by Christophe Pettis, and it's a goodsummarization of all the new features in Postgres 13, particularly the ones that are of interest tohim. Now, the video is about 30 minutes in length, but it's really the first 2022 minutes ofpresentation and then questions at the end. So it's a pretty short overview that gives you all thedifferent changes coming. So if you're interested, check out this YouTube video. The next pieceof content is System Stats Extension for PostgreSQL. And this is from Pgsnake bloodspot.com,which is Dave's postgres blog. And it's actually an extension for Postgres that actually looks atvarious operating system and hardware statistics. Basically, they've given you SQL access viathese new system tables to be able to query what the state of the hardware in the state of theOS system is doing. So they present operating system info, CPU info, usage statistics for CPU,memory info, as well as i, O and Disk info, system load averages, process info, network info,and memory by process info. So this is a great addition. I'm definitely going to be checking outthis system extension to see if I could potentially use that to help with monitoring. So if you'reinterested, check out this extension as well as his blog post about it.

[00:02:22] The next piece of content is tuned PG and U. This is from Hunleyd GitHub IO, andhe's talking about doing tuning to your OS that PostgreSQL is running on. And he advocatesusing this tool called Tuned that kind of helps with dynamically making changes to your system.And I believe it's written and maintained by Red Hat. And he advocates using Tuned oversomething like making changes or editingctl.com instead. And he goes through the process oftaking some existing tuning configurations and adjusting it to come up with a PostgreSQL tuneconfiguration. So this is not only beneficial to see if you want to use tune, but to look at his OSconfiguration changes that he does for some of the systems. And of course, he said he usesbenchmarks to also determine what these changes should be. So definitely something to takeinto account. But if you're interested, definitely check out this blog post.

[00:03:21] The next piece of content is Join Strategies and Performance in PostgreSQL. Now,this post goes over joins in the three different join types that are typically used with a Btreeindex. So they talk about the nested loop join strategy, the hash join strategy, and the merge joinstrategy, and it talks about how each of these are used. The algorithm used, the different typesof indexes that can help in certain situations and whether it's a good strategy. So for example,the nested loop join is a good strategy to use if the outer table is small. A hash join using a hashtable is advantageous if it can fit into work mem. And then a merge join can be useful if both ofthe tables are large. But they do mention that if it's not an equality but something different thanequality, then pretty much things just drop back to a nested loop join. Now what one thing theymention here is a way to be sure you get the best strategy is to make some of the changes theysuggest here. And number one, of course, is having your statistics be up to date. So eitheranalyzing frequently or making sure your default statistics target is set appropriately at thedatabase level and at the table level if you need to. Also, you could potentially look to increaseyour work mem if that makes sense for you to get the cheaper hash joins, as well ascommunicating to PostgreSQL about your hardware and resources, tuning things like yourrandom page costs, effective cache size, and effective I O concurrency. So that way the p --lanner may make better decisions on which types of plan to use. And he also mentions thatquote you can speed up nested loop and merge joins with index only scans, so adding morecolumns to an index may be beneficial. And with versions eleven and above, you can justinclude columns onto that index without actually having it in the index. So if you want to learnmore about these strategies and how they could potentially help your performance, definitelycheck out this blog post.

[00:05:19] The next piece of content is actually a PDF and it's talking about JSON unificationwith regard to postgres. So what they're talking about is that around 2014, JSON B wasintroduced in postgres that attracted some NoSQL users. You can tell the rapid rise of postgreshere in terms of popularity from DB engines and they're arguing as a result of this, JSON hasbeen added to the actual SQL standard. But the issue being that we have a JSON data type anda JSON B data type, but the standard is going for one common standard. So this is talking aboutunifying the JSON and the JSON b and how we could potentially do that. Now, this is asubstantial piece of content and it's over 89 slides, but it goes into the history of JSON. Thedifferent ways you can query it and also how to approach this unification process. So it's verydense in terms of content and material, but if you're interested in learning more about the JSON,the JSON being potentially this unification, that will have to be done with postgres to be inalignment with the SQL standard, definitely check out this piece of content.

[00:06:35] Next piece of content is PostgreSQL lTree versus with Recursive and they're talkingabout they had done a previous post talking about lTree and hierarchical data and querying it.They said well what about performance and what about with Recursive? So this blog postactually looks at doing something similar in order to optimize to get really fast queries. Hebasically does the same thing still using lTree but using a materialized view that can berefreshed and therefore you can index that to get better performance. And after some of thesechanges he got relatively low execution times less than a millisecond for doing some of thesequeries. So if you have hierarchical data and you potentially want to look into lTree or withRecursive and getting better performance by leveraging materialized views, maybe you want tocheck out this blog post from CyberTech Postgresql.com.

[00:07:32] The next piece of content is Advanced Active Record using subqueries in Rails. Now.This is from Pginalyze.com. Now with Rails they're talking about Ruby on Rails which is anapplication framework that uses Ruby and ActiveRecord is the Orm that Ruby on Rails uses. Soit's an object relational mapper. So as such it has its own syntax but sub queries aren't theeasiest thing to figure out in it. But this post goes over some different techniques that you canuse to do it. So for example, this is an example where you have a sub query in a where clauseyou can actually have query that you can then insert into the SQL. And again, this is the activerecord. Ruby syntax. So this is the equivalent to doing something like this query here with asubquery as a part of the where clause and he goes over different ways you can do things suchas where not exist subqueries select subqueries as well as from sub queries and havingsubqueries. So if you have a need to use subqueries in your Ruby on Rails project, maybe youwant to check this blog post to see how best to do that.

[00:08:44] The next post is composite primary keys PostgreSQL and Django. This is fromCrunchydata.com and they were doing a database design where they ideally wanted to have acomposite primary key but when trying to set it up using Django's orm they had some difficultiesand basically it was a lot of difficulties but they were able to work through and get somethingworking to their satisfaction. So if you are interested in learning how they were able to rectify thesituation with the Django RM, definitely check out this piece of content.

[00:09:18] The next piece of content is Types of Indexes in PostgreSQL. This is from Higo CA,and it goes over all the different indexes of which there are many of PostgreSQL, starting with ofcourse the Btree index. Talking about hash, gist, spgist, gen, brin, multicolumn indexes, uniqueindexes, expression indexes, partial indexes index only scans as well as covering indexes --. So if you want to learn all about indexes on PostgreSQL, definitely check out this blog post.

[00:09:50] The next two pieces of content are in relation to Oracle and Migrating to PostgreSQL.The first one is Oracle to PostgreSQL Reasons to Migrate, and it covers some of those heresuch as foreign data wrappers versus federation, all the different languages available to writeprocedures and functions, and compared to PL, SQL and Oracle as well as others. So if you'reinterested in this type of content, check out this blog post. And this is from a secondquadrant.com.

[00:10:17] The second one is set up Aura to PG for Oracle to postgres migration. So this is anopen source utility called Aura Two PG that helps you convert your data from Oracle to postgres.And this is from Crunchydata.com. So if you're interested in doing that, check out this blog post.

[00:10:36] The next piece of content is the PostgreSQL Person of the Week is Simon Riggs. Soif you're interested in learning more about Simon and his contributions to PostgreSQL, definitelycheck out this blog post.

[00:10:48] The next piece of content is how to monitor PostgreSQL Twelve performance withOmnidb part One this is from Second Quarter, and they're talking about using their monitoringtool Omnidb to monitor postgres. So if you want to learn more about Omnidb and what it does interms of monitoring, check out this blog post.

[00:11:08] Next piece of content is authenticating PG Pool Two with LDAP. This is from Higo CAand LDAP authentication is a new feature to my understanding with a new version of PG PoolTwo. So this shows how to get that working with the newer versions of PostgreSQL and PG PoolTwo.

[00:11:28] And the last piece of content is how to use Neural Network Machine learning modelwith two EDA PostgreSQL and Orange Part Seven. So part Seven in a Post talking aboutnetwork machine learning with PostgreSQL. So if you're interested in that, check out this postfrom secondquadron.com.

[00:11:47] That does it. For this episode of Scaling Postgres, you can get links to all the contentmentioned in the show notes. Be sure to head over to Scalingpostgres.com where you can signup to receive weekly notifications of each episode, or you can subscribe via YouTube or bye Byewas. --

episode_image