background

Materialized, PgBouncer Auth, Implicit vs Explicit, JIT | Scaling Postgres 40

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we review articles covering materialized views in Rails, pgbouncer auth_user, implicit vs explicit joins and JIT.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about materialized views, PG Bouncerauthentication, implicit versus explicit and configuring JIT I'm Creston Jameson and this isScaling Postgres, episode 40.

[00:00:24] We have very little content this week. I only have four posts to discuss. Basically it'sThanksgiving here in the United States, and I guess that is kind of the reason why there's not asmuch content this week. So our first post is the benefits of materialized views and how to usethem in your Ruby on Rails project. And this is from Ryan rebo on medium.

[00:00:50] And basically they had a query that was slow and they wanted to use materializedviews to essentially speed it up and use it as a form of caching. So we showed how he does adatabase migration here in Rails where he's creating the materialized view to basically make aphysical table out of a view and select the exact query that he's looking for. Now some of thetechnique used here, I'm not sure how well this will scale as your database grows in size and asyour data size increases. But what he chose to do in this example is he creates a trigger onparticular tables that will refresh the materialized view when data is updated. So first he createda function that's refresh the ready song version and it's going to be used as a trigger and itbasically refreshed the materialized view. Now, one thing you may want to consider doing ifyou're looking at this technique, is to actually refresh the view concurrently so it doesn't blockread access to the view while it's being rebuilt. So that's one thing that wasn't mentioned hereyou may want to consider doing.

[00:02:02] And then secondly, for each statement that updates a particular column or deletes ortruncates from a given table, execute this procedure basically refresh the view. Now, if you havea table that's being updated frequently, I'm not sure how well this would perform. I mean, hesays that the materialized view refreshes fast, but it's basically probably depending on howmuch data you have in it. And if you have a lot of data and are looking to scale. In a previousepisode of Scaling Postgres, I believe episode 39, or potentially 38, we talked about materializedview versus roll ups, and that was done by the Citrusdata.com blog. So using roll ups may be amore appropriate use case than materialized views, but it depends on how frequently your datais changing. And then he goes into Rails and how to set it up for their use case, where theycould actually just trigger a refresh of a materialized view when they needed to through thisRuby class. And basically once they got everything set up, he did a timing benchmark in a query,went from 7.8 seconds to 1.7 seconds. So that's a pretty good win. But there may be othercases where you could use optimized indexes to potentially achieve similar performance. Andactually in the comments they talk about hey, why not use a partial index? So this was aninteresting use case, but definitely consider other potential options if you're looking to scale. Likemaybe you want to use roll ups or if you want to use materialized View, maybe do it concurrentlyso you're not blocking read access. And if your table is really active, you may not want to do atrigger for every statement that updates the data. You may want to refresh it on a periodic basislike maybe through a cron job or something. The next post is PG. Bouncer Pro tip. Use authuser.So simplest way to set up Pgbouncer, which is essentially a connection pooler for postgres, is tohave usernames and passwords in a UserList text that defines who can connect and what's theirpassword. But there's another way to do it using Auth User and Auth query. So basically youdefine a user in your target database that has sufficient privileges to query the PG shadow tablewhich is basically where usernames and passwords are stored and you define a query to beable to look up this table. So for example, they created a role my Auth user with a particularpassword and then established a function that that could use where it selects the username andpassword from the PG shadow table. And once you add in this configuration to Pgbouncer,every time a user connects to Pgbouncer, it will actually look up the username and password tosee if they are authenticated in the destination database. So if you have a lot of usersconnecting to Pgbouncer, this is potentially a better way to manage that process. Now of --course, one thing to keep in mind with postgres ten and eleven it supports scram and a lot ofpeople are potentially moving to that authentication system and I'm not so sure that Scrum willwork with Auth user and Auth query. And to my knowledge Pgbouncer doesn't support Scrum atall actually. But if you want to use this technique to potentially make user management easier foryour PG bouncer install definitely a blog post to check out.

[00:05:31] The next post is PostgreSQL implicit versus explicit joins. So it basically talks aboutthe differences and an implicit join is where you're just saying select all from and you just have acomma between the tables and in the where clause you state how those should be joined. Anexplicit join is where you literally say table join to another table on and define the columns thatthey're joined on for each table. You could also use a using clause as long as the column namesare identical in two tables. So this is considered an explicit joint using the join syntax and implicitis basically using a comment and stating it in the where. Now personally I use explicit all thetime. I don't use Implicit because I tend to favor being explicit in what I want to have happen andit just makes things much more readable for me. And in this blog post he talks about people areasking well, is there any performance difference and not really, but he says there is some casewhere there's a join collapse limit where it actually may start reordering join order once you getup to like nine tables that you're joining. So it's only in rare circumstances that you may get adifference. But normally the query plans between implicit and explicit joins are identical.

[00:06:56] The last post is installing and configuring JIT or Just In Time compilation inPostgreSQL Eleven. This is from the Procona.com blog and basically this is a new feature inEleven where they're using Just In Time compilation and you have to actually install it a certainway and then enable it in Postgres. So this talks about okay, how do you install it and firstexample is installing it from a package repository and he shows some examples in CentOS andRed Hat and the different packages and the process you need to install it. And then also forUbuntu users, which this is definitely the simplest one to do, essentially you trust the Postgresrepository and just do Sudo app install PostgreSQL Eleven and the package for Ubuntu fromPostgres already has legit features in it. And then also if you actually want to build from sourcehere's how you do that. And lastly, to actually use it and turn it on, you need to enable JIT andeven once it's enabled, it may only be triggered in certain use cases. And they talk about settingcertain costs here to be able to get it to start working. So if you want to try out the JITcompilation, definitely a blog post to check out that does it. For this episode of Scaling Postgres,you can get links to all the content mentioned in the show notes. Be sure to head over toScalingpostgres.com where you can sign up to receive weekly notifications of each episode oryou can subscribe via YouTube or itunes. Thanks. --

episode_image