background

Tips & Tricks, Faster APIs, Full Text Search, pgbouncer_fdw | Scaling Postgres 97

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss tips & tricks, how to get faster APIs, full text search considerations and pgbouncer_fdw.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about tips and tricks, faster APIs, fulltext search, and PG bouncer. FDW. I'm Kristen Jameson, and this is Scaling. Postgres episode97 one.

[00:00:22] Alright, I hope you're having a great week. Our first piece of content is Tips and Tricksto Kickstart postgres year 2020. This is from Cybertechn Postgresql.com and it's basically just alist of tips and tricks, and some of these I wasn't aware of, so it was great to be able to get thiscontent.

[00:00:42] The first tip is you can do dynamic SQL loops with Psql's Gxec command. Sobasically if you have the need to query the database and then do something from that query, soyou want to select a set of tables and vacuum them all. You can use the Gxec command and itbasically runs through the return value of that query and then executes that command. Sobasically you're building an SQL statement in what you're returning and you can use Gxec toactually execute that statement. So this makes it easy to execute statements essentially over aloop and they make note here. Be sure to know that this is not an SQL command, but a psqlcommand. The next tip is that you can use session variables, so you can set under a particularscope a variable to enter a value. And they say that the data types aren't enforced, so you couldset it to an integer or a text value and then you can do show that variable to actually print outwhat it is. So you can use current setting to get that setting and use it in an SQL query. And theysay something might change in version 13 with something called schema variables, but as ofright now, this is something that works in current versions. The next tip is that in addition to PGSleep that sleeps for a certain number of seconds, you specify you can actually use PG SleepFour, which indicates essentially a certain interval. So like five minutes, two days or whatever,you put in a more human friendly version. Or you could do PG Sleep until a specific time, sothey're using Tomorrow at 03:00 A.m., or you could do a specific date and time. So again, theseare functions I was not aware of. The next tip is that you can set Planner constants per function.So for example, for a particular function you can alter that function and then specify, for example,you want to enable the sequential scan to off. Now normally people use SQL statements, sothey don't have a lot of functions, but this would be a way that you could change how a query ispotentially executed, or you can alter the amount of work memory it has or set a particularstatement timeout so I could see some use cases where this could be useful. The next tip isDuplicating databases for testing purposes. So probably you mostly, as they say, here, do PGdumps to backup and then restore elsewhere. But if you want to do it within the same cluster,you can use the Create database command and do it from a template. The thing to watch out foris that it actually locks up the source database while it's doing this. So it's something definitelyyou need to be aware of and no connection should be active. But this is another alternative forduplicating a database that should be much faster than pgdump. The next tip is debian specificand I believe the next two are or also Ubuntu is that you can define cluster and initializationdefaults. So there actually is a cluster configuration at this location, etc. PostgreSQL commoncreatecluster.com and it defines all sorts of configuration functions for setting up clusters. So thiscould be really useful to be able to alter this if you want to change how clusters get created. Thenext tip is that there are sample configuration files located in this location for debian systems andthis location for Red hat based systems. So if you've altered your configuration files and youwant what the original one was, you can consult these directories. The next set of tips is withregard to randomly sampling rows and they're using the Table sample SQL command. So youcan use Table sample system and zero one for a zero 1% sample of rows. It's pretty fast. Or ifyou want more randomness, albeit slower, you can use a Bernoulli as they demonstrate here,and they give a few other examples of how you can use Table Sample. And last but not least,the tip is you can save characters when doing a not null and instead of doing is not null, you canactually just do not null. So again, yet another tip I was not familiar with. So I'd say thi --s is definitely a blog post you're going to want to keep in reference to be able to use some ofthese tips and tricks. A great blog post to check out the next post is Faster Ruby APIs withPostgres and this is from Goyabada blog. Now this is talking about Rails in Ruby. However thisis applicable what the technique they're using is applicable for any application framework. Andwhat they're basically saying is that for API purposes, that is, returning JSON, you can do aselect of the database, pull all that data into Ruby, then build objects with it as the Orm for activerecord would do, and then format that output as JSON. An alternative is do a SQL query directlyto Postgres and then have it return JSON to you. So they do this for the Ruby on Rails casewhere they're using the JSON build object function to be able to build out the JSON that theywant along with the JSON aggregate function. And it essentially returns the data that you wouldexpect and it essentially can return the same data that you could using ruby on Rails or anyother application framework, and it does it much more efficiently. So they have some graphsdown here that show essentially it's about tenfold faster using this technique compared to goingthrough essentially the Orm and having Ruby build the JSON. And this essentially holds truewhen looking at single tables or a join table or even increasing the number of elements. It's stillabout ten times faster. Now, other frameworks that may be faster than Ruby probably won't beten times as fast, but still it's going to be relatively significant. Now this may involve morecomplex SQL queries, but if you have an API that requires high performance, then queryingpostgres directly and returning JSON from it and then just passing it on through the frameworkshould be the most efficient option. So if you're interested in using this technique, definitelycheck out this blog post. The next post is Two Edge cases in PostgreSQL. Full text search. Sothey're using Full Text Search and they have a TS vector column, this TSV that they've builtalong with their table. So if they have a title and they want to query it, they went ahead and builta two TS vector value here. And they do it using a trigger function. And they were using theEnglish catalog. And as you can see, it reduces things like painting to paint, roofing to roof,heating and cooling to cool in heat. And that's fine. And you can query like Roof and it will findthe record. So whatever you're searching, you do it to a TS query. However, it won't find roofing.Now, their solution to this was to actually use additional catalogs, and in this case they could usea simple catalog. So what they did, they changed their function that actually concatenates theresults of two different catalogs, the English catalog and the simple catalog. And thoseconversions converted painting into paint and painting roofing into roof and roofing. And nowwhen you do a query, you get the one record expected. So if you're not getting necessarily whatyou expect with Full Text Search, maybe consult using a different PG catalog to see if that cangive you the results you're looking for. And this blog post is from Jetrockets Pro.

[00:08:34] The next post. How to simplify PG Bouncer Monitoring with PG Bouncer FDW. AndFDW stands for foreign data wrapper. This is from Crunchydata.com. So this is a new tool,essentially a foreign data wrapper that allows you to query and monitor PG Bouncer because bydefault PG Bouncer, you get details on its running by actually connecting to it with a specificadministrator or operator user, and then you show commands. So for example, here it's showingthe different pools to get the output. What they wanted to do is actually be able to do SQLqueries to query the different metrics of PG Bouncer to be able to easily monitor it. So they setup essentially a foreign data wrapper. They're using the DB Link foreign data wrapper and theygo through the process of doing it. Then they're adding foreign server, doing all the granting forthe selects. And now you could do something as select all from PG Bouncer underscore Poolsand it returns all the same data. So basically you're querying your main database and it goesand talks to the PG Bouncer to get its metrics for monitoring purposes. And they also mentionedthat they've incorporated this into their PG Monitor open source monitoring utility. So this seemslike a really great addition. I really lik --e this and if you are interested in learning more, definitely check out this blog post.

[00:10:04] The next post is effectively using materialized views in Ruby on Rails and this is fromPganalyze.com. Now again, this is a general subject about views and materialized views. It isspecific to Ruby on Rails, but you can learn more about views and materialized views if youwant to consult this post. Basically they cover what is a view, essentially a virtualized table and ithelps minimize complexity. So if you have a very complex query, you can put it into a view and itwill do that query real time. Alternatively, you can use a materialized view which actually createsan underlying table of all of this data. It has materialized this virtual view, so essentially it'sgetting a one time snapshot of it. Now as a consequence, you have to refresh that snapshotperiodically. So essentially it's like a cache. So it's like a cache for your query. So they do needto be refreshed periodically. But if you want to learn more about views and materialized viewsand how you can use them specifically with a Ruby on Rails or a Ruby application framework,definitely a blog post to check out.

[00:11:16] The next post is actually a YouTube video and it's why developers love postgres. Now,this was presented at Microsoft Ignite that happened recently and this presentation is about 38minutes in length and it covers for about the first 15 to 20 some of the reasons that developersare loving Postgres. The latter half talks about how an organization switched over from Oracle toPostgres and the benefits that they garnered from it. So if you're interested in viewing thispresentation, go ahead and check it out.

[00:11:50] The next post is Webinar BDR latest features and updates follow up. And this is fromSecondquadron.com and this is a webinar that you can register for. There's a link right here tolearn more about their bi directional replication product. I haven't had the opportunity to look at it,but if you are interested in learning more about bi directional replication, definitely check out thiswebinar.

[00:12:14] And the last post is waiting for PostgreSQL 13 Alter Table Alter Column DropExpression and they're mentioning here that with Postgres Twelve generated columns featurethat allows you to do, for example, calculations based on other columns. It's a generatedcolumn. There's actually no way to remove it once you add it to a table. So this new commandactually enables you to do that. So if you want to learn more about it, definitely check out thisblog post that does it. For this episode of Scaling Postgres, you can get links to all the contentmentioned in the show notes. Be sure to head over to Scalingposgriz.com, where you can signup to receive weekly notifications of each episode, or you can subscribe via YouTube or itunes.Thanks. --

episode_image