background

PGCon, Tips, PostGIS, Strings | Scaling Postgres 67

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss videos from PGCon, Postgres tips & tricks, PostGIS parallel performance and using strings.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about Pgcon Tips, postgres and strings.I'm Kristen Jameson and this is scaling postgres episode 67.

[00:00:20] All right, I hope you're having a great week. So our first set of content is actually a lotof videos have been posted to the Pgcon YouTube channel. So I believe this is Pgcon thathappened in Ottawa, Canada, and it appears there have been 33 or so videos that have beenuploaded so far. So definitely a lot of content this week. I haven't had the opportunity to look atany of these yet, but I definitely will be doing that over the next number of weeks. So definitelyplenty of content. If you're interested in postgres with this channel, the next set of content iseven more videos. So there was apparently an event called Percona live in Austin, Texas. Sothey have posted a number of videos.

[00:01:10] Some of the more recent ones aren't necessarily PostgreSQL related, but a fairnumber of the ones in and around this event have been posted. I have watched Zheep, the NextGeneration storage engine for Postgres because I'm super interested in that because I think thathas huge performance benefits, potentially with the pluggable storage architecture and usingZheep kind of using an undo kind of the way Oracle does it. So I'm definitely keeping track ofthat. They say that they plan to release it around PostgreSQL 13, or that's the objective forversion one and then a version two with additional enhancements following on, but definitelyinterested in keeping track of that. But some more videos for you to check out.

[00:01:55] The next post is actually a third presentation that has a YouTube video with it. So thisWas Put On by EDB Postgres by Bruce Momgian, and it's titled Exploring Postgres Tips andTricks. So we cover six different areas and has 15 different points of reference. So basicallyhardware things to take into account there, the internals of how PostgreSQL works, somedifferent things to do in terms of monitoring, some configuration considerations, how to usesome particular features of SQL, and of course, talking a little bit about clients and of coursepsql. So this is the presentation and he basically just has references that go back as far as 2009,it appears, talking about the different issues. And then the video with the presentation occurshere. It's only about 35 or 37 minutes of content. So if you kind of want a little bit of a deep diveinto some tips and tricks with Postgres in terms of its operation, definitely an interesting video tocheck out. The next post is parallel PostGIS and pgSQL twelve. Now this post is a follow on toone that was mentioned last week, and he's talking about the automagical parallelization ofmany common spatial queries. And in this post he actually does some performance checks. Sohe, as he states here, acquired a 16 core machine on AWS, an M Xlrge, and installed thedevelopment snapshots of PostgreSQL and PostGIS. So versions twelve and three, heconfigured the workers to use 16. In all cases, normally you would put the gather a little bitlower, but I guess he definitely wanted to get all of the cores being used for particular queries hewas testing. And he shows some of the performance for the different summarizations of twodifferent tables with a number of different polygons and points. And you can see that it definitelyimproves in performance as it goes down, but it definitely trails off pretty quickly. Like, forexample, he has this representation of scan time versus the number of workers used. And whenhe was using one worker, it took 318 going down to four. Essentially the four workers drops it bya third, but then when he goes from four to eight, it drops it by less than half. So you definitelyhave diminishing returns here. So it's not just throwing cores on, it will magically makeeverything super, super fast in parallel, at least with these development versions that he's using,maybe there's more optimization that needs to be done. And then looking at Join performance,he saw about the same thing. You do have some quick benefits, some quick gains going fromsay, one to four and even a little bit down to six, but then once you hit eight, it's pretty muchdiminishing returns. You don't get double the performance for doubling the workers, clearly,because it goes from 5 seconds down to 4 seconds. So you do get a benefit, but it's not verymuch. And of course his conclusions are there's a limit t --o how much advantage adding workers to a plan will gain you. And also the limit feelsintuitively lower than I expected, giving the CPU intensity of the workloads. So if you'reinterested in using postgas and want to look at some performance benchmarks, definitely a blogpost to check out.

[00:05:18] The next post is Beautifulthings Strings. And this is from second quadrantpostgresql.com. And this is a post all about using strings with PostgreSQL. So they talk offtalking about the different character types, character varying, essentially varicare character aswell as text, and how most people usually just use text, unless you're wanting to actually limithow much you want a text string to contain. And then they go into pattern matching using likeand I like as well as similar to and showing you all the different ways you can manipulate textthan going into regular expressions, as well as using specific functions for working with stringssuch as substring, regex, replace and some others. So if you want to learn a bit more about howPostgreSQL uses strings, definitely a blog post to check out.

[00:06:12] The next post is the contributors team. And this is from the Robert Haas blog and in ithe's talking about quote, recently the PostgreSQL project spun up a contributors team whosemission is to ensure that the PostgreSQL contributors list is up to date and fair. So this isbasically information about the management of the PostgreSQL project and how they're wantingto get contributors list more up to date and fair, representing everybody who helps contribute toPostgreSQL. So if you want to get insight into how this new contributors list is going to be set upand working, as well as insight to how the PostgreSQL project is managed, definitely a blog postI encourage you to check out.

[00:06:55] The next post is Indexes in PostgreSQL nine Brin. So this is from Haber.com andagain, I believe the original post, as we've seen some of these from Haber, is from thePostgrespro Ru site, which is in Russian. And we've talked about numerous different posts onIndexes and this one talks about the Brin index. So as we've discussed before, this is great ifyou have basically the physical layout of the data on the disk closely matches how you want toquery it. So if it's by primary key and you query by primary key, a Brin index could get you a lot ofstorage savings, how it's constructed. So if you're wanting to get super in depth with exactly howthe Britain index works and how it's set up and used, definitely a very comprehensive blog postto check out.

[00:07:46] The next post is PostgreSQL incremental backup and point in time recovery. And thisis from the PG IO blog and they're talking about how to get an incremental backup and point intime recovery set up. So if you're still just doing a logical backup with pgdump, this blog postbasically walks you through how to set up the wall archiving and doing a backup and a point intime recovery. Now, this is a test text based version, but I did do a tutorial a little while ago thattalks about PostgreSQL backup and point in time recovery through a video tutorial. So if youwanted to check out this tutorial, I'll include a link in the description. It's basically in the tutorialsection of Scaling Postgres.

[00:08:31] The next post is how to optimize PostgreSQL logical replication. And this is from theseveral nines.com blog. So basically they go over how logical replication works. It basicallylogically replicates one table to another table to another database system, and they talk aboutsome of the different configuration parameters you can adjust and how to tune them. So it's apretty short post, but if you want to get logical replication set up, definitely a blog post to checkout.

[00:08:58] The last post is power. Four changes in power. Archivist So this is from Arjuju GitHubIO and this post is the next in the series of Power Four beta posts and he's talking aboutchanges done in the Power archivist. So if you use Power or if you want to potentially considerusing it, definitely a blog post to check out to see what's coming in version four.

[00:09:25] That does it for this episode. Of scaling postgres. You can get links to all the contentmentioned in the show notes. Be sure to head over to Scalingposgres.com, where you can sign

episode_image