background

Parallelism, JIT Compiling, Indexes, Administration | Scaling Postgres 30

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we review articles covering parallellism in Postgres 11, JIT compiling, indexes in depth and administration.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres we talk about parallelism in Postgres Eleven JITcompiling indexes and postgres administration. I'm Kristen Jamison, and this is Scaling postgresepisode 30.

[00:00:22] Alright, I hope everyone had a great week. Our first piece of content is Parallelism inPostgreSQL Eleven. This is from Thomas Monroe and it was presented at Postgres Open in SanFrancisco in 2018. So this is a series of slides from a talk he gave and sometimes it's hard todiscern what's going on, but this was a really good presentation. I hope that the video is postedup on YouTube eventually. So he goes into a history of parallelism with Postgres and he has afew slides I'm just going to pull out here I thought were interesting. And this is just part of thehistory portion of his talk. And here he's showing this quote which appears to be from HerbSutter that says the free lunch is over. So basically around 2004 is when we stopped gettingdramatic increases in CPU frequencies. The single thread performance started flattening out.And basically at that point the way to get more processing power was putting more cores in eachCPU. So basically this is the point at which doing parallel operations started to become moreimportant. And then he shows this other slide here called the Parallel Gold Rush and a lot ofdifferent database platforms and how they started offering parallel query execution. And you cansee Postgres has been pretty late to the party in 2016. Now you may think that's not great, butthe reality is that if you have a lot of users connecting to the database, like a typical OLTPtransactional load, each of those processes for each connection can be serviced by differentCPUs or different cores of a machine. So it would only be used in analytical processing whereyou would want to paralyze complex queries. So in terms of the postgres sweet spot OLTP,these kinds of things aren't needed as much. But when you're talking about analytical processingit becomes very important and which is why in previous episodes of Scaling Postgres we'veseen a number of vendors that forked Postgres and developed analytical databases based uponits core. But this is a very technical presentation that goes into a lot of the background ofparallelism and how it works. So if you're interested in that, this is definitely a post to check out. Iwill show one of the later slides are some things that prevent or limit parallelism. And the firstone here is CTE. But to get around it you can rewrite it as a subselect. And he mentioned someother things like full outer joins, no foreign data wrappers currently support parallelism cursorsmax rows queries that writer lock rows functions not marked parallel safe and serializabletransaction isolation. So even with postgres Eleven keeps adding parallelism improvements.There's still more that needs to be done. But again, if you're interested in the parallelism featuresof postgres eleven and what's coming in the future. Definitely presentation to check out the nextpost is parallel PostGIS and pgSQL eleven. This is from the cleverelifent CA blog by PaulRamsay. So he's evaluating the parallelism of postgres and basically he says from 9.6 to tenthere was a possibility of getting parallel plans and he has seen some minor improvements inPostgreSQL eleven by modifying the cost for certain functions. But basically there's still morework to be done. So if you use PostGIS and you're interested in the parallel capabilities of it, thisis definitely a blog post to check out.

[00:04:06] The next post is PostgreSQL eleven and Just In Time compilation of queries. And thisis from the Citusdata.com blog. So basically this post checks some benchmarks on how the newJIT compiler in PostgreSQL eleven can lead to some query improvements. So he used the TPCbenchmark H in terms of report query one, which he shows a representation of it. Here he set upa database and ran some tests against it for PostgreSQL 9.610 and eleven with JIT compilationenabled and we can see the increases in performance for the different versions. And he didmention that he did turn off parallel query execution so that that wouldn't be a factor. Andbasically for this query, PostgreSQL eleven is 30% faster processing this query, which is quitesignificant. And he goes over into the tooling and the setup he used to use these benchmarks.So definitely a pretty thorough blog post. Now in terms of the release notes fo --r PostgreSQL eleven and specifically for the Just in Time compilation, in order to enable JIT, itlooks like you have to compile it from source using the with LLVM flag. So it depends on how youinstall postgres on whether this will be enabled. And if you do it from source you can use this flagto make sure it's enabled. I haven't tested specifically, but my assumption it would be offnormally. So in terms of this blog post you can reference how he set up his tests in order to do it.So again, this is another case where this is probably best for sophisticated analytical queries likethis. So if you think the new JIT compilation feature could potentially improve your performance,definitely a blog post to check out.

[00:05:54] The next post is actually a webinar, and this is webinar database security andPostgreSQL follow up. And this is from the second quadrant, PostgreSQL blog. So they had awebinar about database security and PostgreSQL. And there's a link right here, it says availablehere and it will take you and you have to register, but then at that point you can immediatelywatch the video.

[00:06:17] It's about 45 minutes long. About the first half is a presentation, the latter half isquestions. So basically this presentation covers defense in depth for your PostgreSQL instance.So what kind of controls can you put in place at the network layer at the host layer at theprocess layer on the box. What can you do about the file system? How can you secure accessto the data within the database? For example, using grant revoke permissions, it talked aboutrow level security and how to implement that, as well as general application security, because anapplication that needs to access the data in the database basically goes straight through allthose layers. So I did find it to be a good presentation. So if you're looking to increase thesecurity of your database, definitely a webinar that you should watch.

[00:07:09] The next post is Indexes and Postgres The Long Story or Crocodiles Going to theDentist by Louise Granjonic. So this again is a slideshow presentation and it covers the differentindex types of postgres. What's significant about this is this is probably the most technicalpresentation on how indexes work in postgres I've seen. So there are a few slides that covergeneral advice for a developer. But if you're really wanting to know how postgres indexes workat a granular level, this is probably the best presentation I've seen on it. So if you're interested inthat, definitely a piece of content to check out.

[00:07:51] The next post is Setting Up Streaming Replication in PostgreSQL, and this is from theProcona Database Performance Blog, and in it they basically give you the steps on how to setup replication. It's a relatively short post, but they go through all the details.

[00:08:08] If you want a video format of this, I have done a tutorial showing how to set upPostgreSQL streaming replication, so if you want a little bit different format, you're welcome tocheck out this piece of content as well.

[00:08:22] The next post is actually a PDF presentation and it's called Mastering PostgreSQLAdministration by Bruce Momgen. So I have seen a reference to this presentation before, but it'sa very comprehensive presentation going into all the different aspects of doing PostgreSQLadministration. It's 112 slides worth. If you do PostgreSQL administration and you haven't seenthis presentation yet, definitely a link to check out.

[00:08:52] The last post is announcing Timescale DB 10, the first enterprise ready time seriesdatabase to support full SQL and scale. So I've mentioned Timescale DB in previous episodesof Scaling Postgres, but I thought this announcement that it had reached 1.0 is notable. So ifyou're looking for a Timescale series database that can work as an extension for PostgreSQL,definitely a product and a blog post to check out, that does it. For this episode of ScalingPostgres, you can get links to all the content mentioned in the show Notes. Be sure to head overto Scalingpostgres.com where you can sign up to receive weekly notifications of each episode,or you could subscribe via YouTube or itunes. Thanks. --

episode_image