background

PgBouncer SCRAM, Art of PostgreSQL, Distributed Time-Series, Window Functions | Scaling Postgres 79

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss PgBouncer SCRAM support, The Art of PostgreSQL, making Postgres a distributed time-series DB and window functions.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about PG Bouncer, the ARCT ofPostgreSQL, distributed time series and window functions. I'm creston. Jameson. And this isscaling postgres episode 79 one.

[00:00:22] Alright, I hope you're having a great week. Our first piece of content is PG. Bouncerone. Point 110 is released and the major new feature is support for Scram authentication. Sothis is pretty big. It's only supported MD Five, of course, and some others, but finally it hasScram authentication. And there's a change log here that talks about adding scramauthentication. Handle auth type equals password when the stored password is MD Five. Like aPostgreSQL server would add option log stats to disable printing stats to the log, add time zoneto log timestamps and put the PID into brackets in the log prefix. And it also lists a number offixes, even one for PostgreSQL twelve. So definitely a great set of enhancements. Particularlythe scram authentication is now offered for PG bouncer. So that's great.

[00:01:17] The next piece of content is that a book, the Art of PostgreSQL has been releasedwith the subtitle turn thousands of lines of code into simple queries. Now, I'm not familiar toomuch with this book, but this is just another piece of educational content. So if this seemsinteresting to you in terms of learning more about PostgreSQL, check out the website and itlooks like you can download a free chapter to see if it's something that you'd like.

[00:01:43] The next post is building a distributed time series database on PostgreSQL. And thisis from Timescale.com blog. So they make Timescale, which is a database extension forPostgreSQL focused on time series data. And they're actually introducing a distributed version.So it enables you to distribute the data across multiple database systems. So essentially youhave scale up. Well, now you can scale out to multiple systems. And it looks like this issomething in private beta right now. So this is announcement about kind of why they're doing it.They're not necessarily considering Sharding, but more chunking because they operate at thelevel of what they call chunks. And that allows them to do some different things than Shardingtypically would that they say here. So you can put more chunks of data being managed to do ascale up scenario, elasticity being able to move them around, partitioning flexibility in terms ofchanging the chunk sizes or partitioning dimensions, deleting the chunks older than thethreshold. So maybe they're kind of like partitions, but not really. They're their own thing. Andthey show some benchmarks here that they've looked with a single node. The insertperformance for cluster was 2.7 million metrics processed per second. And with eight nodes andone access node that grants access to essentially the data nodes, they get up to 12.8 millionmetrics per second. So if you work with a lot of time series data and you're looking to scalefurther from a single instance, then maybe this is something you'll want to take a look at. So it'sa pretty long post. Goes into some of their rationality for structuring the way they did, how theyfeel. It's different from traditional sharding. So definitely a blog post to check out.

[00:03:32] The next post is actually a webinar, and it's Webinar Business Intelligence withwindow functions in PostgreSQL follow up. So you can register for the webinar right here, or lookfor the recording right here and it basically covers Windows functions. I haven't had theopportunity to watch the whole thing yet due to other time constraints this week, but I'm alwaysinterested in learning more about window functions because they're the kind of thing that I don'tuse on a daily basis, and anything that keeps me fresh on different capabilities of analyzing data,I like to keep up on. So if you'd like to do that, definitely a webinar to check out.

[00:04:10] The next post is Postgres Table Partitioning, and this is from Enterprisedb.com and itdescribes the partition feature of PostgreSQL, talking about range partitioning, list partitioningand the newly added hash partitioning and how you would use them in what different scenarios.So range is typically for dates list is you have the list of known values that you can specify andhash is being able to use a hash function on a wide variety of data to be able to break it intoequal buckets. And they go through showing you how you can do with declarative --partitioning, create the base table and then create each partition along with inserting data.Talking about the default partition and then talks about the performance benefits of doingpartition exclusion when doing queries. And it gives some advice on when to use partition tables.So if you're looking into partitioning, definitely blog post to check out the next post again from theEnterprisedb.com blog is Pgpool versus PG Bouncer. So this takes a very rational examinationof each of these products. PG Bouncer, which it's marketed as a lightweight connection pooler,so that's pretty much what it does. And they say here quote PG Bouncer executes poolingcorrectly out of the box, whereas PG Pool Two requires fine tuning of certain parameters forideal performance and functionality. So PG Pool Two has a lot more features compared to PGBouncer, but it requires more configuration. So if you're looking for this evaluation or potentiallywant to switch from one connection pooling solution to another, definitely a blog post to checkout the next post is Tips for Postgres from a Postgres Insider. So these are a set of nine tips andthey're not specifically PostgreSQL related, but they can be items that are tangential toPostgreSQL. In other words, something that will impact the database but is not in the databasenecessarily. So the first thing mentioned here is Smart Tools for assessing Hardware. So that'snot necessarily a PostgreSQL configuration, but it's something if you have direct access to thehardware, something you want to monitor the health of your storage, as they say here pgtest FSync to determine what F sync method is the fastest for your operating system and hardware.They talk about the backend flowchart that actually communicates how data flows throughPostgreSQL if you want some extra knowledge in that area. And they go over some other onessuch as a command line control GUC or grand unified configuration levels. So where you canset different levels of configuration, the setting of work mem how to do that optimallytransactional DDL, which is one of postgres's more interesting capabilities as they mentionedhere, the advent of virtual columns in PostgreSQL twelve and of course time zones. So if youwant to get more in depth into some of these tips, definitely check out this blog post.

[00:07:04] The next piece of content is actually a YouTube video called What's New in PostgresTwelve? And this is from a webinar and they basically outline some of the new upcomingfeatures for twelve, including the partitioning improvements, b tree improvements for indexesmost common value statistics, the fact that it inlines many CTE queries by default. Nowprepared plan control just in time compilation checksum control and reindex concurrently. So ifyou want to learn more about postgres twelve features that are coming, definitely a webinar tocheck out.

[00:07:38] The next post is PG backrest to backup PostgreSQL. And this is exactly what it says.It explains the PG backrest solution, how to get it installed and how to configure it for doingbackups for your system. So if you're interested in moving to PG backrest or evaluating it,definitely a blog post to check out.

[00:07:57] The next post is computing day working hours in plpgsql. So this is a relatively simplething to do, basically calculate the number of working days in a particular month. He actuallydesigned a custom function in plpgsql. So if you're wanting to get more experience with writingplpgsql functions, here's a blog post you can check out. And with regard to that, the next post iswhy you need plpgsql underscore check if you write procedures in plpgsql. So what this does isit does some validity checks to the code that you're writing that the built in solution does nothave. And it even has a profiler as well as does some performance warnings here. So it lookslike it's a great tool in your plpgsql development. So definitely check out this blog post if you tendto do a lot of that because maybe you'd like to use this tool to make your job easier.

[00:08:55] The last post is waiting for PostGIS three GEOS three eight. So basically what they'retalking about here is that a PostGIS version three is coming. And what a lot of it does is it links tospecialized libraries, as they say here, to handle particular problems. And one of the ones, theyuse a GEOs for computational geometry, and there were some issues, and they're talking aboutsome function --s that are backed by GEOS, including the ones listed here. And they've had some problems interms of sometimes the library throws errors, but they're working to make improvements withPostGIS three that apparently you'll really see the benefits with GEOS three eight in terms ofbeing more reliable. So if you use PostGIS, definitely something to look forward to with theupcoming PostGIS Three.

[00:09:42] That does it. For this episode of Scaling Postgres, you can get links to all the contentmentioned in the show notes. Be sure to head over to Scalingposgres.com, where you can signup to receive weekly notifications of each episode, or you could subscribe via YouTube or itunes.Thanks. What's? --

episode_image