background

Sharding, Giving Back, SCRAM, Partitioning Evolution | Scaling Postgres 36

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we review articles covering PG11 sharding, giving back, SCRAM authentication and the evolution of partitioning.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about sharding, giving back scram andpartitioning evolution. I'm creston. Jameson and this is scaling postgres episode 36.

[00:00:21] Alright, I hope everyone is doing well. So this week, our first article is why Citus Datais donating 1% equity to PostgreSQL organizations. This is from the Citusdata.com blog and itbasically is exactly as they've described to contribute to PostgreSQL. Citus Data has contributed1% of its stock to the project and specifically it's the PostgreSQL Association, which is in the US.And that will work with PostgreSQL Europe to support the growth, education and futureinnovation of postgres in both the US and Europe. So again, if you're not familiar, citus Data asan extension to Postgres for Sharding your database and they also offer it as a service. And Idefinitely encourage you to read this blog post because it's really great and I really like some ofthe things that they say here. And I found this part interesting.

[00:01:13] PostgreSQL is the world's only independent database and they talk about it'ssignificant because it's unique and that no single company backs or represents the Postgresproject. So it's essentially a community, a coalition of companies, organizations and individualsthat have made and will continue to make PostgreSQL the database that it is. I definitelyencourage you to check out this blog post and I really find it interesting the investment that CitusData has made here.

[00:01:45] The next post is sharding your data with PostgreSQL eleven. And this is from the PGIO blog. Now, as we've discussed in previous episodes of Scaling Postgres, one of the big timefeatures of PostgreSQL Eleven is all the different partitioning enhancements that have beenadded. And one of the ones that hasn't quite been mentioned in a lot of the posts is a change forforeign data wrappers that kind of lets you start setting up your own sharding system, if that's ofinterest to you, and they walk through how to do it in this blog post. Basically you're going to beusing a declarative partitioning introduced in version Ten. So in their example here, they createda temperatures table and partitioned it by a range of dates and created some partitions for it byyear. So they did 2017, 2018, and this is just your standard partitioning. But the interesting partis getting into the foreign data wrapper. So they create the extension Postgres foreign datawrapper FDW, and they essentially set up a second server system they're calling box Two. Soit's a separate server instance. And they also map a user for that second server instance. Onthat second box, they're going to create a temperatures table for 2016. So the primary databaseserver, as it will, has 2017 2018. So they create 2016 on their separate instance. Then withintheir primary, they create that foreign table and make it a partition table and they indicate it's onthe second server. And now you can do inserts into temperatures with the old date on theprimary server and it will write that data to the second server. So it's communicating over thenetwork to store the data for this partition child table on that second server. And they say here aquote being able to insert rows into a remote partition is new in version eleven. With this featureyou can now have your data sharded logically via partitions and physically by the foreign datawrapper. So again, Sharding is not something that you typically hear about as a mainstayfeature of PostgreSQL, but we've been slowly moving towards it and this kind of puts someadditional pieces in play where you could potentially start using these techniques now. But witheach release we're going to get more features that make Sharding a reality for more use cases.So if you're potentially interested in Sharding, this is definitely a blog post to check out.

[00:04:14] The next post is how to set up Scramshaw 256 authentication in PostgreSQL. Andthis is from the Hacksoclock Blogspot.com blog and basically most everybody's using MD Fiveauthentication. However, Scram support was added in PostgreSQL ten and basically it's muchmore secure and you should look to move to Scram as soon as you're able. And one of the maindelays you might get here, some libraries may not support this level of authentication yet. Sobefore you choose to go this route, definitely confirm that all of your libraries that need tocommunicate with PostgreSQL support th --is type of authentication. And he just goes over the steps, how to set up and the process torun through and it was all very clear to me. So if you're wanting to increase your security byusing Scram, definitely a blog post to check out the next post is PostgreSQL eleven partitioningevolution from postgres 9.6 to eleven and basically table partitioning has existed for a long timein postgres and it wasn't until ten that they added declarative partitioning and then much morefeatures in eleven. And I keep saying this in different episodes and talking about blog posts, butthis kind of lays out a feature comparison you may see for a product that's on sale and you couldsee the significant enhancements that have been added via eleven compared to ten. And evenin 9.6 had hardly anything, only a little bit of optimizer partition elimination. So again, thisreiterates it in a very visible form, the number of enhancements that have been added inPostgreSQL eleven with regard to partitioning. And the other thing they talk about here is theperformance. Now here they're loading partition data, 10 million rows worth. With 100 partitionswe could see a dramatic increase in speed from version 9.6 to ten and eleven, so really bigimprovements.

[00:06:15] And here they're looking up a single record and performing some DML statementsand you can see going from 9.6 to ten to eleven. Again, dramatic improvements in selectsinserts. It looks in terms of updates we lost a little bit with Ten but have gained it back in elevenand then Deletes are about the same. But definitely it talks a lot about partitioning and how ifyou're not using it, PostgreSQL eleven is definitely a version to start trying it out if you think it'llhelp you scale your database better.

[00:06:45] The next post is actually the YouTube channel I've discussed for a few episodes ofScaling Postgres, but it's the Postgres Open SV 2018 YouTube channel. I've actually gonethrough a few of these videos and I want to kind of mention the top one I've seen thus far as wellas a runner up, I would say. So out of the ones I've seen thus far, the one that I would give kindof the top honors is Dude, where's my byte? And basically he traces where is your data storedactually in postgres when it stores it. And he talks about where on the file system looking in thedifferent directories and the base directories, what's stored there. And he talks about how thatdata is there and how it gets there. Like when you create a database, where is that actuallycreated? When you create a table, where is that in the file system? And then he talks about theimportance of when you insert data, why does it take a while to get there. And he talks about thedifferent processes in terms of background writing and checkpoints and wall. And basically hegoes through a lot of the rationale and the engineering behind PostgreSQL. So it is a bit of aWhirlwind tour but it gives you a good overview on kind of how postgres works and kind of why itworks the way it does. So out of the videos I've seen thus far, it's not specifically related toscaling, but I would say this is one of the better presentations I've seen thus far from thatconference. And what I'll do is over the next few weeks as I've watched more of these videos, I'lldefinitely share what I think are some of the top videos from that YouTube channel. Now, therunner up is the evolution of postgres high availability. So again, this talk goes from postgresversions prior to, I believe, eight and going up to all the different changes. And essentially whatit's mostly talking about is replication. So at first you had wall shipping where you just take yourwall files and you ship them to another server and restore them there to be able to do real timephysical streaming and then moving into logical replication and logical decoding of the wall filesand ending up talking about bi directional replication. Now, this is not something that thecommunity PostgreSQL does, but Second Quadrant does have their BDR, their bi directionalreplication product and he talks about that and some of the trade offs in using that. So if you'rewanting to learn more about replication or maybe altering or changing how you're doing? Someof it definitely a presentation to watch. The next post is PostgreSQL locking part Twoheavyweight locks. And this is from the Procona.com blog. And basically this is just an overviewof all sorts of different locks that are availab --le in the system. So he talks about Row sharing locks, exclusive locks, different Share locks,Share Update, exclusive, Access Exclusive, and talk about PG advisory locks and Deadlocks.So this is definitely a great review for learning all about the different locks that are in PostgreSQLthat the system uses as well as you could use to lock certain objects.

[00:09:39] The last post is basically an announcement that Pipeline DB has reached 1.0.0, andthis is a high performance times aggregation for PostgreSQL. And the TLDR is Pipeline DB isnow a postgres extension. So similar to Timescale DB, they've made their time series databaseessentially now a PostgreSQL extension. It used to be a fork, but now it's moved back to anextension, and they classify it as a enabling real time reporting use cases at scale where onlysummary data is needed. And it's now licensed under the liberal Apache 20 license. So if you'repotentially interested in Pipeline DB or a time series database, that's essentially just anextension of PostgreSQL, definitely a blog post to check out.

[00:10:27] That does it. For this episode of Scaling Postgres, you can get links to all the contentmentioned in the show notes. Be sure to head over to Scalingpostgres.com, where you can signup to receive weekly notifications of each episode, or you can sign up via YouTube or itunes.Thanks. --

episode_image