background

Constraints, User Order, PG Terrible, Logical Upgrades | Scaling Postgres 5

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we review articles covering database constraints, user defined ordering, how PostgreSQL is terrible, TimeScaleDB and using logical replication for upgrades.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about constraints. User base sortingpostgres is terrible time series databases and zero downtime upgrades. I'm creston jamison.And this is scaling postgres episode five.

[00:00:24] Okay, the first post is Database Constraints in Postgres, the last line of defense, andthis is from the Citusdata.com blog, and it goes over the importance of constraints in terms ofrestricting the type of data you insert into your database. Now, depending on the application thatyou're developing, usually a framework that you're using allows you to place constraints on theapplication to ensure only certain data gets inserted into your database. However, what he'stalking about here are actual database constraints. So these are constraints that you apply onthe tables and columns of your data to ensure that only the right data gets inserted in there.Because if you place your constraints at the application level, there are always other ways youcan using Directsql get in and alter the data. So having constraints on the database is animportant thing to set. And one of the first examples he uses here of a constraint are basicallyunique indexes. Now, he's specifically talking about partial unique indexes, but just having aunique index itself is a type of constraint. But the partial unique index example he uses is sayyou have a user's table and you want the emails within that user's tables to be user table to beunique only when a deleted at column is null. So he is saying a partial unique index is a tool youcan use for this, and then they go over other basic constraints you can set. So for example, youcan set, okay, the price is greater than zero, or the sales price is greater than zero. Soestablishing a classical check constraint on the data being inserted to ensure that prices aregreater than zero, and you can even add other checks, such as the price will always be greaterthan what the sales price is. And they also go into more sophisticated check constraints you canestablish for your database. They also discuss exclusion constraints and these help preventoverlap of particular data sets. And then they talk about simple data types being databaseconstraints. So you could use a text column for everything, but it can be really helpful to use aboolean if all you're looking for is a true false value, or if you need to restrict the length of aparticular column to actually use a fixed size Vercare field. Or there are IP column types you canuse.

[00:03:03] So this is a good article to look at for developers, especially if you are looking to dothings to help increase your data's integrity.

[00:03:13] The next article is User defined order in SQL. And this is from the BegriffsBegriffs.com blog.

[00:03:24] And this blog post discusses a particular use case where you have an ordered list ofitems and the user can move those items and arrange them in any order that they want and howbest to accomplish it from an efficiency standpoint, a robustness standpoint and an elegancestandpoint. And he goes over four different approaches that you can use to support user basedordering. He talks about using an integer method, using a decimal position method, an approachusing true fractions or rational numbers that's supported by a database extension he created, aswell as using that technique true fractions, but doing it as floats. So if this is a particular featureset that your application has user based ordering, you might want to check out this post to see ifimplementing some of these methods make sense to you. Now, his database extension is calledPG Rational. I'm not 100% sold on using it where I'm using this feature in my applications.However, I would encourage you to check out this blog post to see if it's something you might beinterested in looking into.

[00:04:45] The next post is PostgreSQL speeding up analytics and windowing functions. And thisis from the CyberTech Postgresql.com blog. And the subheading can be Improving SQLPerformance or how to Trick the Optimizer. So in this example he sets up an array aggregatewindow function that orders once by ID and then once by ID descending and then orders thefinal result by ID. And here's the explain plan for it. But what's interesting is by simply moving thecolumn order in the select, he is able to eliminate one of the sorting operations in the explainplan. So this is an example of where a simple change can get you a p --erformance boost in your queries. So if you're using window functions that are utilizing multipleorder bys in this way, you might want to check out this blog post to see how you can potentiallymake them more efficient. The next article is actually a presentation that was given at Nordic PGDay in 2018 from Christophe Pettis, I believe from PostgreSQL experts, and the title is Why isPostgreSQL Terrible? So it's always a little bit difficult to interpret presentations from slides.However, this presentation, you can pretty much get the gist of it by reading through it. I hope atsome point they post it on YouTube because it is a really interesting presentation and it basicallydiscusses the pain points that people typically have with PostgreSQL from the perspective ofquery planning, vacuum upgrades, connection management. So this is how many connections tohave to your PostgreSQL server as well as are you using PG Bouncer or not? Or PG Pool, highavailability and of course of special interest to this channel scaling. So this post talks a lot aboutthe problems that PostgreSQL has or that people bring up that it has. And it doesn't necessarilycome up with exact solutions, but a general direction in where the community generatedPostgreSQL should follow. So if you use PostgreSQL for any purpose, I highly suggest youcheck out this presentation because it gives you a good perspective on not only wherePostgreSQL is right now, but how it can be improved for the future. The next article is actually aYouTube video and its name is time series database lectures number six Mike Friedman,Timescale DB.

[00:07:37] And this is from the channel. The Carnegie Mellon University Database group. Now,this didn't happen this week, this was back in November. However, I just happened upon it whilelooking for other types of content and I had heard a little bit about Time Series databases, so Ithought I would take a look at it. I had seen another Time Series database where basically theyexplained what they were doing and it sounded a lot like they were developing features thatalready existed in PostgreSQL. For example, they were relying on the operating system for a lotof the know inserts happened very fast, whereas updates, they were relatively slow, which istypically what PostgreSQL is like deletion happens via tombstone records, which is kind ofsimilar. They had a process they went through called Compaction, which sounds like vacuumessentially, but they weren't an SQL database. They had an SQL like language, they had writeahead logs, they had indexes in memory, but in the next version they were going to disk, so theyhad to cache them in memory. So a lot of it, it looked like they were replicating the kinds ofthings that PostgreSQL already does. However, this particular presentation is on Timescale DB.Now, this is a PostgreSQL extension, so it uses all the power and the features of PostgreSQL,but then it adds on additional capabilities to support its time oriented feature. Basically a lot ofhigh write rates in time oriented optimizations, as they say here. So if you've heard about TimeSeries DBS and are kind of interested in what they are, and if you're using PostgreSQL, Idefinitely suggest checking out this presentation because if I were ever to need these types offeatures, I would definitely reach out for this particular extension as opposed to changing theentire database that I'm using. But it gives you a good overview about the direction they're takingin terms of this extension.

[00:09:48] The next article is an overview of logical replication in PostgreSQL, and this is fromthe Several nines.com blog and it's exactly what it says. It gives you an overview of logicalreplication in PostgreSQL version ten. Its features, its use cases, limitations, and then breaksout how to actually use it by setting it up with the commands here. So if you're thinking aboutlogical replication, definitely check out this blog post. Now, related to that, there is another blogpost on the second Quadrant.com blog, and it's near zero downtime. Automated upgrades ofPostgreSQL clusters in the cloud. And this is actually part one. And he goes over a presentationthat he did discussing exactly what the title near Zero downtime Automated Upgrades ofPostgreSQL Clusters in the Cloud. And they say automated because they were using Ansible todo this upgrade. So the blog post goes over some of the options for doing the upgrades andthen --goes more into depth about using logical replication for upgrades. But the actual presentation,the slides have been posted at this link that will be in the links in the show notes and thetechnique that they're using for a nine five to nine six upgrade actually uses PG logical, which tomy understanding is the precursor to the logical replication feature in version ten. And essentiallyin this example you had a primary database server with three standbys and Pgbouncer isconnecting to the primary database. When you want to do the upgrade, you take one of thestandbys or an entirely new server and you set up version 9.6 and you do logical replication ofall the data from the primary at 9.5 to the primary at 9.6.

[00:11:44] And then you set up one or more standbys on that version 9.6 using standardphysical replication for example. And then at the point that you need to do the application switch,you simply transfer the Pgbouncer to point to the new 9.6 and then stand up the additionalstandbys that are needed. So if you're considering using logical replication as a way to upgradeyour PostgreSQL database clusters, I definitely suggest checking out this blog post and then thepresentation that it links to.

[00:12:18] The next article is from the Azure Microsoft.com blog and this is announcing generalavailability of Azure database services for MySQL and PostgreSQL. So basically, much likeAmazon, these are now generally available to be used on Azure and there's a presentation herethat goes over the feature set as well as a lot of items mentioned in the blog post. So if you arelooking for a hosting service, Azure is now another possibility for you.

[00:12:50] The next article is using PG underscore Bad Plan to create statistics is a good plan,and this is from Joel on SQL blog post@joelonsql.com. And this is a post related to apresentation that explained how to manually compare the predicted rows a query plan would useversus the actual rows. So if there's too much of a divergence with what the planner thinksshould be there statistically versus what's actually there when it does the query, it will make youaware that these variances exist so that you can potentially use Create statistics. Again, thiswould be a version ten feature in PostgreSQL that would allow you to define dependencies. Sofor example, here they're looking at the dependencies for a zip code table with city, state andcountry. So generally certain zip codes reside in certain cities, states and countries. So if yoususpect that you may have queries or data that could benefit from this, you may want to check itout.

[00:14:00] The next article is PostgreSQL replication slots and this is an in depth tutorial done byme that discusses how to set up PostgreSQL replication slots for physical replication if this issomething you're thinking about implementing, I definitely suggest checking it out.

[00:14:18] That does it. For this episode of Scaling Postgres, you can get all the links to thecontent presented in the show. Notes be sure to head over to Scalingpostgres.com, where youcan sign up to receive weekly notifications of each episode. You can also subscribe via YouTubeor itunes. Thanks. --

episode_image