background

Ten Things, Audit Triggers, Schema Design, High Availability | Scaling Postgres 108

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss ten things to hate about Postgres, audit trigger performance, designing a schema and the future of high availability.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres we talk about ten things audit triggers, schemadesign and high availability. I'm Creston Jameson. And this is scaling postgres episode 180.

[00:00:21] Alright, I hope you, your coworkers and family are still doing well giving the situationthat's still with us but our first piece of content is ten things I hate about PostgreSQL. This isfrom R Branson at a Medium and I'm just going to run through the different things that he hatesabout PostgreSQL. Now ultimately he does like PostgreSQL and these are the pain points thathe sees and hopefully they will fix eventually but number one is disastrous XID or the transactionID wraparound. So basically this is you can't have more than 2 billion active transactions andVacuum takes care of handling this but if you hit this 2 billion mark you basically have to take thetable offline in order to repair it and that may mean that of course the database as well. Sobasically, definitely something you should be monitoring. And I agree it is an unfortunate thinghaving to monitor for this, particularly for larger databases. And hopefully at some point, as hementions here, the Xids will be using say, something like a 64 bit integer but until then, we'repretty much stuck with it. So definitely have to agree with him with this.

[00:01:35] Two is failover will probably lose data. Now he's talking about you're using streamingreplications so this is streaming the wall files from a primary to a Replica and normally when youset that up, the default way to set it up is asynchronously now there's a possibility that you'regoing to lose some data when say the primary goes down. Now, you can do synchronousreplication and you can even do it with a quorum commit like it has to commit on so manyreplicas. But then you're going to be taking a performance hit and particularly if you're trying tospread your databases across, say, availability zones in the case of AWS, like multiple regions,you're going to further take a performance it if you try to do that using synchronous replication.So if you're going to use it, it probably makes sense to do it within the same availability zone orwithin the same region, but then synchronizing to other regions. Do it asynchronously if you'regoing to try to attempt that but basically saying there's no really a great solution for this. Hementions some other types of clustering solutions you may want to investigate if this is importantto you. And we also cover something about bi directional replication later that has its own set ofissues, but that would be another way you could investigate trying to really avoid losing data. Butwe'll cover that in an upcoming piece of content in this episode. So I do have to agree this is anissue that people will have, and there is a performance hit with synchronous replication. Notquite sure how to necessarily avoid that. Three is inefficient replication that spreads corruption.Now, I wasn't really quite sure what he was referring to here. He talks about the wall and thenwith a large database, they had something that led to hardware induced data corruption on theMaster, much more easily propagating to Replicas. So I was a little confused. So somehardware caused the corruption, so I would blame the hardware as opposed to Postgres. Andyes, if you are replicating data from a primary to a replica, any data corruption will probably bepassed through to the Replica. So I don't really get necessarily where he's going with this. Andthis is with essentially wall based replication, doing streaming physical replication as opposed hesays logical replication may avoid some of this, but I've never seen kind of what he's mentioninghere if I'm interpreting correctly. Number four is MVCC garbage frequently painful? Nowessentially he's talking about a vacuum. And when you're say doing a lot of updates on a certaintable, it's actually creating additional rows and the dead rows essentially have to be vacuumedup. And I definitely have to agree with him on this because many developers engineersomething with a lot of updates. And Postgres can run into vacuum issues when you set it upthat way. But he does mention Zheep. So of course this is a storage solution that adopts atechnique that MySQL and Oracle use using redo or undo logs. So essentially using this type oflog allows you not to have to create a whole new row, but you can do an update in place andindicate that that row --has been updated. So if you need to reference what that value was or do a rollback it's withinthe redo or the undo logs. So I do have to agree a vacuum can be painful, but there seems to behope with Zheep. So we'll see what the future holds with regard to that. Next is process perconnection equals pain at scale. And essentially this is when you get up to a couple of hundredconnections. Because it's a process for every connection, it uses a lot of resources and basicallypeople look to connection poolers to help mitigate that and to get more connections to thedatabase. And for example, PG Bouncer is the connection pooler I have experience with. But itwould be great if Postgres eventually had a threaded means of supporting more connections outof the box without having to rely on PG Bouncer. Because even though PG Bouncer is great, ithad some of its own things you need to be aware. Like for example, it itself is single threaded.So if you have a multi CPU box and want to run PG Bouncer on it and utilize all the cores or theCPUs of the system, then you have to run multiple instances of PG Bouncer and funnel yourclients through each of those PG Bouncers running on different ports. So I have to agree. Notgreat. It would be ideal to have a better solution for what exists currently. Number six is primarykey index is a space hawk. So this is saying that an index exists separate from the physical rowson the table. And other database systems do indexed organized tables. So the primary key, it'sorganized that way on disk and that if you do it that way, you can save some space, particularlyif there's not that many columns in a table. So for example, this is a primary key that coversthree columns, but it's only a four column table that allows you to save a lot of space. But howoften does that really happen? In my implementations, this is very rare, so I'm not really surehow much space this would save. So with this, I'm a little up in the air on agreeing with it or not.Number seven major version upgrades can require downtime. And he says I quote some majorversion upgrades require hours of downtime to convert data for a large database. Now, I knowwith PG upgrade I've been able to do it in less than a minute, or worst case, a few minutes. Now,what is an issue is upgrading any active replicas because you can just upgrade the primary andthen you need to essentially do a whole restore to get replicas back online, unless you do aparticular procedure to be able to upgrade those as well. So I can see that being an issue and ifthere was a resolution for that, that would be great. But actually upgrading the primary databasein my experience, can be done really quickly. Number eight, somewhat cumbersome replicationsetup, and I don't know if I've had difficulty with it. You do need to have experience and kind ofknow what you're doing, but I don't find it particularly cumbersome to set up replication. Nineridiculous. No planner Hints dogma so this is the case where you actually can't use planner hintswithin your SQL code. You always have to rely on whatever decision the planner makes. Now, Ido have to agree with him that it would be nice to have this capability for this particular reasonhere. He says this philosophy seems brutally paternalistic when you're watching a productiondatabase spiral into a full meltdown under a sudden and unexpected query plan shift. So I'veseen this where you have a production database running just fine. And then there's somethingthat happens with the data. Something's changed, and the plan changes. And now suddenly youhave poor performance for queries that were just fine before. But you have to do someinvestigation to figure out what's wrong and figure out what the problem is. If there was a way todo a hint as a temporary workaround, of course those probably always end up being longerterm. I could definitely see the benefit of that. So I can see the benefits of being able to useplanar hints in SQL queries in certain use cases. And lastly, ten is no block compression. So hesays PostgreSQL does automatically compress some large values but it isn't useful for the mostcommon ways data is stored in a relational database. And he says what's important is actuallyhave block level compression. Now I've seen other people use things like he mentions here,ZFS or other file systems that do compression that enable them to boost their performance aswell as significantly reduce the stora --ge space required for the database. So I wouldn't say this is something I hate, but it would bebeneficial to have this capability as an option. So something to look forward to, for example, butyou can do file system compression for where your database files reside, running it on a differentfile system. So overall I would say I agree with more than half of his proposals of things that youneed to be aware of or can be a pain and that hopefully the PostgreSQL community will beworking toward it. So definitely a blog post I suggest you check out.

[00:10:19] Next post is performance differences between normal and generic audit triggers. Thisis from CyberTech Postgresql.com and they're referencing a previous post talking about rowchange auditing purposes for PostgreSQL and they proposed a couple of different ways. Oneway they proposed was being able to set up per table triggers and logs for auditing the databeing changed. The other was using a generic table, a common table, and have triggers in eachof the logged tables being able to write there. And he wanted to know what the performancedifferences were because his interpretation was that the table specific triggers and audit tableswould lead to better performance. So he goes through his testing methodology. Here how he setit up, his hardware set up, and here are the TPS and the latencies. And you can see here thatthe generic solution, which is the one I tend to prefer, was actually 35% faster as opposed toexplicit triggers for each table, which he's surprised and I'm surprised as well. Now he said thedisk footprint of it is larger for the generic triggers, but he does mention the last paragraph.There's a way you could probably mitigate it to not write as much data, so definitely check thatout. Now, one reason you may not want to do it is that it is more difficult to query the generictable versus the explicit tables. But performance wise he was very impressed by how efficientthis generic auditing solution is. So if you're interested in learning more about that, definitelycheck out this blog post.

[00:12:00] The next post is PostgreSQL schema design. So this is from Graphile.org and they'resetting up a schema design for a forum. Now, what's interesting about this is that a lot ofapplication frameworks have their own way of setting up the schema and you use their DSL forsetting it up. But this goes through and shows you a scenario where you can build more in onthe database side in terms of data protections and schemas and unique keys and things of thatnature, where sometimes the application framework wants to handle that. So I thought this is agood post to look at to get a sense of what's possible with postgres if you're not as familiar withhow you can configure your table. So for example, they're showing a person here, it's in its owncustom schema, and they have a check constraint set on the first name to limit the characterlength and on the last name as well. And the created at column actually has a default, which isthe function now. So like in a lot of frameworks, they never apply this type of check constraint tothe database. That's a check that they do within the application themselves as well as a lot oftimes the created at doesn't include this, the application framework handles doing it, but if youhave it in the database, then any other application accessing the database doesn't have to buildthis into it because it's already a part of the database. So this is a good post to review to seewhat kind of things are possible using the capabilities that exist within your database. Theycover, say you want to use a random UUIDs instead of serial. This is how you would set it up,how you can add a table documentation, how you can set up custom types, how you can set updatabase functions. For example, concatenate a person's first name and last name together. Youcould do this in the database as opposed to your application. So all sorts of different ways ofbuilding an application where you're using a lot more database features. So if this is of interest toyou, definitely a blog post to check out.

[00:14:06] The next post is actually a YouTube video and it's the next generation of PostgresHigh Availability, PostgreSQL Ha, and this is on the Database Month YouTube channel and it'spresented by Tom Kincaid of a second quadrant postgresql.com. Now actually what he's talkingabout is their BRD product or their bi directional replication. So he --perceives this as the next generation of Postgres High Availability and he talks about what'skind of current. Now you have a primary that is streaming to one or more Replicas and you havesome means of determining whether that primary is still active. And you set up automatedprocesses to be able to promote the Replica once the master has gone down. So it goes throughthe process of setting that up. And some of the course disadvantages is that you can run into asplit brain situation and it can take up to a minute or more in order to make that switch. And heargues that if your uptime is super important and you want to be able to switch to another activeserver, you may want to investigate going master to master. So bi directional replication. Nowthere are of course a number of disadvantages with that because when you have conflicts so ifthe same entity is updated on each database, say the last one wins. Now, you can't determine oralter the decision process for that, but it's still something you have to take into account. But whatI found interesting is he had mentioned that some customers opt to keep a shadow master. Soit's there, it's constantly replicating, but it's not actively being used. So it's on the sidelines, as itwere. And within seconds you could choose to switch to that, one being the master and not theother. Now, the advantage of that is a faster switch over and second, because you're not activelyusing both databases, it means you avoid these conflict possibilities. Now of course, you stillhave to take into account if you're doing something like a master master, you need a centralizedway of handling ID creation. Meaning you can't really use serial columns anymore that havesequences within each database. You need to use a Uuid or some other central ID crestonmethod. But it's definitely an interesting perspective. I hadn't considered this situation of ashadow master. So if you're interested in learning more about this and their bi directionalreplication product, definitely a video to check out.

[00:16:40] The next post is useful queries for PostgreSQL index maintenance. And this basicallyruns through different queries that you could use to manage your indexes. Now it's interesting,they start off pretty basic here. We're just seeing how many indexes your table has. Now youcan do this with the D command for a table or the D Plus to find out more information with regardto the table and the indexes. But they're showing a way to get just that information and thesystem views you would need to query in order to get it. But they also show how you candetermine if an index is unique.

[00:17:16] What's the size of the index, what's the size of tables indexes and the total size of allof them together? What query built the indexes? Determine if there's limitations of what kind ofindexes that can be built looking for unused indexes as well as looking for duplicate indexes. Soif you wanted a few queries that can help you better manage your indexes, definitely a blog postto check out.

[00:17:41] The next post is PostgreSQL wall evolution, part one. Now there is part one here andpart two here that is covering the evolution of the write ahead log. So going from 7.1 where it justprotects against a crash, version eight adds point in Time Recovery, version nine addsReplication Capabilities. Version ten adds Logical Replication capabilities and then of course,what's going on in the future and they look at it from the perspective of the different configurationvariables and what's possible to be set in each one. So they go through each version. The firstpost here goes through 83 and the second one continues on with version nine and continuing upthrough version twelve. So if you're interested in learning more about the history and all thedifferent capabilities of configuring your write ahead log, definitely a pair of blog postgres tocheck out. Oh, and I should say this is from Heigo CA.

[00:18:42] The next post is can't connect to Postgres. This is from Mydbainotebook.org and thisis a follow on blog post where he's talking about you've tested that postgres is running, you knowwhat port to connect to, but you still can't connect to it. So he has a flowchart for you to trydifferent things. One other thing he mentions is it could be your listen addresses. And the otherthing is again checking out your logs to see what may be going wrong and if you see a particularerror message, some solution --s to hopefully figure out what's going on. So, very brief post, but meant to help you be able toconnect to postgres when you're having difficulty.

[00:19:21] The next post is the PostgreSQL Person of the Week is Sarah Connor Schner.Forgive me if I mispronounce that. So if you want to review Sarah's work and experience andcontributions to the PostgreSQL community, definitely check out this blog post.

[00:19:38] The next post is PostgreSQL GSS API authentication with Kerberos. Part Three thestatus of authentication, encryption and user principle. So this is the part three, the final part ofsetting up Kerberos with PostgreSQL. So if you're interested in doing that, definitely a blog postto check out.

[00:19:57] And lastly, last blog post is importing spatial data to postgres. So this shows you acouple of different ways that you can import mapping data as well as where to find a few datasets and get it into PostgreSQL to use with PostGIS. So if you have interest in doing that,definitely a blog post to check out.

[00:20:19] That does it. For this episode of Scaling Postgres, you can get links to all the contentmentioned in the show notes. Be sure to head over to Scalingposgriz.com where you can signup to receive weekly notifications of each episode. Or you could subscribe via YouTube oritunes. Thanks. --

episode_image