
Ten Things, Audit Triggers, Schema Design, High Availability | Scaling Postgres 108
Join Over 1,000 Engineers & Get New Episodes Weekly!
In this episode of Scaling Postgres, we discuss ten things to hate about Postgres, audit trigger performance, designing a schema and the future of high availability.
Content Discussed
- 10 Things I Hate About PostgreSQL
- Performance differences between normal and generic audit triggers
- PostgreSQL Schema Design
- [YouTube] The Next Generation of Postgres High Availability (PostgreSQL HA)
- Useful Queries For PostgreSQL Index Maintenance
- PostgreSQL_WAL_Evolution_Part_I
- PostgreSQL_WAL_Evolution_Part_II
- Can't connect to Postgres
- Sarah Conway Schnurr
- PostgreSQL GSSAPI Authentication with Kerberos part-3: the status of authentication, encryption and user principal
- Importing spatial data to PostGIS
YouTube Video
Podcast Audio
Transcript
Alright, I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "10 Things I Hate About PostgreSQL". This is from Rick Branson on medium.com and I'm just going to run through the different things that he hates about PostgreSQL. Now ultimately he does like PostgreSQL and these are the pain points that he sees and hopefully they will be fixed eventually. Number one is disastrous XID or the transaction ID wraparound. So basically this is you can't have more than 2 billion active transactions and vacuum takes care of handling this but if you hit this 2 billion mark you basically have to take the table offline in order to repair it and that may mean that of course the database as well. So basically, definitely something you should be monitoring. I agree it is an unfortunate thing having to monitor for this, particularly for larger databases. And hopefully at some point, as he mentions here, the XIDs will be using say, something like a 64-bit integer but until then, we're pretty much stuck with it.
So I definitely have to agree with him on this. Two is failover will probably lose data. Now he's talking about you're using streaming replications so this is streaming the WAL files from a primary to a replica and normally when you set that up, the default way to set it up is asynchronously. Now there's a possibility that you're going to lose some data when the primary goes down. You can do synchronous replication and you can even do it with a quorum commit like it has to commit on so many replicas. But then you're going to be taking a performance hit. Particularly, if you're trying to spread your databases across, say, availability zones in the case of AWS, like multiple regions, you're going to further take a performance hit if you try to do that using synchronous replication. So if you're going to use it, it probably makes sense to do it within the same availability zone or within the same region, but then synchronizing to other regions does it asynchronously if you're going to try to attempt that but basically saying there's no really a great solution for this.
He mentions some other types of clustering solutions you may want to investigate if this is important to you. He also covers something about bi-directional replication later that has its own set of issues, but that would be another way you could investigate trying to really avoid losing data. But we'll cover that in an upcoming piece of content in this episode. So I do have to agree this is an issue that people will have, and there is a performance hit with synchronous replication. Not quite sure how to necessarily avoid that. Three is inefficient replication that spreads corruption. Now, I wasn't really quite sure what he was referring to here. He talks about the WAL and then with a large database, they had something that led to hardware-induced data corruption on the master, much more easily propagating to replicas. So I was a little confused. So some hardware caused the corruption, so I would blame the hardware as opposed to Postgres. And yes, if you are replicating data from a primary to a replica, any data corruption will probably be passed through to the replica.
So I don't really get where he's going with this. And this is with essentially WAL-based replication, doing streaming physical replication as opposed he says logical replication may avoid some of this, but I've never seen what kind of what he's mentioning here if I'm interpreting correctly. Number four is that MVCC garbage is frequently painful. Now essentially he's talking about a vacuum. When you're saying doing a lot of updates on a certain table, it's actually creating additional rows and the dead rows essentially have to be vacuumed up. And I definitely have to agree with him on this because many developers engineer something with a lot of updates. Postgres can run into vacuum issues when you set it up that way. But he does mention zheap. So of course this is a storage solution that adopts a technique that MySQL and Oracle use using redo or undo logs. So essentially, using this type of log allows you not to have to create a whole new row, but you can do an update in place and indicate that that row has been updated.
So if you need to reference what that value was or do a rollback, it's within the redo or the undo logs. So I do have to agree a vacuum can be painful, but there seems to be hope with zheap. So we'll see what the future holds with regard to that. Next is the process per connection equals pain at scale. Essentially this is when you get up to a couple of hundred connections. Because it's a process for every connection, it uses a lot of resources and basically people look to connection poolers to help mitigate that and to get more connections to the database. For example, PgBouncer is the connection pooler I have experience with. But it would be great if Postgres eventually had a threaded means of supporting more connections out of the box without having to rely on PgBouncer. Because even though PgBouncer is great, it has some of its own things you need to be aware of. Like for example, it itself is single-threaded.
So if you have a multi-CPU box and want to run PgBouncer on it and utilize all the cores or the CPUs of the system, then you have to run multiple instances of PgBouncer and funnel your clients through each of those PgBouncers running on different ports. So I have to agree. Not great. It would be ideal to have a better solution for what exists currently. Number six is the primary key index is a space hog. So this is saying that an index exists separate from the physical rows on the table. And other database systems do indexed organized tables. So the primary key is organized that way on disk, and if you do it that way, you can save some space, particularly if there are not that many columns in a table. So for example, this is a primary key that covers three columns, but it's only a four-column table that allows you to save a lot of space. But how often does that really happen?
In my implementations, this is very rare, so I'm not really sure how much space this would save. So with this, I'm a little up in the air on agreeing with it or not. Number seven major version upgrades can require downtime. He says, quote "Some major version upgrades require hours of downtime to convert data for a large database". Now, I know with pg_upgrade, I've been able to do it in less than a minute, or worst case, a few minutes. Now, what is an issue is upgrading any active replicas because you can just upgrade the primary and then you need to essentially do a whole restore to get replicas back online unless you do a particular procedure to be able to upgrade those as well. So I can see that being an issue and if there was a resolution for that, that would be great. But actually upgrading the primary database in my experience, can be done really quickly.
Number eight is a somewhat cumbersome replication setup, and I don't know if I've had difficulty with it. You do need to have experience and kind of know what you're doing, but I don't find it particularly cumbersome to set up replication. Nine is ridiculous no-planner-hints dogma. So this is the case where you actually can't use planner hints within your SQL code. You always have to rely on whatever decision the planner makes. Now, I do have to agree with him that it would be nice to have this capability for this particular reason here. He says "...this philosophy seems brutally paternalistic when you're watching a production database spiral into a full meltdown under a sudden and unexpected query plan shift". So I've seen this where you have a production database running just fine. And then there's something that happens with the data. Something's changed, and the plan changes.
Now suddenly, you have poor performance for queries that were just fine before. But you have to do some investigation to figure out what's wrong and figure out what the problem is. If there was a way to do a hint as a temporary workaround, of course, those probably always end up being longer term. I could definitely see the benefit of that. So I can see the benefits of being able to use planar hints in SQL queries in certain use cases. And lastly, ten is no block compression. So he says PostgreSQL does automatically compress some large values but it isn't useful for the most common ways data is stored in a relational database. And he says what's important is actually block-level compression.
Now I've seen other people use things like he mentions here, ZFS or other file systems that do compression that enable them to boost their performance. It also significantly reduces the storage space required for the database. So I wouldn't say this is something I hate, but it would be beneficial to have this capability as an option. So something to look forward to, for example, but you can do file system compression for where your database files reside, running it on a different file system. So overall I would say I agree with more than half of his proposals of things that you need to be aware of or can be a pain and that hopefully the PostgreSQL community will be working toward it. So definitely a blog post I suggest you check out.
Next post is "PERFORMANCE DIFFERENCES BETWEEN NORMAL AND GENERIC AUDIT TRIGGERS". This is from cybertec-postgresql.com and they're referencing a previous post talking about row change auditing purposes for PostgreSQL and they proposed a couple of different ways. One way they proposed was to be able to set up per-table triggers and logs for auditing the data being changed. The other was using a generic table, a common table, and having triggers in each of the logged tables being able to write there. He wanted to know what the performance differences were because his interpretation was that the table-specific triggers and audit tables would lead to better performance. So he goes through his testing methodology.
Here is how he set it up, his hardware set up, and here are the TPS and the latencies. You can see here that the generic solution, which is the one I tend to prefer, was actually 35% faster as opposed to explicit triggers for each table, which he's surprised and I'm surprised as well. Now he said the disk footprint is larger for the generic triggers, but he does mention the last paragraph. There's a way you could probably mitigate it by not writing as much data, so definitely check that out. Now, one reason you may not want to do it is that it is more difficult to query the generic table versus the explicit tables. But performance-wise, he was very impressed by how efficient this generic auditing solution is. So if you're interested in learning more about that, definitely check out this blog post.
The next post is "PostgreSQL Schema Design". This is from graphile.org and they're setting up a schema design for a forum. Now, what's interesting about this is that a lot of application frameworks have their own way of setting up the schema and you use their DSL for setting it up. But this goes through and shows you a scenario where you can build more on the database side in terms of data protection and schemas and unique keys and things of that nature, where sometimes the application framework wants to handle that. So I thought this is a good post to look at to get a sense of what's possible with Postgres if you're not as familiar with how you can configure your table. So for example, they're showing a person here, it's in its own custom schema, and they have a check constraint set on the first name to limit the character length and on the last name as well. The created_at column actually has a default, which is the function NOW. So like in a lot of frameworks, they never apply this type of check constraint to the database.
That's a check that they do within the application themselves and a lot of times, the created_at doesn't include this, the application framework handles doing it. But if you have it in the database, then any other application accessing the database doesn't have to build this into it because it's already a part of the database. So this is a good post to review to see what kind of things are possible using the capabilities that exist within your database. They say you want to use a random UUID instead of serial. This is how you would set it up, how you can add table documentation, how you can set up custom types, and how you can set up database functions. For example, concatenate a person's first name and last name together. You could do this in the database as opposed to your application. So all sorts of different ways of building an application where you're using a lot more database features. So if this is of interest to you, definitely a blog post to check out.
The next post is actually a YouTube video. "The Next Generation of Postgres High Availability (PostgreSQL HA)". This is on the Database Month YouTube channel and it's presented by Tom Kincaid of 2ndQuadrant. Now actually what he's talking about is their BRD product or their bi-directional replication. So he perceives this as the next generation of Postgres High Availability and he talks about what's kind of current. Now you have a primary that is streaming to one or more replicas and you have some means of determining whether that primary is still active. You set up automated processes to be able to promote the replica once the master has gone down. So it goes through the process of setting that up. Of course, some disadvantages are that you can run into a split-brain situation and it can take up to a minute or more in order to make that switch.
He argues that if your uptime is super important and you want to be able to switch to another active server, you may want to investigate going from master to master. So bi-directional replication. Now there are of course a number of disadvantages with that because you have conflicts. So if the same entity is updated on each database, say the last one wins. Now, you can't determine or alter the decision process for that, but it's still something you have to take into account. But what I found interesting is he mentioned that some customers opt to keep a shadow master. So it's there, it's constantly replicating, but it's not actively being used. So it's on the sidelines, as it were. Within seconds, you could choose to switch to that one being the master and not the other.
Now, the advantage of that is a faster switchover, and second, because you're not actively using both databases, it means you avoid these conflict possibilities. Now of course, you still have to take into account that if you're doing something like a master-master, you need a centralized way of handling ID creation. This means you can't really use serial columns anymore that have sequences within each database. You need to use a UUID or some other central ID creation method. But it's definitely an interesting perspective. I hadn't considered this situation of a shadow master. So if you're interested in learning more about this and their bi-directional replication product, definitely a video to check out.
The next post is "Useful Queries for PostgreSQL Index Maintenance". This is from percona.com. This basically runs through different queries that you could use to manage your indexes. Now it's interesting, they start off pretty basic here. We're just seeing how many indexes your table has. Now you can do this with the \d command for a table or the \d+ to find out more information with regard to the table and the indexes. But they're showing a way to get just that information and the system views you would need to query in order to get it. But they also show how you can determine if an index is unique. What's the size of the index? What's the size of the tables, indexes, and the total size of all of them together? What query built the indexes? Shows how to determine if there are limitations to what kind of indexes can be built. Looking for unused indexes as well as looking for duplicate indexes. So if you want a few queries that can help you better manage your indexes, definitely a blog post to check out.
The next post is "PostgreSQL_WAL_Evolution_Part_1". This is from highgo.ca. Now there is part one here and part two that covers the evolution of the write-ahead log. So going from 7.1 where it just protects against a crash, version 8 adds point-in-time recovery, and version 9 adds replication capabilities. Version 10 adds logical replication capabilities and then of course, what's going on in the future. They look at it from the perspective of the different configuration variables and what's possible to be set in each one. So they go through each version. The first post here goes through 8.3 and the second one continues on with version 9 through version 12. So if you're interested in learning more about the history and all the different capabilities of configuring your write-ahead log, definitely a pair of blog Postgres to check out.
The next post is "Can't connect to Postgres". This is from mydbanotebook.org and this is a follow-up blog post where he's talking about how you've tested that Postgres is running, you know what port to connect to, but you still can't connect to it. So he has a flowchart for you to try different things. One other thing he mentions is that it could be your listening addresses. And the other thing is again checking out your logs to see what may be going wrong and if you see a particular error message, some solutions to hopefully figure out what's going on. So a very brief post, but meant to help you be able to connect to Postgres when you're having difficulty.
The next post, the PostgreSQL person of the week is Sarah Conway Schnurr. So if you want to review Sarah's work, experience, and contributions to the PostgreSQL community, definitely check out this blog post.
The next post is "PostgreSQL GSSAPI Authentication with Kerberos part-3: the status of authentication, encryption and user principle". This is from highgo.ca. So this is part three, the final part of setting up Kerberos with PostgreSQL. So if you're interested in doing that, definitely a blog post to check out.
The last blog post is "Importing spatial data to PostGIS". This is from techcommunity.microsoft.com. So this shows you a couple of different ways that you can import mapping data as well as where to find a few data sets and get it into PostgreSQL to use with PostGIS. So if you have an interest in doing that, definitely a blog post to check out.