background

Scaling Advice, Vicious Cycle, Improve Observability, Addressing Corruption | Scaling Postgres 150

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss scaling advice, a vicious cycle of WAL writes, how to improve observability and addressing database corruption.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about scaling advice, a vicious circle,improving observability and addressing corruption. I'm Kristen Jameson and this is ScalingPostgres, episode 150.

[00:00:21] All right, I hope you, your friends, family and coworkers can continue to do well. Ourfirst piece of content is Postgres Scaling Advice for 2021. This is from Cybertechgresql.com andin a nutshell, most of the advice is don't prematurely optimize. So if you're wanting to scale, don'tthink of creating some huge distributed database system. Just start with a single server.Because postgres can get you a long way on just that single server and it covers things such asall distributed systems are inherently complex and difficult to get right. So the more you try toscale out or become distributed, the more complicated things get. And he mentions that with agood server, a single PostgreSQL instance can easily do hundreds of thousands of transactionsper second. So he did some readonly tests of his local system and got 32,000 transactions persecond. Then he looked at where he was doing another workstation with around 45,000 writetransactions per second. So he says basically a single node can typically do tens of thousandsof write transactions per second and that a single postgres instance can easily handle dozens ofterabytes of data, which is true. And then also a single node instance is literally bulletproof as faras data consistency is concerned. But he says to help achieve this, you need to declare yourconstraints so that your data is in a good state and don't fool around with the F sync orAsynchronous commit option. So you make sure that things get committed to the disksappropriately and use good disk systems because we're actually going to cover a post in a littlebit that talks about corruption. And one of the main ways corruption can happen is throughhardware issues. But he also covers how do you be prepared for when you need to scale. So ifyou need to scale out, what do you need to do? So the first thing he mentions is don't be afraidto run your own database and that you can migrate to the cloud later if that's something you feelyou need to do. Don't make the mistake of having your entire data architecture centered aroundone huge table. Be sure to use plenty of tables to spread things out. And thirdly, make sure youbake in a proper sharding key in case that is a direction you have to go. So in other words, makesure you can partition all of your tables pretty much with a shard key. So if all of your tablesinclude that key, then presumably you could shard your data that way. So this post didn't have alot of specific advice for scaling because those are covered in other posts, but it's more kind of athought post for don't prematurely optimize your database. So if you're interested in learningmore you can check out this post.

[00:02:59] The next piece of content is the vicious circle of huge writes. This is fromMydbagnotebook.org and they're talking about a cycle of wall writing that you can get into andthat if you have a lot of write activity, a lot of wall files are being written that can causecheckpoints to occur more frequently, which of course triggers more full page image writes whichgives you more wall. So more wall leads to more frequent checkpoints, which leads to more wallbeing generated, which leads to more checkpoints. So it's a vicious circle and then you haveauto vacuum kicking in when you're updating a lot of rows for example, which then also causesmore wall to be written, thus propagating this vicious circle. So there's a graph right here thatdepicts it is that you have huge writes on a database in a short time that causes a lot of wall tobe written. Now this triggers checkpoints more frequently, which of course causes more wall tobe written because of the full page image rights. And then still while this is going on, it alsotriggers auto vacuum to happen more frequently when you have a lot of huge rights whichcauses more wall to be written and it's a constant vicious circle. Now, in terms of how to solvethat, the number one thing she's advocating is what is your application doing? So for example, ifyou have a lot of updates, is there a way you can reduce the amount of updates? Like maybeyou insert a lot of this data and then update it less frequently? Or can you architect your writes alittle different? The other thing you can do is increase your --max wall size so that the checkpoints don't happen as frequently, or change your checkpointtimeout or your checkpoint settings so that your checkpoint less frequently which gives you lessfull page image rights. You could also potentially do some tweaks to auto vacuum. So if you'reinterested in learning more about this, you can check out this post.

[00:04:48] The next piece of content is configuring PostgreSQL for observability. So this is beingable to determine what's going on in your database. Now the first thing they mentioned doing isactually changing your log line prefix to add a bit more information to add information such asthe application being used in terms of the client, the username, the database connected to the IPaddress, so you could potentially identify the host and things of that nature. So this gives youmore observability to what's going on. When a particular error, say gets written to the logs, thenext thing they mentioned is to log your slow queries. So you could do use log min durationstatement. They propose 1 second so any query longer than a second will get logged. Now, ifyou have too many logs you can actually use sampling in more recent versions of postgres sothey give a proposal of how you could sample say, 25% of the slow statements here. In additionto that, they mention logging your locks. So for example, turn on logging the lock weights andthey also mention logging auto vacuums. So specifying the duration when a vacuum is takingtoo long so that you can know what's going on and happens with the process loggingcheckpoints. And you also might want to alter say, your track IO timing or the track committimestamp. Now in addition to configuration settings, you could also add the PG Stat Statementsextension so that you can get a sense of all the different statements that are being run. Youcould also use the Auto Explain extension that will automatically put an explain plan in the logwhen queries take too long and they show you how to configure that here. And then lastly on theclient side, you can actually specify an application parameter so that you can uniquely identifywhat clients are connecting to the database. So for example, they set an application name of theclient that's running the weekly revenue report so they'll know exactly what connection is causingparticular issues if there's a problem in the logs. So interesting blog post that you can check outfrom PG IO.

[00:06:49] The next piece of content is avoiding, identifying and dealing with PostgreSQLdatabase corruption. Part One So they talk a little bit about what is database corruption andsome of the causes. Mainly bad hardware, bad software is also a culprit. One thing they'rementioning here that CFS has been known to have some issues on Linux. I don't know how truethis is anymore, but that's another indication. Misconfiguration in user errors and then they gointo symptoms of a corrupted database. So maybe you have duplicate rows that shouldn't bethere or rows are deleting, they're not appearing anymore. Or you have system crashesattempting to read the data or even error messages that are related to internal tables or internalfile structures and it's not related to client activity. Those could be indications of corruptions. Nowin terms of best practice to avoid data corruption, they talk about backups and restores. Now ofcourse this won't prevent corruption but will help you recover it from it if in the worst casescenario but in terms of a way to avoid it is that keep your Fsync on, keep your full page writeson and set the appropriate value for the wall sync method. Additionally, if you want to protectyourself even more, you can turn on checksums for your data and the last thing they cover issome other best practices. So for example, they mentioned doing a plug pull test of your systemto make sure that it can recover successfully. So all your hardware and or software is working asyou expected. Next, never modify your data files manually. Very bad. To do that, don't go in anddelete the postmaster PID the process ID. Don't run antivirus on the data directory and don't killany PostgreSQL processes. Use the commands PG cancel backend. Or, if you really need to,the PG terminate backend commands. So this is a great list of recommendations for trying toavoid corruption of your database. I definitely suggest you check it out. This post was fromHaigo, CA.

[00:08:52] The next piece of content is our --journey to PostgreSQL twelve. This is from Coffeemeatsbagel.com and they're talking aboutan upgrade they did from 9.6 to 12.4. Now, what I found interesting and they give a descriptionof their architecture here. Basically they have three Replicas, an additional background Replicaand an ETL Replica. They do have PG bouncer that's in the mix and they needed to upgradethis system. Now, the reasons for upgrade is that the primary database had been running forabout 3.5 years and it had a number of oddities. So for example, quote new services wouldrefuse to run in system D so they had to run things like a datadog h in a screen session andsometimes CPU usage went above 50%. It would completely be unresponsive to SSH. So that'sa serious concern. And then their disk usage was an issue because they were running out. Sothey needed to address that. So when doing the upgrade they wanted to have minimaldowntime, of course and roll out the new clusters on new instances to replace the current agingfleet and upgrade their instances. So to have a little bit more headroom. Now they considereddoing a logical dump and restore but that would take way too long. So then they considered PGupgrade but they didn't want to have any sort of downtime at all. So they chose using logicalreplication. And specifically they decided to use PG logical because I believe this has someadditional niceties to be able to subtract the sequences which logical replication normally doesnot. And they went through the process of doing it and basically they first brought up a newpostgres 12.4 primary that was doing PG logical replication. And then they set up streaming toset up new 12.4 Replicas from the new 12.4 primary and they just used the standard streamingreplication. And then once everything looked good, they put the site into maintenance mode,switched the DNS record to the primary database, forced a sync of all the primary keysequences. So I believe this is a PG logical capability. Run a manual checkpoint on the oldprimary, perform some data validation tests against the new primary and bring the system backup. And it happened successfully. Now, they did have a couple of issues that they mentioned isthat slow synchronization can be dangerous. So during the initial PG logical synchronization theyhad to copy a lot of data and that was causing wall to build up because it's holding the wall,retaining it while it's doing that initial data copy. So that was a concern for them. They had tocancel it. And what they did to have it happen more quickly is they dropped all the indexes onthe table being synchronized. They set F sync to off. I would be really concerned about studyingthat given what we understand about database corruption. They set the max wall size to 50GBand checkpoint timeout to 1 hour. So basically they extended those checkpoints to minimize theamount of wall writing. And with that they were able to get the data transferred over. And alsothey mentioned that every update is logged as a conflict. So they went ahead and set a PGLogical Conflict log level to Debug to avoid those messages. So if you're interested in learningmore about how they did this go ahead and check out this blog post. The next piece of content ismonitoring replication, PG Stat replication. So this talks about looking at the PG Stat replicationview on the primary to determine what replication is happening. And as I mentioned here, it onlyknows what Replicas are connected to that primary. So if you have a Cascading streaming setuphere where this primary synchronizes to these Replicas, it doesn't know anything about theseReplicas. So it only has two entries in PGSTAT replication whereas this Replica that has threeReplicas against it has the three entries of these. So just something to keep in mind if you'reusing this view to understand what's going on with your PG Stat replication. And you have fourLSN indicators. One is has it been sent? Which means has the primary sent this LSN to theReplica? Second is the write LSN which means has it been written to the operating system? Notnecessarily flush to disk but the OS knows that it's a write that needs to be done. Thirdly, has itbeen flushed? So has that LSN been written to the disk on the Replica and then finally replayLSN which means the LSN that has been written to the database file. So if you do a query of thisdatabase system you will return that data and they talk about replication lag. And t --hey do have lag columns by time in the PG Stat replication view. So you can determine thewrite lag, the flush lag, the replay lag so you can know how closely the Replicas are beingmaintained to the primary. So if you're interested in learning more you can check out this postfrom Cybertechnposgresql.com.

[00:13:46] The next piece of content is how to run some tasks without user intervention atspecific times. Part Two so this is a part two of a post from Dep and this one talks about PGCron. So it's another extension you can use with postgres to do Cron scheduling of tasks. So ifyou're interested in that you can check out this blog post.

[00:14:08] Next piece of content is actually a YouTube video. It's managing PostgreSQL withansible.

[00:14:14] This is from the San Francisco Bay Area PostgreSQL Users Group. And this hasabout a 50 minutes presentation on how to use ansible to configure and set up postgres.

[00:14:27] The next piece of content is how to create PostgreSQL custom builds and debianpackages. This is from Percona.com, so it does exactly that. It walks you through the process ofactually creating your own debian packages for custom builds of postgres.

[00:14:43] The next piece of content is a preliminary exploration of dynamic hashing inPostgreSQL. And this is from Haigo, CA. Now, this is very internal related post to postgres, so ifthat is of interest to you, you can check out this post.

[00:14:58] Next piece of content is deploy PostgreSQL with TLS and Kubernetes. So if you useKubernetes and want to set up TLS with Postgres, check out this post from Crunchydata.com.

[00:15:10] Also from Crunchydata.com is deploy TLS for PG bouncer in Kubernetes. So if youare setting up PG bouncer with Kubernetes, you can follow this post to determine how to set thatup.

[00:15:22] The next piece of content does Post GIS 3.1 mean the end of topology exception? Soapparently when you're doing a spatial analysis, certain activities can fail when geometries don'tcomply with OGC validity rules. But the hope was that with this update, these exceptions couldbe reduced. Unfortunately, in the testing that was done here, the conclusion was that there arestill some issues with it. So he basically says you probably shouldn't hurry to update PostGIS,but if you want to learn more about that, check out this post from Elephanttamer net. The nextpiece of content is better. OpenStreetMap places in PostGIS. This is from Rustproof Labs andthey have been talking about the new Flex output available in OpenStreetMap and here theycover some additions and changes since 0.0.3 they were looking at 0.7. So if you're interested insome of the additions and changes, you can check out this blog post and the last piece ofcontent. The PostgreSQL Person of the Week is Frank Parchon. So if you're interested inlearning more about Frank and his contributions to postgres, definitely check out this blog postthat does it. For this episode of Scaling Postgres, you can get links to all the content mentionedin the show Notes. Be sure to head over to Scalingposgrows.com where you can sign up toreceive weekly notifications of each episode, or you can subscribe via YouTube or itunes.Thanks. --

episode_image