background

New Releases, Reconnect After Failure, Bablefish, Troubleshoot Disk & Memory | Scaling Postgres 152

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss new PostgreSQL releases, reconnecting after a connection failure, Babelfish SQL Server compatibility and troubleshooting disk and memory issues.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about postgres releases, reconnectingafter failure, BabbleFish, and troubleshooting disk in memory. I'm Kristen Jameson and this isScaling Postgres, episode 152. You alright? I hope you, your friends, family and coworkerscontinue to do well. Our first piece of content is PostgreSQL 13.212, point 611.1110, point 16,9.6 and 9.5 have been released. This is from Postgresql.org and the reason for the releaseprimarily are two security issues. One is a partition constraint violation, errors, leak values ofdenied columns. So basically it looks like some columns can leak through on certain operationsthat someone shouldn't have the permission to view. And also single column select privilegeenables reading all columns. So again, kind of column leakage. This latter one looks like it onlyaffects PostgreSQL version 13 and the other between versions eleven and 13. But overallthere's been a number of bug fixes and improvements that have been done to all the releases.And it's always good to look over these because some of them require an additional work. So forexample, there was an issue that was fixed with just indexes and then they suggest you toreindex those just indexes. Or there were certain cases where create index concurrently hadissues. If you're using prepared transactions and if you are an installation that uses those, thenyou should re index any concurrently built indexes previously. So go ahead and consult this postto look at all the different changes that have been made and resolved in postgres.

[00:01:51] The next piece of content is reconnecting your application after a postgres failover.This is from Citusdata.com and they're talking about how when you're setting up a postgres highavailability solution, one thing that you should also consider in addition to the serverinfrastructure is your client or your application side. And make sure that it can handle thosedisconnects. Because when a primary server fails over to a secondary server, you are going toget dropped connections, it doesn't transparently handle it off. So your application needs tohandle that. And it discusses some techniques of doing that and even gives you a way to dosome testing. But in terms of some tools that you can use, one is if you're not using somethinglike Pgpool or PG bouncer that have some techniques to handle transitions between servers,you could use the multi host connection strings that are part of the libpq library. Now, even ifyour application doesn't use libpq directly, it may rely on libraries that do use that. So forexample, they're mentioning that the postgres drivers in Python, Ruby, PHP or even node JSactually use libpq under their standard libraries. But there are others that use a more directimplementation and you need to check if they support these multi host connection strings. Andbasically it enables you in the host connection string to be able to specify multiple nodes and itwill try one after the other to make a connection. So if Node One is the primary and it goesdown, if it tries a reconnect and it's unable to reach Node One, it'll try Node Two. In addition, youcan also set target session attributes and you can specify that this particular connection for thisclient should be Read Write. So if one of these nodes it connects to only supports read onlyconnections, it will then go to the next node. So this is a simple way you can set up a way for aclient to be able to try and find the PostgreSQL primary, say, after a Failover. Now, of coursethey did mention you can use PG Bouncer and PG Pool, and there's a number of different waysthat you can handle that. But then what it talks about is a way to have your application beingprepared to handle those types of transition failures. Now, Cytus Data has developed a PG AutoFailover, which is an open source solution for doing Auto failing over or a high availabilitysolution for Auto Failing over your Postgres instance. And they've actually packaged it up into adocker container that enables you to run it very simply in your development environment andtest out how your application behaves with a PG Failover with this. Now, of course you wouldn'tuse this docker container in production, but it gives you a basis to do some testing to make surethat the client side of the application can handle these types of transitions. But of course, youcan choose to implement PG Auto Failover for your produc --tion area or use another solution. So if you're interested in learning more about this and thecapabilities of PG Auto Failover, definitely check out this post from Citusdata.com.

[00:04:53] The next piece of content is BabbleFish, the elephant in the room. This is fromPostgreSQL Fund, and this is a discussion piece about the presence of Babelfish. SoBabbleFish is quote, an endpoint to PostgreSQL that understands the SQL Server Wire Protocoltabular data stream so it can interpret things like TSQL commands and things for SQL Server. Ilove this quote that he says here. Quote what this means is that BabbleFish will be able toimpersonate an SQL Server database. Applications may be able to run unchanged believing thatthey are connecting to SQL Server when they will actually be connecting to PostgreSQL.BabbleFish. Now, from my understanding, AWS developed this for their Aurora database, whichis based upon Postgres but is not Postgres, but they have pledged to open source it. And thispost talks about that possibility. And basically it's trying to rally the PostgreSQL community tohelp support this and to try to implement it as an extension to postgres and make changes topostgres to more easily support this type of extension that gives you this SQL Servercompatibility because he doesn't want to get into this situation where they have to create a forkof postgres to support this or they just use it for Aurora. I suppose and potentially don't offer it forPostgreSQL. So I found this a very interesting post and I think you should take time to review itto understand kind of his perspective and why he thinks the PostgreSQL community should helpand assist BabbleFish to be able to work better with PostgreSQL. So if you're interested you cancheck out this post.

[00:06:36] The next piece of content is Troubleshooting performance Issues due to Disk andRam. So this is from Higo CA and it basically talks about a number of tools you can use todiagnose performance issues as they relate to say, disk or memory issues. So the first thing hementions is top in order to check out CPU utilization, checking out memory using the Freecommand and the disk space with the DF command. But if you want to look more into diskperformance, he mentions three other tools here DSTAT, Iotop and Systat, as well as a few othertools such as iostat, SAR, DSTAT and also reviewing the Proc Mem info to get more detailedinformation from what Free gives you. So if you want to learn some more Linux tools to help youdiagnose disk or memory issues as it relates to operating postgres, you can definitely check outthis post.

[00:07:29] The next piece of content starting with PG where is the config? This is from Dep andhe's talking about you've been introduced to a new postgres database and you need to find outwhere its configuration is. And by configuration he basically means the postgresql.com file, thePG underscore hba.com file and maybe the PG underscore identcomp file. Now if you haveaccess to a. psql terminal you can do show config file and it will show you the file location ofwhere that configuration file is at. Or you can do show. HBA file or show. Ident file and it givesyou the directory location, he says. But what if you don't have access to a psql prompt? Well,you can actually take a look at the process running on the server. So if you do a PS command,you can actually pull out the running postgres process and it tells you where the data directory islocated as well as the configuration file. So you can use that to be able to see what itsconfiguration is. Now the other thing to keep in mind that there is also the PostgreSQL auto.comfile and there may be some configuration in there that's typically located in the root of the datadirectory. So if you want to learn a little bit more about where to find different configuration inpostgres, you can check out this post.

[00:08:43] The next piece of content also from the same website is why is their database namedPostgres? And basically when you bring up a cluster, it comes with three databases templateZero, template one and postgres. So template one is basically the template from which all newdatabases are created and you can make alterations to this template that will then beimplemented in the databases you create. Now, template zero exists in case anything happensto template one. So you can recreate template one from template zero. But of course, what isthis Postgres database? --Because if you look in it, do a backslash D, it has no objects in it, and basically it's describedas a way just to be able to connect to some database, to execute some commands, even tocreate a new database. So you can connect to the Postgres database to be able to generatecommands. Because sometimes if you connect to template one, then there's certain commandsthat don't work and he does some different tests connecting his different users so you can seewhat the different behavior is. And there's also some comments below as well that discusses it.So if you're interested in that, you can check out this blog post.

[00:09:53] The next piece of content exploring SQL command changes in postgres 14 this isfrom Higo CA, and they're listing different changes at the SQL level that have happened toPostgres. The first is that the as keyword to give a new name for a column is now no longernecessary. So before you had to do say select LoC as Analyze, well now you can just say selectLoC Analyze and it will interpret that in postgres 14 correctly. So you no longer need the askeyword. It's optional, although there are some that still need to be explicitly set because they'rereserved keywords. The next is adding support for leading trailing trimming of byte array values.So that's a new feature allowing the current role in the granted by command allowing or replacewhen you're working with triggers. So before you had to drop a trigger and then create a newone, well, this now allows create or replacing a trigger, which is a great addition, and alsosupport for reindexing concurrently on partition tables, which it did not do previously. So if you'reinterested in some of these SQL changes coming to postgres 14, definitely check out this postfrom Higo CA.

[00:11:06] The next piece of content is PostgreSQL Toast data Corruption error, unexpectedchunk number. And this is from fluca 1978 GitHub IO. Now he's describing an issue where datahas been corrupted. Now I personally haven't seen this and he actually says down here thatquote, so far I've only experienced human caused damages. So PostgreSQL itself while runninghe hasn't experienced Toast corruption, but apparently someone has gone into the data directoryof Postgres and has altered something or some program has run and impacted the datadirectory. That's where he sees these corruption changes. Well, he's developed a function thatcan actually identify and find out where these are and he's placed it in a GitLab repository here.So he goes through the process of testing it out and actually corrupting a database. So definitelydo this in a test system. But having done that, you can then use the function to be able toidentify the data file that has the issue. So if you're interested in that, you can check out this blogpost.

[00:12:12] The next piece of content deep PostgreSQL thoughts. The Linux Assassin. This isfrom Crunchydata.com and he's basically talking about the out of memory killer. There arevarious different configurations that you do to try to avoid the out of memory killer on yourpostgres system. One of the main ones is setting the overcommit memory to two and thenadjusting your over commit ratio as well to make sure you're using the optimum amount ofmemory. Failing that, there is a way to change the out of memory score adjustment, so that'salso a possibility. And then it goes into there are host level out of memory killer mechanics aswell as C group level, so you can apply some of these out of memory parameters at the C grouplevel. Then he talks about why it's important to avoid this and basically you don't want to bringdown your database or require restart because that has its own host of issues. And he goes intomore detail of a lot of this here. But the post then turns to Kubernetes and how Kubernetesactively sets VM over commit memory to one so it doesn't turn it off as the setting of two, whichis the recommendation. It also has different C group out of memory behavior and you can't reallychange the score adjustment and where a swap memory can help reduce some of this overcommitted memory by using swap, he says that Kubernetes turns off the swap. So runningpostgres in Kubernetes has some particular challenges and he goes into detail with some ofthese and gives you a few options on Mitigation. And he also says that they're actively workingto try and make these issues be minimized in future versions in the future. So if you're interested--in this content, you can check out this post from Crunchydata.com.

[00:13:52] The next piece of content. Also from Crunchydata.com is Kubernetes Pod tolerationsand postgres deployment strategies. So if you want to learn a bit more about Kubernetes and itsnew Pod Tolerations capabilities with regard to postgres and their upgraded PostgreSQLoperator, you can check out this blog post.

[00:14:10] The next piece of content petrone Environment Setup PostgreSQL High Availabilityfor Windows and it discusses how you can use a graphical installer to set up patroni onWindows. So if you're interested in that, you can check out this post from CyberTechpostgresql.com.

[00:14:28] The next post, also from CyberTech, is catchment areas with PostgreSQL andPostGIS. So if you want to learn more about these catchment areas, definitely check out thisblog post.

[00:14:40] The next piece of content is check. PG Backrest 2.0 has been released. This is amonitor for the PG Backrest Backup tool, so if you're interested in learning more about this, youcan check out this post from Pgstuff GitHub IO and the last piece of content is the PostgreSQLPerson of the Week is Hubert Lubachowski. So if you're interested in learning more aboutHubert and his contributions to Postgres, definitely check out this blog post that does it. For thisepisode of Scaling Postgres, you can get links to all the content mentioned in the show notes.Be sure to head over to Scalingpostgres.com, where you can sign up to receive weeklynotifications of each episode, or you can subscribe via YouTube or itunes. Thanks.

[00:15:27] You close. --

episode_image