background

pg_receivewal, Application Failover, pg_checksums, pgBouncer | Scaling Postgres 87

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss the benefits of pg_receivewal, how to setup simple application failover, new pg_checksums and setting up pgBouncer.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres we talk about PG receive wall applicationfailover PG checksums and PG bouncer. I'm creston. Jameson and this is scaling postgresepisode 87 one.

[00:00:21] Alright, I hope you're having a great week. Our first piece of content is never lose aPostgreSQL transaction with Pgraceive wall. And this is from CyberTech Postgresql.com, andthis article is talking about Pgraceivewall. And basically your PostgreSQL database is constantlyproducing wall files or write ahead log files. And you need to keep those around so that you canrestore your database up to a point in time if you have enabled archive log mode. Now, thetypical way to do it is to just archive it with the archive command. And that command, as theymentioned here, is to copy to some local drive and then sending it somewhere else, or to send itto some network attached storage like NFS or secure copy it somewhere, r sync somewhere, oreven to use it for some backup software. Basically you eventually want to get it off the databaseinto some other storage medium, so that way it's safe in case something happens to thedatabase that it was generated on. Now, one thing they mentioned here under the section ofWhen Archive Command isn't Good enough is that you can still lose some committedtransactions because it waits to fill up the whole log file, which is by default it's 16 megabytesworth each wall file. It waits till that wall file is full before it archives it. So you could have thatpartially filled, so you could be missing transactions if your system goes down and you only havewhat the Archive command has produced. But that's something that the PG Receive Wall utilitycan do. Now, it's actually designed to stream the wall files to a separate location. So it actuallygenerates wall files. It is not in and of itself a database, but it kind of uses streaming replicationto read the wall stream and produces the wall files where it's running. But an added benefit of itis that it also writes partial files. So it has a extension partial as it's filling up a particular file as it'sbeing streamed through the replication. And you should also note that it's called PG ReceiveWall in version ten. But prior to version ten, it was called PG Receive XLOG. And this sectionhere talking about this ability to repartial files is that PG Receive Wall is an alternative to wallarchiving that avoids the gap between the current and the archived wall location. So that is oneextra benefit. I've used it to help easily get wall files off of the primary database server becauseyou can set up a dedicated system to run PG Receive Wall and that automatically gets the wallfiles off of, say, your primary database system. And it enables your primary database to havemore resources to use for other purposes. So that wall archiving is handled by a separatesystem. And they say by default, the replication with PG Receive wall is Asynchronous, so youcan enable it to be synchronized, but if you do, it has to be written to the PG Receive walllocation before that transaction can be committed. So you can slow down your writeperformance. And if anything happens to this PG Receive Wall, your system won't accept anycommits at all, which essentially means you're down. So what they suggest here is to avoid thistype of problem, you need at least two synchronous PG Receive Wall processes. So this is agreat post that talks about PG Receive wall and also some of the benefits it has over the archivecommand. So if you're not yet using it definitely a blog post to check out.

[00:03:55] The next post is Seamless application failover using libpq features in PostgreSQL.And this is from Percona.com and this is talking about the ability of libpq and by extension thepsql clients for you to insert multiple hosts that you can connect to as a part of the connectionstring. And it will actually try each one in turn. So it kind of gives you a way to do applicationfailover using these because a lot of application environments use libpq to be able to connect topostgres. And you can also specify whether those connections you're making should be in aread write state or a read only state. So they're giving you an example here of where they havethree servers, a primary and two standbys. And you can actually make a connection string likethis, where you put a comma between each host you want to contact. And then you can specifytarget session attributes and equa --l read write. So what it's going to do is going to run this command on the first read write serverit finds, so you can see it's connecting to Ten. The first one listed in this string here is 20, and thesecond one is Ten, which is the read write, the primary, so that's how it knows to go ahead andconnect to it. And secondly, you can connect to any server for reads. So this one is using thetarget session attributes any and it's selecting the first one in the list here and the first one in thelist here. So if something happened to go wrong with one of these servers, it would of courseconnect to the second one and then they show how they stop PostgreSQL on the Ten server.And with this connection string you can see it will automatically connect up to 20. Now they'resaying you're going to have some sort of delay that happens as it tries to connect to that firstone, but it'll eventually connect to the second one. And then they give an example using Pythonand PHP, how you can use these connection strings with these application environments. And ofcourse it exists for others as well. And in this one you can see they're specifying target sessionattributes being read write as well. So this is a native function as I believe in and aroundPostgreSQL ten, so it gives you a bare bones way to do application failover. So if this is ofinterest to you, definitely a blog post to check out.

[00:06:15] The next post is Auto Failover with PostgreSQL twelve. And this is from the techcommunity on Microsoft.com and they're talking about PG Auto Failover. And so with the releaseof PostgreSQL Twelve and the changes to the Recovery Comp file, utilities such as these needto be updated. So basically they've said they have updated it and their version 1.0.5 will workwith PostgreSQL Twelve. So this seems to be a relatively simple solution for doing Auto Failover.They talk a little bit about what it can do currently, as well as what they have planned for thefuture for PG Auto Failover. So if you're looking for a failover solution, definitely a solution tocheck out that is compatible with PostgreSQL Twelve.

[00:07:00] The next post is PG Checksums 1.0 released and this is from Creditive.com and thisis a utility that verifies activates or deactivates data checksums in PostgreSQL instances.

[00:07:15] So there is already a PG Checksums utility in PostgreSQL, but this one allows onlineverification of checksums. That's pretty much its primary feature, so you don't have to bringdown the database to do the checksum process. And of course it's compatible with all supportedversion of PostgreSQL as opposed to just the more recent one where they've made somechanges to the PG Checksums as well as activation of progress reporting during operation and IO rate limiting. So if you want a more robust version of doing checksums with PostgreSQL,definitely a utility to check out.

[00:07:52] The next post is PG Bouncer tutorial, installing configuring and testing persistentPostgreSQL connection pooling. So this is from Enterprisedb.com and this is a first, what theysay is going to be a multi part series on using PG Bouncer and how to handle failovers of thedatabase and things of that nature. So this first post is relatively simple, it's just installing a PGBouncer, connecting up to a Postgres instance and then seeing if you bounce the networkconnection. What impact does that have? Can the PG Bouncer still reconnect up to it withoutlosing its connection? So, a relatively simple getting started post, but I'm interested to seeingwhat's going to be coming in the future with the second and third editions of this blog post series.

[00:08:41] The next post is monitoring PostgreSQL clusters in Kubernetes, and this is fromCrunchydata.com. And they're talking about using the PostgreSQL operator, which managesPostgreSQL clusters on Kubernetes developed by Crunchy Data, and how they tell you how youcan set it up using PG Monitor, which is a open source solution using Grafana and Prometheusas a data store to be able to track reporting information of PostgreSQL. And this post goes intohow to set this up for a set of PostgreSQL clusters in Kubernetes, so if that's of interest to you,definitely a blog post to check out. And if you're looking at monitoring solution in general, they dohave a link to their PG monitor as a way to build something up relatively quickly, and I believe bydefault it works more easily with CentOS and Red Hat Linux. --

[00:09:39] The next post is a beginner's guide to formatting dates in SQL. This is from thePinopley IO block and it's a very simple post, talking about timestamps, years, how to storecertain dates, converting dates into a string, extracting certain parts of a date as well astruncating to certain parts of a date. So if you're interested in learning some more about the datetime functions in PostgreSQL, definitely a blog post to check out.

[00:10:08] Next post is PostGIS 3.0.0 is released. So now that PostgreSQL twelve has been fullyreleased, they have released PostGIS 3.0.0. So if you're interested in this, definitely a blog postand a project to check out. The next series of posts are a little bit older, but they're talking abouta text search of postgres. The first post is using PostgreSQL to create an efficient searchengine, and they talk about comparing the different ways you want to do a search. So forexample, if you're looking for a prefix in a fuzzy query, you can use a b tree index. And here'show you can specify your searches. If you're looking for a suffix and fuzzy query, you can doyour searches this way using the reverse column expression and a b tree index. If you'rewanting to do a prefix and or suffix and a fuzzy query, you can use PG trigram extension andgen indexes, or you can use gen and rum indexes for full text search. And this post goes into avery comprehensive discussion of full text searching in general, how to break apart differentwords, defining search, syntax, sorting algorithms, and other types of information. So if you wantto get more information about text searching in postgres, definitely a post to check out. Followon posts that are related to it are the rum index and full text search using PostgreSQL. This doesa comparison of using storing data as an array or in a text field and using a gen index or a rumindex and seeing how the performance changes with different types of queries. The next post inthe series is using the built in and custom ranking algorithms of PostgreSQL for full textsearches. So this goes into ranking algorithms using a TS vector or using a multidimensionalarray. And then the last post is term frequency statistics in PostgreSQL full text searches. Sohow frequent or infrequent a particular term appears in a series of text. So if you're interested inthese types of postgres, definitely ones to check out.

[00:12:16] That does it. For this episode of Scaling Postgres, you can get links to all the contentmentioned in the show notes. Be sure to head over to Scalingpostgres.com, where you can signup to receive weekly notifications of each episode, or you can subscribe by YouTube or itunes.Thanks. --

episode_image