background

arm64 with apt, Contributors, Backup Manifest, Now Functions | Scaling Postgres 113

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss arm64 package support for apt, annual Postgres contributors, backup manifests & verifications and different now functions.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about Arm 64 with apt, contributors,Backup, Manifests, now and Timestamps. I'm Kristen Jamison, and this is scaling postgresepisode 113 One.

[00:00:22] Alright, I hope you, your family and coworkers continue to do well with the situationthat continues to be with us. But our first piece of content is Arm 64 on Apt postgresql.com. Andthis is from Christoph Berg's blog and basically the Apt repository for PostgreSQL now coversthe Arm 64 architecture. So these are for Arm processors and the Raspberry Pi is a popular one.So they're not supporting all Arm processors, but they are doing Arm 64 which is the 64 bitversion. So these are now available and they're targeting debian buster, bullseye Sid, as well asbionic and focal for Ubuntu. Now they also mentioned that the Focal repositories for 24 Ubuntuis now available on the repositories as well. And note about old PostgreSQL versions. So if youpotentially want to use an Arm processor, maybe you want to check this out now. Now, normallythis has been for smaller devices, but I know that Amazon EC two instances, not that I'mnecessarily advocating them, but I know they've had a instances a while that have a custom builtGraviton processor that uses the Arm architecture and they also have a newer M six G formedium workloads that get up to 256GB of memory. And they're using their second version ofthe Graviton processor and they state that they deliver up to 40% better price performance overthe current generation M five instances which generally use intel chips. Now I see that it lookslike there's a greater than 20% savings, so maybe there's more performance with these, I'm notreally sure, but definitely having another processing architecture available is definitelyinteresting.

[00:02:14] The next post is who contributed to PostgreSQL development in 2019. And this isfrom the Robert Huss blog. And this is the as he says, the fourth annual post on who contributesto PostgreSQL development. And they are ranked here by looks like the number of linescontributed, the percent lines is also represented and the number of commits. So you can seewho's taking the top spots here as well as committers of non self authored code and thosepeople with at least 80 emails to the pgSQL hackers list. So definitely thank you to everyonewho contributes to PostgreSQL and makes it what it is today. And this is from the blog at Rhossblogspot.com.

[00:03:00] The next post is Backup Manifests and PG Verify backup in PostgreSQL 13. So this issomething to look forward to in 13 where they are now providing a backup manifest to track whatfiles have been backed up as well as a means to verify the backup. So this post from secondquadrant.com walks through this new feature set. So basically what is the content of the backupmanifest file here? Basically it's a JSON object with the following keys manifest version all thedifferent files included in the backup wall ranges that include information such as the timeline,the LSN of the backup start and backup end as well as a checksum. So he goes through andruns through the process. He does a PG based backup. He's not streaming here and specifyingthe directory. He looks at the content of the backup directory and there's a new file called backupmanifest and here's what the JSON file looks like. And then of course with this file you can thenverify the backup with PG verify backup and he goes through and does that validation. Now hetook this backup without the wall files because he was doing those separately. That's what the xnone means, do not backup the wall files. So it's going to result in an error when trying to verify itand he actually skips the validation of the wall files using the N option and you can see it wasverified. My assumption is that you could specify the location of the wall files if you didn't recordthem, but that wasn't mentioned in this post. And here he actually goes in and modifies a wall fileto essentially corrupt it. And you can see it does result in an error when doing a PG verifybackup. So, definitely interesting feature coming in postgres twelve and if you're interestedcheck out this blog post.

[00:04:54] The next piece of content is PostgreSQL now versus now timestamp versus clocktimestamp. So now is a function that returns what the current time is. Now what now timestampis it is actually a constant. So whenever you execute now it will return the --timestamp. But this is a constant that will store the value of what the current timestamp is.The first example you shows here, when it's part of a transaction it will still return the samevalue. For both of these methods, the function and the constant, it returns the exact sametimestamp. Then still within this transaction if you sleep for 10 seconds it will still return thatexact same timestamp. See 578-57-8578. So essentially time according to now is frozen withinside this transaction. So where do you see differences in the constant? Well, if you use it as apart of creating a table and using a default, for example if you say default now, it will alwaysreturn what the current timestamp is. So if you examine this table you can see the default isnow. Whereas if you specify this constant it will take the time now and store that for that default.So here you can see it's actually storing that time. So I think this is probably more of the limiteduse case for now colon colon timestamp z now, the differences between now and the clocktimestamp is that the clock timestamp gets constantly updated. So for that previous example inthe transaction the clock timestamp would continue to be updated with what the current clocktime is. Whereas now maintains the same value. So this is a pretty good review of thesedifferent timing functions. And if you want to learn more and look at more in depth in his examplehere, definitely check out this blog post from Cybertechn Postgresql.com.

[00:06:48] The next piece of content is actually Bruce Momgm's site at momgm us where he hasnuggets of wisdom that he posts from time to time. Really very short, but one was particularlynoteworthy this week that I wanted to mention and that is with PostgreSQL twelve, CTEs or withclauses are now inline by default and there is a keyword you can use with materialized in orderto determine whether to inline those or not. So this is one of the few optimizations that you canadd to an SQL query within postgres. Normally there are no ways to add optimizer hints, but withyour CTEs you can specify whether it should be materialized or not. So that's just something tokeep in mind is that once you eventually do upgrade to postgres, if you have some CTEs or withqueries that aren't as performant as you think they should be, check out the materializedkeyword to see if you should potentially not materialize them or not to get better performance foryour queries.

[00:07:51] The next post is Index corruption in PostgreSQL. The hidden cost of your queries. Sobasically it's talking about index corruption, how to potentially identify it and see it, and then ofcourse, what to do about it. Now, I personally haven't encountered this knock on wood, but themain way to see it is that basically queries aren't using the index for whatever reason, and a wayto resolve it is to rebuild the index. And if you're on version twelve, you can reindex concurrentlyor previous versions, just create the index again and then you can later drop it once that index isactive. So if you're interested in learning more about this, definitely check out this post fromEnterprisedb.com.

[00:08:32] The next post is the best medium hard data analyst. SQL Interview Questions Andthis is from a document that's on Quip.com. I don't know if there's a related post somewhere, butit goes over some different interview questions and how to approach those from SQL. So forexample, number one is a month over month percentage change. Number two is tree structurelabeling. Number three is retained users per month. Number four is cumulative sums. Numberfive is rolling averages. Number six is multiple join conditions and then several window functionpractice problems as well. So if you're really wanting to understand more of SQL and againthese medium hard problems, definitely check out this piece of content.

[00:09:23] The next post is Local persistent Volumes and PostgreSQL usage in Kubernetes. Thisis from Secondquader.com and they're talking about using Kubernetes and the type of storage.So they're talking about setting up a local persistent volume to see check the performance ofthat versus the network storage. And they believed that the direct storage would result in betterperformance. So they wanted to check this. They have their test system. They set up here andthen ran different performance benchmarks. And here are the results. So, looking at sequentialreads and writes, these are t --he disks as declared by the manufacturer. And then this is their bare metal results. And Ibelieve they did Raid these with a Raid one. So that explains why the reads are twice as fast aswhat the disk normally is. So this is about twice as fast. So using local persistence volumes, youcan tell it's within a few percentage points. Just a hair slower than bare metal. Same thing for thewrites. Whereas if you're looking at the Open EBSC Store pool volume doing a more networksolution, you can tell the performance dropped dramatically, essentially one 20th of theperformance and one 10th of the write performance. So basically this communicates to me isthat if you're wanting to use PostgreSQL and Kubernetes, you should focus on using localpersistence volume versus the CSTORE pool volume. But if you're wanting to learn more aboutthat and the methodologies they use for testing, definitely check out this post fromSecondquader.com.

[00:10:55] The next post is Multi Kubernetes Cluster PostgreSQL deployments. So this is talkingabout deploying PostgreSQL in multiple regions or data centers using Kubernetes and usingFederation, and it runs through the process of doing it for two Kubernetes clusters. Now they'rehaving to use an active standby scenario, but it goes through and describes how you can do thisusing their Crunchy data operator. And this post is from Crunchydata.com. So if you'reinterested in learning how to do that, check out this post.

[00:11:28] Next post also from Crunchydata.com is deploy PG Admin Four with PostgreSQL onKubernetes. So if you're wanting to do that to help manage the multiple instances you're going tobe running of Postgres, if you're using Kubernetes, you can check out this blog post.

[00:11:44] The next piece of content is Jimmy. Angelakos is the next PostgreSQL person of theWeek. So if you're wanting to learn more about Jimmy and his contributions to Postgres,definitely check out this post.

[00:11:57] The next piece of content is a review of four top PostgreSQL books. This is from theEnterprisedb.com Blog, and I mentioned a YouTube video previously. Well, this is basically just atext version of it. It mentions the different books that were covered as well as synopsis. So ifyou're wanting a more text version of that content, you can check out this blog post.

[00:12:20] And lastly is routing with PostgreSQL and Crunchy Spatial. So this is fromCrunchydata.com and they're discussing the PG routing extensions to do routing on dynamicallygenerated graphs. So if you're wanting to calculate routing using this in conjunction withPostGIS, definitely check out this blog post that does it. For this episode of Scaling Postgres,you can get links to all the content mentioned in the show notes. Be sure to head over toScalingpostgres.com, where you can sign up to receive weekly notifications of each episode. Oryou could subscribe via YouTube or itunes. Thanks. --

episode_image