background

Connection Scalability, Benchmarking Performance, Community Impact, zheap Progress | Scaling Postgres 135

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss connection scalability, benchmarking Postgres performance, the community impact of an acquisition and the progress of zheap.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about connection scalability,benchmarking performance, community impact, and Zheep progress. I'm Kristen Jameson. Andthis is scaling postgres episode 135.

[00:00:22] All right, I hope you, your friends, family and coworkers continue to do well. Our firstpiece of content is analyzing the limits of connection scalability in postgres. This is fromCitusdata.com and they're talking about the limit that you have with the amount of connectionsyou can have with postgres and how it does pretty well in the hundreds range. But once youstart again to thousands range of connections, you usually have to reach to a pooler such as PGBouncer or PG pool to be able to handle higher numbers of connections. But of course with thatthere are downsides. So for example, you have to use transaction pooling. And then of courseas a consequence of that, you can't really use prepared statements, can't really use temporarytables, and you can't set things within a session. So it precludes using a fair number of features.So it looks like Citus data now that is a part of Microsoft is working towards a way to find outwhere the areas are that need improvement to be able to support larger number of connections,like in maybe the 10,000 or 100,000 range number of connections. Now, some of the issues thatthey're considering is that is it memory usage? Is that the connections are using up too muchmemory or is it snapshot scalability? So being able to take snapshots to know what a particularconnection is aware of with regard to MVCC or is the connection model need to changefundamentally to support higher numbers of connections? And he went through severalscenarios looking at what it could potentially be in terms of memory usage. There is some sort ofa constant connection overhead. But when he looked at this more in depth, it looks like a generalconnection uses below two megabytes based on some analysis he has done. Now, he said it'sreally hard to measure, but it's easy to get this overstated. And there's actually a second postthat I'll mention called Measuring the Memory Overhead of a Postgres Connection by the sameauthor from the Postgres From Below blog. And here he goes through the process of how he'sanalyzing how much memory these connections take in terms of having an overhead. So thedetails are definitely in this post, but the conclusion is it's relatively small per connection. So that,he said, is pretty manageable. The next area is cache Bloat. So when you have one connectiontouching many different objects, you're going to have that cache increase. And he demonstratessome of the size increases here that are causing this problem. Now, what he says here is thatstill he doesn't consider the Cache Bloat a major issue at the moment. And that, quote, acommon solution for the Cache Bloat issue is to drop old connections from the applicationconnection pooler after a certain age. Many connection pooler libraries web frameworks supportthat. So basically addresses the cache Bloat issue. And the third mention is the query memoryusage. And this regards looking at work mem. But when you want a lot of connections for say,OLTP workloads, you're probably not going to have a very high work memory. So he attributedthat as not a significant cause. But then he looked at Snapshot Scalability and he looked at thefact that when you have just a large number of connections, even though they're not beingutilized, it does have a measurable impact on performance. And he shows this here where youhave one active connection with different levels of idle connections and how the performancedrops by more than half, going with say, one connection with a one active connection down to10,000 idle connections and the same thing with 48 active connections going to 10,000. Again,it's about nearly a twofold drop in performance in terms of transactions per second. So eventhough they're idle, they're still doing something. And we needed a CPU profile list to find outwhat was responsible. It looked like it was the Get Snapshot data function that does the bulk ofthe work necessary to provide readers with transaction isolation. And he says these snapshotsare built very frequently, at least once per transaction, very commonly more often. So he lookedat this as this significant limit to work on moving forward.

[00:04:25] The next thing he did take into account is the connection model and particular --ly context switches and how this model is fundamentally built. And he definitely said thatthere's areas here that are ripe for improvement, but it would be a huge amount of work. Sobasically moving forward, he's focusing on the Snapshot Scalability portion to get some earlywins in. Now, this next link shows him working on this as part of a patch that looks like wasstarted all the way back in March. So it has been actively worked on to try to get postgres tosupport increasing numbers of connections without necessarily having to rely on something likePG Bouncer or Pgpool. So overall this was a very interesting blog post and I definitely suggestyou check it out as well. As the companion piece is measuring the memory overhead of apostgres connection in this post to the postgres message board if you're interested.

[00:05:19] The next piece of content is OLTP performance since PostgreSQL eight three. This isfrom secondquader.com. And basically what this is is it's assessing different versions and theirperformance on a single piece of hardware across many different versions since 8.3. So if youlook at this example here, this particular example is for an NVMe drive with a ReadonlyWorkload relatively small database 1.6gb. And it shows each version here in each of theseclusters of data points, each with an increasing number of connections, I believe. So it lets yousee in 9.2 something significant happened in that from there things have been undulating on thedifferent versions but no major regressions. For example, and then when you bump up the scaleby ten to a 16 gigabyte database, so this does still fit in memory, but not the shared memory. Youcan see there's performance improvements that happened at 9.2 and again at 9.5. Similarly,when you go up to 160 gigabyte database with again this system had a consistent 64GB ofRam. So it doesn't fit in memory. You mostly just see the big contribution that happened at 9.5for performance. But now notice this as we're looking at some of these, is that it looks like someof the performance of 13 has dropped relative to twelve. So I have noticed this in a number ofthe graphs. So it definitely depends on the workload. Even though there have been someperformance improvements for some workloads, it looks like it's not always been a win goingfrom say twelve to 13. Next, this is the NVMe with a read write workload. And again, this is prettyconsistent in the recent versions at a small database, medium database and even a largedatabase. But you'll look there is this slight regression that happened from twelve to 13. Nowmaybe there will be further optimizations at different point releases, but at least this is what thedata is indicating for this PG bench test that he carried out. And then he also covers it for SATAdrive as well to look at the different performance. So it's definitely interesting. And he also hasdifferent posts that are looking for the future of performance, talking about improved scalabilitywith many connections and this refers back to this post here that we were just talking about.Also support for non volatile wall buffers. So this is particularly interesting using non volatilememory with a server to support buffering the wall to give better performance. So that was avery interesting post. Talking about a patch that's being developed and worked on forperformance improvements. So if you're interested in performance, particularly across versions,definitely interesting blog post to check out.

[00:08:02] Next piece of content is PostgreSQL TPCC benchmarks, PostgreSQL twelve versusPostgreSQL 13 performance. So here they compared twelve and 13 on the same hardware bothtuned and untuned and they used Hammer DB to do the comparisons, so not PG bench andthey ramped up the number of users and what they saw for untuned performance, it was prettymuch identical. Little bit higher in 13, but dramatic difference when you actually tune itcomparing untuned to tune between both twelve and 13 and you can see the difference that 13gives, particularly at the sweet spot of around 250 users for this particular workload. So herethey're showing 13 is a little bit ahead of twelve in terms of performance. So again, it kind ofdepends on your workload it seems on the difference performance you can potentially expectwith postgres 13 versus postgres twelve. So another interesting benchmarking post.

[00:09:03] The next piece of content is Community. Impact of Second --Quadrant purchase. So this goes back to something I mentioned before, is that last week itwas announced that EDB acquired second quadrant and then that has some benefits becausenow there's a larger organization to advocate for postgres. But the disadvantage is there's nowkind of less diversity in the community. And this very short post speaks to just that. So hementions now EDB's representation in the core team is 60%. So more than half of the team isfrom EDB. And they did say that the core team is working on a solution for this. So it's good thatthey're acknowledging it. But he also talks about the strength of diversity that's important for thepostgres community. But he even mentioned something interesting here, a risk that an evenlarger company wanting to hurt Postgres could acquire EDB, taking a direction that is neutral ornegative for the postgres community, which wouldn't be too great. But on the other side of this, itis interesting that the work that was being discussed during this first post is actually happeningat Microsoft. So another large organization furthering postgres. So I do hope that the diversity ofcompanies supporting postgres increases. Now related to this is another post EDB toCommunity Acquisition statement. This is from Enterprisedb.com. And they're speaking to theexact same thing about making a commitment to the open source nature of Postgres. And theystate here, quote, fortunately, internal PostgreSQL community rules prevent any one companyfrom taking control of the project. So it's again something that they are aware of and are payingattention to, but it's just something to keep in mind. And here are a couple of blog postsspeaking to this potential issue for the Postgres community. So check those out if you'reinterested.

[00:10:51] The next piece of content is Zheep Reinvented PostgreSQL Storage. This is fromCyberTech Postgresql.com, and they're referring to the new storage system was actually startedby EDB a number of years ago. I haven't heard anything about it recently. As a matter of fact, Iwas looking up some information on it yesterday because I have a client that has a very heavyupdate workload and I noticed it's been kind of quiet for a while and then this post popped upand it kind of gives a status. And they say that, quote, to make Zheep ready for production, we'reproud to announce that our partners at Heroic Labs have committed to fund the furtherdevelopment of Zheep and release all code to the community. And CyberTech has decided todouble the amount of money and put up additional expertise and manpower to move Zheepforward. So this is great news in my opinion. That okay, this new storage system is continuing tobe developed and will be released hopefully soon at some point. Now to remind you, Zheep is areimagining of the storage system as a way to potentially not have to use a vacuum. Becausehow postgres works now is when you have an update or a delete, the row gets marked for afuture removal and it's hidden for new transactions, but it still remains there for others. And thenlater those old rows, once they're no longer needed for visibility purposes, are removed byvacuum. The core of what Zheep does is actually keep this information in a separate area. So ifan update happens exactly what says here, it doesn't update in place, but the old information itstores in undo log. So if it needs to roll back that transaction, it consults that separate storagearea to roll it back. Or if an older transaction needs visibility to what that row had, it consults theundo log. But as they say here, basically the key design goals are performing an update in placeso you don't have to do vacuums, have smaller tables because they've kind of reengineered thefundamental storage system, reduce writes as much as possible and reuse space more quickly.So they go over the basic design of Zheep and how most of the transaction information in thecurrent storage system is stored with each Tuple. Here they're doing the storage of thetransaction information at the page level in something they call slots, and it goes through anddiscusses this from a technical level. I don't have time to address it here, but you can feel free toreview this post to understand how it works. And then they talk about the undo and how ithandles doing inserts differently updates and deletes. So basically the goal is not have tovacuum up all of these dead Tuples when an update or delete happ --ens. Basically you store things in an undo and then when they're no longer needed, they getremoved from that storage area. So this could be very beneficial for heavy update workloads asopposed to, say, heavy insert uploads. So if you're interested in learning more about Zaheep,definitely check out this blog post.

[00:13:38] The next piece of content is how to fix PostgreSQL performance issues with PGExtras. This is from Powell Erbanek. They're talking about a tool called PGE Extras and it'savailable for a number of different languages from Ruby to Elixir to Node JS to Python. And itlooks like it is just a utility that from these languages it consults the system views to pull outinformation related to performance. So you do have to enable the PG Stat Statements extensionto get the relevant information you need. But it does things like help you check your cache ratios.And you can do a very simple Pgxures cache it to get that information, as opposed to doing aquery against the system tables. Helps you identify unused indexes to remove how to addmissing indexes. Looking for too many sequential scans. How to identify deadlocks and whatlocks are occurring. Getting rid of unnecessary bloats so giving you a Bloat calculation.Removing unneeded objects to check the size of database tables. So if you're interested in aquick tool to get some of this information, you can check out this blog post.

[00:14:46] The next piece of content is logical decoding of large in progress transactions inPostgreSQL. So this is a post from Enterprisedb.com and this is a new feature that they'reworking toward where a subscriber typically waits for a transaction to be committed before itapplies all of that information to the destination database. But with this it allows you to startstoring that data on the subscriber prior to the final commit. But then once that commit happens,it goes ahead and commits it on the subscriber. So could be advantageous for very long runningor large transactions that are being processed. And it looks like this is scheduled to go intopostgres 14, so if you're interested in that, check out this blog post.

[00:15:31] The next piece of content is webinar highway to zero downtime PostgreSQL upgradesfollow up. So this is a webinar from Secondquarter.com and you can click the link to get accessto it. Here they're talking about different ways to do an upgrade and how to get down to zerodowntime. Now they talk about using PG dumps and then PG upgrade and then using logicalreplication to do upgrades and that's the best way to do zero downtime. But there's some certainthings you need to take into account when doing that, such as sequences don't get copied over,you could have DDL changes you need to be aware of when doing something like this and thattheir tool. PG logical helps alleviate some of these problems, makes things easier. But if you'reinterested in that post, you can check out this webinar.

[00:16:16] Next piece of content is using CTEs to do a binary search of large tables withnonindex correlated data. This is from Endpoint.com. Now this was an interesting use case.They had a database that they really didn't have access to to change indexes, but they neededto do queries against it. And it had an ID field that had an incrementing counter, a sequenceassociated with it and a created at date. So created at was not indexed whereas the ID field wasindexed. They wanted to sort by the created at. It gave poor performance when trying to pull outrecords by that. So they used this ID field to determine what rows they needed to pull. Now inorder to identify what ID was associated with a particular date. They developed this withrecursive query to essentially do this binary search to identify the ID ranges and then applythose to do the query using an index against the data. So basically they recursively went throughand checked different dates to find the IDs to then do their query just using the ID. So it's quitean interesting use case and if you're interested in finding out more, check out this blog post.

[00:17:25] Next piece of content is PostgreSQL monitoring for application developers the Vitals,and he's talking about monitoring the vital statistics of CPU, memory, disk and networkutilization, and why these are the core ones that he likes using and how they can help youmonitor your postgres. If you're interested in that, check out this blog post. Next piece of contentis heap file an --d Page in Details. This is from Haigo CA, and they're talking about the low level structure ofthe heap and the page structure being used. So that references a lot of C code and gives anexample of how things work in the storage system. So if you're interested in that detail, checkout this blog post.

[00:18:06] The next piece of content is configuring PG pool Two watchdog. It's going to be a loteasier. So they're talking about configuring PG pool two and their watchdog to determine if aninstance is down or not. And apparently configuration used to look like this with differentconfiguration files for each node, and they're going to a unified configuration file that applies toall nodes. So definitely easier. If you're interested in learning more, you can check out this blogpost.

[00:18:35] The next piece of content is how to set up PostgreSQL monitoring in Kubernetes. Thisis from Crunchydata.com, and I believe this is using their PostgreSQL operator tool to be able toset this up. So if you're interested in that, you can check out this blog post.

[00:18:50] Next piece of content is waiting for PostgreSQL 14 Support for Out Parameters inProcedures. So apparently with postgres 14 you can now start using out parameters to returnsomething from a procedure that's run so you can check this out for more details.

[00:19:06] And the last piece of content is the PostgreSQL. Person of the week is AndreasKretchmer. If you're interested in learning more about Andreas and his contributions to postgres,definitely check out this blog post that does it. For this episode of Scaling Postgres, you can getlinks to all the content mentioned in the show notes. Be sure to head over toScalingpostgres.com, where you can sign up to receive weekly notifications of each episode, oryou can subscribe via YouTube or links you. --

episode_image