Index Deduplication, Fast Hierarchy, More Monitoring, Cloud-Native

In this episode of Scaling Postgres, we discuss index deduplication in Postgres 13, fast hierarchical access, more essential monitoring and cloud native Postgres.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family, and coworkers continue to do well. Our first piece of content is "DEDUPLICATION IN POSTGRESQL V13 B-TREE INDEXES". This is from cybertec-postgresql.com. They're talking about an enhancement that's coming in version 13 where they're reducing duplication in B-tree indexes. So basically allowing the index to not track every single value if it exists as a duplicate. And with this, you get some pretty significant space savings. So they say it's similar to how a gin index works: if the index keys for different table rows are identical, a gin index will store it as a single index entry. Now, that's not necessarily how this works, but they're using the same concept of not having to store every single identical value in the index to make storage more efficient.

They link to the commit here and they say, quote "The big difference is that duplication index entries can still occur in B-tree indexes". They even include in the bottom here where they're testing it out, they show the table, the index, and they inserted what looks like 20 million rows and then they compared the size of the indexes. Now, if you look at the gin index, that's pretty small compared to the B-tree index. In general, just looking at version 12, 51 megabytes for the gin index, and 400 megabytes for the B-tree index. But access is a little bit slower compared to the B-tree index, at least in this use case. But in version 13, the B-tree index is 126 megabytes.

So not as small as gin as the gin index because again, it still does indeed duplicate everything. But there's a significant savings going from 408 megabytes down to 126 megabytes. So in this example, a 3x savings in space, which is huge. So that means this index could fit in a smaller amount of memory, which may speed up access to it. So it's another great enhancement. Now, to get the benefits you need to do something similar to how the index changes for version twelve were done, which is that you need to reindex. So if you do, say, a reindex concurrently in your indexes, you should get space savings for those indexes once you upgrade to version 13. But if you're interested in learning more, definitely check out this blog post.

The next post, also from cybertec-postgresql.com, is "POSTGRESQL: SPEEDING UP RECURSIVE QUERIES AND HIERARCHIC DATA". So the focus of this is data that exists in the hierarchy, and a lot of times you can use the WITH RECURSIVE CTE to query it. But they have another option here to speed up these types of queries using something called ltree, which is part of the contrib package. He says, quote "Ltree implements a data type ltree for representing labels of data stored in a hierarchical tree-like structure". So since this is an extension, you go ahead and install it. He created a table and inserted some data into a table linking a parent and child to one another.

So for example, B is the child, A is the parent and he built this hierarchy here. Then using ltree you can get a listing of the hierarchies. So for example, if you say SELECT using ltree, it represents each label separated by a dot and you can append rows to the end of the hierarchy. But he says to be careful if the column is NULL because then you can't append it whereas if it's blank then you can and in terms of the label you have to use essentially letters because special characters like a dot will result in an error. Now, he made a materialized view to precalculate these values so he used the WITH RECURSIVE to create them and he's using the ltree so that the table looks like this and the path to it is shown in the ltree column here and then you can query this data.

So query the hierarchy using where the statement contains, for example, this WHERE and it pulls out all of that set of hierarchy. So as he says this is looking for everybody in A, B, C, and G and then you can also use an index with this. Now he doesn't mention but I believe he's using a GiST index to be able to use this. I believe it contains an operator so that you can speed up access to it. So if you have a need for querying hierarchical data and you want to speed up access to it, maybe check out the ltree extension to see if it can assist you with doing that.

Next post is "Essential PostgreSQL Monitoring - Part 2" and "Essential PostgreSQL Monitoring - Part 3". These two posts are from pgdash.io. So in a previous edition of Scaling Postgres, we covered Part 1 which covered cluster-level parameters for PostgreSQL. Part 2 covers database-level ones. So for example, it says one thing you want to monitor is connected clients. Well, it shows you exactly how to do that by just looking at the pg_stat_activity table, and what query you would run if you want to track the size of your database. This is the query you would use. How to track table bloat across all tables in index bloat and they have some tools that you would need to do to pull that out.

Monitoring long-running transactions, looking at deadlocks, and then looking at the oldest vacuum and oldest ANALYZE and what tables you need to query. Just be able to see that. Part 3 covers table index and system-level metrics to monitor, including the table size, table bloat, how many sequential scans are occurring as well as indexes their size bloat, as well as their cache hit ratio in terms of system level. It covers the memory used, the load average, and then the free disk space. So these have been a great list of things to monitor for PostgreSQL. I definitely suggest checking out these blog posts to keep them as a handy reference to see what else you should potentially be monitoring in your PostgreSQL installation.

The next piece of content is "Webinar: Cloud Native BDR and PostgreSQL [Follow Up]". So this is a webinar that they produced at 2ndquadrant.com and they're talking about their Cloud Native BDR and PostgreSQL. So Cloud Native basically means container enabled, I've come to understand. So it's basically using Kubernetes to manage PostgreSQL or their bi-directional replication product, BDR. It goes over an interesting presentation where they're talking about a number of decades ago, we were running on, essentially, bare metal servers, and then we started running on virtualized environments for the last decade or so and now we're kind of moving into the container realm. Now I definitely understand it from an application perspective in databases.

I'm still not really running them on containers again, because of some of the disadvantages that potentially exist with local disk access or fast disk access, which you definitely want with PostgreSQL. But again, that has been continuously improving and they're talking about using the operator that they developed to run PostgreSQL in a Kubernetes cluster and have it working and performing well. Now, what I think is more of a game changer personally is their BDR product because that is essentially a multi-master database. There, it makes a whole lot of sense to run a workload like that on Kubernetes because essentially one database is the same as another within the cluster.

But again, with that product, you have to keep in mind, you have to develop your application differently. You can't do certain things in that type of environment compared to normal PostgreSQL replication where there's always a primary and they can do things like utilize sequences to produce IDs and things of that nature. But I found it a very interesting webinar that kind of looks toward where the future may lead. So if you're interested in checking that out, definitely check out this webinar. You just need to click the link here, submit some contact information, then you get to view the webinar.

The next post is "Optimizing SQL: Simplifying Queries with Window Functions". This is from highgo.ca. Again, another post I like to read about is Window Functions. So this is a pretty basic post that talks about understanding window functions. So they set up an environment within employees that have departments. They insert a bunch of data into it and then they start querying it. The first way they said you can get an average salary is by someone using subqueries, and then they compared it to using a window function to do it. Then they go through and describe all the different window functions that are available in terms of row number, rank, dense rank, lag lead, first value, and last value, and then they show a query with a number of these different window functions in it. One to get the row number by department, one to get the rank by department, and one to get the dense rank by department. Then they start looking at Lag and Lee to give you more information about the salary amount and follow up with the running total. So if you're interested in learning more about window functions, definitely check out this blog post.

The next post is "Removing PostgreSQL Bottlenecks Caused by High Traffic". This is from percona.com and they're basically going over a list of things that you should check out and consider changing to get better performance out of your PostgreSQL installation. They compare each of the items that they mentioned from a complexity standpoint and from a potential impact standpoint. So the first thing they mentioned is tuning your performance parameters. So those are the PostgreSQL parameters listed. The primary ones are the effective_cache_size, shared_buffers, and the work_mem. They go over how you would potentially want to set those and have some estimates that are listed here. So definitely the first thing to check out and do the next, they talk about session connections and how to manage those.

Basically, how many connections you're having, and once you start getting a lot of traffic, you're probably going to want to use a connection pooler like PgBouncer. So they discuss that and the different pooling options that are available. Then they're talking about optimizing the autovacuum. Now, interesting, they say some of the parameters that you should be changing are the autovacuum_max_workers, maintenance_work_mem, autovacuum_freeze_max_age, and whereas you could do this, I generally prefer to make as a first step, make autovacuum more aggressive and most importantly, change the vacuum_cost_limit or the autovacuum_cost_imit so that more work gets done with the autovacuum. Because the problem with adjusting this is that all the workers use that same vacuum_cost_limit.

And if you just add more workers, less work gets done for each worker process. So the most important value to change in my opinion, is the cost_limit because that allows each worker to do more work in a given unit of time. And then if you feel you need some more workers, go ahead and increase that. But know that if you do that, each worker is going to actually do less work because the governor on that is the cost_limit. So in my opinion, the number one parameter to change is the cost limit. Then they talk about advanced autovacuum and it basically looks like delving into table-based parameters, which you can definitely do to have particular tables vacuum more often. Next they cover bloat and their solution for that is using the extension pg_repack, but you can also minimize bloat by autovacuuming or vacuuming more often to make sure that dead tuples are cleaned out frequently.

This post also discusses how to avoid data hotspots. Basically, optimize your indexes and make sure you don't have indexes you don't need so you get potentially more heap-only tuple updates. If you have very large tables with a lot of rows, consider table partitioning, optimizing for parallel querying, and potentially using denormalization techniques. Another area they mentioned is competing application processes. Basically, if you're running your application processes on the database, consider separating out those workloads so you have a dedicated database and a dedicated application server or servers.

They're talking about replication latency and here they're referring to synchronous replication as having a bottleneck of sorts because you have to have a COMMIT happen on two different servers potentially crossing a network domain. So running asynchronously gets you more performance. As I say, last but not least, they talk about the server environment. So basically things you can do on the hardware itself in terms of adjusting RAM CPU drive systems as well as potentially a disk partitioning. So if you want to learn more about their perspective on performance areas you can adjust to help manage high traffic, definitely check out this blog post.

The next post is "PostgreSQL Write-Ahead Logging (WAL) Trade-offs: Bounded vs. Archived vs. Replication Slots". This is from enterprisedb.com. So this is talking about when you're setting up replication how that WAL gets to the replica, essentially. So you can bound it by specifying a certain number of WAL keep segments on the primary so that the replica has them available to pull them over to keep replication going. The second method is archiving them so you archive them on the primary server and ship them somewhere so that the replica has access to them so they can be replayed. Now, that's not streaming, that's essentially log shipping. But it is a valid option to do. And then lastly are replication slots. That's where the primary and the replica are actually interacting with one another that are communicating.

The primary is aware that this replica exists and it's going to keep all of those WAL files until it knows the replica has retrieved them and then it knows it can delete them. Now they go over the different trade-offs. Basically, a bounded WAL where you're using walkeep_segments the primary can go on forever and it's not going to run out of disk space or shut down because it ran out of disk space but it's not aware of any of the replicas. Whereas if you're using replication slots, essentially it has to retain those WALs. So if there's some problem with the replica you could start running out of disk space on the primary because the replica is essentially not consuming them. So again, they go over some of the pros and cons of these.

But essentially the wal_keep_segments allows your primary to move forward without any risk. The disadvantage is that your standbys or your replicas can fall behind and you have to manage that. With the replication slots, you can be assured that everything is going to keep up to date, but you potentially may run out of disks on the primary or be able to mitigate that. Then the archived WALs are really for log shipping, not really streaming. So the author's perspective is to use replication slots and very closely manage your disk space monitor for it. Have alerts set up for it as well as archiving the wall. But again, it's up to you. They go over the different pros and cons in this article. So if you're interested, go ahead and check out this blog post.

The next post is "Locks in PostgreSQL: 3. Other locks". This is from habr.com, and I believe it's from something that may have been posted back in 2015 on postgrespro.ru So, a Russian site. So if you're interested in learning more about other locks, such as deadlocks, predicate locks, or advisory locks, definitely check out this post.

The next piece of content, the PostgreSQL person of the week is Markus Winand. So if you want to learn about Markus and his contributions to PostgreSQL, definitely check out this blog post.

The final posts are "Spatial Constraints with PostGIS and PostgreSQL - Part 2" and "Part 3". Both are from crunchydata.com. So they're talking about, of course, database constraints in regards to PostGIS with PostgreSQL. So if you're interested in learning more about different types of constraints for postgres data, definitely check out these blog posts.

Index Deduplication, Fast Hierarchy, More Monitoring, Cloud-Native | Scaling Postgres 117

Join Over 1,000 Engineers & Get New Episodes Weekly!

Content Discussed

YouTube Video

Podcast Audio

Transcript