background

Multiranges, Missing Metrics, Newbie PostGIS, Conference Videos | Scaling Postgres 211

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss working with multiranges, missing Postgres metrics, PostGIS for newbies and videos from CitusCon.

Content Discussed

YouTube Video

Podcast Audio

Transcript

All right. I hope you, your friends, family. and coworkers continue to do well. Our first piece of content is "MULTIRANGES IN POSTGRESQL 14". This is from cybertec-postgresql.com. So you can create a range by just using the function int4range and give it two integers and then it will output it here. If you notice one side is a bracket, and one side is a parenthesis. The bracket means it's inclusive, the parenthesis means it's exclusive. So essentially there are values 10 to 19 within this range, so it does not include the 20. It does the things to check that things are in the range too where they show here where they try to show a high to a low range, it gives you an error that's an invalid range. Now once you have a range, you can then use this operator that is kind of like a contains operator to tell you if a value is within that given range. And you could see when you do a check for 17, it gives you a true when it's in the range of 10 to 20.

And of course, you can also use date times. That's probably the more frequent use case for using ranges and then they get into showing you the multirange capabilities. So you can use that same int4range and combine two of them together to generate an int4multirange. So you can see the results here which includes the two ranges that were established here. And you can add as many ranges as you want to this. I'm not quite sure what the limit is, but what is interesting about that is that if you include overlapping ranges, it will actually resolve the overlaps to just include what the ultimate set of ranges are. So here you're concatenating three ranges together but it only ends up storing in the final data type as it were, two ranges, because some of them were overlapping.

Then they mentioned you can use these range types within a table and they give that example right here. They also mention how it can handle infinity from either the starting or the ending range just by including NULL in the range. And you can also add those ranges together in a multirange to get basically an infinity at the low end and the high end as well. You can even do operations on ranges that you calculate the difference between two ranges. So for example, in the range of 1 to 20, if you subtract the range of 4 to 6, it's going to be missing those in the multirange that gets returned. Similarly, you can do an intersection of two ranges to get what the results are.

And here you're doing an intersection of those same set of ranges to give you the intersection of them. You can even do things such as aggregating ranges together on the fly. So in this example, it's reading the from and before in separate columns, but it aggregates them into a common range using a rang_ agg function. So it operates across multiple rows. Similarly, you can also UNNEST a multi-range into multiple rows. So this is a great post that shows you all the different manipulations you can do with ranges and multiranges, and I definitely encourage you to check it out.

The next piece of content- "Missing metrics required to gain visibility into Postgres performance". This is from coroot.com, and what they're exploring is that when they're doing performance monitoring, they like to be able to show the success rate of operations as well as the latency of operations. In terms of Postgres, that's usually talking about queries. But it's kind of hard to get both of these metrics with the tools that Postgres has by default. They're showing some examples of this. Here how you have to look into multiple places to kind of get the answer you're looking for. Now the three different places you have to look are pg_stat_statements, pg_stat_activity, and what is being logged in the database.

So when you look at pg_stat_statements, it gives you a historical record of the queries that were run; both long and short queries. But it doesn't record any failed queries and it doesn't show you anything that's running currently in the system. In contrast, pg_stat_activity only shows you those queries that are running right now. You don't know if they're going to ultimately be successful or fail, but they show you all the currently running queries. And then lastly, what usually gets logged is queries that are longer than a particular time frame based upon the configuration log_min_duration_statement. So these are the long queries that are getting logged.

Short queries don't get logged, but you also get failed queries that get logged. So to get the universe of all the queries running, failed queries, short and long, historical and current queries, you basically have to look at these two system views, pg_stat_statements, and pg_stat_activity, as well as what's being logged. So they have developed an open-source Prometheus agent to get this information together that gives you a record of success rates and latencies. They show a little bit how they're able to do that here. And using their tool, they ultimately show you how you can track the errors and query execution times. But the interesting thing is that these are open-source agents you can use in the Prometheus agent. So presumably, you could use these in your own environment to track these sorts of metrics. So if you're interested in doing that, you can definitely check out this blog post.

The next piece of content- "PostGIS for Newbies". This is from blog.crunchydata.com and they're talking about getting started with PostGIS. Basically, it is for geographical information systems and they basically read spatial data primarily. Usually, to get started, you're going to have the extension PostGIS installed on a Postgres database. You might interact with it with a desktop GIS program. Maybe you're pulling the data to perform some data science functions and you can also display things out on the web to an end user. So for this example, post, they're talking about using Postgres and you can use any particular version. They're using their hosted Postgres version.

You need the PostGIS extension that you need to install in the database. They're using pgAdmin to be able to get a GUI interface because it actually has a Geometry Viewer feature to show you some maps in the views. They're looking at QGIS for the desktop app for interacting with the GIS data. They're also talking about the open-source tool they developed called pg_featureserv to actually display Postgres data to the web through an API. They go through the process of loading data, showing you how it looks in pgAdmin with the Geometry Viewer. Then they talk about different ways you can calculate areas or distance with the data that they loaded, as well as get started with some QGIS, as well as publishing a view to the web. So if you want to get started very simply with PostGIS, definitely check out this blog post.

The next piece of content is actually a YouTube playlist, and it's a YouTube playlist of Citus Con that recently happened a number of days ago. So this contains all the different videos that happened at Citus Con. Now a lot of these videos are about Citus, which is an extension for Postgres, implementing scale-up capabilities, but a lot of it covers general Postgres as well.

The next piece of content- "5mins of Postgres E13: New SQL/JSON and JSON_TABLE features in Postgres 15". This is from pganalyze.com. We covered a lot of these posts in last week's episode of Scaling Postgres, talking about the new JSON features coming to Postgres 15. He goes into more depth about these blog posts and explains the different features coming. So if you want a different perspective on that, you can definitely check out his episode here.

Next piece of content-"pg_ivm: a PostgreSQL extension providing Incremental View Maintenance feature". This is from yugonagata-pgsql.blogspot.com, and over the past year he's been talking about a feature that they've been wanting to add to Postgres called Incremental View Maintenance. So this is a materialized view that updates itself. Normally with a materialized view, it's basically a view that has actually been made into a table. You create a materialized view to create it, so the table actually exists. But that data of course comes from base tables. And as that data changes, you need to refresh the materialized view. Well, Incremental View Maintenance is a way of using triggers to keep the view up to date from the base tables.

So basically, it keeps the materialized view up to date. Now, there's an overhead with doing that, but there may be use cases that could be beneficial. So this post is actually talking about an extension, pg_ivm, they created. So their intent was to make it a feature, but because people were interested in it, they actually went ahead and released it as an alpha or a beta extension. So you could actually start using it in current versions of Postgres today. Basically, you create an Incremental View Maintenance using these specific functions here once you've installed the extension. So if you're looking for a way to keep your materialized view up to date incrementally, maybe you'd want to check out this new extension.

The next piece of content- "Update on the Trademark Actions Against the PostgreSQL Community". This is from postgresql.org, and this is an update on the trademark issues that have been discussed in the Postgres community between two different Postgres organizations. And if you want to learn more about what's going on with this, I definitely encourage you to check out this recent post.

The next piece of content- "pgBackRest multi-repositories Tips and Tricks". This is from pgstef.github.io, and they're talking about the capabilities of pgBackRest to interact with multiple repositories. This is like interacting with Amazon S3 or Google's Cloud Storage. Well, this particular post talks a lot about the impact of Postgres archiving of the WAL and how that's handled when working with multiple repositories. So if you're interested in that, you can check out this blog post.

The next piece of content. The PostgreSQLperson of the week is Jehan-Guillaume De Rorthais. My apologies for that pronunciation, but if you'd like to learn more about his contributions to Postgres, definitely check out this blog post.

The last piece of content, we had another episode of The Rubber Duck Dev Show this past Wednesday evening. This one was on the various small projects we're working on. So if you're interested in that type of long-form, developer-based content, definitely welcome you to check out our show.

episode_image