background

Tablespaces, Streaming Replication, More Postgres 14, Security | Scaling Postgres 159

Join Over 1,000 Engineers & Get New Episodes Weekly!

In this episode of Scaling Postgres, we discuss when to use tablespaces, setting up streaming replication, features coming in Postgres 14 and implementing security.

Content Discussed

YouTube Video

Podcast Audio

Transcript



[00:00:00] In this episode of Scaling Postgres, we talk about table spaces, streamingreplication, more Postgres 14 and security. I'm Kristen Jameson, and this is Scaling. Postgresepisode 159.

[00:00:17] One all right, I hope you, your friends, family and coworkers continue to do well. Ourfirst piece of content is when to use table spaces in PostgreSQL. This is from CyberTechpostgresql.com. They're talking about when you should use table spaces, and they compared itto Oracle, where Oracle basically has its own file system it uses to store the data, whereasPostgres relies on whatever operating system it's using to store its data. And they have aterminology in Oracle versus Postgres chart. Here where they say an Oracle table space issimilar to a operating system file system where an Oracle data file is equivalent to a logicalphysical volume in a file system. A segment in Oracle's equivalent to the data files that are partof a table and an extent in Oracle's equivalent to a segment or a data file in Postgres. Now, Idon't necessarily 100% agree with this chart because there doesn't list where the table spaceexists. So the table space is logical representation of where you're storing the data that isconnected to a physical location. So, for example, you logically create a table space, but youspecify a physical location for it. So when you create a new table, you specify that table spacewhich is actually stored in that location. So Postgres normally just uses a data directory andstores everything in there. You can direct, say, logs to a different location using SIM links or othermethods like that. And you can determine to a certain extent maybe some temporary space, butmainly everything is in the data directory and table spaces allow you to store things outside ofthat data directory. And when you create objects, you need to specify the table space. Now, theydo say that when you specify table to go in a particular table space, when you create indexes,you need to specify that table space as well. It doesn't automatically go with the table, but if youput a whole database on a particular table space, it will stay within that table space. Now, theblog talks a little bit about backups and how you need to take table spaces into account and thenask the questions, when should you create a table space? Now, here are some reasons andthey've specified. A lot of these reasons don't necessarily apply anymore. They applied more ina time when we were using a lot of physical drives, a lot of magnetic storage using SSDs andthese virtual file systems in a cloud environment. A lot of these reasons don't really existanymore. And I've only used table spaces when we're running out of data where the primarydata directory is located, and there wasn't an easy way to expand it. So what we typically did isbrought up a new table space and then created large indexes on that table space and thendeleted them on the primary table space or the default data directory to free up some space tobias time to then address it. But it is a bit of a hassle doing backups when you have tablespaces, so I definitely wouldn't recommend it. And that's what this post advocates as well, is thatyou probably shouldn't be reaching to use table spaces a lot and they make a quote here. Sincealmost everybody uses virtualization these days for their postgres installations, table spaces arebecoming an increasingly irrelevant PostgreSQL feature. So whereas they mentioned that inOracle, it's more of a consideration, these table spaces in postgres, it's really only certain usecases that they iterate here and mainly the only way I've used them is you're running out ofspace on the main data directory and cannot easily move or expand that data directory. But ifyou want to learn more, definitely check out this blog post.

[00:04:05] The next piece of content is setting up streaming replication in PostgreSQL 13 andstreaming replication internals. This is from Migops.com and they're talking about basicallyphysical or streaming replication and they talk a little bit about the internals, basically the wallfiles that are being generated. You stream that to a standby server to be able to create a replicaof the primary database and then they go into all the different commands that are used to getthis set up and to get two systems up and running and replicating data from the primary to areplica. So if you're interested in that, you can check out this --blog post.

[00:04:44] The next piece of content waiting for PostgreSQL 14. Add PG Database Ownerdefault Role so this is a new default role that's been added and the intent is to add it to thingslike template One. And template One is a database that all databases are created from. So it's atemplate for you to create new ones. And if you apply this to template one and in his examplehere, he created a security definer function to be able to query the PGSTAT activity table andthen applied it to template one so that when a new database is created with a particular ownerthey can run this function. Get running queries against the database they've created becausethey are the owner. So it's a way to give more permissions to users at a more granular level. Soif you're interested in learning more about this new feature, you can check it out@dep.com. Nowthere are two other postgres related to new PostgreSQL 14 features. The next one is Add Datebin function. So this is similar to the date trunk function where you can take a date and truncateit down to a particular resolution. So for example, there's microseconds milliseconds, days,months, decades, et cetera. Well now you can bend it to a particular time and then offset it. Soit's not always truncated to a particular time, so it's bend at an off time interval. So if you'reinterested in that, maybe you want to check out this blog post. And the third post is adding theAdd Uni string function. So this allows you to decode a string with Unicode escape sequences.So if you're interested in that, you can check out this one.

[00:06:26] The next piece of content is is Postgres secure? This is from Crunchydata.com andthe answer to the question, of course, as they say here, quote the short answer is yes, but it's allbased upon your implementation and how you set up postgres. And they have a number ofresources that they've linked to here to give you guidance on how to set up postgres in a securefashion. And the first thing they cover is the Common Criteria for Information TechnologySecurity Standard, and that there is a protection profile for database management systems thatyou can apply for deployment of postgres. They also talk about work done with the center forInternet Security and the United States Defense Information Security Agency to develop moreguidelines for deployment, configuration and administration. And they include that here, alongwith the Security technical implementation guides. And they also talk about postgres SQLinjections and encryption requirements. So if you're interested in making a more secure postgresinstallation, definitely check out this blog post. The next piece of content is, again more webinarsthat have been posted to the EDB YouTube channel. So if you're interested in video content, youcan check that here.

[00:07:36] The next piece of content is preventing FreeBSD to kill PostgreSQL, also known asout of memory killer Prevention. FreeBSD is different from Linux and we have covered articlestalking about the out of memory killer in Linux and there's specific documentation on postgresabout how to do that. But this covers FreeBSD and a number of different functions like Protect. Ithas to be able to do that and configure it. So if you run postgres on FreeBSD, maybe you wantto check out this blog post.

[00:08:08] The next piece of content is logging of PG pool two on Kubernetes. So if you run PGpool two and you want to do it on Kubernetes and interested in the logging, definitely check outthis blog post from Bping blogspot.com.

[00:08:22] The next piece of content dumping a byte array with psql this is from cleverelephantCA and he's talking about using binary data in postgres, specifically the byte array to storebinary data. And he was using that to do some debugging and creating a pipeline withinpostgres to be able to take Raster images that are stored in binary in postgres but be able toexamine them. So if you're interested in learning more about the pipeline he generated here interms of working with byte array data, definitely check out this blog post.

[00:08:56] And the last piece of content is the PostgreSQL Person of the Week is JanKaraman's. So if you're interested in learning more about Jan and his contributions to postgres.Definitely check out this blog post that does it. For this episode of Scaling Postgres, you can getlinks to all the content mentioned in the show Notes. Be sure to head over to Scalingpostgr --es.com, where you can sign up to receive weekly notifications of each episode. Or you cansubscribe via YouTube itunes. Thanks. --

episode_image