pg_repack 回收表体积

对于某些常进行archiver或者 purge操作的表而言，如果我们不定期回收表空间，则表体积会越涨越大。

但是pg自带的 vacuum full 在回收的过程中会阻塞读写操作，不能在生产环境直接运行。

因此，在生产环境我们常用的表空间收缩工具是pg_squeeze 和 pg_repack

项目地址: https://github.com/reorg/pg_repack

原理：新建一个一模一样的影子表，然后拷贝原表的数据，最后rename替换原表。

注意：待处理的表必须有主键

yum install centos-release-scl-rh
yum install llvm-toolset-7-clang

cd /home/postgres

tar xf pg_repack-ver_1.4.4.tar.gz 

export PATH=/usr/local/pgsql-11.5/bin:$PATH   -- 需要载入环境变量，不然编译过程中可能找不到pg_config这个文件

cd pg_repack-ver_1.4.4

make && make install

yum install centos-release-scl-rh
yum install llvm-toolset-7-clang

cd /home/postgres

tar xf pg_repack-ver_1.4.4.tar.gz 

export PATH=/usr/local/pgsql-11.5/bin:$PATH   -- 需要载入环境变量，不然编译过程中可能找不到pg_config这个文件

cd pg_repack-ver_1.4.4

make && make install

另外，会生成一个可执行的文件： /home/postgres/pg_repack-ver_1.4.4/bin/pg_repack

修改配置文件：

vim /usr/local/pgsql-11.5/data/postgresql.conf

shared_preload_libraries = 'pg_repack'

shared_preload_libraries = 'pg_repack'

然后，重启pg进程

使用方法：

create database db1;

\c db1

create extension pg_repack;
 
create table testdata (id integer,course int,grade numeric(4,2),testtime date);
alter table testdata add primary key (id);

insert into testdata 
 select generate_series(1,100) as id,
 10 as course,
 10.11 as grade,
 '2017-07-06' as testtime;

create database db1;

\c db1

create extension pg_repack;
 
create table testdata (id integer,course int,grade numeric(4,2),testtime date);
alter table testdata add primary key (id);

insert into testdata 
 select generate_series(1,100) as id,
 10 as course,
 10.11 as grade,
 '2017-07-06' as testtime;

然后，我们可以去看下PG datadir物理文件大小从1.1GB涨到了1.6GB了

然后，我们再使用命令 delete from testdata where id between 5000000 and 10000000; 对testdata表删除一半的数据，此时可以看到物理文件没有任何缩小。

然后，在外部使用pg_repack对 color表做空间回收:

cd /home/postgres/pg_repack-ver_1.4.4/bin

./pg_repack -h 127.0.0.1  --port 5434 -Upostgres -d db1 -t testdata -j 2 -D -k

cd /home/postgres/pg_repack-ver_1.4.4/bin

./pg_repack -h 127.0.0.1  --port 5434 -Upostgres -d db1 -t testdata -j 2 -D -k

pg_repack参数

  -a, --all                 repack all databases
  -t, --table=TABLE         repack specific table only
  -I, --parent-table=TABLE  repack specific parent table and its inheritors
  -c, --schema=SCHEMA       repack tables in specific schema only
  -s, --tablespace=TBLSPC   move repacked tables to a new tablespace
  -S, --moveidx             move repacked indexes to TBLSPC too
  -o, --order-by=COLUMNS    order by columns instead of cluster keys
  -n, --no-order            do vacuum full instead of cluster
  -N, --dry-run             print what would have been repacked
  -j, --jobs=NUM            Use this many parallel jobs for each table
  -i, --index=INDEX         move only the specified index
  -x, --only-indexes        move only indexes of the specified table
  -T, --wait-timeout=SECS   timeout to cancel other backends on conflict
  -D, --no-kill-backend     don't kill other backends when timed out
  -Z, --no-analyze          don't analyze at end
  -k, --no-superuser-check  skip superuser checks in client
  -C, --exclude-extension   don't repack tables which belong to specific extension
Connection options:
  -d, --dbname=DBNAME       database to connect
  -h, --host=HOSTNAME       database server host or socket directory
  -p, --port=PORT           database server port
  -U, --username=USERNAME   user name to connect as
  -w, --no-password         never prompt for password
  -W, --password            force password prompt

  -a, --all                 repack all databases
  -t, --table=TABLE         repack specific table only
  -I, --parent-table=TABLE  repack specific parent table and its inheritors
  -c, --schema=SCHEMA       repack tables in specific schema only
  -s, --tablespace=TBLSPC   move repacked tables to a new tablespace
  -S, --moveidx             move repacked indexes to TBLSPC too
  -o, --order-by=COLUMNS    order by columns instead of cluster keys
  -n, --no-order            do vacuum full instead of cluster
  -N, --dry-run             print what would have been repacked
  -j, --jobs=NUM            Use this many parallel jobs for each table
  -i, --index=INDEX         move only the specified index
  -x, --only-indexes        move only indexes of the specified table
  -T, --wait-timeout=SECS   timeout to cancel other backends on conflict
  -D, --no-kill-backend     don't kill other backends when timed out
  -Z, --no-analyze          don't analyze at end
  -k, --no-superuser-check  skip superuser checks in client
  -C, --exclude-extension   don't repack tables which belong to specific extension
Connection options:
  -d, --dbname=DBNAME       database to connect
  -h, --host=HOSTNAME       database server host or socket directory
  -p, --port=PORT           database server port
  -U, --username=USERNAME   user name to connect as
  -w, --no-password         never prompt for password
  -W, --password            force password prompt

2-Harbor

3-Docker

1.安装

🍎维护手册

4-Containerd

1.安装

3.镜像管理

4.构建镜像

5-Dockerfile

🍂 env案例

6-Docker-Compose

7-Swarm

8-KVM

2-资源对象

2-Pod

5-Deployment

6-StatefulSet

7-Service

9-Job

10-ConfigMap

11-Secret

13-CoreDns

17-发布

3-存储

1- 存储卷概念

2-NFS

4-Minio

1-安装

4-网络

1-Calico

2-Cilium

OpenELB

5-认证与授权

6-安装

1.二进制安装

2.kubeadm安装

7-监控

1-Prometheus

2-Alertmanager

3-PrometheusAlert

4-Grafana

5-VictoriaMetrics

8-备份

9-常用操作

10-Yaml配置

11-Helm

3-Helm语法

🍎 Helm项目

12-CICD

1-Jenkins

2-ArgoCD

13-Ingress

1-Ingress_nginx

2-Higress

15-Autoscaler

1-HPA

2-VPA

3-OpenKruise

1-Kruise

16-Scheduler

17-云k8s

1-AWS EKS

5-ingress-nginx

维护手册

18-Kubernetes故障排查

19-Kubernetes排查手册

1-WireShark

20-Kubernetes维护手册

21-Kubernetes面试

22-Kubernetes发布

1-Go

2-Go框架

3-Go编译

5-Go文档

6-Go日志

10-Go模块

11-Web前端开发

4-vue

5-vue架构

12-Git