在金融、能源、政务等关键信息基础设施领域,Oracle/MySQL 商业替换已成为政策强制要求。PG 凭借协议兼容 MySQL(pg_dump、FDW)、ACID 完整、插件生态丰富,成为信创替换首选。
核心选型维度对比:
| 维度 | Oracle | MySQL(社区版) | PostgreSQL 16 |
|---|---|---|---|
| ACID 支持 | 完整 | 仅 InnoDB | 完整(MVCC 全链路) |
| 复杂查询能力 | 最强 | 弱 | 强(窗口函数、CTE) |
| JSON 支持 | 需付费 | 基础 | 原生 JSONB(索引优化) |
| 高可用方案 | RAC(昂贵) | MGR/PXC | 流复制+Patroni(免费) |
| 国产适配 | 关闭授权 | 部分 | UOS/Kylin 旗舰适配 |
// 推荐架构(3节点)
// +------------------+ +------------------+ +------------------+
// | pg-node-1 | | pg-node-2 | | pg-node-3 |
// | (Primary) |<--->| (Replica) |<--->| (Replica) |
// | 192.168.1.10 | | 192.168.1.11 | | 192.168.1.12 |
// | :5432 | | :5432 | | :5432 |
// +--------+---------+ +------------------+ +------------------+
// |
// +--------+---------+
// | etcd-1 | // etcd 集群(3节点,保存 leader 状态)
// | 192.168.1.20 |
// +-------------------+
// 访问层:HAProxy --> 自动路由到 Primary :5432
// etcd 自动主从切换原理:
// 1. 节点挂掉 → etcd 失去心跳 10s
// 2. etcd 发起 Leader Election
// 3. 剩余节点数 >= 2 时可选出新 Primary
// 4. Patroni 执行 promote,新主从备升至 Active
// 5. HAProxy 自动更新后端(通过 API 或 Consul Template)
// 每台服务器执行(替换 hostname)
// pg-node-1: export NODE_NAME=pg1; PG_VERSION=16
// pg-node-2: export NODE_NAME=pg2
// pg-node-3: export NODE_NAME=pg3
// 1. 安装依赖(以 Ubuntu/Kylin 为例)
apt-get update && apt-get install -y \
postgresql-16 postgresql-contrib-16 postgresql-16-repack \
etcd patroni python3-yaml haproxy curl
// 2. 配置 etcd(/etc/etcd/etcd.conf)
ETCD_NAME=$NODE_NAME
ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.1.20:2379"
ETCD_INITIAL_CLUSTER="pg1=http://192.168.1.10:2380,pg2=http://192.168.1.11:2380,pg3=http://192.168.1.12:2380"
ETCD_INITIAL_CLUSTER_TOKEN="pg-ha-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
// 3. 启动 etcd
systemctl enable etcd && systemctl start etcd
etcdctl --endpoints=http://192.168.1.10:2379 member list
// 预期:3个成员状态 healthy
# /etc/patroni.yml (pg-node-1 示例)
scope: pg-ha-cluster
namespace: /service/
name: pg1
restapi:
listen: 192.168.1.10:8008
connect_address: 192.168.1.10:8008
etcd:
hosts: 192.168.1.10:2379,192.168.1.11:2379,192.168.1.12:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576 # 1MB,防止脑裂
postgresql:
use_pg_rewind: true
parameters:
max_connections: 200
shared_buffers: 4GB
wal_level: replica
hot_standby_feedback: on
max_wal_senders: 10
max_replication_slots: 10
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.1.10:5432
data_dir: /var/lib/postgresql/16/main
authentication:
replication:
username: replicator
password: "replicator_pass" # 生产用 Vault 管理
superuser:
username: postgres
password: "your_secure_password"
rewind:
username: rewind_user
password: "rewind_pass"
# /etc/haproxy/haproxy.cfg
global
log /dev/log local0
maxconn 2000
defaults
log global
mode tcp
timeout connect 10s
timeout client 1m
timeout server 1m
frontend pg_front
bind *:5432
default_backend pg_back
backend pg_back
option httpchk GET /master
http-check expect status 200
server pg1 192.168.1.10:5432 check port 8008
server pg2 192.168.1.11:5432 check port 8008 backup
server pg3 192.168.1.12:5432 check port 8008 backup
以 统信 UOS 1060a 和 麒麟 V10 SP1 为测试基准,PG16 均适配良好。但仍需关注:
| 验证项 | Ubuntu 22.04 | UOS 1060a | 麒麟 V10 |
|---|---|---|---|
| apt 安装 PG16 | 原生支持 | 需添加 UOS 源 | 需添加 Kylin 源 |
| systemd 服务 | 可用 | 可用(rc.d) | 可用(rc.d) |
| glibc 兼容性 | 2.35 | 2.31(需验证) | 2.31(需验证) |
| openEuler 适配 | N/A | N/A | 原生支持 |
| 国密插件(sha2) | 需自编译 | 需自编译 | 需自编译 |
// UOS 1060a 添加 PostgreSQL 官方源
cat <<'EOF' | sudo tee /etc/apt/sources.list.d/pgdg.list
deb http://apt.postgresql.org/pub/repos/apt/ focal-pgdg main
EOF
wget -qO - https://www.postgresql.org/media/keys/ACCC4CF8.asc | \
sudo apt-key add -
sudo apt update
sudo apt install -y postgresql-16 postgresql-client-16
// 验证
pg_lsclusters
# 预期:Ver Cluster Port Status Owner Data directory
# 16 main 5432 online postgres /var/lib/postgresql/16/main
# postgresql.conf 核心调优(生产环境)
shared_buffers = 4GB # 约总内存的 25%(16G机器)
work_mem = 64MB # 复杂排序/哈希用
maintenance_work_mem = 512MB # VACUUM/CREATE INDEX
effective_cache_size = 12GB # 约总内存的 75%
random_page_cost = 1.1 # SSD 下调(默认4.0机械盘)
effective_io_concurrency = 200 # SSD
# 信创环境强制
ssl = on
ssl_cert_file = '/etc/ssl/certs/server.crt'
ssl_key_file = '/etc/ssl/private/server.key'
# 监控
shared_preload_libraries = 'pg_stat_statements,pg_stat_kcache'
-- 查找最耗时的 SQL(Top 10)
SELECT
substring(query, 1, 60) AS short_query,
round(total_exec_time::numeric, 2) AS total_time,
calls,
round(mean_exec_time::numeric, 2) AS mean_time,
round(max_exec_time::numeric, 2) AS max_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
-- 发现慢查询后,创建索引
CREATE INDEX CONCURRENTLY idx_orders_created_at
ON orders (created_at);
// 在 pg-node-1 (主) 上执行,模拟主节点宕机
patronictl pause pg-ha-cluster # 暂停自动选主(观察用)
// 在 pg-node-2 上:手动提升
patronictl switch pg-ha-cluster --master pg2 # 强制切换到 pg2
// 确认 pg2 状态
patronictl list pg-ha-cluster
# 预期:Member -- Leader -- Member state
# pg1 -- pg2 -- Stop leader's gracefull switchover
# pg2 -- pg2 -- Master -- running
# pg3 -- pg2 -- Secondary -- running
// 恢复 pg1:
patronictl resume pg-ha-cluster
# pg1 会自动跟随 pg2 变为 Replica,无需手动干预
synchronous_commit = on + synchronous_standby_names = 'ANY 2 (pg2, pg3)'PG HA 集群的监控需要覆盖两层:PG 实例层和 Patroni 协调层:
# pg_exporter 采集的配置(prometheus.yml)
scrape_configs:
- job_name: 'postgresql'
static_configs:
- targets: ['pg-node-1:9187', 'pg-node-2:9187', 'pg-node-3:9187']
- job_name: 'patroni'
static_configs:
- targets: ['pg-node-1:8008', 'pg-node-2:8008', 'pg-node-3:8008']
金融和政务场景通常要求数据库层网络隔离:PG 集群仅允许内网访问,外部通过安全网关路由:
# pg_hba.conf 限制:仅允许 Patroni 节点和 HAProxy 访问
# TYPE DATABASE USER ADDRESS METHOD
local all all peer
host all replicator 192.168.1.11/32 md5
host all replicator 192.168.1.12/32 md5
host all proxy_user 192.168.1.100/32 md5 # HAProxy only
host all all 0.0.0.0/0 reject # 拒绝所有其他连接
# 同时配置主机防火墙
iptables -A INPUT -p tcp --dport 5432 -s 192.168.1.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 5432 -j DROP
信创环境下的 PG HA 方案已经非常成熟:Patroni 提供自动选主与故障转移,etcd 提供集群状态共识,HAProxy 提供透明路由。关键指标:RTO(恢复时间目标)< 30秒(自动切换,依赖 etcd 心跳)、RPO(恢复点目标)= 0(全同步复制模式)。经过充分故障演练后,该架构可支撑信创系统生产级 SLA。