Kubernetes 生产环境配置完全指南:从入门到生产级部署

作为冉冉博客的运维教程系列,今天分享 Kubernetes 生产环境的完整配置流程。这些经验来自真实的集群运维,值得收藏。

一、集群规划与架构

生产环境 Kubernetes 架构设计要点:

节点规划

  • Master 节点:至少 3 台,配置 4 核 8GB
  • Worker 节点:根据负载,至少 3 台,配置 8 核 16GB
  • etcd:生产环境单独部署,至少 3 节点

网络方案

# Calico 网络配置
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
  name: default-ipv4
spec:
  cidr: 10.244.0.0/16
  natOutgoing: true
  nodeSelector: all()

二、核心组件配置

APIServer 安全加固

# kube-apiserver.yaml
--authorization-mode=Node,RBAC
--enable-admission-plugins=NodeRestriction,PodSecurityPolicy
--tls-min-version=VersionTLS12
--audit-log-maxsize=100
--audit-log-max-backup=5

Controller Manager 高可用

# kube-controller-manager.yaml
--leader-elect=true
--leader-elect-lease-duration=15s
--leader-elect-renew-deadline=10s
--leader-elect-retry-period=5s

三、存储配置

PersistentVolume 配置

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-data
spec:
  storageClassName: nfs-storage
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi

StatefulSet 部署

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: mysql
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql

四、网络策略

生产环境必须配置网络策略:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: app-policy
spec:
  podSelector:
    matchLabels:
      app: web
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: api
      ports:
        - protocol: TCP
          port: 8080

五、监控与告警

Prometheus 配置

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
spec:
  replicas: 2
  retention: 15d
  storage:
    volumeClaimTemplate:
      spec:
        storageClassName: ssd
        resources:
          requests:
            storage: 100Gi

告警规则

groups:
- name: k8s-alerts
  rules:
  - alert: HighCPUUsage
    expr: rate(container_cpu_usage_seconds_total[5m]) > 0.8
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "节点 CPU 使用率过高"

六、滚动更新策略

apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
  minReadySeconds: 10
  progressDeadlineSeconds: 600

七、资源配额与限制

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi

---
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
spec:
  limits:
  - max:
      cpu: "8"
      memory: 16Gi
    min:
      cpu: 100m
      memory: 128Mi
    default:
      cpu: 500m
      memory: 1Gi
    defaultRequest:
      cpu: 200m
      memory: 512Mi
    type: Container

八、备份与灾难恢复

生产环境必须做好备份:

  • etcd 快照定期备份
  • PV 数据快照
  • 配置文件 GitOps 管理
  • 定期演练恢复流程

Kubernetes 生产环境配置是一个持续优化的过程。掌握这些核心要点,能够搭建稳定可靠的集群。更多运维干货,欢迎持续关注冉冉博客。

© 版权声明
THE END
喜欢就支持一下吧
点赞9 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容