当前位置: 首页 > news >正文

ES集群磁盘空间超水位线不可写的应急处理

ES集群磁盘空间超水位线不可写的应急处理

  • 检查磁盘空间使用
  • 清理索引旧数据

业务向ES集群写入数据时收到报错:

ElasticsearchStatusException[Elasticsearch exception [type=cluster_block_exception, reason=index [esx_busin_sdx_index_test] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];]
]

由于磁盘空间不足,只允许对索引做读操作和删除索引。

检查磁盘空间使用

查看集群健康状态和磁盘使用情况:

[root@eshost ~]# curl -u esadmin:adminPass123 -X GET "localhost:9200/_cluster/health?pretty"
{"cluster_name" : "escluster","status" : "yellow","timed_out" : false,"number_of_nodes" : 47,"number_of_data_nodes" : 41,"active_primary_shards" : 11721,"active_shards" : 23296,"relocating_shards" : 0,"initializing_shards" : 0,"unassigned_shards" : 95,"delayed_unassigned_shards" : 0,"number_of_pending_tasks" : 0,"number_of_in_flight_fetch" : 0,"task_max_waiting_in_queue_millis" : 0,"active_shards_percent_as_number" : 99.59386088666581
}[root@eshost ~]# curl -u esadmin:adminPass123 -X GET "localhost:9200/_cluster/stats?pretty" | grep disk% Total    % Received % Xferd  Average Speed   Time    Time     Time  CurrentDload  Upload   Total   Spent    Left  Speed
100 11544  100 11544    0     0  38217      0 --:--:-- --:--:-- --:--:-- 38225

查看各节点磁盘使用:

[root@eshost ~]# curl -u esadmin:adminPass123 -X GET "localhost:9200/_nodes/stats/fs?pretty" [root@eshost ~]# ansible -i /home/xuser/hosts/hosts_es es_uat -m shell -a "df -Th | grep es"22.23.55.85 | CHANGED | rc=0 >>
Filesystem                     Type      Size  Used Avail Use% Mounted on
/dev/mapper/esdata0-lv_esdata0 xfs       6.0T  5.9T  110G  99% /esdata0
/dev/mapper/esdata1-lv_esdata1 xfs       6.0T  5.9T  115G  99% /esdata122.23.55.25 | CHANGED | rc=0 >>
Filesystem                        Type      Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup1-lv_es_node1 xfs       6.0T  5.9T  117G  99% /es_node1
/dev/mapper/VolGroup3-lv_es_node3 xfs       6.0T  5.8T  235G  97% /es_node3
/dev/mapper/VolGroup2-lv_es_node2 xfs       6.0T  5.8T  232G  97% /es_node2...
...

果然是磁盘空间不够了。如果本地磁盘空间没办法扩容的话,可以想办法删除不再需要的旧数据。

清理索引旧数据

按索引大小降序排列,查看各索引占用的磁盘空间:

[root@eshost ~]# curl -u esadmin:adminPass123 -s "localhost:9200/_cat/indices?v&s=store.size:desc" | head -n 10health status index                                           uuid            pri rep docs.count docs.deleted store.size pri.store.size
green  open   esx_infra_security_msbgfilelog_2024-10        IXEblLAlRlmxxxxxx  10   1 3526854010            0        3tb          1.5tb
green  open   esx_infra_security_msbgfilelog_2025-01        jczfgnyaRAWxxxxxx  10   1 3557913663            0      2.9tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2025-03        WgobLQ_dSxyxxxxxx  10   1 3544630550            0      2.9tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2024-12        mZFA8PSMQE6xxxxxx  10   1 3496172725            0      2.9tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2025-05        RT40X4fJQRSxxxxxx  10   1 3438523516            0      2.8tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2025-04        5O023SGLTSGxxxxxx  10   1 3385972439            0      2.8tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2024-09        _mXc1NOjRryxxxxxx  10   1 3326446791            0      2.8tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2024-11        yZv3UrPFTryxxxxxx  10   1 3279153784            0      2.8tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2025-02        HEeQwDMuTbCxxxxxx  10   1 3204650774            0      2.6tb          1.3tb

其中:

  • store.size:索引总大小(主分片 + 副本)。
  • pri.store.size:主分片大小。

由于索引名esx_infra_security_msbgfilelog_后缀是按月份创建的,可以按如下方法删除该索引10个月前的旧数据:

echo $(date --date='12 months ago' +%Y-%m)
echo $(date --date='11 months ago' +%Y-%m)
echo $(date --date='10 months ago' +%Y-%m)curl -u esadmin:adminPass123 -X DELETE "localhost:9200/esx_infra_security_msbgfilelog_$(date --date='12 months ago' +%Y-%m)"
curl -u esadmin:adminPass123 -X DELETE "localhost:9200/esx_infra_security_msbgfilelog_$(date --date='11 months ago' +%Y-%m)"
curl -u esadmin:adminPass123 -X DELETE "localhost:9200/esx_infra_security_msbgfilelog_$(date --date='10 months ago' +%Y-%m)"

检查是否删除成功:

[root@eshost ~]# curl -u esadmin:adminPass123 -s "localhost:9200/_cat/indices?v&s=store.size:desc" | head -n 10health status index                                         uuid            pri rep docs.count docs.deleted store.size pri.store.size
green  open   esx_infra_security_msbgfilelog_2025-01      jczfgnyaRAWZxxxxx  10   1 3557913663            0      2.9tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2025-03      WgobLQ_dSxypxxxxx  10   1 3544630550            0      2.9tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2024-12      mZFA8PSMQE6Ixxxxx  10   1 3496172725            0      2.9tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2025-05      RT40X4fJQRSYxxxxx  10   1 3438601248            0      2.8tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2025-04      5O023SGLTSG-xxxxx  10   1 3385972439            0      2.8tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2024-11      yZv3UrPFTry2xxxxx  10   1 3279153784            0      2.8tb          1.4tb
green  open   esx_infra_security_msbgfilelog_2025-02      HEeQwDMuTbChxxxxx  10   1 3204650774            0      2.6tb          1.3tb
green  open   esx_infra_server_linux_2025-05              HWfsGcr0TMyexxxxx  10   1 3124772748            0        2tb            1tb
green  open   esx_infra_server_linux_2024-12              jo2U-VyYRjusxxxxx  10   1 2985056711            0      1.3tb        680.3gb[root@eshost ~]# curl -u esadmin:adminPass123 -X GET "localhost:9200/_cluster/health?pretty" | grep status
http://www.lqws.cn/news/168679.html

相关文章:

  • 【AI News | 20250605】每日AI进展
  • K8S认证|CKS题库+答案| 2. Pod 指定 ServiceAccount
  • 七彩喜智慧养老平台:科技赋能下的市场蓝海,满足多样化养老服务需求
  • OpenStack组件:放置服务(Placement)安装
  • 数据可视化大屏案例落地实战指南:捷码平台7天交付方法论
  • 看板中“进行中”任务过多如何优化
  • 单精度浮点数值 和 双精度浮点数值
  • 基于51单片机的车内防窒息检测报警系统
  • 【运维心得】内存占用虚标真相
  • vue-19(Vuex异步操作和变更)
  • 使用ArcPy进行栅格数据分析(2)
  • JAVA之 Lambda
  • 【赵渝强老师】Docker的图形化管理工具
  • 【JavaEE】万字详解HTTP协议
  • 残月个人拟态主页
  • RADIUS 协议 (Remote Authentication Dial-In User Service)
  • 华为交换机vlan配置步骤
  • 《最长公共子序列》题集
  • 8086寻址解剖图:7种武器解锁x86内存访问的基因密码
  • Linux --环境变量,虚拟地址空间
  • 直线导轨微型化技术难点在哪里?
  • Python基于方差-协方差方法实现投资组合风险管理的VaR与ES模型项目实战
  • Java并发编程实战 Day 10:原子操作类详解
  • 边缘计算应用实践心得
  • P10909 [蓝桥杯 2024 国 B] 立定跳远
  • Python Einops库:深度学习中的张量操作革命
  • 使用 uv 工具快速部署并管理 vLLM 推理环境
  • 前端面试四之Fetch API同步和异步
  • 【Linux网络篇】:从HTTP到HTTPS协议---加密原理升级与安全机制的全面解析
  • 掌握YOLOv8:从视频目标检测到划定区域统计计数的实用指南