Linux|错误集锦|prometheus Error on ingesting samples that are too old or are too far into the future的解决
2023-09-14 09:13:14 时间
前情回顾:
二进制prometheus部署完成后,在prometheus的web界面进行一些数据验证工作。
下面这个是我已经恢复正常的,其实是查询不到数据的
grafana也接收不到任何数据
问题排查:
查看系统日志 /var/log/messages ,有很多飘红,内容如下:
2023-03-05T10:04:43.421936+08:00 EULER1 prometheus: ts=2023-03-05T02:04:43.420Z caller=scrape.go:1604 level=warn component="scrape manager" scrape_pool=prometheus target=http://192.168.123.60:39090/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=585
2023-03-05T10:04:43.422722+08:00 EULER1 prometheus: ts=2023-03-05T02:04:43.420Z caller=scrape.go:1338 level=warn component="scrape manager" scrape_pool=prometheus target=http://192.168.123.60:39090/metrics msg="Append failed" err="out of bounds"
2023-03-05T10:04:46.332122+08:00 EULER1 prometheus: ts=2023-03-05T02:04:46.331Z caller=scrape.go:1604 level=warn component="scrape manager" scrape_pool=server target=http://192.168.123.60:39100/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=489
2023-03-05T10:04:46.332823+08:00 EULER1 prometheus: ts=2023-03-05T02:04:46.331Z caller=scrape.go:1338 level=warn component="scrape manager" scrape_pool=server target=http://192.168.123.60:39100/metrics msg="Append failed" err="out of bounds"
2023-03-05T10:04:53.220008+08:00 EULER1 grafana: logger=context userId=1 orgId=1 uname=admin t=2023-03-05T10:04:53.218709394+08:00 level=info msg="Request Completed" method=GET path=/api/live/ws status=-1 remote_addr=192.168.123.1 time_ms=34 duration=34.438488ms size=0 referer= handler=/api/live/ws
2023-03-05T10:04:58.421342+08:00 EULER1 prometheus: ts=2023-03-05T02:04:58.420Z caller=scrape.go:1604 level=warn component="scrape manager" scrape_pool=prometheus target=http://192.168.123.60:39090/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=585
2023-03-05T10:04:58.422096+08:00 EULER1 prometheus: ts=2023-03-05T02:04:58.420Z caller=scrape.go:1338 level=warn component="scrape manager" scrape_pool=prometheus target=http://192.168.123.60:39090/metrics msg="Append failed" err="out of bounds"
2023-03-05T10:05:01.324456+08:00 EULER1 prometheus: ts=2023-03-05T02:05:01.323Z caller=scrape.go:1604 level=warn component="scrape manager" scrape_pool=server target=http://192.168.123.60:39100/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=489
2023-03-05T10:05:01.324838+08:00 EULER1 prometheus: ts=2023-03-05T02:05:01.323Z caller=scrape.go:1338 level=warn component="scrape manager" scrape_pool=server target=http://192.168.123.60:39100/metrics msg="Append failed" err="out of bounds"
2023-03-05T10:05:13.423583+08:00 EULER1 prometheus: ts=2023-03-05T02:05:13.422Z caller=scrape.go:1604 level=warn component="scrape manager" scrape_pool=prometheus target=http://192.168.123.60:39090/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=585
2023-03-05T10:05:13.424362+08:00 EULER1 prometheus: ts=2023-03-05T02:05:13.422Z caller=scrape.go:1338 level=warn component="scrape manager" scrape_pool=prometheus target=http://192.168.123.60:39090/metrics msg="Append failed" err="out of bounds"
2023-03-05T10:05:16.330791+08:00 EULER1 prometheus: ts=2023-03-05T02:05:16.329Z caller=scrape.go:1604 level=warn component="scrape manager" scrape_pool=server target=http://192.168.123.60:39100/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=489
以上报错的意思是由于时间不正常,时序数据库出现异常。
查看系统时间,是正常的,并且已经同步过阿里云的时间服务器,同步命令如下:
[root@EULER1 tsdb_data]# ntpdate ntp.aliyun.com
11 Mar 20:08:29 ntpdate[53501]: adjust time server 203.107.6.88 offset 0.001621 sec
整个就是无语,完全没有想法的状态。
解决方案:
后面仔细考虑了一下部署前后的操作,其中有一个细节,是prometheus server服务启动后,查看了服务状态,确认状态正常后,又修改了时区,因为系统时间不知道怎么回事,又回到了西八区,因此,再次改为东八区,命令如下:
rm -rf /etc/localtime
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
最终确定是由于时区更改的问题,造成prometheus的时序数据库出现数据异常,因此需要重置prometheus的时序数据库。
最终解决方案:
cat >/etc/systemd/system/prometheus.service <<EOF
[Unit]
Descriptinotallow=Prometheus Monitoring System
Documentatinotallow=Prometheus Monitoring System
[Service]
ExecStart=/opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml --web.listen-address=:39090 \
--storage.tsdb.path="/opt/prometheus/tsdb_data"
# 这里的路径按实际填写
[Install]
WantedBy=multi-user.target
EOF
启动脚本内增加了--storage.tsdb.path="/opt/prometheus/tsdb_data" /opt/prometheus/tsdb_data目录需要新建
再次重启prometheus服务后,发现一切恢复正常。
虽然prometheus的数据仍然是存放在/data 目录下,并不在/opt/prometheus/tsdb_data目录下,但神奇的是prometheus服务竟然恢复了。具体原因还不清楚,留待以后在研究吧。
相关文章
- Linux内核补丁:为系统增添保护(linux内核patch)
- 日志Linux排查错误之旅:查看错误日志(linux查看错误)
- 安装Vue on Linux:轻松搭建开发环境(linux安装vue)
- Linux快速开启之路:Direct Boot(linux直接启动)
- Linux路径符号:精准掌握归途(linux路径符号)
- 优化Linux系统,助力运维工作(linux系统维护)
- 不正确警告:Linux中输入文件名错误(linux输入文件名)
- Linux发展历程:从分支到成熟(linux的分支)
- 解决Linux系统密码错误问题(linux密码错误)
- Linux之家:各种分支的Linux系统探索(linux的分支)
- Linux系统中的空闲进程研究(linux的空闲进程)
- Linux运维:每天都是一段旅程(linux运维日常工作)
- Linux查看域名的简易方法(linux怎么查看域名)
- 关闭Linux防火墙:一种安全实践(关闭linux的防火墙)
- 关闭Linux系统防火墙:一步一步进行!(关闭linux的防火墙)
- 快速上手:Linux创建文件的简单方法(linux创建文件的方法)
- 轻松管理Linux:简单指南(管理linux)
- Linux超时:如何避免网络通信中的失效和错误?(linux超时)
- Linux系统Too many open files错误解决方法(linux打开文件过多)
- 「轻松管理无压力,Linux批量管理工具绝对实用」(linux批量管理工具)
- Installing Tarball Packages on Linux Systems(linux安装tgz)
- Linux意外关机:如何避免数据丢失?(linux意外关机)
- Exploring the Benefits of Linux System on 32bit Architecture(linux系统32位)
- Efficiently Monitor JVM Memory on Linux with These Simple Tips(linux监控jvm内存)
- Effective Ways to Combat Malware on Linux Platforms: A Guide to Trojans Removal(linux查杀木马)
- Exploring the Benefits of VM on a Linux System(linux系统vm)
- Linux 英文文献:学习和使用 Linux 的必备参考资源(linux英文文献)
- Linux如何挂载NTFS格式的硬盘(linux挂载ntfs硬盘)
- 轻松配置Linux系统的固定IP(设置linux固定ip)