Prometheus-rancher-monitoring-prometheus-0 constantly crashloopback off, and running again

ts=2023-10-17T09:26:31.763Z caller=main.go:539 level=info msg=“Starting Prometheus Server” mode=server version=“(version=2.38.0, branch=HEAD, revision=818d6e60888b2a3ea363aee8a9828c7bafd73699)”
ts=2023-10-17T09:26:31.763Z caller=main.go:544 level=info build_context=“(go=go1.18.5, user=root@e6b781f65453, date=20220816-13:23:14)”
ts=2023-10-17T09:26:31.763Z caller=main.go:545 level=info host_details=“(Linux 5.13.0-1022-azure #26~20.04.1-Ubuntu SMP Thu Apr 7 19:42:45 UTC 2022 x86_64 prometheus-rancher-monitoring-prometheus-0 (none))”
ts=2023-10-17T09:26:31.763Z caller=main.go:546 level=info fd_limits=“(soft=1048576, hard=1048576)”
ts=2023-10-17T09:26:31.763Z caller=main.go:547 level=info vm_limits=“(soft=unlimited, hard=unlimited)”
ts=2023-10-17T09:26:31.766Z caller=web.go:553 level=info component=web msg=“Start listening for connections” address=0.0.0.0:9090
ts=2023-10-17T09:26:31.767Z caller=main.go:976 level=info msg=“Starting TSDB …”
ts=2023-10-17T09:26:31.768Z caller=repair.go:56 level=info component=tsdb msg=“Found healthy block” mint=1697440068542 maxt=1697500800000 ulid=01HCY0F5ME18J0WSZX088745HQ
ts=2023-10-17T09:26:31.768Z caller=tls_config.go:231 level=info component=web msg=“TLS is disabled.” http2=false
ts=2023-10-17T09:26:31.768Z caller=repair.go:56 level=info component=tsdb msg=“Found healthy block” mint=1697522400000 maxt=1697529600000 ulid=01HCYE0HQNSMMWDPSKCTX45ZWN
ts=2023-10-17T09:26:31.769Z caller=repair.go:56 level=info component=tsdb msg=“Found healthy block” mint=1697500800000 maxt=1697522400000 ulid=01HCYE0Z4F369QDYRSG6MYYRWR
ts=2023-10-17T09:26:31.769Z caller=dir_locker.go:77 level=warn component=tsdb msg=“A lockfile from a previous execution already existed. It was replaced” file=/prometheus/lock
ts=2023-10-17T09:26:31.843Z caller=head.go:495 level=info component=tsdb msg=“Replaying on-disk memory mappable chunks if any”
ts=2023-10-17T09:26:31.843Z caller=head.go:538 level=info component=tsdb msg=“On-disk memory mappable chunks replay completed” duration=2.9µs
ts=2023-10-17T09:26:31.843Z caller=head.go:544 level=info component=tsdb msg=“Replaying WAL, this may take a while”
ts=2023-10-17T09:26:38.568Z caller=head.go:580 level=info component=tsdb msg=“WAL checkpoint loaded”
ts=2023-10-17T09:26:38.739Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=158 maxSegment=173
ts=2023-10-17T09:26:38.876Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=159 maxSegment=173
ts=2023-10-17T09:26:40.622Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=160 maxSegment=173
ts=2023-10-17T09:26:40.926Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=161 maxSegment=173
ts=2023-10-17T09:26:41.822Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=162 maxSegment=173
ts=2023-10-17T09:26:42.531Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=163 maxSegment=173
ts=2023-10-17T09:26:43.227Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=164 maxSegment=173
ts=2023-10-17T09:26:44.032Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=165 maxSegment=173
ts=2023-10-17T09:26:44.423Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=166 maxSegment=173
ts=2023-10-17T09:26:44.824Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=167 maxSegment=173
ts=2023-10-17T09:26:46.023Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=168 maxSegment=173
ts=2023-10-17T09:26:46.722Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=169 maxSegment=173
ts=2023-10-17T09:26:48.722Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=170 maxSegment=173
ts=2023-10-17T09:26:49.430Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=171 maxSegment=173
ts=2023-10-17T09:26:50.433Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=172 maxSegment=173
ts=2023-10-17T09:26:50.433Z caller=head.go:615 level=info component=tsdb msg=“WAL segment loaded” segment=173 maxSegment=173
ts=2023-10-17T09:26:50.433Z caller=head.go:621 level=info component=tsdb msg=“WAL replay completed” checkpoint_replay_duration=6.724838744s wal_replay_duration=11.865680264s total_replay_duration=18.590534408s
ts=2023-10-17T09:26:51.342Z caller=main.go:997 level=info fs_type=EXT4_SUPER_MAGIC
ts=2023-10-17T09:26:51.343Z caller=main.go:1000 level=info msg=“TSDB started”
ts=2023-10-17T09:26:51.343Z caller=main.go:1181 level=info msg=“Loading configuration file” filename=/etc/prometheus/config_out/prometheus.env.yaml
ts=2023-10-17T09:26:51.358Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.358Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.358Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.358Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.358Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.359Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.359Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.359Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.359Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.359Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.359Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.360Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.360Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.360Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.360Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.361Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.361Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.361Z caller=kubernetes.go:326 level=info component=“discovery manager scrape” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.361Z caller=kubernetes.go:326 level=info component=“discovery manager notify” discovery=kubernetes msg=“Using pod service account via in-cluster config”
ts=2023-10-17T09:26:51.733Z caller=main.go:1218 level=info msg=“Completed loading of configuration file” filename=/etc/prometheus/config_out/prometheus.env.yaml totalDuration=390.671401ms db_storage=1µs remote_storage=1.5µs web_handler=600ns query_engine=1.2µs scrape=249.407µs scrape_sd=3.687195ms notify=26.001µs notify_sd=228.106µs rules=371.551206ms tracing=7µs
ts=2023-10-17T09:26:51.733Z caller=main.go:961 level=info msg=“Server is ready to receive web requests.”
ts=2023-10-17T09:26:51.733Z caller=manager.go:941 level=info component=“rule manager” msg=“Starting rule manager…”

@paredescedric3

The logs currently available lack clarity. I kindly request your assistance in furnishing the previous container logs and, in addition, the records of recent events pertaining to the pods. Your cooperation in this matter is greatly appreciated.

To get previous container logs:

To get pod recent events

kubectl -n <namespace> describe pod <pod-name> 


Here is the latest log

@paredescedric3

We require pod previous logs to debug the issue
image

1 Like

@paredescedric3 Can you please provide the pod pervious logs to debug the issue , then only @syed.salman will be able to guide you properly.