Commit 2902f46
committed
[BACKPORT 2024.2][#24566] docdb: Throttle log when tablet is stuck in bootstrapping state
Summary:
If a tablet is stuck in the bootstrapping state due to bugs like #20977, we will
extensively log the following log on the tserver:
```
I1019 16:05:33.204833 112611 consensus_peers.cc:187] T e28f9138850f4bd29b6ee097d2557e12 P d494d8235d1444e0978ed79d7037ce5c -> Peer fb68a8df689340fabe86c775ba95f289 ([host: "x.x.x.x" port: 9100], []): Found a RPC call in stuck state - timeout: 3.000s, last_rpc_start_time: 125060143.166s, stuck threshold: 10.000s, force recover: 0, call state: OutboundCall(0x0000000db30567a0 -> RPC call yb.consensus.ConsensusService.UpdateConsensus -> { remote: x.x.x.x:9100 idx: 0 protocol: 0x000000000437b7b0 -> tcpc } , state=SENT.): RPC call yb.consensus.ConsensusService.UpdateConsensus -> { remote: x.x.x.x:9100 idx: 0 protocol: 0x000000000437b7b0 -> tcpc } , state=SENT., start_time: 125060143.166s, sent_time: 125060143.166s, callback_time: 0.000s, now: 125487372.619s, connection: 0x000000003010a3d8 -> Connection (0x000000003010a3d8) client x.x.x.x:39958 => x.x.x.x:9100
```
This change throttles the log to once per second (it is DFATAL now, so it will fail immediately
in debug, but we still do not want ERROR logs to fill up the disk if it finds its way to production
clusters).
Original commit: 87afcd8 / D50000
Test Plan: Jenkins
Reviewers: mhaddad, hsunder
Reviewed By: mhaddad
Subscribers: ybase
Tags: #jenkins-ready
Differential Revision: https://phorge.dev.yugabyte.com/D500951 parent b8e7a24 commit 2902f46
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
185 | 185 | | |
186 | 186 | | |
187 | 187 | | |
188 | | - | |
| 188 | + | |
189 | 189 | | |
190 | 190 | | |
191 | 191 | | |
| |||
0 commit comments