Skip to content

Improvements to call cache cleanup#6476

Merged
lutter merged 6 commits intomasterfrom
lutter/call-cache
Apr 3, 2026
Merged

Improvements to call cache cleanup#6476
lutter merged 6 commits intomasterfrom
lutter/call-cache

Conversation

@lutter
Copy link
Copy Markdown
Collaborator

@lutter lutter commented Apr 1, 2026

No description provided.

lutter added 2 commits April 1, 2026 16:18
The inner deletion loop in clear_stale_call_cache used a hardcoded
batch size of 10,000. Replace it with AdaptiveBatchSize that starts
at 100 and self-tunes based on actual query duration, avoiding both
excessive lock contention on large caches and unnecessary round-trips
on small ones.
Add catalog::table_has_index() to check for index existence, and
use it in clear_stale_call_cache to ensure a btree index on
contract_address exists before running deletion queries. The index
is created concurrently if missing, avoiding sequential scans on
large call_cache tables.
@lutter lutter requested a review from dimitrovmaksim April 1, 2026 23:58
Comment on lines +1679 to +1685
conn.batch_execute(&format!(
"create index concurrently if not exists {} \
on {}(contract_address)",
Self::CALL_CACHE_CONTRACT_ADDRESS_INDEX,
table_qname
))
.await?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failure on create index concurrently may leave an invalid index, that will not be used, but will exist as a name on the table. We should probably match on error here and execute drop index if exists.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed that

loop {
let current_size = batch_size.size;
let start = Instant::now();
let deleted_count = sql_query(&delete_cache_query)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually oversight on my part when implementing this (sorry about that 🙏🏻), I would be nice to add logs before/after the delete query to show progress.

// self-tune based on query duration.
let contracts_batch_size: i64 = 2000;
let cache_batch_size: usize = 10000;
let mut batch_size = AdaptiveBatchSize::with_size(100);
Copy link
Copy Markdown
Member

@dimitrovmaksim dimitrovmaksim Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be an edge case (depending on shard/replica setup), but now that batch_size is unbounded, couldn't it grow to a point it creates very large WAL files per batch, which may cause replication lag on read replicas and affect query performance? (tbf 10k already probably had significant WAL/lag impact)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a big deal, at least if the adaptive batch size works since it will try to keep transactions to under 3 minutes

lutter added 3 commits April 2, 2026 13:44
Refactor the code to separate the gymnastics caused by the two separate
storage schemes from the overall control loop
Introduce an env variable to control that, and lower the default to 100
Copy link
Copy Markdown
Member

@dimitrovmaksim dimitrovmaksim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving not to block on nits

);
break;
}
if !idx_valid {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This would trigger on a missing index as well, so the log may be a little misleading

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. I'll just not log; it's not that important to do that

@lutter lutter force-pushed the lutter/call-cache branch from e53be77 to 6a388cb Compare April 3, 2026 01:01
@lutter lutter merged commit 6a388cb into master Apr 3, 2026
6 checks passed
@lutter lutter deleted the lutter/call-cache branch April 3, 2026 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants