fix: Do not mutate shared _gemm_output_3d in CpuGemmConv2d::run() by morgolock · Pull Request #1275 · ARM-software/ComputeLibrary

morgolock · 2026-04-02T10:03:33Z

CpuGemmConv2d::run() was mutating the shared member _gemm_output_3d by extending its padding before soft_init()/import_memory().

When the same operator instance is reused across runs, this can cause later extend_padding() calls to fail. It is also unsafe when the operator is used from multiple threads.

Use a local TensorInfo copy in run() for padding extension and soft_init()/import_memory(), leaving _gemm_output_3d unchanged.

Added a new test: RepeatedRunDoesNotReuseImportedGemm3dTensorInfo.

Change-Id: I3e4e2d25cabf85724ecf126b1c93df6733ee7d48

CpuGemmConv2d::run() was mutating the shared member _gemm_output_3d by extending its padding before soft_init()/import_memory(). When the same operator instance is reused across runs, this can cause later extend_padding() calls to fail. It is also unsafe when the operator is used from multiple threads. Use a local TensorInfo copy in run() for padding extension and soft_init()/import_memory(), leaving _gemm_output_3d unchanged. Added a new test: RepeatedRunDoesNotReuseImportedGemm3dTensorInfo. Change-Id: I3e4e2d25cabf85724ecf126b1c93df6733ee7d48 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>

gunes-arm · 2026-04-02T15:27:32Z

tests/validation/NEON/ConvolutionLayer.cpp

+ * - The second run does not throw
+ * - Both runs compute the same output
+ */
+TEST_CASE(RepeatedRunDoesNotReuseImportedGemm3dTensorInfo, framework::DatasetMode::ALL)


Is this really failing w/o this fix? I've been testing it but couldn't make it fail. I think it's happening because TensorAllocator's destructor is marking the TensorInfo as resizable again and when run() finishes the allocator that's been soft initialized is destructed and it marks the information object resizable again. This makes this problem only visible in situations where the usage of the same object is multi-threaded.

morgolock requested a review from gunes-arm April 2, 2026 10:03

morgolock mentioned this pull request Apr 2, 2026

fix: Do not mutate external TensorInfo in import_memory() #1270

Closed

morgolock force-pushed the pr/fix_shared_tinfo_cpugemm branch from f823e19 to a1d5f71 Compare April 2, 2026 15:24

gunes-arm reviewed Apr 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Do not mutate shared _gemm_output_3d in CpuGemmConv2d::run()#1275

fix: Do not mutate shared _gemm_output_3d in CpuGemmConv2d::run()#1275
morgolock wants to merge 1 commit intomainfrom
pr/fix_shared_tinfo_cpugemm

morgolock commented Apr 2, 2026

Uh oh!

gunes-arm Apr 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

morgolock commented Apr 2, 2026

Uh oh!

gunes-arm Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gunes-arm Apr 2, 2026 •

edited

Loading