Skip to content

Exec separable filter#707

Open
AdityaShome wants to merge 16 commits intokornia:mainfrom
AdityaShome:exec-separable-filter
Open

Exec separable filter#707
AdityaShome wants to merge 16 commits intokornia:mainfrom
AdityaShome:exec-separable-filter

Conversation

@AdityaShome
Copy link
Contributor

📝 Description

Applies ExecutionStrategy support to separable_filter and operations (box_blur, gaussian_blur, sobel), enabling flexible serial and parallel execution modes. The implementation consolidates execution strategies using global macros and improves parallelization options.

Fixes/Relates to: #600


🛠️ Changes Made

  • Added SeparableFilter struct with precomputed offsets for kernel convolution
  • Applied ExecutionStrategy parameter to separable_filter() function signature
  • Added run_horizontal and run_vertical macros for efficient convolution
  • Implemented serial execution path (without Send + Sync trait bounds)
  • Implemented parallel execution paths for ParallelElements, AutoRows, and Fixed strategies
  • Updated all call sites in ops.rs (box_blur, gaussian_blur, sobel) to use ExecutionStrategy
  • Updated test cases to use ExecutionStrategy for deterministic results and verification of value match.
  • Provided ExecutionStrategy argument for examples for api change

🧪 How Was This Tested?

  • Unit Tests: All existing 95 tests in kornia-imgproc passed

    • test_separable_filter_f32 - validates f32 filtering with Serial strategy
    • test_separable_filter_u8 - validates u8 filtering with Serial strategy
    • test_separable_filter_u8_max_val - validates u8 clamping behavior
    • test_parallel_strategies_consistency: Verifies Serial, Fixed(4), AutoRows, and ParallelElements match exactly in f32 images on normalized gaussian like kernel
    • test_parallel_strategies_u8 - Verifies all 4 strategies match for u8 images on box kernel
  • Manual Verification:

    • Compiled successfully with no warnings cargo check --package kornia-imgproc and cargo check --package feature.
  • Performance/Edge Cases:

    • Zero size image check added to prevent empty buffer allocation
    • Parallel execution uses immutable reference to temp buffer to avoid data races
    • Send + Sync bounds ensure thread safe pixel type and avoided in serial for non thread safe pixel type.

🕵️ AI Usage Disclosure

Check one of the following:

  • 🟢 No AI used.
  • 🟡 AI-assisted: I used AI for boilerplate/refactoring but have manually reviewed and tested every line.
  • 🔴 AI-generated: (Note: These PRs may be subject to stricter scrutiny or immediate closure if the logic is not explained).

🚦 Checklist

  • I am assigned to the linked issue (required before PR submission)
  • The linked issue has been approved by a maintainer
  • This PR strictly implements what the linked issue describes (no scope creep)
  • I have performed a self-review of my code (no "ghost" variables or hallucinations).
  • My code follows the existing style guidelines of this project.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • (Optional) I have attached screenshots/recordings for UI changes.

💭 Additional Context

  • I have not touched code changes in execution strategy struct definition and threshold.rs as it was reviewed earlier.
  • Pushing before sccache version mismatch, but passes on action version v0.13.0, for any github action checks failing after fix, I will fix them immediately.

Copilot AI review requested due to automatic review settings February 13, 2026 03:30
@qodo-code-review
Copy link
Contributor

ⓘ You are approaching your monthly quota for Qodo. Upgrade your plan

Review Summary by Qodo

Add ExecutionStrategy support to separable filters, blur operations, and threshold functions

✨ Enhancement 🧪 Tests

Grey Divider

Walkthroughs

Description
• Add ExecutionStrategy parameter to separable filter and blur operations
  - Enables serial, parallel elements, auto rows, and fixed thread pool execution modes
  - Consolidates execution strategies using global macros for efficient convolution
• Implement parallel execution paths with proper error handling
  - Adds Parallel error variant to ImageError for execution failures
  - Validates thread counts and image sizes before processing
• Extend ExecutionStrategy support to threshold operations
  - Introduces ExecuteExt trait for flexible strategy-based execution
  - Updates threshold_binary to accept execution strategy parameter
• Add comprehensive benchmarks and tests for parallel strategies
  - New bench_threshold.rs benchmark suite comparing execution strategies
  - Tests verify consistency across Serial, Fixed, AutoRows, and ParallelElements strategies
Diagram
flowchart LR
  A["ExecutionStrategy API"] --> B["Separable Filter"]
  A --> C["Blur Operations"]
  A --> D["Threshold Operations"]
  B --> E["Serial Execution"]
  B --> F["Parallel Strategies"]
  C --> E
  C --> F
  D --> E
  D --> F
  F --> G["Fixed Thread Pool"]
  F --> H["AutoRows"]
  F --> I["ParallelElements"]
Loading

Grey Divider

File Changes

1. crates/kornia-image/src/error.rs Error handling +4/-0

Add Parallel error variant to ImageError

crates/kornia-image/src/error.rs


2. crates/kornia-imgproc/src/parallel.rs ✨ Enhancement +191/-6

Introduce ExecutionStrategy enum and ExecuteExt trait

crates/kornia-imgproc/src/parallel.rs


3. crates/kornia-imgproc/src/filter/separable_filter.rs ✨ Enhancement +393/-45

Add ExecutionStrategy support with parallel macros

crates/kornia-imgproc/src/filter/separable_filter.rs


View more (7)
4. crates/kornia-imgproc/src/filter/ops.rs ✨ Enhancement +61/-7

Add strategy parameters to blur and sobel functions

crates/kornia-imgproc/src/filter/ops.rs


5. crates/kornia-imgproc/src/threshold.rs ✨ Enhancement +16/-17

Integrate ExecutionStrategy into threshold_binary function

crates/kornia-imgproc/src/threshold.rs


6. crates/kornia-imgproc/benches/bench_filters.rs 🧪 Tests +74/-2

Add parallel strategy benchmarks for gaussian blur

crates/kornia-imgproc/benches/bench_filters.rs


7. crates/kornia-imgproc/benches/bench_threshold.rs 🧪 Tests +84/-0

Create new threshold operation benchmark suite

crates/kornia-imgproc/benches/bench_threshold.rs


8. crates/kornia-imgproc/Cargo.toml ⚙️ Configuration changes +4/-1

Register bench_threshold benchmark configuration

crates/kornia-imgproc/Cargo.toml


9. examples/binarize/src/main.rs ✨ Enhancement +14/-2

Update threshold_binary calls with ExecutionStrategy

examples/binarize/src/main.rs


10. examples/features/src/main.rs ✨ Enhancement +8/-1

Update threshold_binary calls with ExecutionStrategy

examples/features/src/main.rs


Grey Divider

Qodo Logo

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the ExecutionStrategy API introduced in #600 to separable filter operations (box_blur, gaussian_blur, sobel), enabling runtime control over parallelization. The implementation adds flexible serial and parallel execution modes with strategies including Serial, ParallelElements, AutoRows, and Fixed thread pools.

Changes:

  • Implemented SeparableFilter struct with execution strategy support and optimized convolution macros
  • Added ExecutionStrategy parameter to separable_filter() and created separable_filter_serial() for non-thread-safe types
  • Updated box_blur, gaussian_blur, and sobel with _with_strategy variants while maintaining backward compatibility through Serial defaults
  • Added comprehensive tests verifying consistency across all execution strategies
  • Updated examples and benchmarks to use the new API

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
crates/kornia-imgproc/src/parallel.rs Adds ExecutionStrategy enum, ParallelError types, and ExecuteExt trait (from PR #600)
crates/kornia-imgproc/src/filter/separable_filter.rs Core implementation with execution strategies, macros for convolution, tests for strategy consistency
crates/kornia-imgproc/src/filter/ops.rs Updates box_blur, gaussian_blur, sobel with _with_strategy variants and Serial defaults
crates/kornia-imgproc/src/threshold.rs Updates threshold_binary to use ExecutionStrategy (from PR #600)
crates/kornia-imgproc/benches/bench_filters.rs Adds benchmarks for Serial, ParallelElements, and AutoRows strategies
crates/kornia-imgproc/benches/bench_threshold.rs New benchmark file for threshold operations with different strategies
crates/kornia-imgproc/Cargo.toml Adds bench_threshold benchmark configuration
crates/kornia-image/src/error.rs Adds Parallel error variant for execution errors
examples/features/src/main.rs Updates threshold_binary calls with ExecutionStrategy::Serial
examples/binarize/src/main.rs Updates threshold_binary calls with ExecutionStrategy::Serial

Comment on lines 296 to +322
dst: &mut Image<T, C, A2>,
kernel_x: &[f32],
kernel_y: &[f32],
strategy: ExecutionStrategy,
) -> Result<(), ImageError>
where
T: FloatConversion + Clone + Zero + Send + Sync,
{
if kernel_x.is_empty() || kernel_y.is_empty() {
return Err(ImageError::InvalidKernelLength(
kernel_x.len(),
kernel_y.len(),
));
}

if src.size() != dst.size() {
return Err(ImageError::InvalidImageSize(
src.cols(),
src.rows(),
dst.cols(),
dst.rows(),
));
}

let filter = SeparableFilter::new(kernel_x, kernel_y);
filter.apply(src, dst, strategy)
}
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking API change: The separable_filter function signature has been modified to require a strategy: ExecutionStrategy parameter. This breaks backward compatibility for any code using separable_filter directly. Consider either: (1) keeping the original separable_filter signature defaulting to ExecutionStrategy::Serial and adding separable_filter_with_strategy for the parameterized version (consistent with the pattern used for box_blur, gaussian_blur, and sobel), or (2) clearly documenting this as a breaking change in the PR description and considering a major version bump.

Copilot uses AI. Check for mistakes.
&mut dst_auto,
&kernel_x,
&kernel_y,
ExecutionStrategy::AutoRows(0),
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoRows(0) is being tested with a stride of 0, which should trigger an InvalidRowStride error in the ExecuteExt trait implementation. However, the separable_filter implementation ignores the stride parameter for AutoRows and uses cols * C instead. This creates an inconsistency: the ExecuteExt::execute_with would reject stride=0, but separable_filter's AutoRows branch doesn't validate or use the provided stride value at all. Either the test should use a valid stride (like AutoRows(10) for width 10), or the AutoRows implementation in separable_filter should validate the stride parameter.

Suggested change
ExecutionStrategy::AutoRows(0),
ExecutionStrategy::AutoRows(1),

Copilot uses AI. Check for mistakes.
&mut dst_auto,
&kernel_x,
&kernel_y,
ExecutionStrategy::AutoRows(0),
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoRows(0) is being tested with a stride of 0, which should trigger an InvalidRowStride error in the ExecuteExt trait implementation. However, the separable_filter implementation ignores the stride parameter for AutoRows and uses cols * C instead. This creates an inconsistency: the ExecuteExt::execute_with would reject stride=0, but separable_filter's AutoRows branch doesn't validate or use the provided stride value at all. Either the test should use a valid stride (like AutoRows(8) for width 8), or the AutoRows implementation in separable_filter should validate the stride parameter.

Suggested change
ExecutionStrategy::AutoRows(0),
ExecutionStrategy::AutoRows(1),

Copilot uses AI. Check for mistakes.
Comment on lines +230 to +238
ExecutionStrategy::AutoRows(_) => {
// Horizontal
temp.par_chunks_mut(cols * C)
.enumerate()
.for_each(|(r, row)| run_horizontal!(self, r, row, src_data, cols, C));

// Vertical
dst_data
.par_chunks_mut(cols * C)
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AutoRows execution strategy ignores the stride parameter provided by the user. The stride value in ExecutionStrategy::AutoRows(_) is not validated or used; instead, the implementation hardcodes cols * C as the chunk size. This is inconsistent with the ExecutionStrategy API design where AutoRows(usize) should allow the user to specify the row stride. Either remove the parameter and make it AutoRows (auto-computing the stride), or use the provided stride value.

Suggested change
ExecutionStrategy::AutoRows(_) => {
// Horizontal
temp.par_chunks_mut(cols * C)
.enumerate()
.for_each(|(r, row)| run_horizontal!(self, r, row, src_data, cols, C));
// Vertical
dst_data
.par_chunks_mut(cols * C)
ExecutionStrategy::AutoRows(row_stride) => {
if row_stride == 0 {
return Err(ImageError::Parallel("row_stride must be > 0".to_string()));
}
let row_width = cols * C;
let chunk_size = row_width
.checked_mul(row_stride)
.ok_or_else(|| ImageError::Parallel("row_stride too large".to_string()))?;
// Horizontal
temp.par_chunks_mut(chunk_size)
.enumerate()
.for_each(|(r, row)| run_horizontal!(self, r, row, src_data, cols, C));
// Vertical
dst_data
.par_chunks_mut(chunk_size)

Copilot uses AI. Check for mistakes.
Comment on lines 296 to +299
dst: &mut Image<T, C, A2>,
kernel_x: &[f32],
kernel_y: &[f32],
strategy: ExecutionStrategy,
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for separable_filter is incomplete. The strategy parameter is not documented in the doc comment. Add documentation for the strategy parameter explaining what execution strategy to use.

Copilot uses AI. Check for mistakes.
///
/// * `src` - Source image
/// * `dst` - Destination image (must have same size as source)
/// * `kernel_x` - Horizontal filter kernel
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace detected at the end of this line. Remove the trailing spaces.

Suggested change
/// * `kernel_x` - Horizontal filter kernel
/// * `kernel_x` - Horizontal filter kernel

Copilot uses AI. Check for mistakes.
@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Feb 13, 2026

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (3) 📎 Requirement gaps (0)

Grey Divider


Action required

✅ 1. test_fixed_threadpool_validation unwraps 📘 Rule violation ⛯ Reliability
Description
The newly added test uses unwrap() and unwrap_err() on fallible image creation and error
extraction, which can panic and obscures error handling intent. This violates the compliance
requirement to avoid unchecked assumptions and propagate/handle errors using Result/?.
Code

crates/kornia-imgproc/src/filter/separable_filter.rs[R760-774]

+        let img = Image::<f32, 1, _>::from_size_val(size, 0.5, CpuAllocator).unwrap();
+        let mut dst = Image::<f32, 1, _>::from_size_val(size, 0.0, CpuAllocator).unwrap();
+        let kernel = vec![1.0];
+
+        // Fixed(0) should error
+        let result = separable_filter(
+            &img,
+            &mut dst,
+            &kernel,
+            &kernel,
+            ExecutionStrategy::Fixed(0),
+        );
+        assert!(result.is_err());
+        let err_msg = result.unwrap_err().to_string();
+        assert!(
Evidence
PR Compliance ID 11 requires avoiding unwrap()/expect() (including in tests that touch fallible
APIs). The new test calls Image::from_size_val(...).unwrap() and then calls result.unwrap_err()
to access the error message.

crates/kornia-imgproc/src/filter/separable_filter.rs[760-774]
Best Practice: Repository guidelines

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new test `test_fixed_threadpool_validation` uses `unwrap()`/`unwrap_err()` on fallible calls, which can panic and violates the no-unwrap compliance rule.
## Issue Context
This test already uses assertions on `Result`; it can be made panic-free by returning `Result` and matching on `Err(e)`.
## Fix Focus Areas
- crates/kornia-imgproc/src/filter/separable_filter.rs[755-778]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


✅ 2. AutoRows stride unused 📘 Rule violation ✓ Correctness
Description
In SeparableFilter::apply, the ExecutionStrategy::AutoRows(stride) parameter is ignored (_),
so callers cannot control/validate row stride as documented. This creates misleading behavior and
misses validation of boundary values like stride == 0.
Code

crates/kornia-imgproc/src/filter/separable_filter.rs[R230-241]

+            ExecutionStrategy::AutoRows(_) => {
+                // Horizontal
+                temp.par_chunks_mut(cols * C)
+                    .enumerate()
+                    .for_each(|(r, row)| run_horizontal!(self, r, row, src_data, cols, C));
+
+                // Vertical
+                dst_data
+                    .par_chunks_mut(cols * C)
+                    .enumerate()
+                    .for_each(|(r, row)| run_vertical!(self, r, row, temp, rows, cols, C, T));
+            }
Evidence
PR Compliance ID 2 requires identifiers/parameters to reflect actual behavior; PR Compliance ID 3
requires explicit handling of boundary values. The match arm pattern
ExecutionStrategy::AutoRows(_) discards the stride value entirely and performs a fixed
par_chunks_mut(cols * C) approach with no stride validation.

Rule 2: Generic: Meaningful Naming and Self-Documenting Code
Rule 3: Generic: Robust Error Handling and Edge Case Management
crates/kornia-imgproc/src/filter/separable_filter.rs[230-241]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`ExecutionStrategy::AutoRows(stride)` is implemented as `AutoRows(_)` in `SeparableFilter::apply`, so the provided `stride` is ignored and `stride == 0` is not validated/handled. This makes the API misleading and breaks boundary/edge-case expectations.
## Issue Context
`ExecutionStrategy::AutoRows(usize)` is documented as requiring a row stride. The implementation should either (a) honor the stride, or (b) redefine the variant to not accept a stride and update call sites/tests accordingly.
## Fix Focus Areas
- crates/kornia-imgproc/src/filter/separable_filter.rs[203-241]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Morphology example compile break 🐞 Bug ✓ Correctness
Description
threshold_binary() now requires an ExecutionStrategy argument, but examples/morphology still calls
the old 4-argument signature. Since examples/* are workspace members, this will break building the
workspace examples.
Code

crates/kornia-imgproc/src/threshold.rs[R44-48]

   dst: &mut Image<T, C, A2>,
   threshold: T,
   max_value: T,
+    strategy: ExecutionStrategy,
) -> Result<(), ImageError>
Evidence
The PR changed threshold_binary’s signature to include a new required strategy: ExecutionStrategy
parameter. The morphology example still calls threshold_binary with 4 parameters, which will no
longer type-check. The workspace includes examples/*, so this is a build breaker for the workspace
member.

crates/kornia-imgproc/src/threshold.rs[42-48]
examples/morphology/src/main.rs[44-51]
Cargo.toml[1-8]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`threshold_binary` now requires a new `ExecutionStrategy` argument, but `examples/morphology` still calls the old signature. This causes a compile failure for the `examples/morphology` workspace member.
### Issue Context
Other examples were updated in this PR, but `examples/morphology/src/main.rs` still uses the old 4-arg call.
### Fix Focus Areas
- examples/morphology/src/main.rs[44-51]
- crates/kornia-imgproc/src/threshold.rs[42-48]
### Suggested change
Pass a strategy argument (likely `ExecutionStrategy::Serial` for deterministic/small example runs), e.g.:

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. apply() allocates temp early 📘 Rule violation ➹ Performance
Description
SeparableFilter::apply allocates the temporary buffer before checking the strategy and immediately
returning for Serial, causing avoidable allocation overhead in a hot path. This conflicts with the
compliance requirement to avoid avoidable overhead in hot code paths.
Code

crates/kornia-imgproc/src/filter/separable_filter.rs[R197-206]

+        let rows = src.rows();
+        let cols = src.cols();
+        let src_data = src.as_slice();
+        let dst_data = dst.as_slice_mut();
+        let mut temp = vec![0.0f32; src_data.len()];

-                for (&k, &off) in self.kernel_y.iter().zip(self.offsets_y.iter()) {
-                    let y = r as isize + off;
-                    if y >= 0 && y < rows as isize {
-                        let idx = y as usize * cols * C + c * C;
-                        for (ch, acc_val) in acc.iter_mut().enumerate().take(C) {
-                            *acc_val += unsafe { *temp.get_unchecked(idx + ch) } * k;
-                        }
-                    }
+        match strategy {
+            ExecutionStrategy::Serial => {
+                return self.apply_serial(src, dst);
+            }
Evidence
PR Compliance ID 15 requires avoiding avoidable overhead in hot paths. The code allocates `temp =
vec![0.0f32; src_data.len()] before the match strategy, but the Serial` branch returns
apply_serial without using temp, making the allocation unnecessary for serial execution.

crates/kornia-imgproc/src/filter/separable_filter.rs[197-206]
Best Practice: Repository guidelines

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`SeparableFilter::apply` allocates a temporary buffer even when `ExecutionStrategy::Serial` is selected, then immediately returns `apply_serial` without using that buffer. This introduces avoidable overhead.
## Issue Context
This function is part of the filtering hot path; avoiding unnecessary allocations improves performance and reduces memory pressure.
## Fix Focus Areas
- crates/kornia-imgproc/src/filter/separable_filter.rs[197-228]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Serial still needs Send+Sync 🐞 Bug ✓ Correctness
Description
The public separable_filter() API requires T: Send + Sync even when using ExecutionStrategy::Serial,
limiting use with non-thread-safe pixel types. This undermines the stated goal of supporting serial
execution without Send+Sync (even though separable_filter_serial exists).
Code

crates/kornia-imgproc/src/filter/separable_filter.rs[R296-303]

   dst: &mut Image<T, C, A2>,
   kernel_x: &[f32],
   kernel_y: &[f32],
+    strategy: ExecutionStrategy,
+) -> Result<(), ImageError>
+where
+    T: FloatConversion + Clone + Zero + Send + Sync,
+{
Evidence
The implementation explicitly documents that apply_serial avoids Send+Sync, and a
separable_filter_serial wrapper exists without these bounds. However, the main separable_filter
entrypoint still enforces Send+Sync for all strategies, including Serial, preventing callers from
simply selecting Serial in the strategy to work with non-Send/Sync types.

crates/kornia-imgproc/src/filter/separable_filter.rs[126-142]
crates/kornia-imgproc/src/filter/separable_filter.rs[294-303]
crates/kornia-imgproc/src/filter/separable_filter.rs[334-342]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`separable_filter()` requires `T: Send + Sync` regardless of `ExecutionStrategy`, so `ExecutionStrategy::Serial` still cannot be used with non-thread-safe pixel types. This conflicts with the goal/documentation around a serial path without these bounds.
### Issue Context
A `separable_filter_serial()` API exists and supports non-Send/Sync types, but the strategy-based API name `separable_filter()` suggests `Serial` should be sufficient.
### Fix Focus Areas
- crates/kornia-imgproc/src/filter/separable_filter.rs[294-323]
- crates/kornia-imgproc/src/filter/separable_filter.rs[324-361]
- crates/kornia-imgproc/src/filter/separable_filter.rs[126-142]
### Suggested change
Choose one of:
1) **Documentation/API clarity fix (minimal):** Update docs on `separable_filter()` to explicitly state it requires `Send + Sync` and that non-thread-safe types must call `separable_filter_serial()`.
2) **API split (stronger):** Rename the strategy-based function to something like `separable_filter_with_strategy` (Send+Sync) and keep/restore `separable_filter` as serial-only without Send+Sync.
Either path removes confusion and makes the contract clear.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment on lines +760 to +774
let img = Image::<f32, 1, _>::from_size_val(size, 0.5, CpuAllocator).unwrap();
let mut dst = Image::<f32, 1, _>::from_size_val(size, 0.0, CpuAllocator).unwrap();
let kernel = vec![1.0];

// Fixed(0) should error
let result = separable_filter(
&img,
&mut dst,
&kernel,
&kernel,
ExecutionStrategy::Fixed(0),
);
assert!(result.is_err());
let err_msg = result.unwrap_err().to_string();
assert!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. test_fixed_threadpool_validation unwraps 📘 Rule violation ⛯ Reliability

The newly added test uses unwrap() and unwrap_err() on fallible image creation and error
extraction, which can panic and obscures error handling intent. This violates the compliance
requirement to avoid unchecked assumptions and propagate/handle errors using Result/?.
Agent Prompt
## Issue description
The new test `test_fixed_threadpool_validation` uses `unwrap()`/`unwrap_err()` on fallible calls, which can panic and violates the no-unwrap compliance rule.

## Issue Context
This test already uses assertions on `Result`; it can be made panic-free by returning `Result` and matching on `Err(e)`.

## Fix Focus Areas
- crates/kornia-imgproc/src/filter/separable_filter.rs[755-778]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +230 to +241
ExecutionStrategy::AutoRows(_) => {
// Horizontal
temp.par_chunks_mut(cols * C)
.enumerate()
.for_each(|(r, row)| run_horizontal!(self, r, row, src_data, cols, C));

// Vertical
dst_data
.par_chunks_mut(cols * C)
.enumerate()
.for_each(|(r, row)| run_vertical!(self, r, row, temp, rows, cols, C, T));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. autorows stride unused 📘 Rule violation ✓ Correctness

In SeparableFilter::apply, the ExecutionStrategy::AutoRows(stride) parameter is ignored (_),
so callers cannot control/validate row stride as documented. This creates misleading behavior and
misses validation of boundary values like stride == 0.
Agent Prompt
## Issue description
`ExecutionStrategy::AutoRows(stride)` is implemented as `AutoRows(_)` in `SeparableFilter::apply`, so the provided `stride` is ignored and `stride == 0` is not validated/handled. This makes the API misleading and breaks boundary/edge-case expectations.

## Issue Context
`ExecutionStrategy::AutoRows(usize)` is documented as requiring a row stride. The implementation should either (a) honor the stride, or (b) redefine the variant to not accept a stride and update call sites/tests accordingly.

## Fix Focus Areas
- crates/kornia-imgproc/src/filter/separable_filter.rs[203-241]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines 44 to 48
dst: &mut Image<T, C, A2>,
threshold: T,
max_value: T,
strategy: ExecutionStrategy,
) -> Result<(), ImageError>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Morphology example compile break 🐞 Bug ✓ Correctness

threshold_binary() now requires an ExecutionStrategy argument, but examples/morphology still calls
the old 4-argument signature. Since examples/* are workspace members, this will break building the
workspace examples.
Agent Prompt
### Issue description
`threshold_binary` now requires a new `ExecutionStrategy` argument, but `examples/morphology` still calls the old signature. This causes a compile failure for the `examples/morphology` workspace member.

### Issue Context
Other examples were updated in this PR, but `examples/morphology/src/main.rs` still uses the old 4-arg call.

### Fix Focus Areas
- examples/morphology/src/main.rs[44-51]
- crates/kornia-imgproc/src/threshold.rs[42-48]

### Suggested change
Pass a strategy argument (likely `ExecutionStrategy::Serial` for deterministic/small example runs), e.g.:
```rust
use kornia::imgproc::parallel::ExecutionStrategy;
...
threshold::threshold_binary(&gray_single, &mut binary, 128u8, 255u8, ExecutionStrategy::Serial)?;
```

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@AdityaShome
Copy link
Contributor Author

Execution Strategy Performance Comparison:

image image image

Small Image (256x224), Kernel Size 5:

Strategy Time (µs) vs Serial Speedup
Serial 858 baseline -
Parallel 790 0.92x 1.09x faster
AutoRows 283 0.33x 3.03x faster
Fixed(4) 366 0.43x 2.34x faster

Medium Image (512x448), Kernel Size 5:

Strategy Time (µs) vs Serial Speedup
Serial 3,515 baseline -
Parallel 2,967 0.84x 1.18x faster
AutoRows 941 0.27x 3.74x faster
Fixed(4) 1,110 0.32x 3.17x faster

Medium Image (512x448), Kernel Size 11:

Strategy Time (µs) vs Serial Speedup
Serial 6,500 baseline -
Parallel 4,071 0.63x 1.60x faster
AutoRows 1,672 0.26x 3.89x faster
Fixed(4) 1,954 0.30x 3.33x faster

Large Image (1024x896), Kernel Size 17:

Strategy Time (ms) vs Serial Speedup
Serial 63.0 baseline -
Parallel 28.4 0.45x 2.22x faster
AutoRows 11.2 0.18x 5.61x faster
Fixed(4) 16.3 0.26x 3.87x faster

Mostly, AutoRows is giving better 5.6x speedup on large images.

Note: AutoRows uses 1 row per parallel chunk for optimal granularity.
Benchmarks run on Gaussian blur with separable filter implementation.

Copy link
Member

@edgarriba edgarriba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good progress on the execution strategy for separable filters.

some concerns:

  1. unsafe everywhere: the macros run_horizontal! and run_vertical! use get_unchecked — i understand the perf motivation but we should minimize unsafe. can you benchmark with bounds-checked access first? if the difference is negligible, prefer safe code. if it matters, at least add // SAFETY: comments explaining the invariants.

  2. wide dependency: i see wide 1.1.1 added to Cargo.lock — is this actually used in your changes? i don't see it in the separable filter code. if it's from another PR leaking into your branch, rebase.

  3. macro duplication: run_horizontal! and run_vertical! have very similar structure. consider a single generic approach or at least extracting the inner loop.

  4. benchmarks: good that you added strategy benchmarks. can you share actual numbers? specifically: for a 1920x1080 image, what's the speedup of ParallelElements vs Serial for gaussian_blur?

  5. the apply_serial vs apply split is clean. good that serial doesn't require Send + Sync.

  6. the _with_strategy suffix pattern works but makes the API a bit verbose. consider making strategy the default parameter and having the no-strategy version as the convenience wrapper (which you did — that's fine).

@sidd-27 can you cross-review the parallel execution parts?

@AdityaShome
Copy link
Contributor Author

Benchmark for 1920x1080: strategy comparison (kernel size 5) at full HD resolution.

Strategy Unsafe (Fast) Safe (Slow) Regression
Fixed (8) 8.5 ms 9.2 ms +8.2%
Parallel 19.8 ms 21.9 ms +10.6%
AutoRows 26.8 ms 36.9 ms +37.7%
Serial 33.8 ms 37.0 ms +9.5%

I currently kept unsafe indexing for performance but will change if required.
Could you please review once?

@AdityaShome
Copy link
Contributor Author

AdityaShome commented Feb 26, 2026

@sidd-27 can you check once, I ran the benchmarks for all 4 strategies giving similar results. Once approved, could continue it to other ops.
I think this pr is getting big enough and risky to push for far related ones and fixing other ones in a follow up might be a good idea. Thoughts?

@AdityaShome AdityaShome requested a review from sidd-27 February 28, 2026 01:22
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs within 7 days. Thank you for your contributions!

@github-actions github-actions bot added the stale label Mar 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants