Skip to content

Conversation

@mattsu2020
Copy link
Contributor

Overview

Disable SIMD-accelerated paths when GLIBC_TUNABLES removes AVX/AVX512 so wc falls back to the naive counters.
Add hidden --debug flag output that reports whether hardware acceleration is active, disabled by tunables, or unavailable at runtime.
Cache SIMD policy decisions and reuse them within the fast path code to avoid repeated environment parsing.

Testing

cargo test -p uu_wc
cargo clippy -p uu_wc -- -D warnings
Spot-check wc -l with and without GLIBC_TUNABLES='glibc.cpu.hwcaps=-AVX2,-AVX512F'

Add CPU feature detection to conditionally use SIMD-accelerated bytecount functions for chars and lines, falling back to naive methods when SIMD is disabled by environment or unavailable on CPU. Introduce a hidden --debug flag to output SIMD status information, aiding in troubleshooting performance issues.
…lity

Swap the if-else blocks in the wc function's debug output to check for SIMD allowance first, improving code flow without changing functionality.
- Add asimd, ASIMD, hwcaps, tunables, TUNABLES to .vscode/cspell.dictionaries/jargon.wordlist.txt
- Prevents spellchecker from flagging valid technical terms used in the codebase
@mattsu2020 mattsu2020 changed the title fix(wc):GNUwc-cpu.sh fix(wc):GNU wc-cpu.sh Nov 4, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 4, 2025

CodSpeed Performance Report

Merging #9144 will not alter performance

Comparing mattsu2020:wc_compatibility (1a25344) with main (b2feb82)

Summary

✅ 126 untouched
⏩ 6 skipped1

Footnotes

  1. 6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

GNU testsuite comparison:

Congrats! The gnu test tests/wc/wc-cpu is no longer failing!

naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Nov 14, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Nov 14, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Nov 14, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Nov 14, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Nov 15, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Nov 15, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144
RenjiSann pushed a commit to naoNao89/coreutils that referenced this pull request Nov 24, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144
RenjiSann pushed a commit that referenced this pull request Nov 24, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR #9088 (cksum --debug) and PR #9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: #9088, #9144
mattsu2020 and others added 3 commits November 24, 2025 21:06
- Add "hardware" feature to uucore dependency in Cargo.toml
- Replace local cpu_features module with uucore::hardware::simd_policy
- Update debug output to use enabled_features() method for consistency
Reorder the imports within the `use uucore::{ ... };` block for consistency and better readability. The previous order had `hardware::simd_policy` misplaced; it is now sorted alphabetically (e.g., error, format_usage, hardware, parser, etc.). No functional changes were made.
@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/wc/wc-cpu is no longer failing!

Reorder [dependencies] and [target.'cfg(unix)'.dependencies] sections in src/uu/wc/Cargo.toml to follow alphabetical order, improving file organization and consistency without changing functionality.
@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/wc/wc-cpu is no longer failing!

naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Nov 24, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144
@RenjiSann
Copy link
Collaborator

Can you add a test to avoid regressions in the future ?

Add a new Unix-specific test to ensure that SIMD optimizations in `wc` correctly respect GLIBC_TUNABLES, disabling SIMD paths for AVX2/AVX512 when specified. The test verifies both debug output accuracy and functional equivalence of line count results with and without these tunables, addressing potential correctness issues when hardware acceleration is restricted.
@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/wc/wc-cpu is no longer failing!

Use `content.clone()` for the first two `pipe_in` calls to avoid moving the string, and `content` directly for the third to consume it, ensuring proper Rust ownership handling in the test. This prevents borrow checker errors and allows the test to compile correctly.
@sylvestre
Copy link
Contributor

job fails with:
ERROR taplo:format_files: the file is not properly formatted path="/home/runner/work/coreutils/coreutils/src/uu/wc/Cargo.toml"

@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/wc/wc-cpu is no longer failing!

RenjiSann pushed a commit to RenjiSann/coreutils that referenced this pull request Nov 28, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144
martinkunkel2 pushed a commit to martinkunkel2/coreutils that referenced this pull request Nov 30, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144
martinkunkel2 pushed a commit to martinkunkel2/coreutils that referenced this pull request Nov 30, 2025
Add shared CPU hardware capability detection in uucore to prevent
code duplication across utilities. This provides a unified interface
for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and
respecting GLIBC_TUNABLES environment variable.

This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by
providing a common implementation that both utilities can use.

Features:
- CPU feature detection with caching (singleton pattern)
- GLIBC_TUNABLES parsing for hwcaps restrictions
- Cross-platform support (x86/x86_64, aarch64)
- Comprehensive test coverage
- Zero-cost abstractions using std::arch

Implementation details:
- Uses std::arch feature detection (no external deps for detection)
- Adds cfg-if dependency for conditional compilation
- Feature-gated behind "hardware" feature flag
- Android excluded (no CPUID access in sandboxed environment)

Related: uutils#9088, uutils#9144
Reformat the uucore dependency features in src/uu/wc/Cargo.toml from an inline array to a multiline array with proper indentation for improved readability and consistency.

No functional changes; this is solely a styling update.
…e allocations

Use std::fmt::Write and fold with writeln! to build content string efficiently,
reducing memory allocations in test_simd_respects_glibc_tunables for better performance.
@github-actions
Copy link

github-actions bot commented Dec 1, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/wc/wc-cpu is no longer failing!

if policy.allows_simd() {
let enabled = policy.enabled_features();
if enabled.is_empty() {
eprintln!("wc: debug: hardware support unavailable on this CPU");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use the translate! macro

Add debug messages for hardware acceleration in English and French locales, and update Rust code to use translate! macro instead of hardcoded strings. This ensures debug output is language-agnostic and translatable.
@github-actions
Copy link

github-actions bot commented Dec 1, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/wc/wc-cpu is no longer failing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants