-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
fix(wc):GNU wc-cpu.sh #9144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(wc):GNU wc-cpu.sh #9144
Conversation
Add CPU feature detection to conditionally use SIMD-accelerated bytecount functions for chars and lines, falling back to naive methods when SIMD is disabled by environment or unavailable on CPU. Introduce a hidden --debug flag to output SIMD status information, aiding in troubleshooting performance issues.
…lity Swap the if-else blocks in the wc function's debug output to check for SIMD allowance first, improving code flow without changing functionality.
- Add asimd, ASIMD, hwcaps, tunables, TUNABLES to .vscode/cspell.dictionaries/jargon.wordlist.txt - Prevents spellchecker from flagging valid technical terms used in the codebase
CodSpeed Performance ReportMerging #9144 will not alter performanceComparing Summary
Footnotes
|
|
GNU testsuite comparison: |
Co-authored-by: Dorian Péron <[email protected]>
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR #9088 (cksum --debug) and PR #9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: #9088, #9144
- Add "hardware" feature to uucore dependency in Cargo.toml - Replace local cpu_features module with uucore::hardware::simd_policy - Update debug output to use enabled_features() method for consistency
Reorder the imports within the `use uucore::{ ... };` block for consistency and better readability. The previous order had `hardware::simd_policy` misplaced; it is now sorted alphabetically (e.g., error, format_usage, hardware, parser, etc.). No functional changes were made.
|
GNU testsuite comparison: |
Reorder [dependencies] and [target.'cfg(unix)'.dependencies] sections in src/uu/wc/Cargo.toml to follow alphabetical order, improving file organization and consistency without changing functionality.
|
GNU testsuite comparison: |
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
|
Can you add a test to avoid regressions in the future ? |
Add a new Unix-specific test to ensure that SIMD optimizations in `wc` correctly respect GLIBC_TUNABLES, disabling SIMD paths for AVX2/AVX512 when specified. The test verifies both debug output accuracy and functional equivalence of line count results with and without these tunables, addressing potential correctness issues when hardware acceleration is restricted.
|
GNU testsuite comparison: |
Use `content.clone()` for the first two `pipe_in` calls to avoid moving the string, and `content` directly for the third to consume it, ensuring proper Rust ownership handling in the test. This prevents borrow checker errors and allows the test to compile correctly.
|
job fails with: |
|
GNU testsuite comparison: |
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Reformat the uucore dependency features in src/uu/wc/Cargo.toml from an inline array to a multiline array with proper indentation for improved readability and consistency. No functional changes; this is solely a styling update.
…e allocations Use std::fmt::Write and fold with writeln! to build content string efficiently, reducing memory allocations in test_simd_respects_glibc_tunables for better performance.
|
GNU testsuite comparison: |
src/uu/wc/src/wc.rs
Outdated
| if policy.allows_simd() { | ||
| let enabled = policy.enabled_features(); | ||
| if enabled.is_empty() { | ||
| eprintln!("wc: debug: hardware support unavailable on this CPU"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use the translate! macro
Add debug messages for hardware acceleration in English and French locales, and update Rust code to use translate! macro instead of hardcoded strings. This ensures debug output is language-agnostic and translatable.
|
GNU testsuite comparison: |
Overview
Disable SIMD-accelerated paths when GLIBC_TUNABLES removes AVX/AVX512 so wc falls back to the naive counters.
Add hidden --debug flag output that reports whether hardware acceleration is active, disabled by tunables, or unavailable at runtime.
Cache SIMD policy decisions and reuse them within the fast path code to avoid repeated environment parsing.
Testing
cargo test -p uu_wc
cargo clippy -p uu_wc -- -D warnings
Spot-check wc -l with and without GLIBC_TUNABLES='glibc.cpu.hwcaps=-AVX2,-AVX512F'