UPSTREAM PR #17553: Fix byteswapping in gguf_new_metadata.py and gguf_editor_gui.py #348

loci-dev · 2025-11-27T16:40:05Z

Fix gguf_new_metadata.py for non-native endian files

gguf_new_metadata.py reads data from reader.
Reader doesn't byteswap tensors to native endianness.
But writer does expect tensors in native endianness to convert them
into requested endianness.

There are two ways to fix this: update reader and do conversion to native endianness and back,
or skip converting endianness in writer in this particular USE-case.

Let's go with second approach to skip excessive in this case conversions.

Fix gguf_editor_gui.py for non-native endian files

Since it doesn't allow editing or viewing tensor data,
just skip byteswapping when writing data back into file.

If eventually capability to view or edit tensor data is added,
tensor data should be instead byteswapped when reading it.

These changes can be verified even on little endian systems.
For that 2 copies of files are needed.
One copy is modified first and then byteswapped to non-native endian.
Second copy is byteswapped to non-native endian first and then modified same way as first copy.
Both copies should be identical after such modifications.

loci-agentic-ai · 2025-11-27T17:15:34Z

Explore the complete analysis inside the Version Insights

Performance Analysis Summary: PR #348

Analysis Overview

PR #348 introduces correctness fixes for endianness handling in Python GGUF utilities (gguf_writer.py, gguf_editor_gui.py, gguf_new_metadata.py). Performance analysis shows 0.0% change across all 16 binaries. No C++ inference code was modified.

Condition Assessment: Condition 1 applies - no performance metric changes detected.

Performance Impact

Binary Analysis:
All binaries show zero measurable change in power consumption:

libllama.so: 190,887 nJ (0.0% change)
libggml.so: 4,031 nJ (0.0% change)
libggml-cpu.so: 115,243 nJ (0.0% change)
All CLI tools: 0.0% change

Inference Performance:
No impact on tokens per second. Changes are isolated to Python metadata manipulation utilities and do not affect inference functions (llama_decode, llama_encode, llama_tokenize).

Conclusion: This PR fixes data corruption in non-native endian GGUF file handling without affecting runtime performance.

… files gguf_new_metadata.py reads data from reader. Reader doesn't byteswap tensors to native endianness. But writer does expect tensors in native endianness to convert them into requested endianness. There are two ways to fix this: update reader and do conversion to native endianness and back, or skip converting endianness in writer in this particular USE-case. gguf_editor_gui.py doesn't allow editing or viewing tensor data. Let's go with skipping excessive byteswapping. If eventually capability to view or edit tensor data is added, tensor data should be instead byteswapped when reading it.

loci-agentic-ai · 2025-11-28T15:13:58Z

Explore the complete analysis inside the Version Insights

Performance Analysis Summary - PR #348

Analysis Overview

PR #348 introduces endianness handling fixes in Python utility scripts (gguf_writer.py, gguf_editor_gui.py, gguf_new_metadata.py). Performance analysis confirms zero impact on compiled binaries. All 16 analyzed binaries show 0.0% power consumption change. No function-level performance deltas detected between versions 1e15ddf2-b59d-49cc-b843-9f0661a545a2 and 04c91645-829d-48f6-9e12-61790e0ebdc2.

The changes modify Python-only code paths for GGUF file manipulation, adding explicit tensor_endianess parameters to prevent data corruption during cross-endian workflows. No modifications to C++ inference engine, model loading, tokenization, or batch processing components.

Inference Performance Impact: None. Core inference functions (llama_decode, llama_encode, llama_tokenize) remain unchanged with identical response time and throughput measurements. Tokens per second throughput is unaffected as no inference path modifications occurred.

Power Consumption: All binaries maintain identical power consumption profiles, including libllama.so (193182 nJ), libggml-cpu.so (115347 nJ), and inference utilities.

loci-dev temporarily deployed to PROD__AL_DEMO November 27, 2025 16:40 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from aaa8a85 to 9239ee7 Compare November 28, 2025 14:08

loci-dev force-pushed the upstream-PR17553-branch_AlekseiNikiforovIBM-s390x_modifying_scripts branch from 479cda8 to e3bd936 Compare November 28, 2025 14:37

loci-dev temporarily deployed to PROD__AL_DEMO November 28, 2025 14:37 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 11 times, most recently from e4a4e1d to d0b408b Compare November 30, 2025 02:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UPSTREAM PR #17553: Fix byteswapping in gguf_new_metadata.py and gguf_editor_gui.py #348

UPSTREAM PR #17553: Fix byteswapping in gguf_new_metadata.py and gguf_editor_gui.py #348

Uh oh!

loci-dev commented Nov 27, 2025

Uh oh!

loci-agentic-ai bot commented Nov 27, 2025

Uh oh!

loci-agentic-ai bot commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

UPSTREAM PR #17553: Fix byteswapping in gguf_new_metadata.py and gguf_editor_gui.py #348

Are you sure you want to change the base?

UPSTREAM PR #17553: Fix byteswapping in gguf_new_metadata.py and gguf_editor_gui.py #348

Uh oh!

Conversation

loci-dev commented Nov 27, 2025

Uh oh!

loci-agentic-ai bot commented Nov 27, 2025

Performance Analysis Summary: PR #348

Analysis Overview

Performance Impact

Uh oh!

loci-agentic-ai bot commented Nov 28, 2025

Performance Analysis Summary - PR #348

Analysis Overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants