rz-test: assemble/disassemble/lift asm tests in-process#6442
Draft
notxvilka wants to merge 1 commit into
Draft
Conversation
The asm test suite (db/asm, ~20k instruction lines) was by far the most
process-heavy part of rz-test. For every instruction line rz-test spawned
up to three separate rz-asm processes -- one to assemble, one to
disassemble and one to lift+validate the IL -- so a full run launched on
the order of ~30k short-lived rz-asm processes. Each launch dynamically
loads the entire librz plugin stack, and that fixed per-process startup
cost is what dominates the asm tests, especially under ASAN (large
mappings make fork/exec and the loader much more expensive) and on
Windows (CreateProcess is intrinsically costly).
Instead of shelling out, drive RzAsm/RzAnalysis directly from the test
runner. rz_test_run_asm_test() now creates an RzAsm and an RzAnalysis,
configures them from the test's arch/cpu/bits/endianness exactly like the
rz-asm CLI does, and:
- assembles via rz_asm_rasm_assemble() and compares the raw bytes,
- disassembles via rz_asm_mdisassemble() (trim + '\n'->';' as before),
- lifts each instruction with RZ_ANALYSIS_OP_MASK_IL and stringifies +
validates the IL, mirroring print_and_check_il() in librz/main/rz-asm.c.
The post-processing of the disassembly and IL strings is identical to the
old subprocess path, so the recorded EXPECT values keep matching. This
links rz-test against rz_arch (RzAsm/RzAnalysis) and rz_il.
A simple stdin-batching scheme (feeding many instructions to one rz-asm)
was considered and rejected: rz-asm's file/stdin paths run the whole input
through rz_asm_massemble()/stream disassembly, which merges instruction
boundaries and cannot recover the per-line results each asm test asserts.
Driving the library in-process is the only way to both remove the spawns
and keep exact per-instruction results.
Tradeoffs:
- The per-test timeout (config->timeout_ms) is no longer applied to asm
tests. These are bounded computations over a single, already-parsed
instruction; a hang would be a real bug, better surfaced than hidden.
- Crash isolation is reduced: a crashing assembler/lifter now takes down
the worker instead of a child process. The BROKEN mechanism still
covers known failures, and asm ops over a single instruction are far
simpler than the full-binary analysis cmd tests already run.
The RzAsm/RzAnalysis pair is created per test for now; hoisting it to one
instance per worker thread (reconfiguring per test) would remove the
remaining per-test setup cost and is a natural follow-up.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Your checklist for this pull request
RZ_APIfunction and struct this PR changes.RZ_API).Detailed description
The asm test suite (db/asm, ~20k instruction lines) was by far the most process-heavy part of rz-test. For every instruction line rz-test spawned up to three separate rz-asm processes -- one to assemble, one to disassemble and one to lift+validate the IL -- so a full run launched on the order of ~30k short-lived rz-asm processes. Each launch dynamically loads the entire librz plugin stack, and that fixed per-process startup cost is what dominates the asm tests, especially under ASAN (large mappings make fork/exec and the loader much more expensive) and on Windows (CreateProcess is intrinsically costly).
Instead of shelling out, drive RzAsm/RzAnalysis directly from the test runner. rz_test_run_asm_test() now creates an RzAsm and an RzAnalysis, configures them from the test's arch/cpu/bits/endianness exactly like the rz-asm CLI does, and:
The post-processing of the disassembly and IL strings is identical to the old subprocess path, so the recorded EXPECT values keep matching. This links rz-test against rz_arch (RzAsm/RzAnalysis) and rz_il.
A simple stdin-batching scheme (feeding many instructions to one rz-asm) was considered and rejected: rz-asm's file/stdin paths run the whole input through rz_asm_massemble()/stream disassembly, which merges instruction boundaries and cannot recover the per-line results each asm test asserts. Driving the library in-process is the only way to both remove the spawns and keep exact per-instruction results.
Tradeoffs:
The RzAsm/RzAnalysis pair is created per test for now; hoisting it to one instance per worker thread (reconfiguring per test) would remove the remaining per-test setup cost and is a natural follow-up.
Test plan