Fix #79700: Bad performance with namespaced nodes due to wrong libxml assumption #11376
+79
−16
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes https://bugs.php.net/bug.php?id=79700 (that issue description also contains the benchmark!).
The issue talks about a wrong assumption, but I don't think it's a wrong assumption per-se. See also GH-5719 for more context and the emails from https://mail.gnome.org/archives/xml/2020-June/msg00000.html.
The purpose of oldNs AFAIK is actually to store the namespaces which can't be stored on the document nodes. They can't be freed because these nodes may be reused (and they don't refcount or anything). Also: not storing them on the oldNs list will leak memory when all relevant XML nodes are destroyed.
The problem in the original issue is the ever-growing list of duplicate ns nodes.
This PR solves this.
Commit list:
Use a prepending strategy instead of appending in dom_set_old_ns()
improves performance of the list management (see commit description).Reuse namespaces from doc->oldNs if possible in dom_get_ns()
solves the actual problem by reusing ns nodes.Targeting master because of potential impact of risky code changes.
Marking as draft for now because I need to double check this stuff.