Skip to content

Add Internal observers #9024

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 30, 2022
Merged

Add Internal observers #9024

merged 1 commit into from
Jul 30, 2022

Conversation

bwoebi
Copy link
Member

@bwoebi bwoebi commented Jul 15, 2022

Adding observers for internal functions:

There are two main motivations to this:
a) The logic for handling internal and userland observation can be unified.
b) Unwinding of observed functions on a bailout does notably not include observers. Even if users of observers were to ensure such handling themselves, it would be impossible to retain the relative ordering - either the user has to unwind all internal observed frames before the automatic unwinding (zend_observer_fcall_end_all) or afterwards, but not properly interleaved.

This PR is stacked on top of #9062 (merged and rebased).

Copy link
Member

@krakjoe krakjoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks okay to me, I wouldn't mind another pair of eyes with karma having a look ...

@bwoebi bwoebi force-pushed the internal-observers branch from efa4400 to 5d44f9f Compare July 29, 2022 13:17
Copy link
Member

@arnaud-lb arnaud-lb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me apart from the map_ptr issue

Zend/zend_API.c Outdated
@@ -2694,6 +2694,10 @@ ZEND_API zend_result zend_register_functions(zend_class_entry *scope, const zend
internal_function->scope = scope;
internal_function->prototype = NULL;
internal_function->attributes = NULL;
ZEND_MAP_PTR_NEW(internal_function->run_time_cache);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can not do this at runtime, otherwise the opcache will not see it and may allocate the same slots again in other processes. We can do this only during startup, or during an opcache hook.

Clashes may happen if these slots are imported into the current process during include:

dl("dl_test.so");
include "script.php";

An other unfortunate effect would be that the slot is used only temporarily, but will never be freed or reused again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But ... zend_declare_class_constant_ex is also doing ZEND_MAP_PTR_NEW(ce->mutable_data) - code which is also invoked when a module declaring class constants is loaded via dl() at run-time. Or is that broken as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's broken as well :D (or I'm wrong about ZEND_MAP_PTR_NEW).

Here ZEND_MAP_PTR_NEW() is only called for internal classes with AST constants (e.g. internal enums), which are probably rare, which would explain why we haven't seen the problem yet. Also I've just noticed that dl()ing an extension with internal enums crashes for an other reason.

Copy link
Member Author

@bwoebi bwoebi Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn :-D I was looking at that code, saw it, and thought … okay, must be safe then.
However, how would one then fix this category of issues? I mean … doing a check for MODULE_TEMPORARY each time run_time_cache is to be accessed is obviously not viable for performance reasons.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would only need to do this check before ZEND_MAP_PTR_NEW().

Apparently ZEND_MAP_PTR_GET / ZEND_MAP_PTR_SET can handle heap-allocated pointers. In this case the map ptr is a real pointer, not a CG(map_ptr_base) offset:

# define ZEND_MAP_PTR_GET(ptr) \
((ZEND_MAP_PTR_IS_OFFSET(ptr) ? \
ZEND_MAP_PTR_GET_IMM(ptr) : \
((void*)(ZEND_MAP_PTR(ptr)))))

Here we could do a ZEND_MAP_PTR_INIT(internal_function->run_time_cache, zend_arena_alloc(...)) instead of ZEND_MAP_PTR_NEW() + ZEND_MAP_PTR_SET() when EG(active).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to use a .reserved slot, or to introduce a new zend_internal_function field.

The run_time_cache / map_ptr mechanism is useful for user functions because they are immutable. We don't really need this for internal functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the change, I guess looks good now?

Copy link
Member Author

@bwoebi bwoebi Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that installed observers are different from request to request, so this needs run-time mutable data, hence the map_ptr mechanism. The big benefit of the map_ptr mechanism is that it's just an offset into some run-time memory, so that one can easily jump from the immutable (internal function) data structure to mutable run-time data. A .reserved slot or new fixed field on zend_internal_function does not suffice in itself for this purpose; they would also have to contain offsets into some per request to be allocated structure, essentially ending up being what map_ptrs currently are anyway providing.

Further, it perfectly aligns with the existing userland run_time_cache mechanism, reusing the same variables, saving extra conditions (avoiding having to check whether its user code or internal code), and having also the same access pattern for internal and userland run_time_caches does nothing but make the devs life a bit easier.

Additionally, at least while writing extension code, I also had the need to attach runtime information to internal functions. The workaround is doing a hashtable lookup with the function address, but obviously, this has a performance penalty.
The usefulness for core itself is probably a bit more limited, but the run-time overhead of initializing run_time cache is avoided if cache size is basically zero.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A .reserved slot or new fixed field on zend_internal_function does not suffice in itself for this purpose; they would also have to contain offsets into some per request to be allocated structure, essentially ending up being what map_ptrs currently are anyway providing.

I'm not sure about this. map_ptr is needed for structures that may live in SHM, because we don't store pointers to non-SHM in SHM. But zend_internal_functions leave on the heap and don't have that restriction. We can store a pointer to an array of observers in a zend_internal_function, and modify the array independently in each process or request.

I don't object though, with the ZEND_MAP_PTR_SET change this looks good to me 👍

@bwoebi bwoebi force-pushed the internal-observers branch from 5d44f9f to ce4727f Compare July 29, 2022 20:29
There are two main motivations to this:
a) The logic for handling internal and userland observation can be unified.
b) Unwinding of observed functions on a bailout does notably not include observers. Even if users of observers were to ensure such handling themselves, it would be impossible to retain the relative ordering - either the user has to unwind all internal observed frames before the automatic unwinding (zend_observer_fcall_end_all) or afterwards, but not properly interleaved.

Signed-off-by: Bob Weinand <[email protected]>
@bwoebi bwoebi force-pushed the internal-observers branch from ce4727f to d829cce Compare July 30, 2022 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants