Skip to content

Added zend_simd.h #18413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Added zend_simd.h #18413

wants to merge 8 commits into from

Conversation

SakiTakamachi
Copy link
Member

@SakiTakamachi SakiTakamachi commented Apr 24, 2025

For SIMD using 128-bit vectors, add a header file to use common code between SSE2 and ARM Neon.

I don't intend to replace everything with these.
The goal is to avoid having to copy a lot of "almost identical code" for non-SIMD code on ARM.

There are other places could use it, but it would involve too many changes, so I'll address that in a follow-up PR.

@SakiTakamachi SakiTakamachi marked this pull request as ready for review April 24, 2025 04:52
Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion about this.

On one hand this enables NEON code on AArch64 almost for free.
On the other hand this requires usage of our own undocumented abstraction layer.
Most probably this layer is more limited than SSE or NEON by their selves.

Personally, I would prefer something like NEON2SSE (e.g. https://github.com/intel/ARM_NEON_2_x86_SSE)

@SakiTakamachi
Copy link
Member Author

@dstogov

Thank you for checking!

There isn’t much SIMD code in php-src yet, so for now I chose not to use any external libraries.
However, as you mentioned, using a library would provide greater flexibility, and I have absolutely no problem switching to a library if needed.

Since the SIMD code currently in php-src is based on SSE, NEON2SSE might not fit our purpose, as the direction is reversed.

We probably need something like sse2neon:
https://github.com/DLTcollab/sse2neon

Or alternatively:
https://github.com/simd-everywhere/simde

What do you think about these libraries?

@dstogov
Copy link
Member

dstogov commented Apr 28, 2025

Since the SIMD code currently in php-src is based on SSE, NEON2SSE might not fit our purpose, as the direction is reversed.

OK I didn't look too deep. Anyway, all these libraries are more complex then we really need. It may be simpler to create our own neon<-->sse wrapper that implements SSE intrinsics through NEON instructions. Similar to the API you proposed, but without introducing new "naming". This way the subset of supported SSE code should start to work on NEON out of the box.

Finally, it may be better to duplicate code with native NEON optimization instead of trying reusing SSE optimized code.

These are just my thoughts. I'm not sure what solution is the best.

@SakiTakamachi
Copy link
Member Author

@dstogov

It may be simpler to create our own neon<-->sse wrapper that implements SSE intrinsics through NEON instructions. Similar to the API you proposed, but without introducing new "naming". This way the subset of supported SSE code should start to work on NEON out of the box.

Sounds simple and good.
I’ll try to revise it based on that idea!

Finally, it may be better to duplicate code with native NEON optimization instead of trying reusing SSE optimized code.

If the differences in code due to optimization are likely to be significant, I think it’s better to write them separately.
If the differences are small, a wrapper might be sufficient.

@SakiTakamachi
Copy link
Member Author

Since bcmath had its own simd.h, I reverted those changes and modified it to use zend_simd.h instead.

As for other parts, I updated them where the current definitions in zend_simd.h are sufficient for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants