-
Notifications
You must be signed in to change notification settings - Fork 7.8k
ext/opcache: remove option huge_code_pages #10301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1bf858c
to
ec1ee3d
Compare
Rebased & fixed conflict. |
Travis failure unrelated to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This technology definitely makes some speed improvement (not the same as 7 years ago, but still visible). All PHP workers inherit the same read-only mapping of huge code pages. It's done configurable on purpose and may be disabled/enabled by admin. I don't see any reason to remove this.
$ perf stat -e cycles:u -e instructions:u -e dTLB-load-misses -e iTLB-load-misses sapi/cgi/php-cgi -d opcache.jit=0 -d opcache.huge_code_pages=0 -T 5,10000 /var/www/html/bench/wordpress/index.php > /dev/null
Elapsed time: 25.027888 sec
Performance counter stats for 'sapi/cgi/php-cgi -d opcache.jit=0 -d opcache.huge_code_pages=0 -T 5,10000 /var/www/html/bench/wordpress-3.6/index.php':
92,924,210,183 cycles:u
119,988,584,666 instructions:u # 1.29 insn per cycle
5,208,210 dTLB-load-misses:u
7,231,349 iTLB-load-misses:u
25.124572589 seconds time elapsed
20.503810000 seconds user
2.395449000 seconds sys
$ perf stat -e cycles:u -e instructions:u -e dTLB-load-misses -e iTLB-load-misses sapi/cgi/php-cgi -d opcache.jit=0 -d opcache.huge_code_pages=1 -T 5,10000 /var/www/html/bench/wordpress-3.6/index.php > /dev/null
Elapsed time: 24.738100 sec
Performance counter stats for 'sapi/cgi/php-cgi -d opcache.jit=0 -d opcache.huge_code_pages=1 -T 5,10000 /var/www/html/bench/wordpress/index.php':
91,544,779,717 cycles:u
119,989,768,215 instructions:u # 1.31 insn per cycle
4,431,370 dTLB-load-misses:u
6,454,243 iTLB-load-misses:u
24.837880346 seconds time elapsed
20.398353000 seconds user
2.240638000 seconds sys
@dstogov, I tried the same WordPress test, but could not get consistent results in repeated runs. On an otherwise idle machine (with cpu governor=performance, of course), Unlike my previous benchmark attempts, it shows the expected wins for huge pages on the That is with a single PHP process running alone on a computer.
That is only true if all PHP processes are clones of a single initial PHP process, e.g. if you have a single system-wide FPM daemon which is shared by all users and all applications. All theoretical advantages collapse and turn to the opposite once you start a second PHP process, for example in a container (which is a very common thing nowadays). //
Most admins probably don't understand what the feature means and don't know the caveats - why would they, the documentation doesn't tell them, it only says "should improve performance", and who does not want to "improve performance"? Delegating responsibility for this under-documented and complex feature, one that can easily reduce performance, to admins doesn't sound like a plan that really works. Instead of removing the feature, you can improve the documentation to describe the limitations of the feature, to keep people from using it unless they really meet the narrow preconditions that may bring some theoretical sub-1% performance gain in some situations. As it is now, the feature is harmful, that's why I suggest removing it. |
Just personal opinion but instead of removing it for 8.3, what if we document the feature (much) better about the possible pros and cons and then revisit the decision whether we remove it or not for 9.0, I think it will be smoother especially user-wise ? |
I don't know the PHP roadmap, I only see the "master" branch but I don't know what it will become and when - but improving documentation is something that is hard to disagree on. And for this feature, the documentation is certainly misleading. I'll try to find where the documentation resides (doc-en?) and submit a PR over there. If the feature may be removed for 9.0 or whatever version, we can keep this PR open and you merge it whenever you feel like it fits into the roadmap. |
doc-en indeed I believe. |
@MaxKellermann You are quite so adamant on removing the feature even though it is an opt-in one. I would rather argue that it should be off by default on production ini and improve the documentation. That should do it. |
I believe this is an anti-feature and I am convinced not having this option is better than having it. It has disadvantages, but no advantage. I evaluated the feature last year - yes, of course, I want a performance improvement! For our web hosting product I was going to squeeze out the very last millisecond of performance (look at my older PRs here, 8.2 has benefited already a lot from my work, some of which I have successfully upstreamed). Not only did PHP actually become slower with I felt that all the time spent on this "feature" was completely wasted, and I wanted to save others from wasting time with it, therefore I decided to suggest removing it. Here I am. |
There are only very narrow circumstances under which this option has been reported to provide 1% performance gain due to reduction of TLB misses. In many setups, this option only increases memory usage, and will actually decrease performance. To avoid this, let's leave it disabled by default, and let it be an explicit decision to enable it. For a discussion, see php#10301
There are only very narrow circumstances under which this option has been reported to provide 1% performance gain due to reduction of TLB misses. In many setups, this option only increases memory usage, and will actually decrease performance. To avoid this, let's leave it disabled by default, and let it be an explicit decision to enable it. For a discussion, see php#10301
c6afa5d
to
5b7b07d
Compare
There are only very narrow circumstances under which this option has been reported to provide 1% performance gain due to reduction of TLB misses. In many setups, this option only increases memory usage, and will actually decrease performance. To avoid this, let's leave it disabled by default, and let it be an explicit decision to enable it. For a discussion, see php#10301
There are only very narrow circumstances under which this option has been reported to provide 1% performance gain due to reduction of TLB misses. In many setups, this option only increases memory usage, and will actually decrease performance. To avoid this, let's leave it disabled by default, and let it be an explicit decision to enable it. For a discussion, see php#10301
There are only very narrow circumstances under which this option has been reported to provide 1% performance gain due to reduction of TLB misses. In many setups, this option only increases memory usage, and will actually decrease performance. To avoid this, let's leave it disabled by default, and let it be an explicit decision to enable it. For a discussion, see #10301 Closes GH-10336
5b7b07d
to
a953060
Compare
There are only very narrow circumstances under which this option has been reported to provide 1% performance gain due to reduction of TLB misses. In many setups, this option only increases memory usage, and will actually decrease performance. To avoid this, let's leave it disabled by default, and let it be an explicit decision to enable it. For a discussion, see php#10301 Closes phpGH-10336
This feature is useless at best, but is fragile and increases memory usage because it creates a copy of the PHP executable in RAM, when another copy is already in the kernel's page cache. Additionally, each PHP process has its own copy, multiplying the memory usage with the number of PHP processes. Even worse: under memory pressure, the kernel would usually just discard those code pages, and reload them from disk on demand; but with `huge_code_pages`, the kernel cannot simply discard those pages, instead it has to allocate room in the swap partition and then has to write the pages to the swap partition before being able to discard them. This adds disk I/O to an already extremely constrained situation. The feature was added 7 years ago in commit 669f0b3, and the commit message stated that this "provided 2% improvement on real-life apps, because of 2-3 times reduction in number of iTLB misses", but did not tell how these results were produced. I tried, but failed to measure a difference. None of my tests went any faster with `huge_code_pages` enabled. In a second attempt, I tried `perf stat` with events `itlb_misses.miss_causes_a_walk`, `itlb_misses.stlb_hit`, `itlb_misses.walk_completed`, `itlb_misses.walk_duration` and could not see any improvements here either. The results were unstable, but nothing suggested it was better with `huge_code_pages`. Quite contrary, this operation delays PHP startup by 5 ms (measured on an Intel Xeon E5-2630). phpinfo() with opcache.huge_code_pages=0: 17.92 msec task-clock:u # 0.985 CPUs utilized 29,804,809 cycles:u # 1.663 GHz 44,512,668 instructions:u # 1.49 insn per cycle phpinfo() with opcache.huge_code_pages=1: 23.20 msec task-clock:u # 0.988 CPUs utilized 35,709,593 cycles:u # 1.539 GHz 45,940,779 instructions:u # 1.29 insn per cycle In order to reduce PHP's memory size, reduce the load during PHP startup, simplify debugging and profiling, I propose removing this feature.
a953060
to
b6a1e1c
Compare
There are only very narrow circumstances under which this option has been reported to provide 1% performance gain due to reduction of TLB misses. In many setups, this option only increases memory usage, and will actually decrease performance. To avoid this, let's leave it disabled by default, and let it be an explicit decision to enable it. For a discussion, see php#10301 Closes phpGH-10336
This feature is useless at best, but is fragile and increases memory usage because it creates a copy of the PHP executable in RAM, when another copy is already in the kernel's page cache.
Additionally, each PHP process has its own copy, multiplying the memory usage with the number of PHP processes.
Even worse: under memory pressure, the kernel would usually just discard those code pages, and reload them from disk on demand; but with
huge_code_pages
, the kernel cannot simply discard those pages, instead it has to allocate room in the swap partition and then has to write the pages to the swap partition before being able to discard them. This adds disk I/O to an already extremely constrained situation.The feature was added 7 years ago in commit 669f0b3, and the commit message stated that this "provided 2% improvement on real-life apps, because of 2-3 times reduction in number of iTLB misses", but did not tell how these results were produced.
I tried, but failed to measure a difference. None of my tests went any faster with
huge_code_pages
enabled.In a second attempt, I tried
perf stat
with eventsitlb_misses.miss_causes_a_walk
,itlb_misses.stlb_hit
,itlb_misses.walk_completed
,itlb_misses.walk_duration
and could not see any improvements here either. The results were unstable, but nothing suggested it was better withhuge_code_pages
.Quite contrary, this operation delays PHP startup by 5 ms (measured on an Intel Xeon E5-2630).
phpinfo() with opcache.huge_code_pages=0:
phpinfo() with opcache.huge_code_pages=1:
In order to reduce PHP's memory size, reduce the load during PHP startup, simplify debugging and profiling, I propose removing this feature.