// FIXED
- Issue was Shopware 6.5 related
After weeks of unsuccessful debugging, we are now hoping for help here.
We have a system where we can create test instances for Shopware 5 and Shopware 6. The applications are not special and run on LAMP without any specific configuration.
In detail, the issue is that after an indefinite period of time, for an unknown reason, the PHP child processes crash and fail to restart. As a result, the instances return a blank page. They only start working again after executing systemctl restart/reload php8.1-fpm
The twist is that we are using different PHP versions (FPM 5.6, 7.2, 7.4, 8.{0..2}). However, the problem only occurs with PHP 8.1 FPM; all other PHP versions don't have this issue. The configurations of the PHP versions are all identical.
The PHP versions are from the Sury repository and are all up-to-date (8.1.18).
We have already enabled DEBUG mode and receive the following log messages when accessing a page during the error:
[08-Jun-2023 14:02:36.220848] DEBUG: pid 613368, fpm_children_make(), line 407: blocking signals before child birth
[08-Jun-2023 14:02:36.221667] DEBUG: pid 613368, fpm_children_make(), line 431: unblocking signals, child born
[08-Jun-2023 14:02:36.221679] NOTICE: pid 613368, fpm_children_make(), line 437: [pool hidden] child 666067 started
[08-Jun-2023 14:02:36.323769] DEBUG: pid 613368, fpm_event_loop(), line 440: event module triggered 1 events
[08-Jun-2023 14:02:36.327076] DEBUG: pid 613368, fpm_got_signal(), line 82: received SIGCHLD
[08-Jun-2023 14:02:36.327089] DEBUG: pid 613368, fpm_event_loop(), line 440: event module triggered 1 events
[08-Jun-2023 14:02:36.327103] WARNING: pid 613368, fpm_children_bury(), line 258: [pool hidden] child 666067 exited with code 70 after 0.105426 seconds from start
We have already checked the following things:
- Checked server load, which is almost underutilized
- Modified pool configurations multiple times
- Looked for bugs related to PHP 8.1 FPM (couldn't find anything at least)
Our fpm pool configuration for every pool:
[hidden]
user = hidden
group = hidden
listen = /run/php/php8.1-fpm-hidden.sock
listen.owner = www-data
listen.group = www-data
listen.mode = 0660
pm = ondemand # tried also with dynamic first
pm.max_children = 124
pm.start_servers = 32
pm.process_idle_timeout = 10s
pm.min_spare_servers = 32
pm.max_spare_servers = 64
pm.max_requests = 5000
# played with the pm. settings a thousand times
php_admin_value[open_basedir] = /home/hidden:/dev/urandom
php_admin_value[sys_temp_dir] = /home/hidden/files/php/tmp
php_admin_value[upload_tmp_dir] = /home/hidden/files/php/tmp
php_admin_value[error_log] = /home/hidden/files/php/php-error.log
php_admin_flag[log_errors] = on
Our php-fpm.conf (everything is default, instead of the following. We only implemented these testwise for debugging purposes. The error occurs with and without these options.)
log_level = debug
emergency_restart_threshold = 3
emergency_restart_interval = 1m
process_control_timeout = 10s
Our php.ini is standard, except for the memory_limit, which we raised to 4G.
System specs:
- Ubuntu 22.04.2
- 128GB DDR5 ECC RAM
- 16C/32T (AMD Ryzen™ 9 7950X3D)
- 2x 1.92TB NVMe (RAID 1)
If you need more information, please feel free to ask. We appreciate any hints and further approaches. We tried to summarize the information from the past weeks as best as we could.