Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: superglobals-realated crash with custom extensions in worker mode #796

Merged
merged 26 commits into from
Jun 5, 2024

Conversation

dunglas
Copy link
Owner

@dunglas dunglas commented May 16, 2024

Closes #767.
Partial fix for #592 (memory leak when dealing with cookies in worker mode).

@dunglas dunglas force-pushed the fix/767 branch 2 times, most recently from 8b8753c to edee8bb Compare May 16, 2024 16:01
@dunglas
Copy link
Owner Author

dunglas commented May 17, 2024

@withinboredom @TimWolla would you mind reviewing the proposed fix?

I need to do more testing to ensure that the behavior is what is expected, but at least this fixes the segfault. We'll have to double-check if this doesn't introduce a memory leak or some weird issues.

Copy link
Contributor

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to do more testing to ensure that the behavior is what is expected, but at least this fixes the segfault. We'll have to double-check if this doesn't introduce a memory leak or some weird issues.

It does leak:

[Tue May 21 07:27:50 2024]  Script:  '-'
/usr/local/src/php/Zend/zend_string.h(174) :  Freeing 0x00007fff802020a0 (40 bytes), script=-
Last leak repeated 14 times
[Tue May 21 07:27:50 2024]  Script:  '-'
/usr/local/src/php/Zend/zend_string.h(174) :  Freeing 0x00007fff802020f0 (40 bytes), script=-
Last leak repeated 42 times
[Tue May 21 07:27:50 2024]  Script:  '-'
/usr/local/src/php/Zend/zend_string.h(174) :  Freeing 0x00007fff802025a0 (40 bytes), script=-
Last leak repeated 12 times
[Tue May 21 07:27:50 2024]  Script:  '-'
/usr/local/src/php/Zend/zend_hash.c(291) :  Freeing 0x00007fff8025f000 (56 bytes), script=-
Last leak repeated 4 times
[Tue May 21 07:27:50 2024]  Script:  '-'
/usr/local/src/php/Zend/zend_hash.c(1311) :  Freeing 0x00007fff80260c00 (2560 bytes), script=-
=== Total 77 memory leaks detected ===

For debugging memory leaks, I suggest USE_ZEND_ALLOC=0 + ASAN. Valgrind does not appear to work well here.

I'm also seeing

2024/05/21 07:27:50.165 INFO    restarting      {"worker": "/go/src/app/testdata/worker.php"}

immediately after, but I'm not sure if that is expected.

The following test script also appears to log the incorrect information:

<?php

ignore_user_abort(true);

for($running = true; $running;) {
    file_put_contents(__DIR__ . '/log', "before: " . $_SERVER['REQUEST_URI'] . ' ' . $_SERVER['REMOTE_ADDR'] . "\n", FILE_APPEND);

    $running = \frankenphp_handle_request(static function () {
        file_put_contents(__DIR__ . '/log', "inside: " . $_SERVER['REQUEST_URI'] . ' ' . $_SERVER['REMOTE_ADDR'] . "\n", FILE_APPEND);
    });

    file_put_contents(__DIR__ . '/log', "after : " . $_SERVER['REQUEST_URI'] . ' ' . $_SERVER['REMOTE_ADDR'] . "\n", FILE_APPEND);
}

I would expected before: and after: to look identical at all times, but I'm seeing:

before: worker.php 
before: worker.php 
before: worker.php 
before: worker.php 
inside: /worker.php?some_query_parameters 127.0.0.1
after : /worker.php?some_query_parameters 127.0.0.1
before: /worker.php?some_query_parameters 127.0.0.1
inside: /worker.php?some_query_parameters 127.0.0.1
after : /worker.php?some_query_parameters 127.0.0.1
before: /worker.php?some_query_parameters 127.0.0.1

frankenphp.c Outdated Show resolved Hide resolved
frankenphp.c Outdated Show resolved Hide resolved
frankenphp.c Outdated Show resolved Hide resolved
@TimWolla
Copy link
Contributor

Valgrind does not appear to work well here.

I does point out some conditional jumps that depend on uninitialized in non-go code values, though. But also quite a few invalid reads in go code, which might or might not be expected.

@TimWolla
Copy link
Contributor

The following test script also appears to log the incorrect information:

The cause probably is that you can't naively memcpy the zval, because the reference counting of strings is then incorrect, allowing the code to perform in-place modification instead of doing copy-on-write. I didn't test, because I find the DX of the development container to be pretty unwieldy.

@dunglas
Copy link
Owner Author

dunglas commented May 21, 2024

Thanks for your detailed review @TimWolla. Regarding the Docker container, it is entirely optional. When you have a local installation of PHP, you can just run this command to compile the binary: https://frankenphp.dev/docs/compile/#compile-the-go-app

frankenphp.c Outdated Show resolved Hide resolved
@dunglas
Copy link
Owner Author

dunglas commented May 21, 2024

I pushed a new version that uses ZVAL_COPY as suggested by @TimWolla. This should fix the leak (to be double-checked).

This behaves almost as expected but I identified two issues:

  • $_SERVER['FRANKENPHP_WORKER'] and probably other custom environment variables passed to the worker script are lost inside the closure. We should probably merge the variables from the worker script in the closure to prevent this issue.
  • $_SERVER['FRANKENPHP_WORKER'] is NULL terminated, that's an existing, unrelated bug. I'll fix it. Fixed in $_SERVER['FRANKENPHP_WORKER'] must not be NULL-terminated #809

@dunglas
Copy link
Owner Author

dunglas commented May 21, 2024

The behavior is now the expected one. If you don't mind, cloud you please review again?

@dunglas dunglas marked this pull request as ready for review May 21, 2024 16:49
@dunglas
Copy link
Owner Author

dunglas commented May 21, 2024

Actually, merging the $_SERVER superglobals of the worker and the request wasn't expected, we even have a test (the failing one) for that.

As showcased by the test, it's entirely possible to do that userland if needed.

Should we keep the existing behavior of not merging (after a second thought, I think that it's more flexible to not merge)?

@dunglas dunglas changed the title test: failing test reproducing #767 fix!: better superglobals handling in worker mode May 22, 2024
@TimWolla
Copy link
Contributor

I'm still seeing a leak:

b57e622520ad# ../caddy/frankenphp/frankenphp php-server --worker worker-getopt.php
2024/05/22 13:44:51.696 WARN    admin   admin endpoint disabled
2024/05/22 13:44:51.696 WARN    http.auto_https server is listening only on the HTTP port, so no automatic HTTPS will be applied to this server {"server_name": "php", "http_port": 80}
2024/05/22 13:44:51.696 INFO    tls.cache.maintenance   started background certificate maintenance      {"cache": "0xc000345a80"}
2024/05/22 13:44:51.713 INFO    FrankenPHP started 🐘   {"php_version": "8.3.8-dev"}
2024/05/22 13:44:51.714 INFO    http.log        server running  {"name": "php", "protocols": ["h1", "h2", "h3"]}
2024/05/22 13:44:51.714 INFO    Caddy serving PHP app on :80
2024/05/22 13:44:51.723 WARN    tls     storage cleaning happened too recently; skipping for now        {"storage": "FileStorage:/root/.local/share/caddy", "instance": "00d35d99-1901-4bf0-aa0b-e1f662db2acd", "try_again": "2024/05/23 13:44:51.723", "try_again_in": 86399.999999265}
2024/05/22 13:44:51.723 INFO    tls     finished cleaning storage units
[Wed May 22 13:44:53 2024]  Script:  '-'
/usr/local/src/php/Zend/zend_string.h(174) :  Freeing 0x00007fe08c2021e0 (40 bytes), script=-
Last leak repeated 12 times
[Wed May 22 13:44:53 2024]  Script:  '-'
/usr/local/src/php/Zend/zend_string.h(174) :  Freeing 0x00007fe08c202230 (40 bytes), script=-
Last leak repeated 40 times
[Wed May 22 13:44:53 2024]  Script:  '-'
/usr/local/src/php/Zend/zend_string.h(174) :  Freeing 0x00007fe08c2026e0 (48 bytes), script=-
Last leak repeated 12 times
[Wed May 22 13:44:53 2024]  Script:  '-'
/usr/local/src/php/Zend/zend_hash.c(291) :  Freeing 0x00007fe08c25f000 (56 bytes), script=-
Last leak repeated 4 times
[Wed May 22 13:44:53 2024]  Script:  '-'
/usr/local/src/php/Zend/zend_hash.c(1311) :  Freeing 0x00007fe08c260c00 (2560 bytes), script=-
=== Total 73 memory leaks detected ===
2024/05/22 13:44:53.542 ERROR   unexpected termination, restarting      {"worker": "/go/src/app/testdata/worker-getopt.php", "exit_status": 2}

and then calling curl -v http://localhost/worker-getopt.php?some_query_parameters.

I also might've done something wrong while starting FrankenPHP, because of the exit(2) being triggered.

@TimWolla
Copy link
Contributor

And the test script with the file_put_contents from yesterday’s review still outputs the wrong data for the after:.

@dunglas
Copy link
Owner Author

dunglas commented May 22, 2024

I've an idea for the after:. I'll take a look. The crash and the leak are weird, I'm not sure of why it behaves like this.

@dunglas
Copy link
Owner Author

dunglas commented May 22, 2024

Maybe related: #81

frankenphp.c Outdated Show resolved Hide resolved
@dunglas
Copy link
Owner Author

dunglas commented May 27, 2024

@TimWolla @withinboredom would you mind testing/reviewing this patch one more time? The current implementation looks ok-ish to me.

@TimWolla
Copy link
Contributor

@dunglas For the test script in #796 (review) I'm still seeing the wrong data being written to the log file:

before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
before: worker-getopt.php 
inside: /worker-getopt.php?some_query_parameters 127.0.0.1
after : /worker-getopt.php?some_query_parameters 127.0.0.1
before: /worker-getopt.php?some_query_parameters 127.0.0.1

frankenphp.c Outdated Show resolved Hide resolved
frankenphp.c Outdated Show resolved Hide resolved
@dunglas
Copy link
Owner Author

dunglas commented May 31, 2024

In the last commit, I fixed the crash without changing the existing behavior (the docs change is now irrelevant, I'll revert that). I think it's good enough for now.

@TimWolla
Copy link
Contributor

TimWolla commented Jun 3, 2024

In the last commit, I fixed the crash without changing the existing behavior (the docs change is now irrelevant, I'll revert that). I think it's good enough for now.

I can confirm the crash is fixed, but internal functions see an empty $_SERVER now. That's certainly better than a crash, but would still be inconsistent with userland’s PoV.

return 0;
}

free(ctx->cookie_data);
Copy link
Owner Author

@dunglas dunglas Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this free() here likely fixes a memory leak in worker mode.

@dunglas
Copy link
Owner Author

dunglas commented Jun 4, 2024

I cleaned the code and updated the docs. I think I've found a way to keep the userland and internal structures in sync but I didn't create a test case. @TimWolla would you mind checking if this now works with the Tideways extension?

@TimWolla
Copy link
Contributor

TimWolla commented Jun 5, 2024

I think I've found a way to keep the userland and internal structures in sync but I didn't create a test case.

I can confirm that internal functions called after frankenphp_handle_request() see the same $_SERVER as userland initially saw.

However what is not kept in sync is manually writing to $_SERVER either within the callback or after frankenphp_handle_request() finishes and before calling the internal function.

$_SERVER['dummy'] = 'value';

Dumping var_dump(count($_SERVER)) returns 42 in my test case, but an internal function

ZEND_FUNCTION(native_debug)
{
    zend_is_auto_global(ZSTR_KNOWN(ZEND_STR_AUTOGLOBAL_SERVER));
    HashTable *server = Z_ARRVAL_P(&PG(http_globals)[TRACK_VARS_SERVER]);
    printf("%d\n", zend_hash_num_elements(server));
    zval *request_uri = zend_hash_str_find(server, "REQUEST_URI", strlen("REQUEST_URI"));
    printf("%s\n", Z_STRVAL_P(request_uri));
}

prints 41 (because dummy is missing). Likewise modifying an entry using:

$_SERVER['REQUEST_URI'] = 'dummy';

is not visible to the native function.


I'm not sure if as a user I would prefer a crash or silently incorrect data. Your call.

@dunglas
Copy link
Owner Author

dunglas commented Jun 5, 2024

@TimWolla are the two data structures synced automatically in a normal script? I may be wrong, but I don't think that we do anything special FrankenPHP-side that would cause this desync.

@dunglas
Copy link
Owner Author

dunglas commented Jun 5, 2024

@TimWolla according to my tests, this behaves the same even when not in worker mode, so it looks "expected" to me, at least let's say that it's not related to FrankenPHP.

@TimWolla
Copy link
Contributor

TimWolla commented Jun 5, 2024

@dunglas Can confirm with the cli-server. Sorry for the confusion. This looks good to me then!

@dunglas dunglas changed the title fix!: better superglobals handling in worker mode fix: better superglobals handling in worker mode Jun 5, 2024
@dunglas dunglas changed the title fix: better superglobals handling in worker mode fix: superglobals-realated crash with custom extensions in worker mode Jun 5, 2024
@dunglas dunglas merged commit 3714fdf into main Jun 5, 2024
42 checks passed
@dunglas dunglas deleted the fix/767 branch June 5, 2024 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

$_SERVER lifecycle is unsafe in worker mode
3 participants