Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with reverse proxied server sent events and compression #6293

Open
ueffel opened this issue May 2, 2024 · 3 comments
Open

Problems with reverse proxied server sent events and compression #6293

ueffel opened this issue May 2, 2024 · 3 comments
Labels
bug 🐞 Something isn't working discussion 💬 The right solution needs to be found help wanted 🆘 Extra attention is needed

Comments

@ueffel
Copy link
Contributor

ueffel commented May 2, 2024

Affected version: 2.7.6, 2.8.0-beta.1

If the encode module is enabled and caddy is asked to reverse proxy a server sent event (SSE) endpoint, I noticed 2 problems:

Problem 1:

If the upstream does not send a response body (just headers), which is perfectly valid by the way, caddy does not flush these headers to the client, so the SSE connection is not correctly established. #4247 seems to be very similar, and the reverse proxy module does the right thing, but because of #4314 (#4318) the encode module delays the flushing until the first bytes of response body are written. (oops my bad 😅). That means the SSE connection is only correctly established after the first write of the upstream (first event), which can be a while after the initial "handshake".

Possible solution:

Make an exception for the SSE Content-Type (text/event-stream) and don't wait for the first bytes, but just initialize the encoding ignoring the minimum_length.

diff --git a/modules/caddyhttp/encode/encode.go b/modules/caddyhttp/encode/encode.go
--- a/modules/caddyhttp/encode/encode.go
+++ b/modules/caddyhttp/encode/encode.go
@@ -23,7 +23,8 @@
 	"fmt"
 	"io"
 	"math"
+	"mime"
 	"net/http"
 	"sort"
 	"strconv"
 	"strings"
@@ -247,5 +248,12 @@
 	// see: https://caddy.community/t/disappear-103-early-hints-response-with-encode-enable-caddy-v2-7-6/23081/5
 	if 100 <= status && status <= 199 {
 		rw.ResponseWriter.WriteHeader(status)
 	}
+	ctHeader := rw.Header().Get("Content-Type")
+	ct, _, err := mime.ParseMediaType(ctHeader)
+	if err == nil && ct == "text/event-stream" {
+		rw.init()
+		rw.ResponseWriter.WriteHeader(rw.statusCode)
+		rw.wroteHeader = true
+	}
 }

Problem 2:

If the encode module actually decides to do compression on the SSE connection, the upstream flushing and flushing in the encode module (FlushError) afterward does not actually send the finished event message to the client.

The reason for this is probably the window size of the compression algorithms. The bytes of the upstream are written to the encoder, but the Encoder has not actually written anything to the ResponseWriter (the thing the encoding module is flushing) and even if it has, the event message is likely incomplete because the compression frame was not full.

Possible solution:

Call Flush() on the encoder to make it finish its compression frame and write everything to the ResponseWriter which then can be flushed to the client.

diff --git a/modules/caddyhttp/encode/encode.go b/modules/caddyhttp/encode/encode.go
--- a/modules/caddyhttp/encode/encode.go
+++ b/modules/caddyhttp/encode/encode.go
@@ -266,4 +274,14 @@
 		return nil
 	}
+
+	if rw.w != nil {
+		fl, ok := rw.w.(interface{ Flush() error })
+		if ok {
+			err := fl.Flush()
+			if err != nil {
+				return err
+			}
+		}
+	}
 	//nolint:bodyclose
 	return http.NewResponseController(rw.ResponseWriter).Flush()

But I'm not sure about the above implementation. Flush() is called pretty frequently by caddy (especially with flush_interval -1, which is the default) and this could make the compression less efficient when the compression frames are forced to finish before they actually filled up.

curl

Calling curl (-N for streaming responses), see Debug Help below for info about the setup.

$ curl -v -N --compressed http://localhost:80/
* Host localhost:80 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:80...
* Connected to localhost (::1) port 80
> GET / HTTP/1.1
> Host: localhost
> User-Agent: curl/8.7.1
> Accept: */*
> Accept-Encoding: deflate, gzip, br, zstd
>
* Request completely sent off

No immediate Response because of Problem 1.
When the first event message from the upstream arrives:

< HTTP/1.1 200 OK
< Cache-Control: no-cache
< Content-Encoding: zstd
< Content-Type: text/event-stream
< Date: Thu, 02 May 2024 21:08:52 GMT
< Server: Caddy
< Vary: Accept-Encoding
< Transfer-Encoding: chunked
<

Response headers (note the Content-Encoding) but no response body.

Only if there is quite a lot of data coming from the upstream (commented out code on the small Go app below) a response arrives from caddy, but with the last event message incomplete.

Debugging help

Caddyfile (minimum length to force encoding)

http://:80 {
	encode zstd gzip {
		minimum_length 10
	}
	reverse_proxy 127.0.0.1:8080
}

Simple Go application for server sent events

package main

import (
	"net/http"
	"time"
)

func main() {
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		rc := http.NewResponseController(w)
		rc.SetWriteDeadline(time.Time{})
		w.Header().Set("Content-Type", "text/event-stream")
		w.Header().Set("Cache-Control", "no-cache")
		w.Header().Set("Connection", "keep-alive")
		w.WriteHeader(http.StatusOK)
		rc.Flush()

		ticker := time.NewTicker(3 * time.Second)
		for {
			select {
			case <-ticker.C:
				w.Write([]byte("data: test\n\n"))

				// more data
				// w.Write([]byte("data: "))
				// for range 200 {
				// 	w.Write([]byte("just some random bytes that have no actual meaning and are just here to fill up some space. Some lorem ipsum if you will. Some lorem ipsum if you will."))
				// }
				// w.Write([]byte("\n\n"))

				rc.Flush()
			case <-r.Context().Done():
				return
			}
		}
	})

	http.ListenAndServe(":8080", nil)
}
@francislavoie
Copy link
Member

francislavoie commented May 2, 2024

Does it even make sense to try to compress SSE? Maybe we exclude it from the default content type matcher for encode.

https://github.com/caddyserver/caddy/blob/master/modules%2Fcaddyhttp%2Fencode%2Fencode.go#L121

There's not really any good way to negate here to say "text/* but not text/event-stream" so maybe we need to hard-code that response type to exclude it 🤔

@mholt
Copy link
Member

mholt commented May 2, 2024

Or if there's a way to just not wait for the first bytes when the type is text/event-stream. I feel like the exceptions are very few in general, that we can just hard-code the logic for now without having to make the ResponseMatcher interface more complicated. (But maybe it needs an exemption parameter to be more useful, I dunno.)

@mholt mholt added bug 🐞 Something isn't working discussion 💬 The right solution needs to be found labels May 2, 2024
@ueffel
Copy link
Contributor Author

ueffel commented May 2, 2024

Does it even make sense to try to compress SSE? Maybe we exclude it from the default content type matcher for encode.

Generally it does make sense. You can send arbitrary text data as event message, for example with HTMX part of the website as HTML or some JSON data. Both of which compress pretty well.

@mholt mholt added the help wanted 🆘 Extra attention is needed label May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working discussion 💬 The right solution needs to be found help wanted 🆘 Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants