You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now the metric http_response_time_seconds has one label - path, and r.URL.Path is used as its value. This can be an issue when a service behind Reproxy uses dynamic urls (for example, /api/v1/users/{user_id}). In this case, the response size of /metrics may become too large to process by Prometheus (especially when resources are limited).
We can't just drop this label because of the backward compatibility promise. However, we can make it optional. It would be enabled by default - again, to comply with the backward compatibility promise.
I also have a few more suggestions (more → less important):
Add the label server to all metrics. It feels wrong that only http_requests_total has this label. For example, it prevents me from creating an alert for 500+ statuses for specific upstreams.
Add metrics http_request_size_bytes{server} and http_response_size_bytes{server} that would allow users to monitor the size of requests and responses.
Add new buckets for http_response_time_seconds - for example, 10 and 30. The current "highest" bucket is 5s, but it can take much longer to process some types of requests. At the same time, I understand that it's hard to find a good and fitting maximum value. Another option it to make the buckets configurable - but I am not sure it would be practical.
The text was updated successfully, but these errors were encountered:
@umputun Let me know which direction you'd like to go in based on the feedback provided by @ShoshinNikita.
My question is mostly around do we drop the field or keep the field with the value of aggregated? The reasons for keeping the field is it wouldn't break any dashboards that rely on this field.
Can you also provide your preferred name for the CLI argument to toggle this feature?
I would only plan to add this specific feature and not the other recommendations from @ShoshinNikita.
Right now the metric
http_response_time_seconds
has one label -path
, andr.URL.Path
is used as its value. This can be an issue when a service behind Reproxy uses dynamic urls (for example,/api/v1/users/{user_id}
). In this case, the response size of/metrics
may become too large to process by Prometheus (especially when resources are limited).We can't just drop this label because of the backward compatibility promise. However, we can make it optional. It would be enabled by default - again, to comply with the backward compatibility promise.
I also have a few more suggestions (more → less important):
server
to all metrics. It feels wrong that onlyhttp_requests_total
has this label. For example, it prevents me from creating an alert for 500+ statuses for specific upstreams.http_request_size_bytes{server}
andhttp_response_size_bytes{server}
that would allow users to monitor the size of requests and responses.http_response_time_seconds
- for example,10
and30
. The current "highest" bucket is5s
, but it can take much longer to process some types of requests. At the same time, I understand that it's hard to find a good and fitting maximum value. Another option it to make the buckets configurable - but I am not sure it would be practical.The text was updated successfully, but these errors were encountered: