Unavailable Splunk endpoint that returns 504, breaks all other sinks and whole Vector deployment #20496

Alan18081 · 2024-05-14T17:13:05Z

A note for the community

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

We encountered problem with Splunk HeC sink. When Splunk endpoint returns 504 Gateway Timeout, we found out that other sinks went in broken state to other Datadog destinations. Eventually, it causes whole Vector Pod to fail and we need to restart it

Configuration

sink_dd_external_splunk_${customer_account_name}_logs_${splunk}:
  type: splunk_hec_logs
  endpoint: ${splunk-endpoint}
  default_token: ${splunk-token}
  compression: gzip
  encoding:
    codec: json
  inputs:
  - tenants_external_logs.${customer_account_name}
  request:
    retry_attempts: 20
  buffer:
    type: disk
    max_size: 500000000
    when_full: block

Version

0.32.1

Debug Output

No response

Example Data

No response

Additional Context

No response

References

No response

The text was updated successfully, but these errors were encountered:

jszwedko · 2024-05-20T17:39:23Z

Hi @Alan18081 !

It is expected behavior that that a failing sink will apply back-pressure to other sinks connected to the same inputs. The way to obviate this is to configure a buffer on the sink you want to tolerate downtime for, which I see you did. If the buffer fills up, back-pressure will still be applied if you have when_full: block though so you may want to consider increasing the size of your buffer to tolerate a longer period of downtime. Alternatively, you can configure the buffer to drop data rather than block with when_full: drop. The concept of backpressure in Vector is described in more detail here: https://vector.dev/docs/about/concepts/#backpressure

I'll close this out since it appears to have worked as-designed, but let me know if you have any additional questions about this!

Alan18081 added the type: bug A code related bug. label May 14, 2024

jszwedko closed this as not planned Won't fix, can't repro, duplicate, stale May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unavailable Splunk endpoint that returns 504, breaks all other sinks and whole Vector deployment #20496

Unavailable Splunk endpoint that returns 504, breaks all other sinks and whole Vector deployment #20496

Alan18081 commented May 14, 2024

jszwedko commented May 20, 2024

Unavailable Splunk endpoint that returns 504, breaks all other sinks and whole Vector deployment #20496

Unavailable Splunk endpoint that returns 504, breaks all other sinks and whole Vector deployment #20496

Comments

Alan18081 commented May 14, 2024

A note for the community

Problem

Configuration

Version

Debug Output

Example Data

Additional Context

References

jszwedko commented May 20, 2024