throttling not working as one would expect? #1353

thisIsLoading · 2024-05-20T02:09:44Z

Hi,

i wanted to use throttling for a job that downloads files from a website. i dont want to limit the amount of jobs to be scheduled, just the amount of jobs that are executed within a certain amount of time. i figured one job every 5 seconds would be good enough to not trigger any 429s on the remote server, so i set this:

  good_job_control_concurrency_with(
    # Maximum number of jobs with the concurrency key to be
    # concurrently performed (excludes enqueued jobs)
    # Can be an Integer or Lambda/Proc that is invoked in the context of the job
    perform_limit: 1,

    # Maximum number of jobs with the concurrency key to be performed within
    # the time period, looking backwards from the current time. Must be an array
    # with two elements: the number of jobs and the time period.
    perform_throttle: [12, 1.minute],

    key: -> { self.class.name }
  )

which i thought would exactly do what i need.

however, when running the jobs i realized, that the concurrency control seems to be via an exception where it throws an error about an exceeded throttle:

and it re-schedules the job about 3 hours later(?)

There definitely are a lot of moving parts and my simple world view isnt enough. so, i clearly dont seem to understand how i would need to configure the way i want it to perform.

I thought, it would take this arry, would devide the duration by the number in [0] and then executes a job, waits the calculated amount and executes the next job.

can you help me out what i do wrong and how i get it to constantly executing jobs until the queue is empty, without long pauses?

thank you!

The text was updated successfully, but these errors were encountered:

bensheldon · 2024-05-20T15:36:24Z

Your understanding of how Throttling actually works is correct. I tried to explain that in this section in the Readme: https://github.com/bensheldon/good_job?tab=readme-ov-file#how-concurrency-controls-work

You're seeing 3 hours because it is using retry_on ... wait: :polynomially_longer. You can add your own retry_on handler to your job with a fixed retry e.g.

retry_on(
          GoodJob::ActiveJobExtensions::Concurrency::ConcurrencyExceededError,
          attempts: Float::INFINITY,
          wait: -> (executions) { 30.seconds + (10 * Kernel.rand) } 
        )

The challenge with throttling and concurrency control is that there's a conflict between the goal of a general job queue (run tasks as quickly as possible) and a throttled queue (run tasks at a managed rate). GoodJob's "dequeue, check constraints, retry" pattern is the same one I've seen implemented elsewhere, but I'm open to contributions or outside inspiration.

thisIsLoading · 2024-05-20T15:45:43Z

i understand. thank you @bensheldon

as this was just the early stages of my project, i must admit i jumped ship to sidekiq after i found https://github.com/ixti/sidekiq-throttled which is doing exactly what i needed.

i still feel good_job is doing a better job than sidekiq, just not for this particular use case. unfortunately i dont feel ready enough to contribute anything (yet), so i had to take the easy exit.

with that said, thanks a lot for doing all this and providing this gem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

throttling not working as one would expect? #1353

throttling not working as one would expect? #1353

thisIsLoading commented May 20, 2024

bensheldon commented May 20, 2024

thisIsLoading commented May 20, 2024

throttling not working as one would expect? #1353

throttling not working as one would expect? #1353

Comments

thisIsLoading commented May 20, 2024

bensheldon commented May 20, 2024

thisIsLoading commented May 20, 2024