Feat: Add `OLLAMA_LOAD_TIMEOUT` env variable #4123

dcfidalgo · 2024-05-03T09:47:50Z

For certain hardware setups and models, the offloading to the GPU can take a lot of time and the user can hit a timeout. This PR makes the timeout configurable via the OLLAMA_LOAD_TIMEOUT env variable, to be provided in seconds.

@dhiltgen I added a subsection in the FAQ, since I was not sure where to document the env variable. Let me know if this is the right place.

llm/server.go

bsdnet · 2024-05-03T22:14:04Z

llm/server.go

+		}
+	}
+	print(timeout)
+	expiresAt := time.Now().Add(time.Duration(timeout) * time.Second) // be generous with timeout, large models can take a while to load
 	ticker := time.NewTicker(50 * time.Millisecond)


Should we print the message "loading the model" for each tick?
Without the message or a spinning, the compute seems being stuck for 10 mins.

UmutAlihan · 2024-05-19T23:59:19Z

This is a killer feature for users who try to build things using only low end cheap GPUs. Please review and merge this for the next release, for the sake of common people.

dhiltgen · 2024-05-21T16:39:34Z

Thinking about this one more... the large timeout, and making it configurable is really more of a workaround. What we really should do is detect progress during model load and detect stalling more quickly. As long as we're making forward progress, we should let it proceed, perhaps indefinitely, but if we stall, then we should fail faster than 10m.

I've started to lay some initial foundation in #4547 which I think will help us get to a better user experience overall.

dhiltgen · 2024-05-23T21:10:59Z

Thanks for putting this together.

I think we can close this now that I've merged #4157.

Please give it a shot and if you see any flakiness we can adjust the stall duration.

dcfidalgo · 2024-05-24T05:57:13Z

Thanks for the heads-up, will definitely give it a try.

dcfidalgo added 2 commits May 2, 2024 23:32

Add env variable OLLAMA_LOAD_TIMEOUT in seconds

11bcc40

Add entry in FAQ

0905a7b

dhiltgen reviewed May 3, 2024

View reviewed changes

llm/server.go Outdated Show resolved Hide resolved

llm/server.go Outdated Show resolved Hide resolved

bsdnet reviewed May 3, 2024

View reviewed changes

sammcj mentioned this pull request May 5, 2024

Consider Using Standard Config Format #204

Open

log instead of fail

2befddf

dcfidalgo requested a review from dhiltgen May 6, 2024 07:00

dhiltgen mentioned this pull request May 23, 2024

Wire up load progress #4547

Merged

3 tasks

dhiltgen closed this May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Add `OLLAMA_LOAD_TIMEOUT` env variable #4123

Feat: Add `OLLAMA_LOAD_TIMEOUT` env variable #4123

dcfidalgo commented May 3, 2024

bsdnet May 3, 2024

UmutAlihan commented May 19, 2024

dhiltgen commented May 21, 2024

dhiltgen commented May 23, 2024

dcfidalgo commented May 24, 2024

Feat: Add OLLAMA_LOAD_TIMEOUT env variable #4123

Feat: Add OLLAMA_LOAD_TIMEOUT env variable #4123

Conversation

dcfidalgo commented May 3, 2024

bsdnet May 3, 2024

Choose a reason for hiding this comment

UmutAlihan commented May 19, 2024

dhiltgen commented May 21, 2024

dhiltgen commented May 23, 2024

dcfidalgo commented May 24, 2024

Feat: Add `OLLAMA_LOAD_TIMEOUT` env variable #4123

Feat: Add `OLLAMA_LOAD_TIMEOUT` env variable #4123