You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenAI and the majority of LLM's allow for response streaming.
To get responses sooner, you can 'stream' the completion as it's being generated. This allows you to start printing or processing the beginning of the completion before the full completion is finished.
Expected Behavior
Allow for conversation API response streaming (as this is a built in feature of the majority of LLM's)
Actual Behavior
When you request a conversation response, the entire completion is generated before being sent back in a single response. If you're generating long completions, waiting for the response can take many seconds.
Metadata
Typesense Version: 26.0
The text was updated successfully, but these errors were encountered:
Bajocode
changed the title
Streaming Conversational Responses
[Feature Request] Streaming Conversational Responses
May 18, 2024
Voting on the issue: For most use cases, showing just a loader while the response is being generated is not sufficient. It makes the conversational bot almost unsuitable for production use.
Also vote for this issue. I presented our demo based on typsense RAG last week and it was little bit annoying for customers to wait for long response. If they see stream of letters (they're used to it now) it will look much better.
Description
OpenAI and the majority of LLM's allow for response streaming.
To get responses sooner, you can 'stream' the completion as it's being generated. This allows you to start printing or processing the beginning of the completion before the full completion is finished.
Expected Behavior
Allow for conversation API response streaming (as this is a built in feature of the majority of LLM's)
Actual Behavior
When you request a conversation response, the entire completion is generated before being sent back in a single response. If you're generating long completions, waiting for the response can take many seconds.
Metadata
Typesense Version: 26.0
The text was updated successfully, but these errors were encountered: