fix: close stream after upstream ends to prevent client hang#186
fix: close stream after upstream ends to prevent client hang#186octo-patch wants to merge 1 commit into
Conversation
…qlchat#176) When an LLM provider (e.g. Ollama native API) does not send a [DONE] sentinel at the end of the stream, the ReadableStream in the API handler was never explicitly closed. This left the client's reader.read() loop waiting indefinitely, causing the UI to appear stuck. Two changes: - chat.ts: call controller.close() after exhausting the upstream body, wrapped in try/catch to silently ignore the case where [DONE] already closed it. Also guard the text encoding so undefined delta content (the final finish_reason chunk) is not encoded. - ConversationView: wrap the stream-reading loop in try/catch so that any stream error marks the message as FAILED instead of leaving it stuck in LOADING state. Co-Authored-By: Octopus <liyuan851277048@icloud.com>
There was a problem hiding this comment.
Pull request overview
Fixes a streaming hang where the backend ReadableStream could remain open if an upstream provider ends the response without sending the SSE data: [DONE] sentinel, and improves frontend handling when the stream errors so the UI doesn’t remain stuck in LOADING.
Changes:
- Backend: skip encoding/enqueueing empty delta chunks; ensure the response stream is closed after the upstream body is fully consumed.
- Frontend: wrap the stream-reading loop in
try/catchand mark the assistant message asFAILEDon read errors.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/pages/api/chat.ts |
Closes the server stream after exhausting upstream chunks; avoids encoding undefined/empty deltas. |
src/components/ConversationView/index.tsx |
Adds error handling around reader.read() loop to prevent indefinite LOADING state on stream errors. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Ensure the stream is closed after all upstream chunks are consumed. | ||
| // Some providers (e.g. Ollama native API) may not send a [DONE] sentinel, | ||
| // which would leave the ReadableStream open and the client hanging indefinitely. | ||
| try { | ||
| controller.close(); |
There was a problem hiding this comment.
TextDecoder.decode(chunk) is used without { stream: true } while iterating arbitrary Uint8Array chunks. If a multibyte UTF-8 code point is split across chunks, decoding can introduce replacement characters and potentially break SSE parsing/JSON parsing. Consider decoding in streaming mode (decoder.decode(chunk, { stream: true })) and flushing once at the end (before closing) to handle boundary-split characters correctly.
| const { value, done: readerDone } = await reader.read(); | ||
| if (value) { | ||
| const char = decoder.decode(value); | ||
| if (char) { | ||
| assistantMessage.content = assistantMessage.content + char; |
There was a problem hiding this comment.
TextDecoder.decode(value) is called without { stream: true } inside a loop reading arbitrary byte chunks. This can corrupt multibyte UTF-8 characters that span chunk boundaries. Use streaming decode (decoder.decode(value, { stream: true })) and flush after the loop to ensure correct character reconstruction.
| try { | ||
| while (!done) { | ||
| const { value, done: readerDone } = await reader.read(); | ||
| if (value) { | ||
| const char = decoder.decode(value); | ||
| if (char) { | ||
| assistantMessage.content = assistantMessage.content + char; | ||
| messageStore.updateMessage(assistantMessage.id, { | ||
| content: assistantMessage.content, | ||
| }); | ||
| } | ||
| } | ||
| done = readerDone; | ||
| } | ||
| done = readerDone; | ||
| } catch (error) { | ||
| messageStore.updateMessage(assistantMessage.id, { | ||
| content: assistantMessage.content || "Failed to receive response. Please check your API endpoint configuration.", | ||
| status: "FAILED", | ||
| }); | ||
| return; |
There was a problem hiding this comment.
The reader lock is never released/canceled. In the error path you return inside catch, and in the success path you exit the loop without calling reader.releaseLock() (or reader.cancel() when failing). Wrapping the read loop in try/catch/finally and releasing/canceling the reader in finally will avoid leaving the stream locked and can help ensure the underlying connection is torn down promptly on errors.
Fixes #176
Problem
When an LLM provider (e.g. Ollama) does not send a
data: [DONE]sentinel at the end of the SSE stream, theReadableStreamcreated insrc/pages/api/chat.tswas never explicitly closed. Thecontroller.close()call was only reachable inside the SSE event handler whendata === "[DONE]", so providers that omit this terminator left the stream permanently open. On the client side, thereader.read()loop inConversationViewthen blocked indefinitely, causing the UI to appear frozen with the message stuck inLOADINGstate.Two additional issues accompanied this:
delta(finish_reason: "stop"),delta?.contentisundefined. PassingundefinedtoTextEncoder.encode()can produce unexpected bytes in some runtimes instead of being a no-op.controller.error()), the unhandled promise rejection in the frontend left the message permanently stuck inLOADING.Solution
src/pages/api/chat.tsfor awaitloop that exhausts the upstream response body, callcontroller.close()inside atry/catch. The catch silently ignores the case where[DONE]already closed the controller.encoder.encode(text)/controller.enqueue(queue)calls behind anif (text)check so empty-delta chunks are skipped cleanly.src/components/ConversationView/index.tsxwhile (!done)stream-reading loop in atry/catch. If the stream signals an error, the assistant message is markedFAILEDwith an explanatory message rather than remaining stuck inLOADING.Testing
[DONE]path still closes the stream normally; the post-loopcontroller.close()is a no-op (caught and ignored).[DONE], the stream now closes cleanly once all upstream bytes are consumed.