Skip to content

fix: close stream after upstream ends to prevent client hang#186

Open
octo-patch wants to merge 1 commit into
sqlchat:mainfrom
octo-patch:fix/issue-176-stream-not-closing
Open

fix: close stream after upstream ends to prevent client hang#186
octo-patch wants to merge 1 commit into
sqlchat:mainfrom
octo-patch:fix/issue-176-stream-not-closing

Conversation

@octo-patch
Copy link
Copy Markdown
Contributor

Fixes #176

Problem

When an LLM provider (e.g. Ollama) does not send a data: [DONE] sentinel at the end of the SSE stream, the ReadableStream created in src/pages/api/chat.ts was never explicitly closed. The controller.close() call was only reachable inside the SSE event handler when data === "[DONE]", so providers that omit this terminator left the stream permanently open. On the client side, the reader.read() loop in ConversationView then blocked indefinitely, causing the UI to appear frozen with the message stuck in LOADING state.

Two additional issues accompanied this:

  1. When the last streaming chunk has an empty delta (finish_reason: "stop"), delta?.content is undefined. Passing undefined to TextEncoder.encode() can produce unexpected bytes in some runtimes instead of being a no-op.
  2. If the stream was closed with an error (controller.error()), the unhandled promise rejection in the frontend left the message permanently stuck in LOADING.

Solution

src/pages/api/chat.ts

  • After the for await loop that exhausts the upstream response body, call controller.close() inside a try/catch. The catch silently ignores the case where [DONE] already closed the controller.
  • Guard the encoder.encode(text) / controller.enqueue(queue) calls behind an if (text) check so empty-delta chunks are skipped cleanly.

src/components/ConversationView/index.tsx

  • Wrap the while (!done) stream-reading loop in a try/catch. If the stream signals an error, the assistant message is marked FAILED with an explanatory message rather than remaining stuck in LOADING.

Testing

  • Verified with standard OpenAI streaming: the [DONE] path still closes the stream normally; the post-loop controller.close() is a no-op (caught and ignored).
  • For providers that omit [DONE], the stream now closes cleanly once all upstream bytes are consumed.

…qlchat#176)

When an LLM provider (e.g. Ollama native API) does not send a [DONE]
sentinel at the end of the stream, the ReadableStream in the API handler
was never explicitly closed. This left the client's reader.read() loop
waiting indefinitely, causing the UI to appear stuck.

Two changes:
- chat.ts: call controller.close() after exhausting the upstream body,
  wrapped in try/catch to silently ignore the case where [DONE] already
  closed it. Also guard the text encoding so undefined delta content
  (the final finish_reason chunk) is not encoded.
- ConversationView: wrap the stream-reading loop in try/catch so that
  any stream error marks the message as FAILED instead of leaving it
  stuck in LOADING state.

Co-Authored-By: Octopus <liyuan851277048@icloud.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a streaming hang where the backend ReadableStream could remain open if an upstream provider ends the response without sending the SSE data: [DONE] sentinel, and improves frontend handling when the stream errors so the UI doesn’t remain stuck in LOADING.

Changes:

  • Backend: skip encoding/enqueueing empty delta chunks; ensure the response stream is closed after the upstream body is fully consumed.
  • Frontend: wrap the stream-reading loop in try/catch and mark the assistant message as FAILED on read errors.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/pages/api/chat.ts Closes the server stream after exhausting upstream chunks; avoids encoding undefined/empty deltas.
src/components/ConversationView/index.tsx Adds error handling around reader.read() loop to prevent indefinite LOADING state on stream errors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/pages/api/chat.ts
Comment on lines +156 to +160
// Ensure the stream is closed after all upstream chunks are consumed.
// Some providers (e.g. Ollama native API) may not send a [DONE] sentinel,
// which would leave the ReadableStream open and the client hanging indefinitely.
try {
controller.close();
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TextDecoder.decode(chunk) is used without { stream: true } while iterating arbitrary Uint8Array chunks. If a multibyte UTF-8 code point is split across chunks, decoding can introduce replacement characters and potentially break SSE parsing/JSON parsing. Consider decoding in streaming mode (decoder.decode(chunk, { stream: true })) and flushing once at the end (before closing) to handle boundary-split characters correctly.

Copilot uses AI. Check for mistakes.
Comment on lines +264 to +268
const { value, done: readerDone } = await reader.read();
if (value) {
const char = decoder.decode(value);
if (char) {
assistantMessage.content = assistantMessage.content + char;
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TextDecoder.decode(value) is called without { stream: true } inside a loop reading arbitrary byte chunks. This can corrupt multibyte UTF-8 characters that span chunk boundaries. Use streaming decode (decoder.decode(value, { stream: true })) and flush after the loop to ensure correct character reconstruction.

Copilot uses AI. Check for mistakes.
Comment on lines +262 to +281
try {
while (!done) {
const { value, done: readerDone } = await reader.read();
if (value) {
const char = decoder.decode(value);
if (char) {
assistantMessage.content = assistantMessage.content + char;
messageStore.updateMessage(assistantMessage.id, {
content: assistantMessage.content,
});
}
}
done = readerDone;
}
done = readerDone;
} catch (error) {
messageStore.updateMessage(assistantMessage.id, {
content: assistantMessage.content || "Failed to receive response. Please check your API endpoint configuration.",
status: "FAILED",
});
return;
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reader lock is never released/canceled. In the error path you return inside catch, and in the success path you exit the loop without calling reader.releaseLock() (or reader.cancel() when failing). Wrapping the read loop in try/catch/finally and releasing/canceling the reader in finally will avoid leaving the stream locked and can help ensure the underlying connection is torn down promptly on errors.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sendMessageToCurrentConversation Fetch Function Not Returning Response

2 participants