Skip to content

feat(session): expose disableSpill and maxTempDirectorySize on SessionContextBuilder#84

Open
LantaoJin wants to merge 1 commit into
apache:mainfrom
LantaoJin:feat/disk-manager-builder
Open

feat(session): expose disableSpill and maxTempDirectorySize on SessionContextBuilder#84
LantaoJin wants to merge 1 commit into
apache:mainfrom
LantaoJin:feat/disk-manager-builder

Conversation

@LantaoJin
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

PR #28 added tempDirectory(String) as the only Java surface for DataFusion's RuntimeEnvBuilder disk-manager knobs. Two real gaps remain — both reachable on the Rust side, neither reachable from Java today, and neither reachable via setOption(...) because they live on RuntimeEnv construction rather than the datafusion.* config namespace (the native side already explicitly rejects datafusion.runtime.* keys, for the same reason that motivated the existing typed memoryLimit(...) / tempDirectory(...) setters):

  • No way to disable spill entirely. Useful for memory-only execution profiles or environments without writable disk.
  • No way to cap the spill volume size. A runaway sort or hash-aggregate can fill the spill disk — on a multi-tenant node, a co-tenant outage rather than just a query failure. This matches the --disk-limit flag that datafusion-cli ships.

Issue #83 originally included a third setter, tempDirectories(List<String>), but a follow-up survey of the upstream code, CLI, docs, examples, and Python binding showed multi-directory mode is a programmatic-only escape hatch on DiskManagerBuilder: not exposed by RuntimeEnvBuilder itself, not used by datafusion-cli, not in configs.md, not in any example. It can ship later if a real user files an issue with a use case; punting it keeps this PR focused on the two setters with concrete motivation.

What changes are included in this PR?

  • New setters on SessionContextBuilder:
    • disableSpill().
    • maxTempDirectorySize(long)
  • Proto: new DiskManagerOptions message (optional disabled, optional max_temp_directory_size) added to session_options.proto.
  • Native: one new arm in createSessionContextWithOptions that decodes DiskManagerOptions and routes through with_disk_manager_builder(Disabled) / with_max_temp_directory_size.

Are these changes tested?

Yes. 9 new tests in SessionContextBuilderTest

Are there any user-facing changes?

Yes, additive only — no breaking changes.

  • Two new public setters on SessionContextBuilder. Existing callers using only tempDirectory(String) see no behaviour change and produce identical wire bytes.
  • The new mutually-exclusive combination throws IllegalStateException at build() time (not at the setter call), matching the setOption runtime-key pattern.
  • No Javadoc / user-guide updates in this PR; the new setters' Javadoc carries the v1 surface description, consistent with how recent additions (e.g. cacheManager, registerObjectStore) shipped.

@LantaoJin
Copy link
Copy Markdown
Contributor Author

Follow-up (no action needed before merging this PR): the new Javadoc on disableSpill() and maxTempDirectorySize(long) describes the failure case as "a RuntimeException carrying DataFusion's resources-exhausted message" because the typed exception hierarchy from #79 (feat/typed-exceptions) isn't on this branch's base. Once #79 merges, a one-line follow-up can tighten both Javadoc references to {@link ResourcesExhaustedException} for accurate cross-linking. If #79 merges first, this PR's Javadoc can be updated during a rebase before merge instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: expose disableSpill and maxTempDirectorySize on SessionContextBuilder

1 participant