feat(substrait): add SessionContext.fromSubstrait gated behind opt-in Cargo feature#80
Open
LantaoJin wants to merge 1 commit into
Open
feat(substrait): add SessionContext.fromSubstrait gated behind opt-in Cargo feature#80LantaoJin wants to merge 1 commit into
LantaoJin wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
SessionContext.fromSubstrait#75 .Rationale for this change
SessionContext.fromProto(byte[])accepts only DataFusion's ownLogicalPlanNodeproto. Substrait — the cross-engine logical-plan standard that DataFusion already supports through thedatafusion-substraitcrate — has had no Java-side entry point. Embedders that compile plans elsewhere (Calcite via Isthmus, custom planners, federation hubs, integrations with other engines) had to round-trip through SQL to use the Java binding. That round-trip is lossy: source-side optimisations baked into the Substrait plan are discarded, and SQL is not always expressive enough to round-trip cleanly when plans reference extensions or function variants with no surface SQL form.What changes are included in this PR?
This PR adds a single new entry point that mirrors the existing
fromProtoshape but consumes SubstraitPlanbytes instead. The implementation is small (~50 LOC of JNI plus ~25 LOC on the Java side); the bulk of the diff is the test that round-trips a hand-built Substrait plan through the JNI bridge.New public Java API on
SessionContext:planBytesis a serialisedsubstrait.proto.Plan. The plan is translated against this context's catalog: any tables it references must already be registered. The returnedDataFrameis lazy and composes with the rest of the API.Default-off, so
cargo build(and thereforemake test,make, and everyone who doesn't need Substrait) stays hermetic without any new build prerequisites. Substrait support is opt-in:cargo build(default)cargo build --features substraitprotocon PATHcargo build --features substrait,protoccmakeon PATHThe Java surface is unchanged either way —
SessionContext.fromSubstrait(...)is always present; calls just throw a clear "datafusion-jni was built without thesubstraitCargo feature; rebuild with--features substrait" error from the JVM if the feature was compiled off.SessionContextSubstraitTestdetects this case and skips itself via JUnit'sAssumptions.assumeFalse(...), somake teststays green either way.This is intentionally different from PR #60's avro handling, which is always-on.
Are these changes tested?
Yes, 7 new tests in
SessionContextSubstraitTestAre there any user-facing changes?
Yes, purely additive. New public API:
SessionContext.fromSubstrait(byte[]) → DataFrameNo API removals, no deprecations, no behavior change for existing callers. The default
cargo builddoes not pull indatafusion-substraitand adds no new build prerequisites;SessionContext.fromSubstrait(...)is present but throws "feature not enabled" at runtime. Users who need Substrait rebuild with--features substrait(and either installprotocor also enable theprotochelper feature). The native binary is unchanged in size unless the feature is opted in.The new test-scope dependency
io.substrait:core:0.81.0is added to the parent POM'sdependencyManagement(with version propertysubstrait.java.version) and tocore/pom.xmlintestscope only; it does not enter the runtime classpath of the published artifact.