feat(metrics): add SessionContext.memoryUsage and runtimeStats#85
Open
LantaoJin wants to merge 1 commit into
Open
feat(metrics): add SessionContext.memoryUsage and runtimeStats#85LantaoJin wants to merge 1 commit into
LantaoJin wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
Multi-tenant DataFusion deployments need two operational signals that the Java binding currently does not expose:
Per-session memory.
SessionContextBuilder.memoryLimit(...)(PR feat: configure SessionContext and RuntimeEnv via builder #28) caps the global pool, but if a tenant blows past their fair-share allocation there is no way to attribute the bytes back to a session. Without per-session attribution, fair-share scheduling, abuse detection, and OOM root-causing all fall back to runtime restart.Tokio runtime stats. The JNI library drives a single shared multi-threaded Tokio runtime in
lib.rs. Embedders that surface node-level health -- e.g. an OpenSearch_nodes/statsendpoint -- need worker count, busy time, queue depth, etc. Today they have to hand-roll a parallel native bridge.Both share an FFI snapshot pattern: read a small struct of counters across the boundary on demand. They are bundled here so the design conversation only happens once.
What changes are included in this PR?
Two new accessors on
SessionContext, two new immutable POJOs:Per-session memory tracking
native/src/memory.rsintroducesTrackingMemoryPool, a thin wrapper around anyArc<dyn MemoryPool>that interceptsgrow/try_grow/shrinkto maintain twoAtomicU64counters: total bytes currently held and the peak observed since session creation. Pool semantics (limits, eviction, spilling) are unchanged becausetry_growstill defers to the inner pool.The wrapper is layered on automatically by both
createSessionContextandcreateSessionContextWithOptions-- callers don't opt in. IfSessionContextBuilder.memoryLimit(...)configured aGreedyMemoryPoolorTrackConsumersPool, the tracker wraps that. If it didn't, the tracker wraps DataFusion's defaultUnboundedMemoryPool.Java callers can't downcast
Arc<dyn MemoryPool>back to the concrete tracker type (the trait does not requireAny), so a process-wideMutex<HashMap<jlong, Arc<TrackingMemoryPool>>>keyed by the JNI handle gives the snapshot path a way to find the right tracker. Inserted at session create, drained at session close; no extra failure modes.Per-session, not per-DataFrame. A cross-engine survey (pandas / Polars / Spark / DuckDB / DataFusion-Rust + Python) confirmed that no engine ships per-DataFrame in-flight memory accounting. What pandas/Polars expose as
memory_usage/estimated_sizeis data-at-rest sizing of materialised columns -- a different feature. Multi-tenant attribution in DataFusion is conventionally one session per tenant, which matches the OpenSearch prior art (QueryMemoryPoolkeyed offcontext_id). Per-DataFrame attribution would need a side-channel registry hooked into operator-time consumer creation; not blocked by this PR, can land later if requested.Tokio runtime metrics
native/src/runtime_metrics.rsis gated behind a default-offruntime-metricsCargo feature becausetokio-metricsrequires--cfg tokio_unstableat build time.tokio_metrics::RuntimeMonitor::intervals()is a delta iterator -- eachnext()returns metrics covering the period since the previous call -- so the module owns a single process-wideRuntimeAccumulatorthat maintains running totals for documented-monotonic fields. Snapshot (point-in-time) fields (workers_count,live_tasks_count,global_queue_depth) pass through without accumulation.Build matrix:
cargo build(default)RUSTFLAGS="--cfg tokio_unstable" cargo build --features runtime-metrics--cfg tokio_unstableThe Java surface is unchanged either way --
SessionContext.runtimeStats()is always present; calls just throw a clear "datafusion-jni was built without theruntime-metricsCargo feature; rebuild the native crate withRUSTFLAGS=\"--cfg tokio_unstable\" cargo build --features runtime-metrics" error from the JVM if the feature was compiled off.SessionContextRuntimeStatsTestdetects this case and skips itself via JUnit'sAssumptions.assumeFalse(...), somake teststays green either way.A new
make native-runtime-metricstarget makes the opt-in build a one-liner.This is intentionally similar to PR #75's
substraitfeature handling: a heavy / build-prereq-bearing dependency stays out of the default build, the Java surface is unchanged, and a feature-off compile substitutes a stub handler that throws clearly.Are these changes tested?
Yes -- 9 new tests across
SessionContextMemoryUsageTestandSessionContextRuntimeStatsTest.Are there any user-facing changes?
Yes -- purely additive. New public API:
org.apache.datafusion.MemoryUsage(immutable value class)org.apache.datafusion.RuntimeStats(immutable value class)SessionContext.memoryUsage() -> MemoryUsageSessionContext.runtimeStats() -> RuntimeStatsNo API removals, no deprecations, no behavior change for existing callers. The default
cargo builddoes not pull intokio-metricsand adds no new build prerequisites.SessionContext.memoryUsage()is always available;runtimeStats()is present but throws "feature not enabled" at runtime unless rebuilt with the feature.