Skip to content

Silent bootstrapping.#6241

Open
ryzhyk wants to merge 1 commit into
mainfrom
issue6090
Open

Silent bootstrapping.#6241
ryzhyk wants to merge 1 commit into
mainfrom
issue6090

Conversation

@ryzhyk
Copy link
Copy Markdown
Contributor

@ryzhyk ryzhyk commented May 14, 2026

Fix #6090.

Add ?silent_bootstrapping argument to /start and /approve calls, which disables
output connectors during bootstrapping. The primary use case for this is
bootstrapping the pipeline after a Feldera upgrade where the contents of views
is not expected to change.

Compatibility:

  • Pipeline manager: the pipeline.bootstap_policy column is now used to store
    both attributes related to bootstrapping: bootstrap_policy and
    silent_bootstrapping as a JSON object. This appears more future-proof
    than adding new columns to the DB. Backward compatibility is maintained by
    supportint deserialization from the old format.
  • Pipeline: the pipeline executable takes an optional --silent-bootstapping command
    line argument. This argument is only passed when silent bootstrapping is
    requested. If the user tries to start a pipeline compiled with an old runtime
    with silent bootstrapping, the pipeline will fail to start. Otherwise both old
    and new pipelines should work fine.

Describe Manual Test Plan

Checklist

  • Unit tests added/updated
  • Integration tests added/updated
  • Documentation updated
  • Changelog updated

Breaking Changes?

Mark if you think the answer is yes for any of these components:

Describe Incompatible Changes

@ryzhyk ryzhyk force-pushed the transactional_bootstrapping branch from 7148c81 to dffc5b1 Compare May 15, 2026 00:20
Base automatically changed from transactional_bootstrapping to main May 15, 2026 21:20
@ryzhyk ryzhyk marked this pull request as ready for review May 16, 2026 22:18
@ryzhyk ryzhyk requested a review from snkas May 16, 2026 22:19
@ryzhyk
Copy link
Copy Markdown
Contributor Author

ryzhyk commented May 16, 2026

@snkas , could you check changes to the pipeline manager?

@ryzhyk ryzhyk requested a review from gz May 16, 2026 22:20
Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Draft, so high-level only. Overall shape is sound — BootstrapConfig as a single deployment-scoped config object is the right move once you have a second knob alongside policy, and the legacy-string fallback in parse_string_as_bootstrap_config makes the DB migration cleanly backwards-compatible.

Design things worth thinking through before this comes out of draft:

  1. #[serde(flatten)] bootstrap_config: Option<BootstrapConfig> in ExtendedPipelineDescr / ExtendedPipelineDescrMonitoring is subtle — serde-flatten on an Option interacts awkwardly with utoipa OpenAPI generation and is hard to evolve (adding a third field later cannot be cleanly deprecated because flatten + Option hides which subset of fields was present in the JSON). Prefer either a non-Option BootstrapConfig with a Default sentinel, or accept the nested object explicitly in the JSON shape.

  2. bootstrap_config_to_string does serde_json::to_string(&bootstrap).unwrap(). Fine in practice for a small Copy struct, but please add a one-line comment justifying the unwrap so a future contributor doesn't try to surface the error through call sites that can't return one. Also: the matching DBError::InvalidBootstrap(s) is now overloaded — 'unknown legacy string' and 'malformed JSON config' both reach it, and the operator-facing error message can't tell the user which case they hit. Worth splitting.

  3. post_pipeline_approve used to record request.query_string() for the event tracker; the new code records "silent_bootstrap=true" or empty. Any future /approve query param will silently disappear from audit/event tracking. Reconstructing from the typed ApproveParameters (or keeping the raw string and additionally branching on the parsed value) would be more robust.

  4. BootstrapConfig::from(p).with_silent_bootstrap(args.silent_bootstrap) is a bit awkward as a builder for a two-field Copy struct. A plain constructor BootstrapConfig::new(policy, silent_bootstrap) would let the three call sites in server.rs read top-to-bottom without the builder dance.

  5. Cross-version compatibility: the PR description acknowledges that an old runtime binary started with --silent-bootstrap will fail. Please make sure that failure surfaces a useful error in the runner logs (clap's 'unexpected argument' message is fine if it's captured); otherwise the manager should pre-check the runtime version before passing the flag so users don't get a cryptic exit code.

  6. Per workspace rule on enterprise/OSS divergence: test_bootstrap_enterprise is @enterprise_only and exercises the new additive silent_bootstrap knob — that's the right shape. When you finalize, please double-check no helper paths it touches behave differently between OSS and enterprise on existing endpoints.

Will do a full pass once it's out of draft.

Copy link
Copy Markdown
Contributor

@snkas snkas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the disadvantage of adding bootstrap configuration to the runtime configuration? It seems to be more about the behavior of the pipeline at runtime, rather than how to start it.

How does (silent) bootstrapping interact with fault tolerance? E.g., at-least-once fault tolerance or exactly-once fault tolerance?

Is silent bootstrapping not a feature to have always enabled? When does a user want to have intermittent output?

Comment thread crates/pipeline-manager/src/db/types/pipeline.rs
Comment thread crates/pipeline-manager/src/db/types/pipeline.rs Outdated
Comment thread crates/pipeline-manager/src/db/types/pipeline.rs Outdated
Comment thread crates/pipeline-manager/src/db/types/pipeline.rs Outdated
pub deployment_runtime_desired_status: Option<RuntimeDesiredStatus>,
pub deployment_runtime_desired_status_since: Option<DateTime<Utc>>,
pub bootstrap_policy: Option<BootstrapPolicy>,
#[serde(flatten)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the other complicated fields are serde_json::Value, because if they are every backward incompatibly changed, the request to GET pipelines will fail and nothing will show in the Web Console. This seems to take a similar route, where BootstrapConfig is allowed to be expand even more in the future -- if it ever fails to be deserialized, pipeline retrieval will break.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, this is why this location needs the flatten attribute, to maintain compatibility with clients, including webconsole.

pub deployment_runtime_desired_status_since: Option<DateTime<Utc>>,
pub bootstrap_policy: Option<BootstrapPolicy>,
#[serde(flatten)]
pub bootstrap_config: Option<BootstrapConfig>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the effect in the future of flattening things will be here. In general, using #[serde(flatten)] is not a good idea and has in the past made debugging a lot harder, e.g., that a BTreeMap<String, String> is being flattened inside the Kafka configuration. Another option would be to keep bootstrap_policy, and add a field bootstrap_config that is also used to populate it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, flattening is a slippery feature, but I believe the way it is used here is the safe and intended way to use it according to serde docs, and based on my experience. I'll double check if any of the instances of flattening in this PR are unnecessary.

@ryzhyk
Copy link
Copy Markdown
Contributor Author

ryzhyk commented May 18, 2026

What's the disadvantage of adding bootstrap configuration to the runtime configuration? It seems to be more about the behavior of the pipeline at runtime, rather than how to start it.

Bootstrapping is purely startup behavior, similar to choosing to start the pipeline in a paused state. It must be set separately every time the pipeline is started. This is especially the case for the new option, which is something people will only want to set when restarting the pipeline after a runtime update.

How does (silent) bootstrapping interact with fault tolerance? E.g., at-least-once fault tolerance or exactly-once fault tolerance?

  • Exactly once FT is mutually exclusive with bootstrapping. A modified pipeline will refuse to replay.
  • At-least-once should work fine I think

Is silent bootstrapping not a feature to have always enabled? When does a user want to have intermittent output?

Right now I only know of one case when silent bootstrapping is useful -- when restarting the pipeline after a feldera upgrade. In other cases it can lead to data loss.

Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Clean implementation: BootstrapConfig struct with backward-compatible DB parsing, well-documented silent_bootstrap semantics, thorough integration test covering multiple bootstrap cycles. The serde(flatten) on Option in PipelineSelectedInfoInternal is fine for backward compat but note that future BootstrapConfig fields will surface at the API top level.

@ryzhyk ryzhyk added this pull request to the merge queue May 21, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 21, 2026
Fix #6090.

Add `/silent_bootstrapping` argument to `/start` and `/approve` calls, which disables
output connectors during bootstrapping. The primary use case for this is
bootstrapping the pipeline after a Feldera upgrade where the contents of views
is not expected to change.

Compatibility:
- Pipeline manager: the `pipeline.bootstap_policy` column is now used to store
  both attributes related to bootstrapping: `bootstrap_policy` and
  `silent_bootstrapping` as a JSON object. This appears more future-proof
  than adding new columns to the DB. Backward compatibility is maintained by
  supportint deserialization from the old format.
- Pipeline: the pipeline executable takes an optional --silent-bootstapping command
  line argument. This argument is only passed when silent bootstrapping is
  requested. If the user tries to start a pipeline compiled with an old runtime
  with silent bootstrapping, the pipeline will fail to start. Otherwise both old
  and new pipelines should work fine.

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
@ryzhyk ryzhyk enabled auto-merge May 21, 2026 16:07
@ryzhyk ryzhyk added this pull request to the merge queue May 21, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 21, 2026
Copy link
Copy Markdown

@mythical-fred mythical-fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-confirming approval after rebase. Single squashed commit 9c9ef57 carries the same design I already approved at 7aa10b6: BootstrapConfig with backward-compatible DB parsing, serde(flatten) on Option<BootstrapConfig> in PipelineSelectedInfoInternal, and the multi-cycle test_silent_bootstrap_enterprise covering transmitted_records=0 during silent bootstrap, normal resumption, and the non-silent final round. Docs in modifying.md are clear about when (not) to use silent bootstrap. No dbg!/eprintln! regressions; commit message is descriptive.

Prior non-blocking nits still stand (none are gating): bootstrap_config_to_string still uses .unwrap() with no justifying comment; DBError::InvalidBootstrap is now overloaded between legacy-string and malformed-JSON cases; post_pipeline_approve audit string is hard-coded to silent_bootstrap=true/empty rather than reconstructed from the typed query, so future /approve params will need to be added there explicitly. Worth picking up in a follow-up but not blocking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[connectors] Disable output endpoints during bootstrapping.

4 participants