Skip to content

fix: Disable semantic_check for job table subtraction in refresh()#1383

Merged
dimitri-yatsenko merged 4 commits into
maint/2.0from
fix/jobs-semantic-check-1379
Feb 5, 2026
Merged

fix: Disable semantic_check for job table subtraction in refresh()#1383
dimitri-yatsenko merged 4 commits into
maint/2.0from
fix/jobs-semantic-check-1379

Conversation

@dimitri-yatsenko
Copy link
Copy Markdown
Member

Summary

  • Fix populate(reserve_jobs=True) failing when dj.config.jobs.keep_completed=True
  • The - operator doesn't pass semantic_check=False, causing semantic matching to fail

Problem

When keep_completed=True, the refresh() method tries to find completed jobs that need to be re-pended:

success_to_repend = self.completed.restrict(key_source, semantic_check=False) - self._target

The .restrict() correctly disables semantic check, but the - operator calls .restrict(Not(x)) without semantic_check=False. This causes a lineage mismatch error:

DataJointError: Cannot join on attribute `number`: different lineages 
(TEST_.~~numbers_squared.number vs TEST_.#numbers.number)

The lineages differ because:

  • Job table's PK is defined directly in the job table (~~table.attr)
  • Target table's PK comes from a foreign key reference (#parent.attr)

Solution

Replace the - operator with explicit .restrict(Not(...), semantic_check=False):

success_to_repend = self.completed.restrict(key_source, semantic_check=False).restrict(
    Not(self._target), semantic_check=False
)

Fixes #1379

🤖 Generated with Claude Code

dimitri-yatsenko and others added 2 commits February 5, 2026 10:15
The `-` operator calls `.restrict(Not(x))` without passing `semantic_check=False`.
When `keep_completed=True`, the subtraction `- self._target` fails because:
- Job table's PK has lineage `~~table.attr` (defined in job table)
- Target table's PK has lineage `#parent.attr` (from foreign key)

Replace `-` operator with explicit `.restrict(Not(...), semantic_check=False)`.

Also move `Not` import to top-level (was local import).

Fixes #1379

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Regression tests for #1379 - verify that populate(reserve_jobs=True)
works correctly when keep_completed=True and add_job_metadata=True.

Tests:
- test_populate_reserve_jobs_with_keep_completed: Basic populate works
- test_populate_reserve_jobs_keep_completed_repend: Deleted results are re-pended

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
dimitri-yatsenko and others added 2 commits February 5, 2026 10:26
The experiment table is already populated by the fixture, so populate()
had nothing to do. Delete the data first to ensure there's work to be done.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The repend test was failing because experiment table has compound PK
but jobs only track subject_id (due to non-FK attribute warning).
Replace with simpler test that directly tests jobs.refresh() behavior.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@dimitri-yatsenko dimitri-yatsenko merged commit d7f6b55 into maint/2.0 Feb 5, 2026
7 checks passed
@dimitri-yatsenko dimitri-yatsenko deleted the fix/jobs-semantic-check-1379 branch February 5, 2026 16:37
dimitri-yatsenko added a commit that referenced this pull request May 19, 2026
Same fix #1383 applied to the Job table's antijoin in refresh(),
now applied to AutoPopulate._populate_direct's antijoin and the
progress() fallback path. The two-arg subtract `key_source - self`
triggers QueryExpression.__sub__ which calls .restrict(Not(...))
with semantic_check=True by default.

The semantic-check requirement is wrong here: this antijoin is a
plain set-difference, not a join — we ask "which key_source rows
aren't yet in self." Whether the same-named PK attribute carries
the same source-table lineage tag on both sides is irrelevant.

Where it bites: dj.Imported / dj.Computed tables whose primary key
is fully inherited from a single FK, with no own-table PK attributes.
On those, self.proj() returns the PK attribute with lineage=None
(or pointing to self rather than the FK parent), while key_source's
matching attribute carries the parent's lineage tag. The
semantic-check fails with:

    Cannot join on attribute 'X': different lineages
    (schema.parent.X vs None). Use .proj() to rename one of the
    attributes.

This pattern is legitimate ("one row downstream per parent row,
no intermediate ID") but rare in typical Elements / SciOps pipelines,
which extend the inherited PK with own-table attributes (trial_id,
experiment_id, etc.) that anchor proj()'s lineage. That's why the
existing #1405 test suite didn't surface it.

Changes:
- src/datajoint/autopopulate.py
  - Import Not from .condition at module top.
  - _populate_direct: replace `(LHS - self.proj())` with
    `LHS.restrict(Not(self.proj()), semantic_check=False)`.
  - progress(): same swap on the no-common-attrs fallback branch.
- tests/integration/test_autopopulate.py
  - New test_populate_antijoin_fk_inherited_pk regression test:
    Spec(Manual) -> Item(Imported with only -> Spec) — the minimal
    shape that triggers the bug. Without the fix Item.populate()
    raises DataJointError; with the fix it populates correctly,
    progress() reports correct counts, and partial-then-full
    populate works.

Stacked on top of #1452 (the secrets-loading + dead-code fix); rebase
to master after that lands.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants