fix: bound QuestionHistory per-entry known-answer payload#1755
Conversation
`QuestionHistory._history[question] = (now, known_answers)` stores the incoming known-answers set by reference. The dict's entry count is capped at `_MAX_QUESTION_HISTORY_ENTRIES = 10000`, but each entry's set was unbounded. `QueryHandler.async_response` builds the set from the union of every TC-deferred fragment's answers — up to `_MAX_DEFERRED_PER_ADDR = 16` packets x ~750 records each — so a LAN peer sustaining TC-fragmented queries with large known-answer lists can pin hundreds of MB across the `_MAX_QUESTION_HISTORY_ENTRIES` dimension. The records never enter the DNS cache, so `_MAX_CACHE_RECORDS` does not bound this path. `add_question_at_time` now drops the insert when the known-answers set exceeds `_MAX_KNOWN_ANSWERS_PER_HISTORY_ENTRY = 256` (well above any RFC-realistic single-question known-answer list). Truncating to a subset was considered and rejected: `suppresses()` returns True when the stored set is a subset of the incoming known-answers, so a smaller stored set matches more easily and would over-suppress legitimate follow-up queries — the conservative direction is to forgo suppression for the oversized query, not to retain a partial snapshot. Any pre-existing smaller entry for the same question key is left untouched. The new constant is `cdef unsigned int` in `_history.pxd` so the bound check compiles to a direct C integer compare.
|
@bluetoothbot review |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1755 +/- ##
=======================================
Coverage 99.77% 99.77%
=======================================
Files 33 33
Lines 3497 3500 +3
Branches 489 490 +1
=======================================
+ Hits 3489 3492 +3
Misses 5 5
Partials 3 3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This pull request closes a memory-retention DoS vector in duplicate-question suppression by adding a per-entry size bound on the stored known-answer set in QuestionHistory. It prevents attacker-supplied, TC-fragment-amplified known-answer payloads from being retained in _history while preserving existing suppression behavior for normal-sized payloads.
Changes:
- Introduces
_MAX_KNOWN_ANSWERS_PER_HISTORY_ENTRY = 256as a hard per-entry bound for retained known-answers. - Updates
QuestionHistory.add_question_at_time()to drop (not truncate) oversized known-answer payloads to avoid over-suppression. - Adds targeted unit tests covering oversized-drop behavior, preservation of existing entries, and the boundary-at-cap case.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/zeroconf/const.py |
Adds the per-entry known-answer cap constant with rationale. |
src/zeroconf/_history.py |
Enforces the per-entry cap by ignoring oversized known-answer inserts. |
src/zeroconf/_history.pxd |
Declares the new constant for the Cython-optimized hot path. |
tests/test_history.py |
Adds coverage for oversized known-answer handling and boundary behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
QuestionHistory._history[question] = (now, known_answers)stores the incoming known-answers set by reference. The dict's entry count is capped at_MAX_QUESTION_HISTORY_ENTRIES = 10000(PR #1733), but each entry's set was unbounded.QueryHandler.async_responsebuilds the set from the union of every TC-deferred fragment's answers; up to_MAX_DEFERRED_PER_ADDR = 16packets x ~750 records each, so a LAN peer sustaining TC-fragmented queries with large known-answer lists can pin hundreds of MB across the_MAX_QUESTION_HISTORY_ENTRIESdimension. The records never enter the DNS cache, so_MAX_CACHE_RECORDSdoes not bound this path either.Fixes #1754.
Details
_MAX_KNOWN_ANSWERS_PER_HISTORY_ENTRY = 256(src/zeroconf/const.py); well above any RFC-realistic single-question known-answer list; cdef'd in_history.pxdso the bound check is a direct C integer compare under Cython.QuestionHistory.add_question_at_time(src/zeroconf/_history.py) drops the insert whenlen(known_answers) > _MAX_KNOWN_ANSWERS_PER_HISTORY_ENTRY. Any pre-existing smaller entry for the same question key is left untouched, so legitimate suppression continues to work even while the attacker hammers the same question with oversized payloads.suppresses()returnsnot previous_known_answers - known_answers, i.e., it suppresses when the stored set is a subset of the incoming known-answers. Truncating the stored set to a smaller subset would make the subset relation hold more easily, which means over-suppression — we would drop responses that the legitimate querier was entitled to. The conservative direction under uncertainty is to forgo suppression for the one oversized query (cost: one extra response), not to retain a partial snapshot (cost: silent suppression of legitimate follow-ups).Test plan
tests/test_history.py:test_question_history_oversized_known_answers_dropped: a set of_MAX_KNOWN_ANSWERS_PER_HISTORY_ENTRY + 1known-answers does not appear in_history.test_question_history_oversized_preserves_existing_entry: a pre-existing 2-record entry survives an oversized follow-up call for the same question; the stored set object is unchanged.test_question_history_at_cap_known_answers_is_stored: a set of exactly_MAX_KNOWN_ANSWERS_PER_HISTORY_ENTRYrecords is retained (boundary check).REQUIRE_CYTHON=1 poetry install --only=main,dev && poetry run pytest tests/ --no-cov --timeout=60: 384 passed, 3 skipped.SKIP_CYTHON=1 poetry run pytest tests/ --no-cov --timeout=60: 384 passed, 3 skipped.poetry run pre-commit runon touched files: ruff, ruff format, mypy, codespell, flake8, cython-lint all green.cython -aconfirms the bound compare lands as__pyx_t_8 = (__pyx_t_7 > __pyx_v_8zeroconf_8_history__MAX_KNOWN_ANSWERS_PER_HISTORY_ENTRY)(pure C int compare against the cdef'dunsigned int).