Skip to content

DynamoDB tags may be removed after repeated store.apply() due to asynchronous UntagResource / TagResource operations #6418

@parodin

Description

@parodin

Expected Behavior

When store.apply() is executed multiple times against a Feast registry configured with DynamoDB as the online store, table tags should remain stable and idempotent.

Repeated executions of:

store.apply([..])

should not modify the final tag state when no configuration changes have occurred.

The DynamoDB tables should consistently retain the configured tags after every execution.

Current Behavior

Feast's DynamoDB online store implementation updates tags by:

  1. Reading existing tags (ListTagsOfResource)
  2. Removing all existing tags (UntagResource)
  3. Recreating all configured tags (TagResource)

However, DynamoDB documents both TagResource and UntagResource as asynchronous and eventually consistent operations.

Because Feast invokes TagResource immediately after UntagResource without waiting for propagation, repeated executions can leave tables without tags.

Observed behavior:

apply #1 -> tags present
apply #2 -> tags missing
apply #3 -> tags present
apply #4 -> tags missing

Adding a delay between UntagResource and TagResource eliminates the issue.

Steps to reproduce

Feast configuration

Configure DynamoDB as the online store with table tags enabled:

online_store:
  type: dynamodb
  region: eu-west-1

  tags:
    project: test-fstore

Apply definitions repeatedly

store.apply([...])

Observe tag state

Run repeatedly:

aws dynamodb list-tags-of-resource \
  --resource-arn <table-arn>

Tags may disappear after an apply execution even though Feast issued a successful TagResource request.

Additional debugging

Instrumenting _update_tags() shows:

current_tags = dynamodb_client.list_tags_of_resource(...)

dynamodb_client.untag_resource(...)

# immediately after untag
list_tags_of_resource() -> still returns previous tags

time.sleep(10)

list_tags_of_resource() -> returns []

dynamodb_client.tag_resource(...)

# immediately after tag
list_tags_of_resource() -> still returns []

time.sleep(10)

list_tags_of_resource() -> returns expected tags

This matches DynamoDB documentation stating that tag operations are eventually consistent.

Specifications

  • Version: Feast 0.63.0
  • Online Store: DynamoDB

Possible Solution

The DynamoDB online store implementation should not assume that UntagResource has completed when the API call returns.
Potential fixes:
Option 1
Wait until tag removal is visible before issuing TagResource.
Example:

untag_resource(...)
while list_tags_of_resource(...):
    time.sleep(1)
tag_resource(...)

Option 2
Avoid deleting and recreating all tags.
Instead:

  • Compute the tag diff
  • Add/update only changed tags
  • Remove only obsolete tags

This would avoid the race condition entirely and reduce API calls.
Relevant AWS documentation

AWS DynamoDB documentation for UntagResource:

UntagResource is an asynchronous operation.

The application or removal of tags using TagResource and UntagResource APIs is eventually consistent.

ListTagsOfResource API will only reflect the changes after a few seconds.

This behavior appears incompatible with the current implementation that performs UntagResource immediately followed by TagResource without waiting for propagation.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions