Skip to content

fix(node): preserve original write-decline error and roll back local cache#3678

Merged
Apollon77 merged 3 commits intomainfrom
fix/transaction-preserve-original-commit2-error
May 1, 2026
Merged

fix(node): preserve original write-decline error and roll back local cache#3678
Apollon77 merged 3 commits intomainfrom
fix/transaction-preserve-original-commit2-error

Conversation

@Apollon77
Copy link
Copy Markdown
Collaborator

@Apollon77 Apollon77 commented May 1, 2026

Summary

When a remote Matter device declines a write, two things were going wrong on the client side:

  1. The original StatusResponseError (carrying the device's status code) was hidden inside a generic FinalizationError produced by transaction commit phase 2, so callers of setStateOf / agent.state.x = … / endpoint.set had no way to detect or react to the device's specific failure.
  2. The local client cache had already accepted the value the user attempted to write. Subscriptions only deliver server-side changes, so the cache stayed divergent from the server until the process restarted.

This PR fixes both — in two commits so the contract change in commit phase 2 is reviewable independently from the cache-rollback work.

Commit 1 — fix(general): surface original commit-phase-2 errors to transaction caller

Tx#executeCommit2 no longer wraps the participant error in a generic FinalizationError. throwIfErrored rethrows the original error when a single participant fails, and aggregates with MatterAggregateError when multiple do. The rollback path's mislabeled "in commit phase 2" message is corrected to "during rollback".

TransactionTest covers the new contract directly (single-error rethrow + multi-error aggregate). End-to-end ClientNodeTest cases cover single decline, partial decline (1 of 2 cross-cluster writes), and full decline. A server-side regression test locks in the existing preCommit-phase-1 path.

Commit 2 — feat(node): roll back local client cache on declined remote writes

  • RemoteWriter accepts an optional onFailure(failures: WriteResult.AttributeStatus[]) callback that fires before the write rejects.
  • RemoteWriteParticipant captures a per-cluster snapshot (previousValues, writtenValues, compensator) at every set() call. On commit2 failure it groups failures by endpoint+cluster and invokes the compensator with the failed attribute IDs.
  • DatasourceCache implements the Compensator interface. compensate() restores the pre-write value via the existing externalSet path so persistence + listeners fire as if a subscription update arrived, but only when:
    • a baseline previous value was captured for the key (skip otherwise), AND
    • the current local value still equals what was just written (skip if a concurrent subscription update changed it).

Three new ClientNodeTest cases cover single decline, partial decline across two clusters, and full decline. The tests fail with expected 1 to equal 0 when compensate() is neutered, confirming they exercise compensation rather than relying on subscription delivery.

🤖 Generated with Claude Code

Apollon77 and others added 2 commits May 1, 2026 09:18
…aller

Tx#executeCommit2 used to wrap any phase-2 participant failure in a
generic FinalizationError carrying only the participant name. The
original error (e.g. a StatusResponseError returned by a remote device
declining a write) was logged but never reached the caller, so
ClientNode setStateOf / agent.state / endpoint.set callers had no way
to detect the device's status code.

throwIfErrored now rethrows the original error when a single participant
fails, and aggregates with MatterAggregateError when multiple do. The
rollback path's mislabeled "in commit phase 2" message is corrected.

Adds direct unit tests for the new contract (TransactionTest) and
end-to-end ClientNode tests covering single rejection, partial multi-
attribute rejection, and full multi-attribute rejection across two
clusters, plus a server-side regression test for the preCommit path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After PR #3676 the original StatusResponseError reaches the caller of
ClientNode setStateOf / agent.state / endpoint.set, but the local
client-side cache still held the value the user attempted to write.
Subscriptions only deliver server-side changes, so the divergence
persisted until restart.

RemoteWriteParticipant now tracks a per-cluster snapshot
(previousValues + writtenValues + compensator) per set() call. On
commit2 failure the new RemoteWriter onFailure callback delivers
per-attribute statuses; the participant groups failures by
endpoint+cluster and invokes the compensator with the failed ids.

DatasourceCache implements the compensator: it restores the pre-write
value via the existing externalSet path (so persistence + listeners
fire as if a subscription update arrived) but only when:
  - a baseline previous value was captured for the key, AND
  - the current local value still equals what was just written
    (skip restoration if a concurrent subscription update changed it).

Adds three end-to-end tests covering single decline, partial decline
across two clusters, and full decline.  The tests fail when the
compensate body is disabled, so they verify compensation rather than
relying on subscription delivery.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 1, 2026 08:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves client-side handling of remote write failures in matter.js by (1) preserving and surfacing the original server-provided write error to callers and (2) rolling back optimistic client cache updates when the remote device declines specific attribute writes.

Changes:

  • Update transaction commit phase 2 error propagation to rethrow the original participant error (single) or aggregate with MatterAggregateError (multiple), and fix rollback error wording.
  • Add an optional failure callback to RemoteWriter and implement write-failure compensation via RemoteWriteParticipant + DatasourceCache.
  • Add/extend tests to cover single and multi-attribute declines, partial declines, and cache rollback behavior.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
packages/node/test/node/ClientNodeTest.ts Adds end-to-end tests verifying declined-write error surfacing and client cache rollback for single/partial/full declines.
packages/node/src/storage/client/RemoteWriter.ts Adds optional onFailure hook invoked with failing attribute statuses before rejecting the write.
packages/node/src/storage/client/RemoteWriteParticipant.ts Captures pre-write snapshots per endpoint+cluster and compensates local state for declined attributes on commit2 failure.
packages/node/src/storage/client/DatasourceCache.ts Implements compensator to restore pre-write values through externalSet when remote declines and local value still matches the attempted write.
packages/general/test/transaction/TransactionTest.ts Adds unit tests covering phase-2 single-error rethrow and multi-error aggregation.
packages/general/src/transaction/Tx.ts Changes commit2/rollback error collection to retain original errors and aggregates via MatterAggregateError; fixes rollback message.

… guard

Datasource keeps user-written values by reference and integrateExternalChange
replaces a reference only when a subscription actually delivers a changed
value (its own no-op deep-equal filter strips identical re-deliveries).  The
guard in DatasourceCache.compensate therefore only needs to check `current[id]
!== writtenValues[id]` instead of running a full deep-equal on every key — the
ref check captures real concurrent updates without the extra walk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Apollon77 Apollon77 added the automerge Set this label if the PR is ready to automatically merged after approval label May 1, 2026
@Apollon77 Apollon77 merged commit 344ee3e into main May 1, 2026
43 checks passed
@Apollon77 Apollon77 deleted the fix/transaction-preserve-original-commit2-error branch May 1, 2026 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge Set this label if the PR is ready to automatically merged after approval

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants