-
Notifications
You must be signed in to change notification settings - Fork 397
MSC4222: Adding state_after
to sync v2
#4222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
36f8310
to
87fb070
Compare
87fb070
to
55e34c9
Compare
Clients will not be able to tell when a state change happened within the timeline. This was used by some clients to | ||
render e.g. display names of users at the time they sent the message (rather than their current display name), though | ||
e.g. Element clients have moved away from this UX. This behavior can be replicated in the same way that clients dealt | ||
with messages received via pagination (i.e. calling `/messages`), by walking the timeline backwards and inspecting the | ||
`unsigned.prev_state` field. While this can lead to incorrect results, this is no worse than the previous situation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of usage of the timeline section should be not impacted at all.
We still send all state events (even if they might be wrong) in the timeline section so clients using those state event to render things in the timeline should be not impacted at all.
Clients will not be able to tell when a state change happened within the timeline. This was used by some clients to | |
render e.g. display names of users at the time they sent the message (rather than their current display name), though | |
e.g. Element clients have moved away from this UX. This behavior can be replicated in the same way that clients dealt | |
with messages received via pagination (i.e. calling `/messages`), by walking the timeline backwards and inspecting the | |
`unsigned.prev_state` field. While this can lead to incorrect results, this is no worse than the previous situation. | |
As before clients will not be able to tell when a state change happened within the timeline. This was used by some clients to | |
render e.g. display names of users at the time they sent the message (rather than their current display name), though | |
e.g. Element clients have moved away from this UX. | |
This MSC allows to compute the current state correctly. It does not fix the behavior for the state history in the timeline. So clients using such an approach still will face the same issues they had before this MSC. | |
A proper solution for this usecase would be to paginate back state using the `replaces_state` id. With this one can find all actual state event ids. This list of id's can be used in combination with the timeline history to filter for invalid and valid state changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess what I wanted to make clear was that clients will need to change how they calculate the "state at each event", and that that behaviour is really considered best-effort at best.
> Both responses are the same, but the client **MUST NOT** update its state with the event. | ||
|
||
|
||
## Potential issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format of returned state in
required_state
is a list of events. This does now allow the server to indicate if a "state reset" has happened which removed an entry from the state entirely (rather than it being replaced with another event).
Does this issue from MSC4186: Simplified Sliding Sync apply to this MSC too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm, that is true. I guess there is an open question as to whether we want to try and fix that here or not 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any fix we do to communicate deleted state in SSS can easily enough be backported to v2 should it be needed. I see no reason to block the MSC on this point. On the contrary, this MSC massively reduces desync issues compared to v2 without state_after
because we are now tracking diffs against the current state and not diffs against the last timeline event. For such a small change, this MSC provides immense value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah ok but is this easy or hard to fix? If it isn't easy, then will it be harder to fix when the proposal is spec? I don't mind a compromise as long as we're not hurting ourselves later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends how you classify easy or hard :D
It's technically very easy, as you just need to have some representation of "deleted" events: we've tossed around content: {}
and content: null
and a few others like having a new key in the response JSON. It's socially difficult however as it's prone to bikeshedding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are roughly two proposals:
- Add deleted state into the list as a object with
type
,state_key
andcontent: null
(or whatever) keys, and nothing else. Since the state is deleted we don't have a proper event and so don't have any other keys. - Add a separate field for deleted state, with the same format as above.
Both are trivial server side, the issue is what is cleanest for clients. AIUI the tradeoffs are:
- Mixing deleted state in with the other state means that clients essentially need to handle multiple datatypes in the same list.
- Having a separate field will make it easy for client implementations to miss/forget implementing the extra field during implementation. From my experience, its fairly common for people not to closely read the docs when initially writing clients, and so I think this will happen more often than you might think.
Neither are particularly hard blockers, though I'm very much coming at this from a server side perspective. I have a mild preference for the first option, as it a) semantically feels more correct, and b) reduces the risk a bit of clients forgetting to implement the extra field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you intend to show if a state event in current state has been redacted? How would that work?
What about killing two birds in one stone and repeat the event with empty content. Since then it doesn't matter if a client is naive and unaware of the behavior, if they naively update their model then they'll end up with the right state.
Edit: Ok i forgot protected fields exist smh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Mixing deleted state in with the other state means that clients essentially need to handle multiple datatypes in the same list.
- Having a separate field will make it easy for client implementations to miss/forget implementing the extra field during implementation. From my experience, its fairly common for people not to closely read the docs when initially writing clients, and so I think this will happen more often than you might think.
I mean if you really want to avoid both of these ideally you could use a new type for state deltas with a property for the event and a property for the change type. In Draupnir land we call them: Introduced (for unseen state key type pair), Superseded, Redacted, and I guess you'd also need Removed1.
Not sure if it's practical or possible for you to change that now though.
Footnotes
-
It's actually a bit more complicated because we try to not keep completely redacted state events around as an optimization and try to tell people exactly how they have changed but eh it's not necessary to copy that https://github.com/Gnuxie/matrix-protection-suite/blob/main/src/StateTracking/StateChangeType.ts#L13-L64 ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any fix we do to communicate deleted state in SSS can easily enough be backported to v2 should it be needed. I see no reason to block the MSC on this point. On the contrary, this MSC massively reduces desync issues compared to v2 without
state_after
because we are now tracking diffs against the current state and not diffs against the last timeline event. For such a small change, this MSC provides immense value.
+1; I'm strongly in favour of not blocking this MSC for years until someone has time to wrangle this last nit. I don't think that landing it as-is will make it materially more difficult to fix later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be way down on the lower end of priorities but this isn't a nit. And it shouldn't be taking years to sort issues out like this. If it is, then that's a more urgent issue to solve because then things are entirely dysfunctional.
Co-authored-by: Andrew Morgan <[email protected]>
Co-authored-by: Andrew Morgan <[email protected]>
updated with the contents of `state_after`. | ||
|
||
Clients can tell if the server supports this change by whether it returns a `state` or `state_after` section in the | ||
response. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state
isn't currently required in sync responses, but I guess state_after
will be required in all joined rooms after this change? Should probably make it more explicit that servers MUST return state_after
with an empty events
array inside
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
response. | |
response. The `state_after` property is **required** by servers that support this change. |
A team member has request SCT review on this MSC for the following points:
|
`org.matrix.msc4222.use_state_after`) query param. | ||
|
||
When enabled, the Homeserver will **omit** the `state` section in the room response sections. This is replaced by | ||
`state_after` (the unstable field name is `org.matrix.msc4222.use_state_after`), which will include all state changes between the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`state_after` (the unstable field name is `org.matrix.msc4222.use_state_after`), which will include all state changes between the | |
`state_after` (the unstable field name is `org.matrix.msc4222.state_after`), which will include all state changes between the |
I think?
The current sync v2 API does not differentiate between state events in the timeline and updates to state, and so can | ||
cause the client's view of the current state of the room to diverge from the actual state of the room. This is | ||
particularly problematic for use-cases that rely on state being consistent between different clients. | ||
|
||
This behavior stems from the fact that the clients update their view of the current state with state events that appear | ||
in the timeline. To handle gappy syncs, the `state` section includes state events that are from *before* the start of | ||
the timeline, and so are replaced by any matching state events in the timeline. This provides little opportunity for the | ||
server to ensure that the clients come to the correct conclusion about the current state of the room. | ||
|
||
In [MSC4186 - Simplified Sliding Sync](https://github.com/matrix-org/matrix-spec-proposals/pull/4186) this problem is | ||
solved by the equivalent `required_state` section including all state changes between the previous sync and the end of | ||
the current sync, and clients do not update their view of state based on entries in the timeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if this gave a bit more explanation about why /v3/sync
's current behaviour is insufficient, or at least linked to some of the issues that come about as a result of the current implementation. It is likely that not all readers will be familiar with the problems.
@@ -0,0 +1,193 @@ | |||
# MSC4222: Adding `state_after` to sync v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really not v2, unless you happened to be here in 2015.
# MSC4222: Adding `state_after` to sync v2 | |
# MSC4222: Adding `state_after` to `/sync` |
@@ -0,0 +1,193 @@ | |||
# MSC4222: Adding `state_after` to sync v2 | |||
|
|||
The current sync v2 API does not differentiate between state events in the timeline and updates to state, and so can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current sync v2 API does not differentiate between state events in the timeline and updates to state, and so can | |
The current [`/sync`](https://spec.matrix.org/v1.14/client-server-api/#get_matrixclientv3sync) API does not differentiate between state events in the timeline and updates to state, and so can |
This is basically the same as how state is returned in [MSC4186 - Simplified Sliding | ||
Sync](https://github.com/matrix-org/matrix-spec-proposals/pull/4186). | ||
|
||
State events that appear in the timeline section **MUST NOT** update the current state. The current state **MUST** only be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
State events that appear in the timeline section **MUST NOT** update the current state. The current state **MUST** only be | |
Clients **MUST NOT** update their view of the current state based on events that appear in the timeline section. | |
The current state view **MUST** only be |
updated with the contents of `state_after`. | ||
|
||
Clients can tell if the server supports this change by whether it returns a `state` or `state_after` section in the | ||
response. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
response. | |
response. The `state_after` property is **required** by servers that support this change. |
Since the current state of the room does not include the new state event, it's excluded from the `state_after` section. | ||
|
||
> [!IMPORTANT] | ||
> Both responses are the same, but the client **MUST NOT** update its state with the event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> Both responses are the same, but the client **MUST NOT** update its state with the event. | |
> Both responses are the same, other than the change of `state` to `state_after`, but the client **MUST NOT** update its state with the event. |
There are a number of options for encoding the same information in different ways, for example the response could | ||
include both the `state` and a `state_delta` section, where `state_delta` would be any changes that needed to be applied | ||
to the client calculated state to correct it. However, since | ||
[MSC4186](https://github.com/matrix-org/matrix-spec-proposals/pull/4186) is likely to replace the sync v2 API, we may as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[MSC4186](https://github.com/matrix-org/matrix-spec-proposals/pull/4186) is likely to replace the sync v2 API, we may as | |
[MSC4186](https://github.com/matrix-org/matrix-spec-proposals/pull/4186) is likely to replace the current `/sync` API, we may as |
> Both responses are the same, but the client **MUST NOT** update its state with the event. | ||
|
||
|
||
## Potential issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any fix we do to communicate deleted state in SSS can easily enough be backported to v2 should it be needed. I see no reason to block the MSC on this point. On the contrary, this MSC massively reduces desync issues compared to v2 without
state_after
because we are now tracking diffs against the current state and not diffs against the last timeline event. For such a small change, this MSC provides immense value.
+1; I'm strongly in favour of not blocking this MSC for years until someone has time to wrangle this last nit. I don't think that landing it as-is will make it materially more difficult to fix later.
MSCs proposed for Final Comment Period (FCP) should meet the requirements outlined in the checklist prior to being accepted into the spec. This checklist is a bit long, but aims to reduce the number of follow-on MSCs after a feature lands. SCT members: please check off things you check for, and raise a concern against FCP if the checklist is incomplete. If an item doesn't apply, prefer to check it rather than remove it. Unchecking items is encouraged where applicable. MSC authors, feel free to ask in a thread on your PR or in the #matrix-spec:matrix.org room for clarification of any of these points.
|
There are a couple of open threads here, but I don't think they need to block us from starting to gather FCP ticks. @mscbot fcp merge |
Team member @richvdh has proposed to merge this. The next step is review by the rest of the tagged people: Once at least 75% of reviewers approve (and there are no outstanding concerns), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for information about what commands tagged team members can give me. |
This is to allow clients to opt-in to a change of the sync v2 API that allows them to correctly track the state of the room.
Rendered
Implementations:
state_after
matrix-js-sdk#4487FCP tickyboxes