RTC: fix cursor awareness / presence bug in nested rich text elements#77673
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
|
👋 Thanks for your first Pull Request and for helping build the future of Gutenberg and WordPress, @danluu! In case you missed it, we'd love to have you join us in our Slack community. If you want to learn more about WordPress development in general, check out the Core Handbook full of helpful information. |
|
@danluu I noticed during testing on this feature that the awareness cursor always appears in the first cell, regardless of where the cursor is actually placed in a table: table-awareness-test.movAlso around ~0:16, typing appears to make the cursor travel linearly through multiple cells. Something is off here. I pushed up a test change in 120e6e3 to test a non-first cell, which should fail. This is a valuable feature, so hopefully it just needs a bit more guidance to get working. I'll plan to complete the work here in your branch if it's not too tricky. |
|
Sounds good! I can try more fuzzing against this when the fix is in to see if we can chase out any potential remaining bugs (or, if you want, you can also try the fuzzer in https://github.com/danluu/gutenberg/tree/try/fuzz directly). |
alecgeatches
left a comment
There was a problem hiding this comment.
I think this PR is good to ship after the first-cell fix. Thank you! I'll wait a bit if @danluu has any more feedback, but I think this is solid.
|
The fuzzing setup found something that's suspicious but isn't a confirmed bug. I'm letting it run to see if it can produce what it thinks is a real repro. If you want to merge this, I think it seems fine to merge and then create a new PR if there's still an outstanding cursor issue since this does fix a number of real issues, or we can wait to see if the agentic fuzzing setup comes back with a repro of a real bug. |
|
@danluu Happy to wait for your results in case there's something else made worse here. |
|
Just as an aside, @dmsnell has mentioned wanting to set up some kind of system so the fuzzer can automatically run on PRs. A hardware company I worked for did that decades ago, in the pre-LLM days, and it worked really well. I think it will make more sense to do this in a while when more bugs are fixed since bug triage is moderately token intensive (at least as I have it set up today—I'm sure it could be done much more cheaply and efficiently if optimized), and the fuzzer has a bunch of things disabled in order to avoid producing too many similar bugs. The coverage is pretty severely reduced due to how much has to be disabled to avoid repeatedly surfacing the same bug, and the triage is still fairly token intensive as LLMs sort through possible failures, but I think this would actually be fairly cheap to run and have decent coverage if trunk were fairly clean w.r.t. bugs the fuzzer turns up, which seems like it should be doable. |
|
Here are some other issues the fuzzer "thinks" are bugs with cursors: issue-1-nested-table-cell-cursor-missing-remotely-visible-click.mp4issue-2-nested-attributekey-becomes-stale-after-row-deletion.mp4The fuzzer "thinks" the first video is an issue that's exposed by an existing test in the PR, test/e2e/specs/editor/collaboration/collaboration-nested-awareness-selection.spec.ts. e086a64 and 0e66eb5 add tests for other issues. There's a fix the AI suggested in danluu/gutenberg@danluu/rtc-issue-06-nested-awareness-pr...pr77673-cursor-awareness-fix with a long (but, IMO, not super verbose) explanation of the alleged bugs and proposed fix in https://github.com/danluu/gutenberg/blob/explanation/docs/explanations/pr-77673-cursor-awareness-fix-plan.md. This supposedly fixes 9 cursor bugs, which were allegedly introduced by a combination of #74728, #74878, #75590, #76107, #76597, #76913, #77164, #77136, and this PR. I'm not sure what the right way to handle these AI explanations is. They're quite long even when not overly verbose, but I think any report that tries to explain the history of a set of concurrency bug is going to be fairly long. When I worked at a chip company and people wrote things like this up by hand (well before LLMs existed), they could be longer on a per-bug basis. |
|
@danluu From a quick glance at the new tests, the reported bugs look likely legitimate but scoped to trickier operations like awareness cursor positioning during row deletes. I think the current PR improves cell-level awareness greatly (versus none at all), and the new Codex results focus on new edge-cases. Would you be okay to cherry-pick the newer test commits ( |
0e66eb5 to
cb1473d
Compare
|
@alecgeatches Sure, I can do that (if I'm reading your meaning correctly; I think you want those out of this PR as well?). |
Yes, thank you! We can merge this PR as-is and I'd like for the new fuzzing results from #77673 (comment) to be in a follow-up PR if possible. Thanks! |
What?
This is part of a series of bug reports and PRs from an AI fuzzing project. See #77532 for more details on that. I filing a few more of these after a discussion with @alecgeatches on how the fuzzer found a bug that I hadn't file that ended up taking some developer time to chase down and repro. This seems like one of the lower priority bugs the fuzzer has surfaced.
Repro before fix:
nested-awareness-cell-crop-repro-failing.mp4
Repro after fix:
nested-awareness-cell-crop-fix-passing.mp4
BEGIN AI GENERATED TEXT
awareness conversion bug at the
PostEditorAwareness.convertSelectionStateToAbsolute()layer.The fuzzer selection state is valid for the CRDT representation: Gutenberg stores some rich-text fields inside nested Yjs maps/arrays, for example
core/tablecells underbody[].cells[].content. The receiver should be able to resolve aY.RelativePositioninside any nestedY.Textto the containing block's local client ID.Failure
Source fuzzer:
/Users/danluu/dev/fuzz/gutenberg-try-fuzz/packages/core-data/src/awareness/test/post-editor-awareness.tsRepro branch:
danluu/realistic-browser-reproCommand:
npm run test:unit packages/core-data/src/awareness/test/post-editor-awareness.ts -- --runInBand --testNamePattern="nested rich-text"Observed on the repro branch:
1401-1405,1407, and1408fail.richTextOffsetresolves to the expected shifted offset.localClientIdisnull, expectedlocal-nested-attrs.1406passes because it picks a shallow enough target for the old fixed-depth parent lookup.The same failure reproduced directly in the
try/fuzzworktree with the original fuzzer file.Cause
convertSelectionStateToAbsolute()resolves the relative cursor position, then assumes the Yjs parent chain is:That is true for top-level rich-text attributes such as
attributes.content, but false for nested rich-text attributes such as:The old code uses
absolutePosition.type.parent?.parent, so nested targets resolve the text offset but never find the containing block. Without a local client ID, the collaborator cursor/selection cannot be anchored to a block element.Introduction
The fixed-depth parent lookup was introduced on February 23, 2026 by PR #75590, commit
d5add1a75a1(Real-time collaboration: Remove block client IDs from Awareness, fix "Show Template" view). That PR intentionally stopped carrying block client IDs in awareness state and instead resolved the local block by walking from the resolved Yjs relative position back to a block map. The text-selection path added this assumption:So the regression point for this lookup bug is PR #75590, not the original awareness-selection work.
Relevant related PRs:
blockId,blockStartId,blockEndId) alongside the Yjs relative text position, so this specific parent-walk failure did not exist yet.core/tablecell merging schema-aware on April 2, 2026 and created nested Yjs structures such asbody[].cells[].contentas nestedY.Textvalues. That did not introduce the incorrect lookup, but it made thecore/tablenested-rich-text CRDT shape that exposes the latent PR Real-time collaboration: Remove block client IDs from Awareness, fix "Show Template" view #75590 assumption.Real-vs-False-Positive Checks
Invalid oracle: ruled out. The expected local ID is resolved through the same block index path used by existing root and inner-block awareness tests. The mocked editor store contains the corresponding local block at that path.
Invalid generated shape: ruled out for the conversion layer. Nested
Y.Textvalues are a supported CRDT shape;crdt-blocksexplicitly serializes nested rich-text attributes into nested Yjs structures.Helper misuse: ruled out. The reproducer uses
Y.createRelativePositionFromTypeIndex()andY.createAbsolutePositionFromRelativePosition()on the same document, matching production selection conversion.Environment contamination and race injection: ruled out. The failure is deterministic in a single
Y.Docwith no async sync provider and no injected races.Known-fixed bug masking: ruled out. This reproduces against trunk plus the table-cell selection-path support in this workspace, but without the awareness conversion fix. The older rich-text offset fix is not involved because the offset is correct.
Lower-Level Repros
Standalone Yjs shape repro: repros/nested-awareness-parent-walk-repro.cjs
Command:
This script does not import Gutenberg. It builds both relevant Yjs parent chains and shows that
parent.parentfinds the block for a top-level rich-text attribute but returnsnullfor nested table-cell rich text. The same script also verifies that an ancestor walk reaches the containing block. It exits successfully only when that mismatch is reproduced.Production conversion unit/fuzzer repro: packages/core-data/src/awareness/test/post-editor-awareness.ts
Command:
npm run test:unit packages/core-data/src/awareness/test/post-editor-awareness.ts -- --runInBand --testNamePattern="nested rich-text"On the repro branch, the seeded nested-rich-text cases fail at
convertSelectionStateToAbsolute()withlocalClientId: null. On the fix branch, the same cases pass.Selection-state nested path unit repro: packages/core-data/src/utils/test/crdt-user-selections.ts
Command:
npm run test:unit packages/core-data/src/utils/test/crdt-user-selections.ts -- --runInBand --testNamePattern="nested rich-text attribute path"This verifies the sender-side lower layer: a block-editor selection with
attributeKey: "body.0.cells.0.content"resolves to aY.RelativePositionin the nested table-cellY.Text. That proves the receiver-side conversion bug can be reached without synthetic awareness writes once table cells expose a stable nestedRichTextidentifier.Browser Repro
Test:
test/e2e/specs/editor/collaboration/collaboration-nested-awareness-selection.spec.tsThis is a browser-level reproduction with user actions:
core/tableblock.ArrowRight.body.0.cells.0.content.On the repro branch, the test reaches the nested attribute path and then fails because no
.collaborators-overlay-user-cursorappears for user 2. That isolates the original conversion bug: the sender now produces a nestedY.Textrelative position from a realistic table-cell selection, while the receiver still cannot walk from that nestedY.Textback to the containing block.The browser reachability prerequisite is commit
3e7ae5a6997(Emit nested table cell awareness selections). It gives table-cellRichTextcontrols a stable nestedidentifierand teachesgetCursorPosition()to resolve dot-path attribute keys into nested CRDTY.Textinstances. The conversion fix is intentionally absent from the repro branch.Command used locally from
test/e2e:The alternate
wp-envconfig in.context/wp-env.e2e-8902.jsonuses port8902only because the default local8888/8889ports were already occupied by other workspaces.Failing browser artifact:
Fix Plan
Replace the fixed
parent?.parentassumption with an ancestor walk from the resolvedY.Textto the nearest Yjs block map, then reuse the existing Yjs-path-to-local-client-ID resolver. Keep whole-block selection handling unchanged.END AI GENERATED TEXT
Unlike with some of the earlier PRs, I left the playwright repro in the PR to make it easier to run, but if many of these PRs get merged, it might be a good idea to remove that from the test suite before merging the PR to avoid adding a bunch of small, high-overhead, playwright tests.
Use of AI Tools
The code here is all AI generated. As noted above, the bug finding came from an AI fuzzing project.