Encrypted Group Messaging, a Smarter Compose Box, and a Path to Shareable Conversations
Build cycle: 2026-05-13 · Revisions: r6634 — r6636
Subsystems touched: Messages dApp, MLS (RFC 9420) service layer, test infrastructure, GLink design
TL;DR
This cycle’s headline:
- End-to-end-encrypted group conversations work. Two GRIDNET OS users on different sub-identities can now create an MLS / RFC 9420 group, invite each other, send encrypted messages, and leave — with full Perfect Forward Secrecy after the leave. The integration test passes 14/14.
- Seven silent bugs in our ts-mls integration that were masquerading as “everything fine but nothing works” — gone. The most interesting one: a missing two-byte protocol-version header that made every KeyPackage, every Welcome, every encrypted message non-decodable. The wrapper functions silently returned
nullinstead of throwing, so the symptoms looked like “things just don’t deliver” rather than “decryption is failing.” - New compose flow in the Messages dApp. The old “+ New message” button popped a single-field dialog asking for a 34-character wallet address. The new one transforms the header into a live-search bar that, per keystroke, queries your Contacts list, your online peers, AND the on-chain Blockchain Explorer in parallel — highlighting people who are already in your address book.
- A complete design document for GLinks — shareable deep-links that work across Messages (MLS-encrypted group joins, with optional shared history) and eMeeting (ZKP-PSK meeting rooms). Implementation rolling out across the next few cycles.
1. The MLS Story — Or, How Seven Silent Bugs Conspired to Look Like One
Last cycle we shipped the plumbing for end-to-end-encrypted group messaging under RFC 9420 (Messaging Layer Security — the IETF-standardised protocol that powers WhatsApp’s groups, Wire, and a growing list of others). It uses TreeKEM, a binary-tree ratchet that lets dozens of peers efficiently update group keys whenever someone joins or leaves, with O(log n) cost per operation. Cryptographic state-of-the-art, in other words.
This cycle we shipped the fixes.
The integration used the well-regarded ts-mls library (v1.6.2). When we wrote our wrapper layer around it (lib/MLSService.js), we made seven independent mistakes — each one small, each one swallowed by a catch clause, each one returning null or an empty array rather than throwing. The wrapper looked correct: methods returned, conversations rendered, the sidebar populated. But nothing actually delivered.
Here’s a flavour:
Every time we encoded an MLS message — a KeyPackage, a Welcome, an encrypted application message — we wrote a JSON envelope and asked ts-mls to serialise it. The library expects a
versionfield. We didn’t pass one. So ts-mls dutifully looked upprotocolVersions[undefined], gotundefined, and encoded that as a 16-bit integer: bytes00 00. Every message went out the wire prefixed with “protocol version zero.” Every receiver tried to decode it, hit “version 0 is invalid,” and bailed. Returnednull. Caller’scatchswallowed it. UI rendered an empty group state. Nobody saw the error.
Multiply that by seven distinct issues across the encode/decode/createCommit/proposal-type/ratchet-tree/list-members/persist surface, and you get a layer that appeared to work but produced groups that nobody could join, sent messages that nobody could decrypt, and recorded membership lists that were always empty.
The forensic write-up is at Tasks/MLS_GROUP_VALIDATION_2026-05-13.md. The fixes ship in r6634. What I want to highlight here, though, isn’t the what but the why-it-mattered:
Group chat is the linchpin of any social messaging product. 1-on-1 DMs are useful, but the moment you can pull three colleagues into a Project X thread, send them a draft, and edit it together over the course of a week — that’s when the product feels like Slack or Signal. RFC 9420 is what makes that experience possible without either trusting a central server with plaintext (the WhatsApp model, pre-MLS) or scaling badly past tiny groups (the old Signal Sender Keys model). It’s the right primitive. We needed it working.
It works now. The test:
$ node tests/flow-mls-group.mjs
✅ PASS PRE — Tab A bind
✅ PASS PRE — Tab B bind
✅ PASS G7.02 — A createGroupA returns ok with groupID
✅ PASS G7.02 — A has mls_state_snapshots row at epoch 0 + groups row
✅ PASS G7.03 — A addMemberToGroupA(B)
✅ PASS G7.03 — B processWelcome → joins group
✅ PASS G7.03 — B has groups + mls_state_snapshots rows for group
✅ PASS G7.05 — A encryptApplicationA
✅ PASS G7.05 — B decryptApplicationA recovers plaintext
✅ PASS G7.07 — B creates self-Remove proposal
✅ PASS G7.07 — A commits the leave proposal
========================================
SUMMARY — passes: 14 / fails: 0
========================================
Fourteen passes. Two sub-identities under one wallet. A real ratchet, real Welcome packets, real ciphertexts crossing tabs.
2. The Compose Box — Or, “Stop Making People Type 34-Character Strings”
The old “+ New message” flow worked like this:
- User clicks + New message.
- A dialog opens.
- The dialog has one input field: “Recipient wallet address”.
- The user pastes
1BKGXMj441G5iWuQyxS7Bpr4bxr9vAvXWy— if they’ve memorised that, or are willing to switch tabs to grab it.- The dialog closes. The conversation opens.
You can imagine the click-throughs in our metrics. This kind of flow is what makes people give up on a product the third time they try to write a message.
The new flow:
- User clicks + New message.
- The header buttons fade out. An inline search bar slides in where they were. Cursor focused.
- The user starts typing — say,
Alice. Or1B. OrProject Lead.- Each keystroke fires three parallel searches:
- Local Contacts cache (instant, no network)
- PresenceManager — who’s online and authenticated right now?
- Blockchain Explorer search (
searchBlockchainA) for on-chain domain accounts — rate-limited to 1 req/s to be polite to the chain.- Results render as a clean dropdown in three sections:
- YOUR CONTACTS — highlighted with a green tint and a
Contact pill
- ONLINE NOW — ● Online pill in cyber-blue
- ON-CHAIN DOMAINS —
Chain pill, in case you only know someone by their friendly_id
- Each row shows an avatar with initials, a presence dot (green = authenticated and online; amber = visible but peer-auth pending; gray = offline), and the friendly name if there is one — falling back to the truncated address.
- ↑ / ↓ navigates. Enter picks. Click works. Esc cancels. Click outside also cancels.
- Click a row → conversation opens. No second dialog. No mode switch.
This is what the user asked for verbatim, and on reflection it’s what every modern messenger does: Slack, Telegram, Signal, Discord, iMessage. The “type-a-wallet-address-into-a-dialog” pattern was a holdover from when contacts didn’t exist yet and the only way to start a chat was with the raw cryptographic identifier. We have contacts now. We have friendly_ids. We have presence. The UI should leverage all of it, in one place, at typing speed.
The integration with the Contacts dApp is deliberately deep: Messages reads from the live Contacts instance’s _byAddress map first (avoiding any redundant SQL), falls back to its _contacts array, then to a read-only cross-account SQL scan of the Contacts package’s contacts table. The presence layer comes from CVMContext.getPresenceManager.getOnlinePeers(). If you and the recipient are both online and have completed peer-auth, the live path lights up green; if only their address is in presence but auth is pending, it shows amber. No surprises.
The full diff is in r6635 (dApps/Messages.js). Live across all 10 themes (Cyber, Retro Terminal, Classic 95, Glass, Light, Void, Abyss, Synthwave, Aurora, Brutalist) via the existing --gn-msg-* CSS variables. Mobile-mode at the 640 px breakpoint collapses the hint label and stretches the search bar full-width.
3. The GLink Design — Or, “Send Someone a Link, Land Them Inside an E2E-Encrypted Group”
We use GLinks today for cross-dApp deep links: https://ui.gridnet.org/?glink=BASE64_JSON_PAYLOAD. Click one and the right dApp pops up, pre-loaded with the right state. Simple deep-linking, nothing more.
The user’s directive this cycle:
“See if we can integrate GLinks into Messages so that a conversation (even a group conversation) could be shared with others and the required cryptographic constructs allowing others to join would be already embedded into the link. Think about reconciling conversation history between newly joining members so that they can access history as well.”
And, separately:
“Look into integrating GLinks into eMeeting too — notice that eMeeting group meetings rely on different ZKP cryptography with a derived pre-shared key — so make sure GLinks properly integrate into this concept.”
Both directives describe the same family of UX: I give you a link, you click it, you’re in the room. WhatsApp invite links, Discord invite links, Zoom meeting links. The cryptographic substrate underneath each is different — but the user expectation is the same.
Tasks/GLINK_INTEGRATION_DESIGN_2026-05-13.md (r6636) is the full design — about 6000 words across schemas, wire formats, threat model, and a 5-phase rollout. The high points worth sharing:
Three new GLink actions
| Action | dApp | Crypto layer | What the link carries |
|---|---|---|---|
open-dm |
Messages | (none beyond standard) | Sharer’s wallet address |
join-mls-group |
Messages | MLS / RFC 9420 | Group ID, inviter address, claim token, optional shared-history flag |
join-emeeting-room |
eMeeting | ZKP-PSK swarm | Room ID, swarm ID, the PSK itself, host address |
For join-emeeting-room the PSK travels in the link. That’s by design and matches the UX expectation of every existing meeting-link product on the planet: anyone you give the link to can join the call. The ZKP authentication still happens (the PSK lets the joiner prove knowledge, not be looked up), but distributing the PSK is the act of inviting.
For join-mls-group, the link does NOT carry MLS material directly — instead it carries enough for the joiner to claim their seat. Joiner clicks → publishes their own MLS KeyPackage if needed → sends a kind=29 mls-glink-claim envelope to the inviter (sealed via the swarm) → inviter’s dApp processes the claim against a pending-claim token → calls addMemberA → Welcome flows back. The joiner is in the group within a few seconds of clicking, having never typed an address.
The History Archive Key (HAK)
This is the most interesting cryptographic piece, and the part where we diverge from textbook MLS deliberately.
The MLS specification’s whole point is that new joiners cannot read past messages. The TreeKEM ratchet advances at every commit, and the keys for epoch N are derived from secrets the joiner at epoch N+1 has never seen. Forward secrecy is a feature, not a bug. It means that if your phone is compromised tomorrow, the messages from yesterday — already deleted from your phone — cannot be recovered from a chain dump.
But product-side, users expect to scroll back. The 22-year-old who joins the work Slack on Monday wants to see what the team decided last Friday. The contractor pulled into a project channel halfway through Q3 wants to see the spec discussion from Q2. “You can’t read the past” is a cryptographic invariant we’d rather not impose on users who have a legitimate reason to share history.
So we layer a side-channel. Per group, we derive a 32-byte History Archive Key at creation time. Every outbound application message gets TWO ciphertexts stored locally — the MLS-encrypted one that flows on the swarm, AND a ChaCha20-Poly1305-sealed copy under HAK in our local DB’s archive_blob column. The HAK never enters the MLS ratchet. It travels out-of-band, sealed to each new joiner’s on-chain X25519 public key (the same primitive our offline-DM path uses), only when the group’s history_mode is 'shared' AND the joiner explicitly consents.
When a member is removed, the inviter generates a new HAK_v2 and re-seals it to every remaining member. Future archive_blobs use HAK_v2. The removed member can still decrypt messages they previously had access to (HAK_v1 doesn’t get destroyed retroactively — that would be O(N) re-encryption of history), but cannot decrypt any new message archive. This is the only honest trade-off available — and we surface it in the leave-group dialog text so users can make informed decisions.
Opt-out is one toggle in the New-group dialog. Default: off, matching privacy-first conventions. Teams who care more about scrollable onboarding than airtight forward secrecy can flip it on.
Phase-1 rollout
Phase 1 in this cycle = the design + the test-plan additions (§7.11, §7.12, §7.13, §4.8) + the schema deltas planned. Implementation lands across the next two cycles:
- Phase 2: HAK introduction + dApp toggle.
- Phase 3: HAK delivery (kind=30 envelope) + past-message recovery on the joiner side.
- Phase 4: HAK rotation on member-remove + comprehensive negative-path tests.
- Phase 5: eMeeting GLink — separate dApp surface, much smaller code change.
4. Test Coverage Snapshot
For the first time we have a comprehensive test driver for the group surface — tests/flow-mls-group-comprehensive.mjs. Results from this cycle:
| Section | Status | Notes |
|---|---|---|
| G7.01 Bootstrap (KeyPackage gen + persist) | Verified separately; main probe hits a test-driver SQL race | |
| G7.02 Group create + chained invite | Solo create + immediate invite, single user action | |
| G7.03 Asynchronous join (online inviter) | Welcome flows, joiner’s epoch matches inviter | |
| G7.05 In-epoch messaging (text + image) | 1×1 PNG round-trips intact across the MLS encrypt | |
| G7.07.01 Committer removes someone else | Members list shrinks, epoch advances | |
| G7.07.02 Self-leave via proposal-then-commit | Proper RFC 9420 dance (committer can’t self-remove in own commit) | |
| G7.07.04 PFS after leave | Removed peer cannot decrypt N+2 messages — OperationError |
|
| G7.10.* UX surface (header, sidebar, dialogs, compose) | All four assertions pass | |
| G7.04 Offline-Welcome via cross-account SQL | Needs a Core SQL seal — 5-30 min poll cycle | |
| G7.06.* Three-peer churn | Needs a third tab on sub-id 2 | |
| G7.08.* Resilience scenarios | Needs envelope-drop/reorder harness | |
| G7.09.* Post-compromise security via PathUpdate | Needs key-rotation hook | |
| G7.11–§7.13 GLink-related | Lands as Phase 2-4 ships |
S5 (Messages 1:1): offline-on-chain path validated end-to-end. The live (kind=22) path requires the broader WebRTC swarm peer-auth handshake to stabilise; that’s a swarm-layer concern outside the Messages dApp itself and is on the platform-side roadmap.
Three evidence logs are now committed under Tasks/validation-evidence/ for anyone who wants to re-run them locally.
5. What This Adds Up To
Three things, really.
First, GRIDNET OS now has working end-to-end-encrypted group chat under an IETF-standardised protocol. Not “we ran some code that didn’t crash” — working. Two users, two sub-identities, asynchronous Welcome, in-epoch encryption, forward-secrecy after leave, 14/14 on the integration test. This was the major outstanding question mark in the Messages dApp surface, and it’s resolved.
Second, the way you start a conversation in Messages no longer feels like 2014. The compose box does what users in 2026 expect — type a name, see live results, hit Enter. The integration with Contacts and the Blockchain Explorer is what makes it feel native to the platform rather than bolted-on.
Third, we have a concrete, written design for shareable encrypted conversations — across both the MLS surface in Messages and the ZKP-PSK surface in eMeeting. The History Archive Key is a deliberate, documented departure from textbook MLS forward secrecy, made for a deliberate UX reason and surfaced transparently to users. It’s not perfect — no history-sharing scheme can be — but it’s honest, and it’s tunable per group.
6. What’s Next
Immediate:
- Phase 2 of the GLink rollout: HAK schema + toggle in the New-group dialog.
- Three-peer harness so we can light up the G7.06.* tests.
- The peerAuthed-stays-false swarm fix (it’s been blocking the live kind=22 delivery tests for weeks now — it’s a small fix in
CPeerAuthonce we identify the missed signal).
Medium-term:
- GLink Phases 3–5 (HAK delivery, rotation, eMeeting integration).
- Full §4 (eMeeting calls) test sweep — 1-to-1 + 3-peer + the unknown-caller trust-stratification flow.
- The
§9adversarial tests (replay, spoofed sender, tampered ciphertext) — these are the ones that prove the security properties, not just the happy path.
Longer-term:
- Federation across GRIDNET deployments via swarm bridges.
- Mobile parity for the new compose flow (the CSS is already responsive; just needs the M-mode toggle wired).
- A “history archive recovery” UI for users who joined a group post-hoc: “we recovered 142 messages from before you joined” with a per-message provenance badge.
Acknowledgements & Credits
Crypto building blocks: ts-mls v1.6.2 (RFC 9420), @hpke/core, @noble/curves (X25519 + Ed25519 + Curve25519), @noble/ciphers (ChaCha20-Poly1305).
UI primitives: GRIDNET OS Common Dialogs, CWindow, CSubWindow, CPresenceManager, CMagicButton, SQL API (CSQLDatabaseRef), Tabulator, the existing GLink primitive.
Pair-programmed and forensic-debugged across the day’s session — including the cathartic moment of realising that all seven MLS bugs were independent and each one was sitting behind a try/catch that printed nothing.
Filed for the GRIDNET OS community, 2026-05-13. Discussion + questions welcome on the project channels. The full revision diffs are at r6634 (MLS fixes), r6635 (compose UI), r6636 (GLink design).