Native Matrix VoIP with Element Call

Arathorn · on March 5, 2022

From my perspective, the really exciting thing about this that it works equally well in mobile web browsers as well as desktop web - clicking on a link on Mobile Safari should Do The Right Thing without having to install anything.

Moreover, because it's built on Matrix, MSC3401 (https://github.com/matrix-org/matrix-doc/blob/matthew/group-...) means that we'll finally have decentralised cascading video/voice conferences once the SFU (selective forwarding unit) component is added into the mix. So, for instance, users on the same homeserver will get their video feeds relayed locally with minimal latency... and then users on another remote homeserver will also get mixed locally with minimal latency, trunking the two together. If the link dies or one homeserver dies, the conference will keep going - i.e. precisely the same semantics as normal Matrix.

jfkimmes · on March 5, 2022

Could you give some insights on how you estimate the amount of effort of making Element Call competitive, performance-wise, with say Discord. I heard that Discord threw a lot of time and money at optimizing voice in their product. Can you just jump in and realistically compete?

There are a lot of performance/latency/sound quality comparisons online of Mumble vs TeamSpeak vs Discord and recently Jitsi vs MS Teams vs Zoom, etc. I feel like this is a problem-space that can be optimized to an arbitrarily deep extent. Just two examples that come to mind are SFU performance/efficiency and noise reduction, two things where e.g Jitsi notoriously lags behind Zoom.

Arathorn · on March 5, 2022

The competitive gap with Discord in terms of media quality is probably something like:

* Need a low-latency SFU. This should be very doable; not only are there a lot of good FOSS SFUs to build on top of these days, the history of the Matrix team is actually that we built VoIP stacks fulltime before we shifted focus to Matrix, and we've built MCUs and media servers of all flavours in the past. MSC3401 should also give us a competitive edge given latency will be automagically minimised by using the physically closest decentralised SFU, thus letting anyone bring their SFU to the party.

* Needs a SFU with good rate control (and/or FEC). This is probably the single most important thing to get right in terms of quality. Signal wrote up a good overview of why: https://signal.org/blog/how-to-build-encrypted-group-calls/

* Excellent noise cancellation (and background noise elimination, microphone scratch noise elimination etc). Ideally you need something like https://krisp.ai/ or https://workspaceupdates.googleblog.com/2021/06/background-n... in the mix - but doing this in an E2EE-friendly and privacy preserving manner is Hard. However, just like we solved E2EE full text search by doing it clientside and making the indexes gossipable between your clients (https://github.com/matrix-org/seshat), we'll have a go at doing something similar for this problem too.

* Excellent automatic gain control. The importance of normalising/compressing everyone's audio so they're equivalent loudness is really important.

We're also in the process of adding in spatial audio (unsure if Discord has that) which should help a tonne with distinguishing the different audio feeds.

We can probably also be more bullish about supporting new audio codecs like Lyra.

EDIT: oh, the other obvious thing is echo cancellation - but we're currently at the mercy of the browser's WebRTC stack for that. However, we could ship a tweaked WebRTC in Element Desktop (or other native Matrix clients) in future to do better than plain old WebRTC.

pthatcherg · on March 8, 2022

Hey, I'm the author of the Signal blog post about SFUs. I have a few questions/comments:

1. I don't think there are many good open source SFUs to choose from. I know of 2, maybe 3 (including our new one). There may be many, but few have good rate control. But maybe I just don't know about them? I'd be happy to learn of more good ones.

2. Echo cancellation is certainly a hard problem, but it doesn't conflict with E2EE unless you do it on the server, which isn't necessary. So perhaps it may be a somewhat harder problem because you close off one possible approach (doing it server-side), but many (most?) echo cancellation solutions are done client-side.

3. You may not be completely dependent on WebRTC's echo cancellation any more because of the new MediaStreamTrackProcessor and MediaStreamTrackGenerator APIs. I don't know if it will work for echo cancellation, but it might.

Arathorn · on March 10, 2022

1. So the SFUs we're currently looking at are yours, ion-sfu (and/or galene) and mediasoup. Honestly we haven't finished looking at how they compare for rate control, but the Pion team seems very interested in ensuring they have good rate control.

2. From context I think you're talking about noise cancellation here? I assumed that some of the more exotic ML-based ones ran serverside, which obviously is incompatible with E2EE. It sounds like there are a bunch of options for running WASM-based intelligent noise cancellation clientside though, especially with MediaStreamTrackProcessor and friends. rnnoiseless as a pure Rust->WASM port of rnnoise looks fun, for instance: https://github.com/jneem/nnnoiseless

3. True, although given Google are highly motivated to make AEC work properly in WebRTC, I guess I'm hoping that they'll continue improving it, much as they have been. I certainly never want to have to write or integrate one ever again :D

spockz · on March 5, 2022

Nice elaborate answer. Discord has noise cancellation by krisp. I believe it is offline only. That would be good enough I suppose. But I’m not sure whether it can be loaded in webasm so it works in the browser. (To meet the cross platform needs of Matrix/Element.)

ryukafalz · on March 7, 2022

> We're also in the process of adding in spatial audio (unsure if Discord has that) which should help a tonne with distinguishing the different audio feeds.

Thank you! This is something I've wanted in a videoconferencing app since I discovered it was a thing, and I'm glad it's Element doing it.

throwaway29879 · on March 6, 2022

Can you at least let us know what "stacks" anyone involved has produced? last I checked nobody involved has ever been involved in real voice platforms (as evidenced by 8 years of "voice first" but actually producing very little actual voice integration)

Arathorn · on March 6, 2022

Sure. The entirety of the original Matrix team used to be a team called “next gen telephony” inside a startup called MX Telecom, which then got acquired by Amdocs and turned into their Unified Communications division, which I ran. The product was a horizontally scalable SIP softswitch built on resiprocate (B2BUA SBC and stateless router components), and then with media processing done using a C++ media graph framework we created called mxmedia. This was similar to gstreamer, but with more flexible threading and using C++ typing extensively to model media formats and propagate them through the media graph. It ended up with roughly the same featureset as today’s gstreamer, but about 10 years earlier - so with muxers, demuxers, packetizers, codecs, resamplers, colourspace transforms, AEC, AGC, rate control, SRTP, ICE, etc. We used it serverside for media gateways for transcoding calls for ISDN, H.323 and IAX; we then ported it clientside when smartphones came around to use it as a webrtc-style softphone SDK.

Unfortunately the IP was all proprietary and owned by the company, and when WebRTC stole our lunch we shifted focus to messaging and building Matrix, and have slowly returned focus to VoIP - hence Element Call.

throwaway29879 · on March 6, 2022

That's pretty much what I thought, which makes me wonder why at least SIP integration wasn't early on the list, there is the voip bridge but it relied on freeswitch and obviously is unmaintained so it doesn't work anymore, is there anything on the road map to make this happen?

jfkimmes · on March 5, 2022

Very insightful! I'll have a look at the links. Thanks for answering.

anticensor · on March 5, 2022

Discord voice is actually a server muxing MCU emulating an SFU.

7sidedmarble · on March 6, 2022

Do you have any details on that? From what I've seen they just say it's a normal SFU architecture.

dopa42365 · on March 5, 2022

Something wrong with simply using RNNoise?

Arathorn · on March 5, 2022

mmm, https://github.com/jitsi/rnnoise-wasm looks like it might help. wonder if jitsi actually uses it.

saghul · on March 6, 2022

We are currently using it for the “your microphone is noisy” notification feature, not for actual noise detection.

With Audio Worklets or insertable streams, however, it should be possible to apply it to the stream im real time, though we haven’t tried it yet.

Cu3PO42 · on March 5, 2022

The repository is under the Jitsi organization, so I'd say chances of them using it or at least planning to use it are quite high.

namibj · on March 6, 2022

You can try in Mumble.

GeckoEidechse · on March 5, 2022

Maybe a bit off-topic but will upcoming Matrix Live also be hosted via Matrix VoIP now? ^^

(At least the smaller ones like interviews where the number of participants is <=8 until compatibility with SFU is done)

Arathorn · on March 5, 2022

sure! :D

youerbt · on March 5, 2022

> And, the big one drum roll, please... we will be integrating this into Element so you can have voice and video rooms, and hold group video calls inside the Element app natively over Matrix.

I'd love to replace my teamspeak server with this. Discord is tempting, but I'd rather host it myself.

jeroenhd · on March 5, 2022

For what it's worth, Matrix / Element already have something like this with the built-in Jitsi integration in (some) clients. The UX isn't great right now in my opinion (which will hopefully change when this project gets merged into Element) but if all you want is rooms, text and voice, then you're already set with the current setup.

You can self-host a Jitsi instance and use it in Matrix rooms as an integration. Depending on your teamspeak setup you can enable guest access to your server, enable registration for server user accounts (or manually register it for specific members), or only let people with existing accounts from other servers join in (their own, matrix.org ones, you name it). You might also need to tweak the ACLs a bit, but other than that you'll have all the core features Teamspeak and friends provide.

I think the biggest difference between the existing Matrix group call system and what's been announced here today is the new UI/UX, and the fact that the native group call API and reference implementation are nearing completion. Both are great news, but don't necessarily add new user-facing features to the Matrix ecosystem.

georgyo · on March 5, 2022

The UX is more than terrible, it is basically unusable.

The jitsi stuff is bolted on, you join a link and your user name does not come with you.

No discoverablity. Does this channel already have a jitsi room attached, no way to know. I see a link to a jitsi room, is there anyone inside? No way to know.

Recommending this as a alternative to TeamSpeak, mumble, or discord today is a way to get people to try a horrible experience and then say Matrix sucks forever.

Only when this call stuff merges will Matrix start being viable for those communities.

anticensor · on March 5, 2022

TeamSpeak 5 actually uses Matrix under the hood, with a proprietary voice protocol and TeamSpeak-specific room version.

GeckoEidechse · on March 5, 2022

Yes but the current UX for that is terrible. With the proposed MSC linked in the blog post, Matrix will additionally gain Teamspeak/Discord like voice channels [1].

With this we could finally have a proper Teamspeak/Mumble bridge for voice that gets properly represented on the Matrix side which is amazing :D

Also kinda funny that Teamspeak only recently started using Matrix for their global chat feature [2].

Maybe now after the gitter acquisition, Element should consider acquiring a certain voice-focused company ;)

**

[1] https://github.com/matrix-org/matrix-spec-proposals/blob/mat...

[2] https://community.teamspeak.com/t/teamspeak-5-beta-bug-repor...

Taywee · on March 5, 2022

That's the exciting bit for me. I love Matrix, but video and voice being primarily "call" based is painful, when what I really want is a dynamic voice/video room that people can quickly and easily drop in and out of, like in Discord.

Arathorn · on March 5, 2022

MSC3401 specifically provides for this UX, with the `m.intent` field used when you instantiate an `m.call` within a room (https://github.com/matrix-org/matrix-spec-proposals/blob/mat...):

* `m.intent` to describe the intended UX for handling the call. One of:

1. `m.ring` if the call is meant to cause the room participants devices to ring (e.g. 1:1 call or group call)

2. `m.prompt` is the call should be presented as a conference call which users in the room are prompted to connect to

3. `m.room` if the call should be presented as a voice/video channel in which the user is immediately immersed on selecting the room.

Element Call implements `m.room` effectively (in that you get slung directly into the call once you click on the link). Once we integrate this into Element itself, then the plan is to support all three different intent types: `m.ring` for "group calls", `m.prompt` for "conferences" and `m.room` for discord-style voice/video rooms.

panick21_ · on March 5, 2022

So will the room have a setting to make a voice video only room, a mixed room or a chat only room?

I assume m.ring would not be needed if its a something like a Teamspeak room.

Arathorn · on March 5, 2022

It wouldn't be a setting; the intent would be set based on how you initiated the call. If you hit a big 'call Alice and Bob!' button you'd get a call which rings them both; if you hit a 'start a conference' you'd get a Teams/Zoom-style conference; if you hit 'create a voice/video room' button you'd get a Discord-style voice/video room.

(N.B. we might not bother with the group-calling option in Element; haven't decided yet - but Matrix needs to support it for folks who want those semantics :)

Taywee · on March 5, 2022

Thanks for the extra info. I'm really looking forward to the fruition of this, and will happily be trying out this current Element Call beta quite soon.

kitkat_new · on March 5, 2022

you can choose one of the three, and if you choose option 3, the room will become a voice video only room (this is how I understand it).

So m.ring existing isn't a problem.

How it is exposed in the UI, is a completely different thing, although I don't think it would be impossible to make a button that creates a room with a m.room call aka voice/video room

chayleaf · on March 5, 2022

Mumble works great for me in the meantime

Gigachad · on March 5, 2022

Mumble doesn't really work on mobile making it not usable for most users

soupbowl · on March 6, 2022

I've used plumble on android many times on 5g for multi hour drives with minimal issues as recently as a few months ago.

ryukafalz · on March 6, 2022

I use it via Mumla occasionally on Android. Works fine?

panick21_ · on March 5, 2022

This is really cool. I really feel like this ecosystem is hitting its stride.

The list of upcoming features is long and sometimes I don't understand how they can have so many things going on at once.

Features tend to take a few years to mature, but that is not a bad thing.

I will be setting up my own server shortly, something that I have been thinking about for a while and maybe try to develop something.

Congrats to the team.

Question:

The OpenID part, would it not be cool if Matrx would offer something like a federated SSO experience? So that on other webpages you could use your matrix login instead of Github/Google and co. Hosting that server yourself would be fantastic.

There are of course some issues with Attribute retrieval if user can host these servers themselves.

Arathorn · on March 5, 2022

> The OpenID part, would it not be cool if Matrix would offer something like a federated SSO experience?

Yes, hopefully we'll be able to use the OIDC server bundled with your Matrix server to auth users in general if wanted (or alternatively you can of course point at an existing OIDC server).

bb010g · on March 6, 2022

Has support for IndieAuth and/or FedCM (when stable enough) been considered?

https://aaronparecki.com/2018/07/07/7/oauth-for-the-open-web

https://github.com/fedidcg/FedCM

ptman · on March 6, 2022

https://gitlab.com/ptman/matrix-login this could ofc be modified to support oidc

jeroenhd · on March 5, 2022

I wonder if this setup can be leveraged to create a Twitch-like streaming solution based on Matrix. The group call setup is clearly not built for such a use case, but if you only stream to a small audience (say, people paying for a certain perk on Patreon?) you could probably use this quite easily.

The client would need better support for alternative inputs, of course, such as the RTMP suggestion found on the Github issues page, but it can and should work.

What I'm also curious about is how this fits in with the experiments on the peer-to-peer mesh system (running a homeserver on every client). Forwarding video across a web of clients would become bandwidth and CPU intensive real fast and the added latency would be non-negligible.

I'll be watching this play out with great interest, that's for sure.

Arathorn · on March 5, 2022

> I wonder if this setup can be leveraged to create a Twitch-like streaming solution based on Matrix.

Yes, we've built it with this in mind. There are three possible approaches:

* You can broadcast a headless client via HLS or RTMP, similar to how Jibri works for Jitsi, and how we broadcast FOSDEM (https://matrix.org/blog/2021/02/15/how-we-hosted-fosdem-2021...). Basically you run a headless Chrome against a virtual X server and pipe X's framebuffer into ffmpeg. It's a pretty blunt approach and obviously uses a lot of serverside resources, but ensures that the broadcasted stream is completely faithful given that it's literally a recording of what a client would be seeing.

* Alternatively, you could do the same approach, but pipe it into a WebRTC broadcasting platform rather than HLS/RTMP/RTSP/DASH. This could give much lower latency, but more bandwidth on the server given you're effectively setting up a 1:1 VoIP call with each viewer, and you'll be unlikely to be able to use CDNs (unless we end up in some crazy world where freeform multicast on the internet finally comes to pass thanks to CDNs fanning out your RTP packets for you :P)

* Finally, you could composite the streams together serverside using an MSC3401 compliant MCU and broadcast the result via one or other approach. This might not be quite as high fidelity as running a 'real' Element Call instance serverside, but could be way more efficient in terms of resources, and could prove perfectly adequate. The only catch is that we haven't bolted MSC3401 onto any MCUs yet. (Clearly we should get back in touch with our friends at FreeSWITCH / Signalwire :D)

> What I'm also curious about is how this fits in with the experiments on the peer-to-peer mesh system (running a homeserver on every client). Forwarding video across a web of clients would become bandwidth and CPU intensive real fast and the added latency would be non-negligible.

MSC3401 is pretty much orthogonal to P2P Matrix. The way it works is that it'll use whatever conferencing nodes ('foci') are advertised in the room to mix together the calls in a decentralised fashion. If there are none (as per today) then it just goes full mesh. If there is one, then everyone will converge on it. If there are two, then folks will pick the one with the lowest latency. But critically, it doesn't matter whether the clients are talking to a serverside homeserver or are running P2P Matrix. Finally, the foci could run either serverside or clientside (much as skype supernodes were effectively clientside foci), but we need to get serverside foci working first before we add clientside ones in for the P2P crew ;)

anticensor · on March 5, 2022

> If there are two, then folks will pick the one with the lowest latency.

No it needs to be deterministic. "Just picking the lowest latency node" would result in sync failures i.e. not everybody hearing the same thing.

Arathorn · on March 6, 2022

Mixed sync +/- 100ms of latency or whatever is probably fine, for most purposes? If you really want everyone to be centralised on a single SFU then you’d set the permissions on the room appropriately.

anticensor · on March 6, 2022

It can be distributed SFU just the relative timings need to converge deterministically (possibly by servers agreeing on a virtual apex to artificially delay the streams to make up for topological differences) to make everyone hear the same thing.

Arathorn · on March 6, 2022

So if it turns out that the streams need to be synced then we can delay them to do so - but i am a bit unsure on how important that really is. Are you worried about folks talking over each other if they think they successfully interrupted but then a remote user thought the same and they collide and backoff, like csma/cd?

anticensor · on March 6, 2022

> Are you worried about folks talking over each other if they think they successfully interrupted but then a remote user thought the same and they collide and backoff, like csma/cd?

Yes, that is really important for classroom use cases.

jpbernius · on March 5, 2022

There is some cool development going on with Matrix recently! Threaded messaging first and now group calls. I am especially excited for Safari Support as this is my personal pain-point with Jitsi Meet. Very excited to see this evolve into the Element client.

traspler · on March 5, 2022

Do the Homeservers act as Signaling, STUN and TURN servers or are there additional components necessary? Will the SFU part in the future also be part of Synapse or will/are these things split?

Arathorn · on March 5, 2022

The homeservers act as signalling servers. For STUN/TURN you need to run a separate TURN server (typically coturn), as per https://matrix-org.github.io/synapse/latest/turn-howto.html.

The future SFU will similarly be split from the homeserver, with the initial implementation based on either Signal-Calling-Service, ionsfu or mediasoup (we're evaluating all three). Of course, the point of being standards based is that you'll be able to mix & match SFUs and MCUs from other vendors.

You can find out more about the architecture from my talk at Commcon (https://2021.commcon.xyz/talks/extending-matrix-s-e2ee-calls...) - or Robert's talk at FOSDEM: https://fosdem.org/2022/schedule/event/matrix_metaverse/

anticensor · on March 5, 2022

For court admissible conferencing, routing everything through the persistent room state is almost a hard requirement.

traspler · on March 5, 2022

Thanks a lot for answering in detail!

qqcwerqwect · on March 5, 2022

As I understand it, signalling will happen over the matrix protocol. as matrix/element already had a 1:1 voice chat, synapse integrates well with coturn, so you typically run coturn along side synapse.

georgyo · on March 5, 2022

Only signalling. Matrix users coturn for STUN and TURN.

desertraven · on March 6, 2022

Slightly off topic: I really like the idea of Matrix/Element, but I’ve found I can’t rely on it.

In three instances, I’ve had a long-running chats stop working completely and simply not load. All history/media inaccessible. Just a loading spinner. This is Element on iOS.

I wish it were more dependable, because things like this would excite me much more.

Arathorn · on March 6, 2022

We have had some fairly nightmarish bugs on iOS thanks to how iOS changed push notifications a while back, meaning that you end up with a tiny client in the push extension trying to independently sync and decrypt e2ee traffic which was at risk of clashing with the main app and causing problems. I think I know the bug you are describing, and it was fixed several months ago - as of the current build the spinner should only show for less than a second and never get stuck on. Sorry that you got bitten by this; I’m also an iOS poweruser and the fallout from the push changes was painful.

Seperately, we wre also experimenting with a full rewrite of the iOS app on top of matrix-rust-sdk called Element X, which whould provide a much better and faster codebase to try to avoid this sort of thing in future.

desertraven · on March 6, 2022

Wow! Seems it was user error, since I just updated it and it’s fixed. Thanks so much.

I wish I could update my parent comment. Although this interaction speaks for itself I think. Thanks for a great product!

encryptluks2 · on March 5, 2022

> We’re also working with the Tauri core team to provide cross-platform lightweight desktop apps as soon as possible!

How about a lightweight cross-platform Matrix client that doesn't rely on Electron?

Arathorn · on March 5, 2022

Tauri + Electron (or Hydrogen) provides a lightweight cross-platform Matrix client that doesn't rely on Electron...

Meanwhile, Nheko, NeoChat, Fractal and others provide very usable lightweight native cross-platform Matrix clients too. Element is also working on its own experiments (TBA) based on matrix-rust-sdk.

caslon · on March 5, 2022

Calling Fractal usable is maybe a bit of a stretch, but there are a lot of really good Matrix clients now!

JCWasmx86 · on March 5, 2022

Fractal-Next is really usable for me at least. (The times I've tried it) The only thing that is hindering it for me is the missing SSO-Support, but as soon as my MR is merged, it has SSO-Support

Arathorn · on March 5, 2022

Fractal-next is looking very usable (thanks to leveraging matrix-rust-sdk as its engine), tbh.

karmanyaahm · on March 5, 2022

i think you meant Tauri + Element

Arathorn · on March 5, 2022

bah, so i did. doh

Jack5500 · on March 5, 2022

Well, it doesn't use Electron, it uses tauri, which doesn't bundle chrome but uses the native browser and is therefor much smaller, faster und more lightweight.

crossroadsguy · on March 6, 2022

I had my hopes up after seeing “native” in the title.

But then, I have decisively given up on Matrix ever being anything even remotely a solution for individual-individual kind of personal usage. But I do hope to see it being used at (my) work places soon.

There are multiple reasons for that. It costs money and I doubt the general populace will pay for that and since Matrix/element won’t harvest data and destroy privacy it’s a no go. They don’t have limitless resources like large corps.

Fragmented apps, using different severs (let alone running the server itself) in the too many clients isn’t “people scalable”. (It it ever even begins to try it it has to one one single blessed server and a blessed app - which at least the current such app Element is long time from)

So no, they can’t and won’t invest in making Element mainstream for personal usage and rightly so.

So what Matrix/Element is going to be?

Slack, Google Workspace Char etc: - YES!

WhatsApp, Signal, Telegram: NO!

caslon · on March 5, 2022

There are already like a dozen Matrix clients that don't use Electron. It doesn't take much effort to go to the Clients list and pick one out.

Gigachad · on March 5, 2022

The problem is the group of users who have a problem with Electron also tend to have a problem with a billion other things (the size of the chat bubbles as another user commented here) so they end up never being satisfied with any client and any developer who tries to cater to them will be flooded with a wave of complaints about how the animation speed should be configurable.

caslon · on March 6, 2022

I have a problem with Electron, personally. I think it's a rational thing to complain about. I just don't think that misrepresenting your case is a good thing.

GeckoEidechse · on March 5, 2022

Why not just use FluffyChat?

https://fluffychat.im/

nani8ot · on March 5, 2022

I personally can't stand how big the chat bubbles in FluffyChat 1.0+ are. They were always big but now they are huge. I would have preferred it if they got smaller...

But with Element now shipping chat bubbles as an configurable option, I'll stay with that. For me, the new chat bubbles size is perfect.

Aside of that, I'm generally really fond of FluffyChat, they are doing a great job!

DoItToMe81 · on March 5, 2022

I used Fluffychat for a few months. I found its performance was rather lacking. It had slowdown issues and general problems that I didn't find with other clients.

abnercoimbre · on March 5, 2022

I run a hybrid conference [0] with a physical track and an online track. All tracks communicate using Matrix.

Element Call sounds like something I should integrate into the conference experience, but logistics and audience size is a concern. We'll be entering 1k+ territory this year.

[0] https://handmade-seattle.com

Arathorn · on March 5, 2022

please wait until we’re out of beta! :)

mawise · on March 5, 2022

> Element Call is built entirely on Matrix: it doesn’t need any additional servers to get going. You can run it against your existing Matrix homeserver to provide complete self-sovereignty…

> In the near future, we will support using the app with any homeserver

I really hope this isn't an indicator how how Element the company will be focusing more on their paid offerings to the exclusion of supporting private homeservers. I've recently been experimenting with running a Matrix homeserver on a Raspberry PI in my basement--I'm excited about actual self-sovereignty in messaging but it isn't quite to the point where I can tell my family to rely on it instead.

jfkimmes · on March 5, 2022

I feel like this is a misunderstanding. You can already self host 'Element Call' and point that to your own homeserver. The passage you're referencing just refers to their hosted Element Call client. I'm pretty sure they're currently trying to move away from credentials-based login on 3rd-party clients to token-based authentication. I would guess they open up their hosted client when they finished implementing this new authentication scheme to avoid the doubled effort and to discourage using your homeserver credentials on "random" clients.

Arathorn · on March 5, 2022

Yup, the grandparent post is a misunderstanding. Basically, we want Element Call to work with a single click to a URL without requiring any account - much as Jitsi and similar do. Therefore it needs to pick a default homeserver. Currently it doesn't play well with existing Matrix accounts (as it will sync in loads of chatrooms which are irrelevant when you just want to do a call), so for expedience we just create accounts on its default homeserver to get going.

However, we're in the process of moving Matrix over to OIDC for auth (as per https://matrix.org/blog/2021/12/22/the-mega-matrix-holiday-s...), at which point apps like Element Call should be able to easily hook into your existing account on your own homeserver using your existing server auth. In other words, the server will auth you, not the client. Combined with Sliding Sync (aka Sync v3) https://matrix.org/blog/2021/12/22/the-mega-matrix-holiday-s... this will then let you securely and efficiently use your existing Matrix account with apps like Element Call without it all getting bogged down with your existing chatrooms (or needlessly giving Element Call access to conversations it shouldn't care about).

Finally, this is beta: the auth/reg stuff is very much placeholder - for instance, it doesn't even expose password reset, given the upcoming shift to OIDC.

danShumway · on March 5, 2022

> using your existing server auth

Complete sidenote, but just to expand on the above, the OIDC work is particularly exciting because (at least my understanding is) it opens the door for letting people use a single Matrix account for a lot of different "stuff" on top of Matrix.

There have been some interesting conversations and demos I've seen about using Matrix rooms to help with P2P sync for apps, and there are already bots on Matrix that use Matrix rooms for notifications so you don't need to send emails. But setting that stuff up is kind of challenging if you don't want to make a dedicated new user for each app or grant access to your account -- and (again, assuming my understanding is correct) I vaguely suspect that we might start to see a lot more 3rd-party apps built to integrate with Matrix once it's easier to point someone at a webapp and say, "just log in with your Matrix account, and it'll only have access to one room that's specific to this app."

So imagine potentially having a collaborative app that's hyper-focused on one task, like a collaborative D&D messaging platform or some crud. And in the backend, everyone using that app uses their regular Matrix account, and under the hood it's just sending messages to a shared room -- and then as a developer you don't need to worry about handling a bunch of encryption and user accounts and all of that stuff, you might not even need a backend at all.

dane-pgp · on March 5, 2022

> you might not even need a backend at all.

On the basis that "the best code is no code", this could revolutionise app architecture. Imagine if instead of using blockchains, Web3 apps were backed by Matrix servers.

mawise · on March 5, 2022

I guess I still don't understand it. (And my apologies for coming across like I was trying to pick a fight)

What does it mean that I can currently use it with my existing homeserver? The architectures for some of these federated systems can be difficult to understand, I feel like I didn't really understand what _Matrix_ was until I tried running my own homeserver.

jfkimmes · on March 5, 2022

Federated systems tend to have different client and server implementations that can work in all combinations (as long as both are spec compliant).

The confusing part comes into play when the client implementation is a web app that you can use to log in to any server hosted anywhere. Because of how centralized services work, we are used to a paradigm where the website you visit hosts its own backend - you cannot choose. With these federated systems, on the other hand, you can go to website element.foo.com and log in at matrix.bar.com.

The blog post was saying they limited the option to log into your homeserver at matrix.bar.com (because of the reasons outlined above). This does not, however, stop you from hosting their (FOSS) app at call.bar.com and pointing it to your homeserver at matrix.bar.com

lijogdfljk · on March 5, 2022

A bit off topic, but anyone know if Threads are something Matrix currently supports, or ever wants to support?

My family uses Signal for _a lot_, likely way more than we should, and i feel like we'd benefit from threads immensely.

Maybe i should toy with making my own Matrix client or something, heh.

edit: https://github.com/vector-im/element-web/issues/2349 looks like it might exist in some beta form. Depending on the client heh

jeroenhd · on March 5, 2022

Threads are supported on Matrix as of a week or two ago. They're no longer in beta as far as I can tell. I don't know for sure how well alternative clients support them, but desktop and mobile versions of Element do threads just fine.

Alternatives (Fluffychat, for example) are usually a while behind Element because of manpower. For example, native polls are in Element but not in Fluffychat yet. So, if you want to risk the transition, best to recommend the official clients first.

That said, I don't know how well threads will actually work in a family context. I like them as a concept, but outside Google Chat (where every message in a room is a response to or the start of a thread) I haven't seen them get picked up naturally.

kevincox · on March 5, 2022

Threads are in beta on the Element clients. IIUC the spec isn't quite solidified (they are waiting for more clients to experiment first).

Arathorn · on March 5, 2022

yup, it’s currently in testing on element web/ios/android and you can play with it in beta.

Apotheos · on March 6, 2022

Have always wanted to try and run a Matrix instance but the tutorials all seem just a bit out of my skill level.

Would love if there was a docker container to easily spin up an instance.

ryukafalz · on March 7, 2022

There is: https://hub.docker.com/r/matrixdotorg/synapse/

jwithington · on March 5, 2022

Big fan of matrix. Run a server on my RPi.

giancarlostoro · on March 5, 2022

My only concern is does this expose your IP to participants? This is one benefit of Discord Voice channels vs Discord Calls (which are end to end but expose your IP to everyone on the call, and theirs to you, so much so, you just open a console log and you can see them all...).

Arathorn · on March 5, 2022

Currently yes, but it's trivial to fix and we'll do so on Monday - see https://news.ycombinator.com/item?id=30570436

mattmerr · on March 5, 2022

Sounds like hiding IP is going to be opt-in? I'm not sure what the implications are of TURN but what would be the downside of making IP-hiding the default?

Arathorn · on March 5, 2022

it would increase latency badly (the turn server for the call.element.io instance is in the UK, so all traffic would bounce through it, often unnecessarily), and it would cost us loads on bandwidth as a result. It also means that the turn server sees all the IPs and metadata of who is calling who, which may not be an improvement if you trust your caller more than your server admins!

For instance, two users on the same LAN calling each other would end up bounced via the UK, which is a bit unfortunate if they are in Australia.

The solution is really to switch to using SFUs everywhere, which then solves both firewall traversal, scalability and privacy (assuming you’re happy for your SFU to know your IP - but if you’re happy for your TURN to know it, then it’s probably fine).

stevenicr · on March 5, 2022

great post, animation helps a bunch too.

wondering if ip addys are exposed to the other users like with webrtc and if so, could we force conencts to be only through coturn to hide the ips?

Also wonder if there will be a warning for such, especially of encryption is turned on - some may think they are truly anonymous, and where ips are exposed of course that's not true.

Also wonder about moderation, hopefully this does not become a target for the sickening trolls soon - but moderation needs will be coming, so who gets the ip logs to consider blocking? homeserver runners?

I expect the future will need whitelists/blocklists subscription options for clients at some point.

Arathorn · on March 5, 2022

Currently we don't force TURN, so in practice this means that voice packets go direct between the clients if possible, and so the IP addresses of the clients are necessarily exposed to each other.

However, this is utterly trivial to fix: matrix-js-sdk already exposes https://github.com/matrix-org/matrix-js-sdk/blob/96ba061732b... and we simply haven't exposed it as a setting in Element Call yet. I've filed a bug for it at https://github.com/vector-im/element-call/issues/251 - thanks for bringing it up!

In terms of moderation: this is no different to moderation in Matrix as a whole, where we're already busy working on shared greylists (MSC2313 and friends) - https://matrix.org/blog/2020/10/19/combating-abuse-in-matrix... has more details at the end.

Godel_unicode · on March 5, 2022

Has anyone looked into their implementation of E2EE for calls? My naive assumption is they are brokering some kind of shared key exchange as opposed to requiring everyone to send an individually encrypted stream to everyone else.

jfkimmes · on March 5, 2022

The post says there is no E2EE at the moment. They're probably talking about signalling, since with the current implementation of full mesh you get E2EE 'for free' in WebRTC.

Also the post mentions: https://2021.commcon.xyz/talks/extending-matrix-s-e2ee-calls...

Arathorn · on March 5, 2022

So in the initial beta we haven't turned on E2EE, purely because it will make it way harder to debug any problems which surface.

However, at this rate, things are looking pretty stable and i'd expect us to enable it in the next few weeks. It uses normal Matrix E2EE, which means a Double Ratchet between the pairs of devices participating in a given conversation, which is then used to secure the signalling which is used to set up the calls between the devices. The call media is transport-layer encrypted via DTLS and SRTP, so as long as the signalling with the DTLS fingerprints is secured by Matrix E2EE, the whole call fabric will be E2EE.

https://github.com/matrix-org/matrix-js-sdk/pull/2002 is the PR we actually need to merge in order to enable E2EE on it, ftr.

dodgerdan · on March 6, 2022

Will E2EE be on by default?

Arathorn · on March 6, 2022

arianvanp · on March 5, 2022

Mesh doesn't scale for large conference calls though. Usually some SFU needs to be in play. But that's only possible if you have Insertable Streams support which for now only Chrome has

Arathorn · on March 5, 2022

Safari DP 141 actually released RTCRtpScriptTransform support 2 days ago (thanks to https://trac.webkit.org/changeset/270107/webkit/). But we're waiting eagerly for Firefox to add it in https://bugzilla.mozilla.org/show_bug.cgi?id=1631263.

Hopefully by the time we've sorted out the SFU component, all the browsers will have RTCRtpScriptTransform implemented and so we'll get good cross-platform E2EE for larger calls (rather than it being limited to Chrome + Safari + Desktop)

arianvanp · on March 18, 2022

Oh very nice!

jiffygist · on March 5, 2022

Just wondering, can I have fullband stereo sound to play music?

Arathorn · on March 5, 2022

The media quality that you hear is entirely up to the sender's WebRTC implementation. If you're calling another Element Call user, we currently use the default audio constraints when capturing audio: https://github.com/matrix-org/matrix-js-sdk/blob/96ba061732b... - but if you wanted stereo, you could put { channelCount: 2, sampleRate: 48000 } or whatever into that line to achieve it.

I'll file a bug to make this configurable.

EDIT: https://github.com/vector-im/element-call/issues/249

puyoxyz · on March 5, 2022

Why is this an Element thing and not a Matrix thing?

Arathorn · on March 5, 2022

The protocol, spec (MSC3401) and underlying client heavy lifting (matrix-js-sdk) are all Matrix; contributed by a mix of Element employees (Robert, me, Dave) and community contribs (Simon B). The app skin itself on top is Element, given most of the work and the original idea is from Element folks - same reason that Element itself is a product from Element-the-company even though 95% of the code is underlying Matrix projects. The Matrix Foundation itself is a non-profit to look after all the Matrix protocol and reference implementations but doesn’t ship user-facing products, just as W3C and Linux Foundation don’t.

throwaway29879 · on March 6, 2022

Still waiting for an android release that doesn't use ~7G ram and crash consistently on desktop (14G ram, then dies, or most recently, instantly segfaults)

As usual, delusions of grandeur and horrifically over confidence in the abilities of a 45 million dollar funded gang of "developers" that have produced basically nothing in 8 years, when unfunded students have produced a discord clone in about 6 months

robobro · on March 5, 2022

Did anyone else try to install this? It just refreshes non stop for me when I access its domain.

Arathorn · on March 5, 2022

works fine for me. if the app had launched i'd ask you to submit feedback (which ends up in our bugtracker). as it hasn't, can you file a bug on github.com/vector-im/element-call/issues with a copy of the javascript console so we can see what's breaking? you might also want to disable any weird browser extensions to see if that's to blame.

robobro · on March 5, 2022

sure, appreciate the comment! :)

The main instance @ call.element.io also doesn't seem to work for me, but I chalked that up to (assumed) heavy server load.

2Gkashmiri · on March 5, 2022

this does not seem to have changeable room addresses right now like jitsi.

we use jitsi common url to have weekly meetings. meet.jit.si/meeting2. everyone knows the url and time. we just join and talk. this is a nice concept

jeroenhd · on March 5, 2022

That seems to be available, though?

https://call.element.io/meetingwiththegang opens a meeting for a room named "meetingwiththegang".

Ideally, you'd want to use such a system with accounts so you can apply some kind of ACL, but if you want to use randomly generated room names for secret links then everything you need is there already.

Looks like you can't change the room name after starting the call, but you can start a new one at the new address if you want to change the name.

kitkat_new · on March 5, 2022

I think it does in theory (Matrix supports changing room addresses), however what is the benefit of this as opposed to using the same address for all meetings?

jacobmischka · on March 5, 2022

I think that's what they want, I believe by "changeable" they meant "customizable"?

3np · on March 5, 2022

The person you're replying to wants to set it to something custom, to make it memorable and meaningful.

Arathorn · on March 5, 2022

We have this today. https://call.element.io/whateveryouwant does what you'd expect.

ve5eta · on March 5, 2022

Impressive

soupbowl · on March 5, 2022

I am excited about this, I hope it is stable and secure...

encryptluks2 · on March 5, 2022

I feel if XMPP just had some better tools this push for Matrix would be unnecessary.

Gigachad · on March 5, 2022

If IRC had some better tools then the push for XMPP would be unnecessary. In fact, why didn't we just build IM on top of email?

Turns out its much much easier to start from scratch with the right ideas than to transform an existing protocol in to something usable and convince everyone else and update all software to match.

"The market" has decided that Matrix works well and XMPP doesn't. If it made sense to fix XMPP then someone would have done it and we would all be using that.

SamWhited · on March 6, 2022

I don't think this is true. Matrix is largely only used by the HN crowd and a handful of open source projects. It's a couple million users max, which in this case is tiny. XMPP is used in Jitsi and Zoom and various Google products (is Firebase still based on XMPP? I'm not actually sure about this one) and on your Nintendo Switch and probably still on Playstation, and in most of Cisco's video stuff which is used by every megacorp, etc.

grey_earthling · on March 6, 2022

Someone did build IM on top of email: https://delta.chat

pigeons · on March 5, 2022

Yeah, I read Matrix's reasoning for not just improving XMPP and instead starting a new protocol, and I gave them a few years, and have to use matrix quite a bit and am consistently convinced it was the wrong choice and XMPP is the way to go.

GeckoEidechse · on March 5, 2022

https://matrix.org/faq/#what-is-the-difference-between-matri...

zaik · on March 5, 2022

If you read https://github.com/matrix-org/matrix-spec-proposals/blob/mat... it becomes apparent that the 'eventually consistent' part of Matrix is more of a hindrance in this case.

Also 'the more federation and interoperability the better' is kind of contradictory to constantly reinventing extisting Internet Standards. I suspect VoIP over Matrix is not compatible with SIP or XMPP A/V calls and looking at current state of the matrix.org XMPP chat implementation they probably will never be.

zaik · on March 5, 2022

As of recently, Dino already can do video conferences based on XMPP and other clients are working on it as well. There is also a bidirectional SIP/XMPP bridge, so it integrates with other existing Internet Standards: https://sip.cheogram.com/

kitkat_new · on March 5, 2022

please read the article (and linked MSC [0]) first and check out how this decentralized(!) conferencing proposal works.

I don't think tools that implement this exist in the XMPP ecosystem at all.

[0] https://github.com/matrix-org/matrix-spec-proposals/blob/mat...