Prior Art and Competing Projects
In this blog post, we will compare Polyphony with various existing projects, both open-source and proprietary, to understand their strengths and weaknesses, and to identify areas where Polyphony can either copy features or differentiate itself.
Signal
Signal is a pioneer in modern end-to-end message encryption. The protocol and software are engineered around security and privacy concerns. Signal excels in this regard, accepting some cuts in features or usability compared to other messengers.
While all apps and server software are available under a FOSS license, Signal is not federated and discourages third-party clients; it is instead implemented as a centralized service. This makes it hard for users to self-host or modify the software, which is a practical barrier to truly free software. That said, some practical efforts have been made to base third-party software on Signal, such as Cable IM and Molly.
Moreover, Signal makes a fairly decent argument against federation on their blog, arguing that federated protocols have a tendency to get "stuck in the past" as multiple independent implementations have a tendency not to evolve as quickly as centralized services. We will have to plan around this trade-off in some way.
Signal requires accounts to be tied to a phone number, which forces users to have a single identity/account, making it hard to keep parts of ones' online life separate.
SimpleX
SimpleX is an end-to-end encrypted messaging app with a strong emphasis on metadata privacy.
The lack of global identifiers is a core design feature. Users invite each other to converse over a particular set of relay servers, whose addresses are included in the invite link. Each link is unidirectional, so that when Alice talks to Bob, the relays used are different than when Bob replies to Alice. When multiple users use a single relay server, messages are sent out with randomized timing ("traffic mixing") to prevent external observers from learning who is talking to who. Due to the invite-only messaging, the protocol is inherently spam-resistant.
Relays can be easily self-hosted, and any user is free to choose which relay servers they trust - the extent to this trust is limited to relaying messages effectively. A trade-off exists between larger relays that can effectively perform traffic mixing, or smaller relays that are operated by better-trusted parties.
As a result of this strong focus on security, the design is fairly complex, with security features somewhat dominating over user experience.
Multi-device experience is poor and requires some technical knowledge, involving remotely-controlling clients through TCP ports, causing issues with firewalls and availability when the machine running the client may be offline.
Discord
Discord is a large commercial chat and communications platform, with an emphasis on building communities ("servers" or "guilds") of varying sizes and with optional public accessibility. Thanks to its size, the freemium business model, ease of use, and voice and video chat offerings, it has evolved into a de-facto standard messaging platform for casual conversations, especially for younger generations.
Discord serves as a significant source of inspiration for Polyphony, especially its focus on communities rather than more individualized messaging.
As a proprietary and centralized service, a single, large, commercial party — a company based in the USA — processes and stores all communication in clear text—including direct messages. With lax privacy statements, they are in a position to effectively do whatever they want with this data.
Users are at the mercy of a centralized moderation team, which are in a position to ban users for any reason, which are not in much of a position to appeal. For instance, Discord discourages the use of third-party clients, and may ban an account if abnormal behavior from a client is detected. Their policy on this is a bit unclear, alleging they ban third-party clients for appearing like spam bots abusing the client API rather than the dedicated bot API, but it still illustrates being at the whims of their moderation team. Recently, invite links to servers marked as "community" were set to auto-expire after a while - this was unilaterally done without the consent of these communities. It is fairly common to find expired Discord invite links issued by users who were not aware of this restriction.
Moreover, Discord is frequently misused for hosting files, documentation, or discussion which may have future value for indexing, searching, and archiving. Discord servers are almost always closed off to the wider web, unless third-party archiving solutions are used.
ActivityPub/Fediverse
ActivityPub is a W3C recommended federated social networking protocol, known for microblogging applications such as Mastodon, Misskey (and its forks), GoToSocial, etc. It enables a network of independent servers to provide a decentralized, federated microblogging service. Users on any server may view posts by and interact with users from any other server.
Due to its decentralized and federated nature, the Fediverse is generally resilient against issues that affect individual servers; a single server failure will generally not affect users on other servers.
Culture, rules, policies, and how they are enforced may vary wildly from one server to the next. While this allows a great plurality of ways of working, being, and existing, it also frequently leads to spouts of drama. Especially the notion of "defederation", whereby one server blocks another - preventing any interactions between users of both servers, including severing existing follower/followee relations - frequently leads to conflict.
If a server administrator is abusive, users may open an account on another server and continue - this is a significant strength of the federated model. However, while migrating followers/followees from one account to the other is supported, this process often does not transfer all followers, does not transfer posts from the old account to the new, and relies on cooperation from the old server, which might not occur in case of abuse or sudden instance shutdown. It is on this aspect that Polyproto (and Polyphony more general) intends to significantly improve, by empowering individual users to take control of their own identity on the network and support migration as a first-class feature.
Email (SMTP)
SMTP-based Email is a comparatively ancient system, but still in extremely widespread and active use. SMTP is a truly federated system with a wide variety of implementations and providers, and is fairly easy to self-host. Hosting is usually dominated by large providers, but small single-user self-hosted SMTP servers generally work fine, once DNS records are set up correctly.
There is also relatively little security built into the protocol itself, with most measures added on afterward and not being consistent among implementations. PGP and S/MIME are common methods for signing and encrypting messages, though this is mainly done by expert users who know how to use these technologies. Sadly, PGP and S/MIME come with their own suite of vulnerabilities.
GNU Jami (formerly Ring)
Jami/Ring uses a distributed hash table to organize a truly serverless peer-to-peer instant messaging system. Accounts are merely public/private key pairs with some metadata, and can be created without email verification or other such steps.
Unfortunately, GNU Jami also comes with a lot of the downsides associated with P2P networking. For instance, when tried by the author of this document, messages were simply not delivered or delivered with an extreme delay. P2P does not always play well with local network setups, which often involve a myriad of different firewalls, network address translation (NAT), local administrative policies, which may lead to mysterious and hard-to-diagnose issues, especially for novice users. Protocols that ensure clients only ever talk to servers in standard ways (such as HTTP(S)) avoid such local issues.
Moreover, in P2P, nobody is in charge of the infrastructure by design; this makes the overall network somewhat uncontrollable.
RetroShare
Another mostly serverless peer-to-peer networking application. Integrates messaging, social networking, and even email over its network.
RetroShare further integrates somewhat features regarding a kind of "social graph", meaning an observable graph of who is a peer of who, which may serve as a solid basis for establishing webs of trust and social organization.
Similar to Ring/Jami, uses an NAT for routing; the result is a similar truly decentralized design for peer-to-peer networking. While the client includes a number of tricks for routing, this page's author had trouble actually connecting to a peer - despite both parties being fairly well-versed in technological matters. In our view, this cements the need for more traditional client-server approaches, if only because these play better with local networks.
Mumble
While Mumble does voice chat, not instant messaging, we do feel a need to mention it as a piece of self-hostable free software that simply does one thing well.
While technically not federated, any client can connect to any server. Servers may host small communities, all while there is no lock-in of specific users to specific servers. Communities and groups themselves, however, do not necessarily have an option for migration if they are defined by the server's address.
Zulip
Zulip is a free and open source (Apache license) non-federated collaboration message platform. The project collaboration talk is hosted on the official instance, but the software can be self-hosted as well.
We mention Zulip for its notable UX surrounding asynchronous communication organized by topic, where topics are themselves grouped into "channels".
As conversation threads tend to go off-topic after a while, a remarkable feature is the ability for moderators to post-facto move messages to different threads or topics, keeping things organized without enforcing that conversations rigidly remain on-topic all the time.
Polyphony should probably steal some ideas here.
Matrix
Matrix operates as a decentralized, federated communication platform. Users may be hosted on any server, and can then communicate freely with users on any other.
Matrix provides end-to-end encrypted chat as well as a number of fairly complex and original mechanisms for resilience and security.
While this work is legitimately impressive, the resulting protocol is complex and relatively difficult to implement, whereas Polyphony aims at simplicity and practical usability above all.
We explicitly wish to avoid Matrix's dreaded "unable to decrypt message" issue, which most new users run into and are frequently confused by.
XMPP
XMPP (Extensible Messaging and Presence Protocol) is a relatively old, well-used federated message exchange protocol with a wide and diverse variety of implementations. The federation is meaningful in that anyone can easily self-host a server and communicate with others freely.
The protocol counts a core instant message functionality, as well as somewhat overwhelmingly large number of protocol extensions (XEPs) enabling a variety of behaviors and applications, including group chat, keeping message histories on servers, copying messages to other clients, end-to-end encryption over OMEMO, as well as several competing standards for message styling and formatting.
While extremely granular and modular, this makes the protocol somewhat difficult to implement and use in practice, which is why Polyphony aims to implement protocol extensions in a far less granular manner.
Conclusion
Comparing to existing and competing protocols, Polyphony fills a nice currently not occupied by other applications. Applications are either proprietary, prioritize security over usability, are centralized, deal poorly with abuse by infrastructure owners, or miss the kind of community-building that a platform like Discord offers.
By focusing on different priorities, Polyphony aims to provide real-world usability and dependability.