A conformally parameterized discrete version of a Catenoid minimal surface. Just like the elliptic curves over finite fields used in elliptic curve cryptography it can be interpreted a discrete version of a corresponding Riemann surface. See Bobenko et. al. for more information about the mathematical background.

Implementing the Signal E2EE protocol for sopher.io

Published in

Becoming sopher

8 min readAug 29, 2018

Sopher.io provides end-to-end encrypted communication for teams. We offer end-to-end encryption for each and every communication that happens between your and your team member’s devices. In this post we want to answer the question how we got there and what are the special properties of our implementation compared to the Signal protocol as specified by Open Whisper Systems (OWS).

When we started our implementation in December 2016 OWS had just released descriptions of their algorithms for the handshake protocol X3DH and for the Double-Ratchet-Algorithm to the public domain. That was great since we would not have to dig around in their source code. An earlier attempt to implement the Signal protocol by the encrypted messenger Wire conjured up legal problems due to similarities with the source code of OWS's implementation. The description of multi-device support Sesame was added a bit later and so we came up with our own which is pretty similar. But first things first, what are the properties we expect from a message protocol that suits our needs?

Properties of an ideal protocol

One key aspect of the technical side of Sopher is the fact that we do not operate an application server. Everything happens in the client and if we really need to store information online we upload it to standard cloud storage such as Dropbox, Google Drive, S3 or the like. This poses a few challenges to the implementation of the messaging protocol. In particular we faced the following tasks:

Only use cryptographic primitives provided by the WebCrypto API
Symmetric asynchronous handshake
Message acknowledgement
Multi-device support

A browser implementation of the Signal protocol using Typescript

For the browser implementation of the Sopher client we use Typescript. This enables us to leverage the power of asynchronous programming using the keywords async and await. Not to mention the beauty of working with an editor (mostly Visual Studio Code) that really understands the structure and dependencies of the project. If you are not familiar with Typescript check out their Documentation, its worth it.

In the next sections we will walk you through the differences to vanilla Signal as described in the OWS specs.

WebCrypto primitives

At the current stage of development we support browsers that implement the WebCrypto API sufficiently well (Chrome, Firefox). This has several advantages security-wise and implementation-wise. It is more secure since we can leverage the power of non-exportable keys. A CryptoKey can be flagged non-extractable when generated by the API:

let key = await crypto.subtle.generateKey(
  {
    name: 'ECDH',
    namedCurve: 'P-384'
  },
  false, // make the key non-extractable
  ['deriveKey', 'deriveBits']
);

This makes it extra hard for an attacker to steal the keys. Nevertheless those keys can be stored in IndexedDB for subsequent usage.

Obviously sticking to WebCrypto greatly simplifies the implementation as we do not have to create fallback strategies for browsers without WebCrypto. This would add a great amount of overhead since one would have to use WebWorkers for encryption to avoid locking the browser context.

The X3DH protocol recommends using the X25519 or X448 elliptic curves for the use in Diffi-Hellmann key agreements. However those curves are not part of WebCrypto implementations neither in Chrome nor Firefox, let alone Safari. Safari however added ECDH support to its Technology Preview in May 2017. We therefore decided to use the standard NIST curve P-384 to perform Diffie-Hellmann key agreements and DSA signatures.

All other cryptographic primitives recommended by OWS are available in WebCrypto implementations. We use AES-CBC for content encryption and HMAC with SHA-512 for payload signatures as described in the JWA specification. One could consider using direct encryption as described in the spec to save a little space here. On the other hand we need per-recipient encryption for group messaging anyway. One way to implement this is via the multiple encryption of a content encryption key.

Symmetric asynchronous handshake

We basically follow the specification of X3DH when establishing a session between devices. The one major difference is in the way we determine in advance who is Alice and who is Bob to avoid the hassle of maintaining more than one session as described in the Sesame algorithm.

In Sopher a shared file in a cloud storage acts the role of the server. A client publishes his public (pre-)keys for other clients to pick up and start a session. In particular we store the following information in a JWS structure signed with the device key:

{
  DID: "c6e4b0b5b8ffcfeb4e0...",
  name: "Chrome",
  timestamp: 1535538368530,
  key: {crv: "P-384", kty: "EC", ...},
  pipeKey: {crv: "P-384", kty: "EC", ...},
  signedPrekey: {crv: "P-384", kty: "EC", ...},
  onetimePrekeys: [
    {crv: "P-384", kty: "EC", ...},
    {crv: "P-384", kty: "EC", ...},
    {crv: "P-384", kty: "EC", ...},
    ...
  ],
  invitations: [
    {
      salt: "jJTjeLvmeeYyT8...",
      url: "eyJhbGciOiJBMjU2S1ciLCJlbmMiOiJBMjU2Q0JDLUhTNTEyIn0..."
    },
    ...
  ],
  ,
  ...
}"

We establish a transport encryption by using Diffie-Hellmann key agreement with the corresponding pipeKeys and invitation salts of the devices. The resulting symmetric key is used to encrypt both the url of the message pipe (the file that contains all encrypted messages) given in the invitation attribute and the contents of this file. If Alice wants to start sending messages to Bob she uses the corresponding device keys as IKa and IKb, the signedPrekey of Bob as SPKb and one of the public keys contained in the onetimePrekeys array as OPKb. In addition to that she generates her own ephemeral key pair EKa. Refer to the X3DH protocol for the details on how to arrive at a shared secret using all of these keys.

In the initial messages that are included in the first and second chain of the Double Ratchet Algorithm we include the following information:

export interface InitialMessageData {
  IK: JsonWebKey;
  EK: JsonWebKey;
  SPK_THUMB: string;
  OPK_THUMB: string;
  AD: string;
}

We identify the selected signedPrekey and onetimePrekey by their respective thumbprints as defined by JWK thumbprint. AD takes the value literally as defined in the X3DH spec. The public keys are serialized using the JWK specification.

The X3DH protocol is asymmetric by nature: a device starts sending messages by downloading a set of (pre-)keys from the server and calculating a shared secret from it. In the next step the recipient receives one of the first messages that includes all necessary information to complete the session handshake.

In our setting the situation is symmetric: Any device can start sending messages as soon as it learns about the public device profile of a contact’s device. Once the other device receives these messages it should either pick up the remote session or adapt its own session it created before. To achieve this we fix the roles of Alice and Bob as described in the OWS spec in advance by using a natural order of public EC identity keys by comparing x coordinates. The device with the smaller key is Alice, the one with the greater acts the role of Bob.

When a device starts sending messages the situation is perfectly symmetric. Each device downloads a set of keys, performs X3DH as specified and starts sending. Only until a device receives the first message, then the state either has to be switched to a new shared secret (in the case of Bob) by performing the corresponding X3DH step. Or, in the case of Alice, just the receiving chain has to be initialized with the shared secret defined by the other device's initial messages:

Algorithm 1: Initialization of the state after the first message.

By doing so we avoid keeping track of more than one session per pair of devices.

Message acknowledgement

We need acknowledgement for messages in order to decide which messages we have to store online (We write a file for all pending messages to cloud storage if the device is offline) for later retrieval at the device of the recipient.

In order to accomplish message acknowledgement we include in each message header the missing messages that are stored in MKSKIPPED as described in the spec. The corresponding header attribute mm is an array that contains chain identifiers (the thumbprint of the corresponding public chain key) followed by the message indices within that chain. Such a JWE header may look this this:

{
  alg: "A256KW",
  enc: "A256CBC-HS512",
  aad: {
    v: 1,
    dh: {
      crv: "P-384",
      kty: "EC",
      x: "mmEa7QxSNTCJMD80hynUq6GJV1Aq8l...",
      y: "wjgt1XqQeGx3XxLJTHjk3zLAYH3Ncr..."
    },
    n: 0,
    pn: 1,
    mm: [
      "gcNwdlmBw4EZIPbrc4V-uN6t7cesDwSeja-SgdeZxKs",
      0,
      1,
      "rKlsCL-DMQOUY_jmlXlOqeOmVZbilOO9Z_3CxiwMrOI",
      0
    ],
    dm: []
  }
}

Note that we include all additional header information that is not part of the JOSE specs in the aad (Additional Authenticated Data) attribute. This includes the protocol version v, the JWK chain key dh, the index of the message in the current chain n, and the length of the previous chain pn. If this would be one of the initial messages then we would include an extra init field in the aad structure.

This acknowledgement procedure can however only capture what the other side knows it is missing. Thus we must retain all messages of the current and the previous chain in addition to the messages mentioned in the header since the other side might be missing messages at the end of the previous chain or in the current chain.

We implement a method called retainStoredMessages that deletes all stored messages but those mentioned in the mm header. We invoke it before DHRatchet so it effectively leaves the stored messages of the previous and current chain alone, see line 7 of this Gist:

Algorithm 2: The decryption routine.

Message discarding

If a device A is inactive for more than a month device B would stop encrypting and sending messages for it. Additionally, device B would discard all messages that have been stored so far to avoid bloating the contents of the corresponding pipes connecting A and B. When device B becomes active again we signal the indices of discarded messages in the dm header field of each of the following messages from B to A until those indices are removed from the mm header of A as a reaction to the dm header contents. See also line 9 of Algorithm 2.

Multi-device support

There is not much to say about multi-device support. As we have only one Double Ratchet session per pair of devices there is not much maintenance going on when sending messages between contacts. We have two kinds of messages at the moment: system messages directly address separate devices, and chat messages that address all devices of a contact or group of contacts. If a message is sent to all devices of a contact it gets encrypted separately for each of the sessions.

Future work

Direct JWE encryption to save space

Currently we use A256KW encryption algorithm to encrypt all messages feeding the Signal message key as input for the AES-KW operation. We believe that this extra step doesn't add security to the encryption over using the message key directly to encrypt generate the MAC of the JWK.

Acknowledgements

We like to thank the team of Open Whisper Systems.