WGPS Confidentiality
Data synchronisation via the WGPS faces a dilemma: on the one hand, we want to opportunistically synchronise data with as many peers as we can, on the other hand, we want to preserve confidentiality of all data that is guarded by access control. This document presents in detail how we balance those needs.
Data confidentiality goes much further than withholding Payloads. We need to protect NamespaceIds, SubspaceIds, and Paths. To make things more difficult, we want to keep these confidential even when an active eavesdropper listens in on a sync connection. We want to allow for synchronisation with anonymous peers, but we cannot protect against active eavesdropping in those sessions. Hence, we need to be careful which information we disclose even to a peer who has read access to some certain data.
NamespaceIds, SubspaceIds, and Paths do not only occur as part of authenticated Entries, but they also inform which data two peers want to sync in the first place. While two peers discover their common interests, we do not want to leak any of these either. To simplify our presentation, we introduce some definitions around these concepts, starting with the notion of a PrivateInterest.
Confidential data that relates to determining the AreasOfInterest that peers might be interested in synchronising.}Let p1 and p2 be PrivateInterests.
We say p1 is more specific than p2 if
p1.namespace_id == p2.namespace_id, andp2.subspace_id == anyorp1.subspace_id == p2.subspace_id, and- p1.path is an extension of p2.subspace_id.
We say that p1 is strictly more specific than p2 if p1 is more specific than p2 and they are not equal.
We say that p1 is less specific than p2 if p2 is more specific than p1.
We say that p1 and p2 are comparable if p1 is more specific than p2or p2 is more specific than p1.
We say that p1 includes an Entry e if
p1.namespace_id == e.namespace_id, andp1.subspace_id == anyorp1.subspace_id == e.subspace_id, and- p1.path is a path_prefix of e.path.
We say that p1 and p2 are disjoint there can be no Entry which is included in both p1 and p2.
We say that p1 and p2 are awkward if they are neither comparable nor disjoint. This is the case if and only if one of them has subspace_id any and a path p, and the other has a non-any subspace_id and a path which is a strict path_prefix of p.
We say that p1 includes an Area a if
p1.subspace_id == anyorp1.subspace_id == a.subspace_id, and- p1.path is a path_prefix of a.path.
Security Model
We can now lay out out the security model of the WGPS: which data does the WGPS expose in which scenarios? We do not have formal proofs for any of these claims, these are merely our design goals (which we believe to have achieved).
Throughout the following, Alfie and Betty are honest peers, Muriarty is a malicious peer who may deviate arbitrarily from the WGPS, and Epson is an active eavesdropper on the networking layer who can read, modify, drop, or insert arbitrary bytes on a WGPS communication channel.
Threat Model
We consider two primary scenarios:
- Alfie and Muriarty sync, and Muriarty tries to glean as much information from/about Alfie as possible.
- Alfie and Betty sync without knowing any long-term secrets of each other, and Epson attacks the session and tries to glean as much information from/about Alfie and Betty.
Note that Epson can simulate a Muriarty, or they could even be the same person. Epson is only interesting for cases where Alfie syncs with somebody who has more knowledge than Epson, so we do not consider the cases where Muriarty and Epson collaborate.
If Alfie and Betty know longterm public keys, they can exclude an active attacker during the handshake. If only one of them knows a longterm secret of the other, Epson is less powerful than in the setting where neither knows a longterm secret of the other; hence we only analyze the prior scenario.
Scope
We now list the information we wish to keep confidential. We group it in four levels, based on which kind of peer or attacker is allowed to glean which information.alj: Worst table styling ever, please send help. Might need multiple rows instead of nested lists?
| L0 |
|
|---|---|
| L1 |
|
| L2 |
|
| L3 |
|
We consider fingerprinting, tracking, and deanonymisation based on information from L3 to be out of scope of this document.
Goals
We now describe which kind of information can be learned by which kind of attacker. A rough summary:
- With a valid ??read_capability??, you can access all corresponding information (L0 and below). This is a feature.
- If you know or guess a PrivateInterest without holding a corresponding ??read_capability??, you can access information at L1 and below.
- An active eavesdropper can access everything at L2 without holding a corresponding ??read_capability??.
Consequently, NamespaceIds, SubspaceIds, and Paths that cannot be guessed are never leaked by using the WGPS. Conversely, attackers can confirm their guesses about PrivateInterests to some degree. Hence, it is important to keep NamespaceIds, SubspaceIds, and Paths unguessable by introducing sufficient entropy.
Syncing with Muriarty
When Alfie syncs with a malicious peer Muriarty, Muriarty is able to glean the following information:
- If Muriarty has a ??read_capability?? for some PrivateInterest p:
- Muriarty can learn all information from L0 and below that pertains to the intersection of p and any PrivateInterest of Alfie. Muriarty is allowed to access this information, there is nothing malicious about this.
- If Muriarty knows or guesses a PrivateInterest p which he does not have a ??read_capability?? for:
- Muriarty can learn the number of more specific PrivateInterests of Alfie, but nothing else about them.
- Muriarty can learn all less specific PrivateInterests of Alfie, as well as all information of L1 and below for those PrivateInterests.
- For every PrivateInterest p_alfie of Alfie such that p and p_alfie are awkward:
- If
p_alfie.subspace_id == any, Muriarty learns Alfie’s ??subspace_capability?? for the namespace. - Otherwise:
- If Muriarty has a ??subspace_capability?? for the namespace, he learns all information of L1 and below pertaining to p_alfie.
- If Muriarty does not have a ??subspace_capability?? for the namespace, he learns that p_alfie exists, but nothing more about it.
- If
Syncing Attacked By Epson
If Alfie syncs with an honest peer Betty, but an active attacker Epson can read and manipulate the communication channel, that attacker can glean the following information:
- Epson can glean anything that a Muriarty can glean.
- If Betty has a ??read_capability?? for some PrivateInterest p, which Epson has no knowledge about:
- For every intersecting PrivateInterest of Alfie’s, Epson can glean the information at L2 and below.
- If Betty has a ??read_capability?? for some PrivateInterest p, and Epson knows p but does not have a ??read_capability?? for it:
- Epson can learn exactly the same information as if Epson was a Muriarty who guessed p but does not have a ??read_capability?? for it.
TODO TODO TODO}A capability that certifies read access to arbitrary SubspaceIds at some unspecified Path. The namespace for which this grants access. The user to whom this initially grants access (the starting point for any further delegations). Authorisation of the user_key by the namespace_key. Successive authorisations of new UserPublicKey.}