[OTR-dev] Fwd: Ensuring transcript soundness in a multiparty chat setting

Tue Dec 10 16:13:41 EST 2013

Either I'm missing some common background, or there's an unaddressed
ambiguity in "message ordering" in this thread:

Chat UIs typically present a linear sequence of messages. This presents a
distinct problem if we assume an underlying DAG agreement protocol, because
DAGs are not linear. Consider a simple DAG with arrows pointing to
predecessor messages:

B -> A <- C

There are two ways to "squash" this into a linear order: [A, B, C], or [A,
C, B].  This ambiguity is crucial for human interaction; imagine A is
"Vanilla icecream is the best!", and B is "Are you planning to do something
illegal?" and C is "Yes, without a doubt!"

Suppose we try to ensure every participant sees a *consistent linear
serialization* of the DAG order. We could do this, for example, by sorting
branches such as {B, C} consistently such as by the lexical sorting of the
message hashes. When we consider time, this leads to a concerning UI
behavior: over time, new messages may not only be appended to the
linearization of messages, but new messages may also be inserted.  For
example, if B sorts consistently before C, then a user may at one time see
[A, C], then later [A, B, C].

I believe all of these issues are already present in a two party chat,
regardless of the underlying protocols ordering or reliability guarantees.
 (For example, XMPP would have this problem, given TCP without packet loss
and without malice once there are multiple federated servers interacting, I
believe.)  Suppose Alice sends message A, and Bob receives A.  Now Alice
composes and sends B before receiving C while Bob composes and sends C
before receiving B. The "typical" chat UI will show Alice [A, B] and Bob
[A, C] at various times, but later both clients will show all three
messages in some order.

I'm all on board for decentralized protocols that arrive at agreements
about DAG structures, but let's keep an eye on the UX implications. Perhaps
a better way to express my concern is to imagine a perfect DAG-order
agreement protocol, then imagine an attacker which can influence only how
the DAG is serialized in the UI, and they cannot alter message contents or
the agreed partial ordering, only the ambiguous parts of the ordering. What
attacks can they accomplish?

Maybe a good chat client would show the DAG structure directly, which I
hope users could understand and which would remove the linearization
requirement altogether which seems full of unavoidable trade-offs. Also, a
good client must show messages which are not under agreement yet - at least
the messages the user has entered - in some sort of pending state.

Regards,
Nathan

ps: This isn't just abstract nit-picking. I once had a multi-hour emotional
altercation with a significant other and we later realized it was due to
our two respective SMS clients showing different message orders and that
one of us was responding to one question whereas the other believed the
response was for a different question.

On Thu, Dec 5, 2013 at 12:59 PM, Ximin Luo <infinity0 at gmx.com> wrote:

> On 03/12/13 18:12, George Kadianakis wrote:
> >> On Mon, Dec 2, 2013 at 3:43 AM, Ximin Luo <infinity0 at gmx.com> wrote:
> >>> On 02/12/13 02:53, Trevor Perrin wrote:
> >> [...]
> >>>>
> >>>> But I agree that committing to DAG predecessors would result in
> >>>> smaller messages, so could be advantageous.
> >>>>
> >>>> OTOH, piggybacking all predecessors means the recipient learns
> >>>> all
> >>>> predecessors for a message upon receipt of that message.  That
> >>>> has
> >>>> advantages too:
> >>>>
> >>>>  * In an OldBlue-type protocol, the recipient could issue LostMsg
> >>>> requests for all missing predecessors right away, instead of
> >>>> needing
> >>>> multiple rounds of waiting for lost messages to arrive before
> >>>> discovering more predecessors.
> >>>>
> >>>
> >>> Multiple rounds could be alleviated without resorting to "all
> >>> previous
> >>> messages", if they communicate their own Frontier in the LostMsg
> >>> message. Then,
> >>> people handling this LostMsg can infer that the grandparent(s) are
> >>> also Lost,
> >>> and retransmit those too.
> >>
> >> That makes sense.  I'll admit I'm not super interested in
> >> LostMsg/Retransmit algorithms, as we've got existing chat/messaging
> >> protocols that do an adequate job with reliability (I think?)
> >>
> >
> > I'm also not super interested in LostMsg/Retransmit algrorithms. That
> > said, I don't think that the current major chat protocols provide
> > reliability (when the server is adversarial or when a user suddenly
> > disconnects).
> >
> > For example, I don't think that XMPP provides reliability or
> > in-order-delivery on its own (see XEP-0198 and XEP-0184). However,
> > because it's used over TCP, its streams are both reliable and
> > ordered. Still, communication between Alice and Bob over XMPP actually
> > involves two streams: Alice <-> Server and Server <-> Bob.  While each
> > of those streams is reliable and ordered, nothing tells us that the
> > communication channel Alice <-> Bob is reliable or ordered if we
> > consider the server as an adversary (i.e. the server can reorder or
> > drop Alice's packets before forwarding them to Bob).
> >
> > FWIW, based on https://otr.cypherpunks.ca/Protocol-v3-4.0.0.html, OTR
> > itself "assumes a network model which provides in-order delivery of
> > messages, but that some messages may not get delivered at all (for
> > example, if the user disconnects).". I think OTR provides its own
> > reliability by using NAKs and doing retransmissions (at least
> > according to section 5.1 of the otr-wpes.pdf paper).
> >
> > Slightly off-topic, but lately when I'm thinking of the network model
> > of a secure multiparty chat, I've found it helpful to consider a
> > multiparty chat over a centralized chat protocol (like IRC or XMPP,
> > where clients send messages to the server and the server broadcasts
> > them to the other clients). It seems helpful for two reasons:
> > a) The server, in centralized protocols, is a powerful network
> > adversary that can easily reorder, delay or drop messages (it's like
> > the Global Active Adversary for the distributed case).  Similar
> > attacks can also be performed by network links in a decentralized P2P
> > model, but thinking of the centralized case seems easier.
> > b) In the same protocol-agnostic fashion as OTR, it seems natural to
> > me that the first instances of secure multiparty chats will use the
> > popular chat protocols that are currently available. This probably
> > means XMPP MUCs and IRC, which are both centralized.
> >
> > What I'm trying to say, is that even though providing reliability in a
> > multiparty chat protocol doesn't seem like a huge priority, a robust
> > such protocol should have a way to ensure reliable end-to-end delivery
> > of its messages.
> >
>
> Interesting that OTR assumes in-order delivery. If we assume a
> network-level adversary or untrusted central service, it could definitely
> deliver messages out-of-order. I think the DAG model already protects
> against this, though.
>
> >> You've talked about periodically broadcasting empty messages.  I
> >> think
> >> that's a good idea, particularly if parties can expect that of each
> >> other, as it allows checking for timeliness and liveness of the
> >> whole
> >> group.  Tossing that in, here's my latest:
> >>
> >> --
> >>
> >> Each Message contains:
> >>  - PREDECESSOR list containing the most-recently received message
> >> number from each other participant (equivalently: the number of
> >> messages received from each other Participant)
> >>  - HASH over the predecessor messages (including the sender's
> >>  previous message)
> >>
>
> I still think hashing over all participants is redundant. You only need
> the direct predecessors, like in the git history DAG. From this, anyone can
> transitively infer your version of the PREDECESSOR list. A less dense DAG
> is also easier to analyse. We can run with your version for now, though.
>
> >> A Message's HASH can be verified once all predecessors have arrived.
> >>
> >> Each Participant maintains:
> >>  - An UNVERIFIED pool of received messages whose hashes have not yet
> >> been verified, and an expiration time for each.
> >>  - Lists of RECEIVED messages from each Participant.  The HEAD
> >>  message
> >> in each list is the latest message from that Participant, and has an
> >> expiration.  The tail message in each list is the oldest message
> >>  that
> >> is referred to by an UNVERIFIED message (older messages don't need
> >>  to
> >> be kept around).
>
> I'm not sure this possible in general. If the unverified message has
> multiple chains of lost messages preceding it, you don't know which "oldest
> message" is referred to. Using the example from my last email:
>
> 1<--2---3---4---5---9
>              \
>               +-6---7---8
>
> (This would occur when, e.g. whoever sent 6,7,8 did not yet receive 5,9.
> This might be multiple people. Note that this is using the sparse DAG; you
> can fill in extra pointers for a dense DAG.)
>
> Say Alice sees all messages except 6. She knows about 6 itself, because
> she has 7 which points to it. However, she does not know 6's own
> precedessors, so she must keep all messages *in case* 6 points to 1.
>
> With the expiry mechanism, you can toss all messages that *would be
> expired* if it was the HEAD, though. That might be a better approach rather
> than discarding things via DAG-based metrics.
>
> >>
> >> The expirations associated with UNVERIFIED and HEAD messages are for
> >> timeliness checks:
> >>  * If there's ever an expired UNVERIFIED message, that means you
> >> haven't received some predecessor within a reasonable amount of
> >> time.
> >>  * If there's ever an expired HEAD message, that means you haven't
> >> heard anything from some party within a reasonable amount of time.
> >>
> >> On receiving a message, do:
> >>  - Check that the message has newer predecessors than the previous
> >> message from this sender
>
> This is nice. In the sparse form of the DAG, this check would be
> equivalent to checking that a new message from a sender, say Bob, has their
> previous message as an ancestor through the DAG. (The same property for
> other participants follows transitively from this property, so you only
> need to check the sender. In the dense case, participants can lie about the
> extra redundant pointers, so we do need to check them.)
>
> However, this check is only possible if we have already verified all the
> predecessors, so I think it would be better to do this as part of
> "verifying". An example:
>
> 1<--2   3<--4
>
> To simplify things, assume that Bob sends all messages. We receive 2 from
> Bob, then 4 from Bob, but we haven't received 3 yet. Doing this check here
> would fail, since we don't know what 3's parents are. Bob could be
> attacking us with a bogus message, or everything could work out when we
> later see that 3 has 2 as a parent.
>
> >>  - Add the message to the UNVERIFIED pool and the appropriate
> >>  RECEIVED
> >> list, and set expiration times
> >>  - If any message in UNVERIFIED can now be verified, do so, and
> >>  either
> >> raise a security fault or delete it from UNVERIFIED
> >>
> >> If you haven't sent a message in X time, send an empty message.
> >>
> >> Thoughts?
> >>
>
> Basic gist of it looks good, next we need more detail. I'll think about it
> as well. :)
>
> One comment (which might be redundant as I'm not up-to-date with all the
> contextual literature) is that this reduces scalability. This may or may
> not be a concern, but this would make e.g. a conversation with 30 people
> more costly. (But perhaps other parts of the protocol, like the handshake,
> already make this costly, so we will just restrict ourselves to a "small
> group".)
>
> >
> > I like it. It seems like it would work.
> >
> > (Would be even better if we could sketch a quick proof for its
> > soundness too.)
> >
> > Also, IMHO, messages should not be forwarded to the chat client while
> > they are in the UNVERIFIED pool. Assuming normal (non-adversarial)
> > network operation, receiving message predecessors should be quick and
> > users shouldn't even notice the buffering.
> >
> > We can even make some "Large amount of unverified messages" or
> > "Impossible to verify messages from participant X for a while"
> > warnings that we can forward to the chat application instead.
> >
> > (I'm saying this, because forwarding unverified messages to the client
> > with a big fat "UNVERIFIED MESSAGE" warning seems like a recipe for
> > disaster. Similar to CA-SSL's "The site's security certificate is not
> > trusted" usability/security nightmare.)
> >
>
> I'm cool either way. It would also be easier to implement in the way that
> you say. My main motivation for suggesting the other approach was
> performance - to show the user *something* rather than having things block
> on one unreceived message. But maybe this is not important enough to
> outweigh the other security concerns.
>
> --
> GPG: 4096R/1318EFAC5FBBDBCE
> git://github.com/infinity0/pubkeys.git
>
>
> _______________________________________________
> OTR-dev mailing list
> OTR-dev at lists.cypherpunks.ca
> http://lists.cypherpunks.ca/mailman/listinfo/otr-dev
>
>

-- 
Nathan Wilcox
Least Authoritarian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cypherpunks.ca/pipermail/otr-dev/attachments/20131210/664439fd/attachment.html>