[OTR-dev] Fwd: Ensuring transcript soundness in a multiparty chat setting

Thu Dec 5 15:59:02 EST 2013

On 03/12/13 18:12, George Kadianakis wrote:
>> On Mon, Dec 2, 2013 at 3:43 AM, Ximin Luo <infinity0 at gmx.com> wrote:
>>> On 02/12/13 02:53, Trevor Perrin wrote:
>> [...]
>>>>
>>>> But I agree that committing to DAG predecessors would result in
>>>> smaller messages, so could be advantageous.
>>>>
>>>> OTOH, piggybacking all predecessors means the recipient learns
>>>> all
>>>> predecessors for a message upon receipt of that message.  That
>>>> has
>>>> advantages too:
>>>>
>>>>  * In an OldBlue-type protocol, the recipient could issue LostMsg
>>>> requests for all missing predecessors right away, instead of
>>>> needing
>>>> multiple rounds of waiting for lost messages to arrive before
>>>> discovering more predecessors.
>>>>
>>>
>>> Multiple rounds could be alleviated without resorting to "all
>>> previous
>>> messages", if they communicate their own Frontier in the LostMsg
>>> message. Then,
>>> people handling this LostMsg can infer that the grandparent(s) are
>>> also Lost,
>>> and retransmit those too.
>>
>> That makes sense.  I'll admit I'm not super interested in
>> LostMsg/Retransmit algorithms, as we've got existing chat/messaging
>> protocols that do an adequate job with reliability (I think?)
>>
> 
> I'm also not super interested in LostMsg/Retransmit algrorithms. That
> said, I don't think that the current major chat protocols provide
> reliability (when the server is adversarial or when a user suddenly
> disconnects).
> 
> For example, I don't think that XMPP provides reliability or
> in-order-delivery on its own (see XEP-0198 and XEP-0184). However,
> because it's used over TCP, its streams are both reliable and
> ordered. Still, communication between Alice and Bob over XMPP actually
> involves two streams: Alice <-> Server and Server <-> Bob.  While each
> of those streams is reliable and ordered, nothing tells us that the
> communication channel Alice <-> Bob is reliable or ordered if we
> consider the server as an adversary (i.e. the server can reorder or
> drop Alice's packets before forwarding them to Bob).
> 
> FWIW, based on https://otr.cypherpunks.ca/Protocol-v3-4.0.0.html, OTR
> itself "assumes a network model which provides in-order delivery of
> messages, but that some messages may not get delivered at all (for
> example, if the user disconnects).". I think OTR provides its own
> reliability by using NAKs and doing retransmissions (at least
> according to section 5.1 of the otr-wpes.pdf paper).
> 
> Slightly off-topic, but lately when I'm thinking of the network model
> of a secure multiparty chat, I've found it helpful to consider a
> multiparty chat over a centralized chat protocol (like IRC or XMPP,
> where clients send messages to the server and the server broadcasts
> them to the other clients). It seems helpful for two reasons: 
> a) The server, in centralized protocols, is a powerful network
> adversary that can easily reorder, delay or drop messages (it's like
> the Global Active Adversary for the distributed case).  Similar
> attacks can also be performed by network links in a decentralized P2P
> model, but thinking of the centralized case seems easier.
> b) In the same protocol-agnostic fashion as OTR, it seems natural to
> me that the first instances of secure multiparty chats will use the
> popular chat protocols that are currently available. This probably
> means XMPP MUCs and IRC, which are both centralized.
> 
> What I'm trying to say, is that even though providing reliability in a
> multiparty chat protocol doesn't seem like a huge priority, a robust
> such protocol should have a way to ensure reliable end-to-end delivery
> of its messages.
> 

Interesting that OTR assumes in-order delivery. If we assume a network-level adversary or untrusted central service, it could definitely deliver messages out-of-order. I think the DAG model already protects against this, though.

>> You've talked about periodically broadcasting empty messages.  I
>> think
>> that's a good idea, particularly if parties can expect that of each
>> other, as it allows checking for timeliness and liveness of the
>> whole
>> group.  Tossing that in, here's my latest:
>>
>> --
>>
>> Each Message contains:
>>  - PREDECESSOR list containing the most-recently received message
>> number from each other participant (equivalently: the number of
>> messages received from each other Participant)
>>  - HASH over the predecessor messages (including the sender's
>>  previous message)
>>

I still think hashing over all participants is redundant. You only need the direct predecessors, like in the git history DAG. From this, anyone can transitively infer your version of the PREDECESSOR list. A less dense DAG is also easier to analyse. We can run with your version for now, though.

>> A Message's HASH can be verified once all predecessors have arrived.
>>
>> Each Participant maintains:
>>  - An UNVERIFIED pool of received messages whose hashes have not yet
>> been verified, and an expiration time for each.
>>  - Lists of RECEIVED messages from each Participant.  The HEAD
>>  message
>> in each list is the latest message from that Participant, and has an
>> expiration.  The tail message in each list is the oldest message
>>  that
>> is referred to by an UNVERIFIED message (older messages don't need
>>  to
>> be kept around).

I'm not sure this possible in general. If the unverified message has multiple chains of lost messages preceding it, you don't know which "oldest message" is referred to. Using the example from my last email:

1<--2---3---4---5---9
             \
              +-6---7---8

(This would occur when, e.g. whoever sent 6,7,8 did not yet receive 5,9. This might be multiple people. Note that this is using the sparse DAG; you can fill in extra pointers for a dense DAG.)

Say Alice sees all messages except 6. She knows about 6 itself, because she has 7 which points to it. However, she does not know 6's own precedessors, so she must keep all messages *in case* 6 points to 1.

With the expiry mechanism, you can toss all messages that *would be expired* if it was the HEAD, though. That might be a better approach rather than discarding things via DAG-based metrics.

>>
>> The expirations associated with UNVERIFIED and HEAD messages are for
>> timeliness checks:
>>  * If there's ever an expired UNVERIFIED message, that means you
>> haven't received some predecessor within a reasonable amount of
>> time.
>>  * If there's ever an expired HEAD message, that means you haven't
>> heard anything from some party within a reasonable amount of time.
>>
>> On receiving a message, do:
>>  - Check that the message has newer predecessors than the previous
>> message from this sender

This is nice. In the sparse form of the DAG, this check would be equivalent to checking that a new message from a sender, say Bob, has their previous message as an ancestor through the DAG. (The same property for other participants follows transitively from this property, so you only need to check the sender. In the dense case, participants can lie about the extra redundant pointers, so we do need to check them.)

However, this check is only possible if we have already verified all the predecessors, so I think it would be better to do this as part of "verifying". An example:

1<--2   3<--4

To simplify things, assume that Bob sends all messages. We receive 2 from Bob, then 4 from Bob, but we haven't received 3 yet. Doing this check here would fail, since we don't know what 3's parents are. Bob could be attacking us with a bogus message, or everything could work out when we later see that 3 has 2 as a parent.

>>  - Add the message to the UNVERIFIED pool and the appropriate
>>  RECEIVED
>> list, and set expiration times
>>  - If any message in UNVERIFIED can now be verified, do so, and
>>  either
>> raise a security fault or delete it from UNVERIFIED
>>
>> If you haven't sent a message in X time, send an empty message.
>>
>> Thoughts?
>>

Basic gist of it looks good, next we need more detail. I'll think about it as well. :)

One comment (which might be redundant as I'm not up-to-date with all the contextual literature) is that this reduces scalability. This may or may not be a concern, but this would make e.g. a conversation with 30 people more costly. (But perhaps other parts of the protocol, like the handshake, already make this costly, so we will just restrict ourselves to a "small group".)

> 
> I like it. It seems like it would work.
> 
> (Would be even better if we could sketch a quick proof for its
> soundness too.)
> 
> Also, IMHO, messages should not be forwarded to the chat client while
> they are in the UNVERIFIED pool. Assuming normal (non-adversarial)
> network operation, receiving message predecessors should be quick and
> users shouldn't even notice the buffering.
> 
> We can even make some "Large amount of unverified messages" or
> "Impossible to verify messages from participant X for a while"
> warnings that we can forward to the chat application instead.
> 
> (I'm saying this, because forwarding unverified messages to the client
> with a big fat "UNVERIFIED MESSAGE" warning seems like a recipe for
> disaster. Similar to CA-SSL's "The site's security certificate is not
> trusted" usability/security nightmare.)
> 

I'm cool either way. It would also be easier to implement in the way that you say. My main motivation for suggesting the other approach was performance - to show the user *something* rather than having things block on one unreceived message. But maybe this is not important enough to outweigh the other security concerns.

-- 
GPG: 4096R/1318EFAC5FBBDBCE
git://github.com/infinity0/pubkeys.git

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 897 bytes
Desc: OpenPGP digital signature
URL: <http://lists.cypherpunks.ca/pipermail/otr-dev/attachments/20131205/33f978da/attachment.pgp>