[OTR-dev] Re: Pidgin plugin sends and parses HTML

Tue May 13 09:22:19 EDT 2008

There are a number of issues raised in this thread.  Let me know if I
miss any.  I'll start with the simple ones.

Scott> The other serious problem in my mind is the need for a 'i'm going
Scott> offline now' message

It's not mandatory.  It didn't exist in the first version of the
protocol.  It's just a hint to the other side that you've gone away, and
they're free to deallocate a few resources (and forget the remaining one
or two keys).  Or are you referring to something else?

Jonathan> I also noticed that libotr returns HTML error messages, which
Jonathan> we think is bad, they are not translatable and we have to
Jonathan> strip HTML from them.

We also think this is bad, bad, bad.  libotr is divided into three
parts:

- the low-level packet format, encryption, keys, state, etc.
- user authentication support (Socialist Millionaires' protocol in the
  current version)
- the messaging support (otrl_message_sending, fragmentation, etc.)

The messaging support is completely crufty.  It contains hardcoded
English HTML strings, among other nonsense.

We're starting a project to revamp the libotr API now, particularly the
messaging part.

For reference, here are the OTR projects currently underway over the
next 3-4 months:

- New UI for user authentication, based on results of user study (which
  will be published in the Symposium On Usable Privacy and Security:
  SOUPS 2008).  The code is pretty much done for this; it needs a
  congifuration checkbox and some revamped documentation.

- Implementation of libotr in Java, hopefully leading to some
  OTR-on-smartphone action.

- Addressing the user-logged-in-multiple-times problem.

- libotr API cleanup, as above.

Anyone who wants to volunteer to help with in particular the new API,
now's the time.  One of the big goals is exactly to make libotr have
fewer pidgin idioms in it.  To do this, we need people with experience
with the internals of other IM clients to contribute their domain
knowledge.

Rüdiger> According to the OTR spec, the library is supposed to do
Rüdiger> nothing more than replace the plain text with the encrypted
Rüdiger> text. As such, the place for text/plain is supposed to contain
Rüdiger> encryped text/plain, while the place for text/html is supposed
Rüdiger> to contain encrypted text/html.

OK, this is the big one.  As was noted, it comes up from time to time.
In this post, I'll try to clarify my thoughts on the matter.

That's not quite what the spec says, as Scott (I think it was) points
out.  But let's put that aside.  Let's talk about what the spec *should*
say, not what it says now.

The plaintext of an OTR message is a sequence of bytes.  How do you
interpret those bytes?  In addition to the content-type issue (is it
text/plain, text/html, text/xhtml, something else?), there's also the
encoding issue, which no one seems to have raised.  These two things
actually go hand-in-hand.  The AIM protocol, for example, specifies
three choices for the encoding: 7 bit ASCII, UCS-2BE, and ISO-8859-1.
Note that none of these is UTF-8 (though some clients may exploit the
fact that 7-bit ASCII is a subset of UTF-8 and hope the other end
doesn't barf on it).

So who specifies the encoding and content-type for the plaintext
sequence of bytes inside the OTR packet?  There are three choices I see:

1. It's specified explicitly in the protocol.  Everyone uses the same
   choice, and you have to convert between whatever you use natively and
   this choice before encrypting / after decrypting.  Note that it's
   perfectly reasonable for libotr to provide some commonly used
   conversion routines to help you with this.

2. It's specified explicitly in the OTR Data packet (within the
   authenticated section).  Different clients may make different
   choices, or make different choices for different messages (such as
   the text/plain and text/xhtml parts of a Jabber message).  Clients
   will need to convert between the choice received and their native
   choice.  libotr could again help here by providing common
   conversions, but now you potentially have an n^2 problem as all
   formats need to be converted to all other formats.

3. It's specified implicitly by context, probably by "inheriting"
   whatever encoding and content-type are being used to format the
   OTR ciphertext.  This has the advantage of clients not having to
   convert anything, since they can just do "in-place"
   encryption/decryption, most of the time.

Rüdiger is advocating 3; someone else on the list suggested 2.  I'm
going to suggest 1.  Here are my reasons.

My main issue with 3 is the potential security hole it opens up.  Unlike
1 and 2, the method of interpretation of the plaintext bytes is not
securely specified, and is in fact under the control of an adversary.

For example, suppose there's a Jabber message sent (in model 3), where
the body section contains an encryption of "to put something in bold,
wrap <b></b> around it" and the xhtml section contains an encryption of
"to put something in bold, wrap <b></b> around it".  Then an
adversary replaces the OTR block in the xhtml section with the one from
the body section.  An end user with an xhtml-aware client will see "to
put something in bold, wrap  around it", and the attacker has
successfully modified the message.

The advantage of 1 over 2 is just that there's no chance someone will
pick a bizarre format you don't know about; you just need to convert
between whatever native format you use for your IM client/protocol and
the fixed OTR format.

So here's my proposal, which is of course open for debate.

- Specify that OTR plaintext is UTF-8 text/xhtml.

- Clarify in the section on encrypting/decrypting that you need to
  convert your plaintext to/from that format.

- Have at least one version of the encrypt/decrypt API call take as a
  parameter the format (type and encoding) your plaintext should be in,
  and libotr will convert it for you (assuming it knows that format).

So if this were implemented, Jabber clients would call
otrl_message_sending/receiving with a parameter of UTF8_TEXT_PLAIN to
encrypt/decrypt the body part, and UTF8_TEXT_XHTML to encrypt/decrypt
the xhtml part.  The fact that what you get when you apply AES-CTR to
the OTR message body is in fact UTF-8 text/xhtml in either case should
be totally transparent.  Ideally, I could decide to standardize on
"UCS-2BE text/rtf rot13" and you wouldn't even notice.  [But of course,
everyone would have to be using an OTR library that supported this.]

There's a slightly related problem of what do you do if the encoded OTR
string (the "?OTR:AAED...") isn't appropriate for your IM protocol.  For
example, AIM using UCS-2BE.  I suppose libotr could translate that for
you as well (to "\x00?\x00O\x00T\x00R\x00:\x00A\x00A\x00E\x00D..."), but
that's a really rare case.

I'll note that the outcome of the "fix the multiple login problem" work
above will certainly lead to a rev of the OTR wire protocol.  So that
would be a great time to clarify this as well.

I think that's long enough for now.  ["I made this letter longer than
usual because I lack the time to make it shorter." -- Blaise Pascal]

Comments?

   - Ian