[OTR-dev] html in otr messages

Ian Goldberg ian at cypherpunks.ca
Thu Mar 22 14:17:00 EDT 2007


On Fri, Mar 23, 2007 at 02:55:23AM +1100, Scott Ellis wrote:
> >From Wintermute on the Miranda forum:
> 
> "Hm .. ok .. the friend made a statement and asked me to post it for him (as
> he isn't registered here and to lazy to do so):
> 
> "Quote:
> I had a lengthy email conversation with Ian Goldberg, one of the authors of
> OTR, libotr and the gaim-otr plugin. If you read the OTR spec carefully, you
> will see that it specifies optional HTML-formating in the plaintext. As
> OTR-messages are merely encapsulated using Jabber (or other) protocols, Ian
> thinks (as do I after thinking about it) that the OTR specs supersede the
> standards of "lower level" protocols such as XMPP. It is the job of an OTR
> plugin to process the HTML tags in the plaintext, if they are not used by
> the client it's the plugins job to strip them.
> If you don't see it that way, I suggest you contact the otr-dev
> mailinglist."
> 
> I think of it quite the opposite way - OTR simply encapsulates protocol
> messages
> 
> OTR would have quite a job to do trying to detect whether the client
> supports HTML in Miranda - there are so many messaging modules etc. And
> they're likely to change at any time - it would make maintainence difficult.
> I don't think tracking the client's capabilities is the plugin's job.

You're right; I think that whatever calls the OTR plugin should be able
to understand what the plugin outputs.  In gaim, there's no problem,
since gaim understands the format of OTR plaintext messages already.
Miranda hasn't (yet?) implemented tag-handling in most of its protocol
plugins, fine.  So the Miranda AIM plugin, for example, has to
explicitly strip the tags, instead of parsing them.  You could have the
Miranda OTR plugin optionally do that instead; the protocol plugin could
pass a parameter that says "strip the HTML tags from the plaintext
before giving it back to me".  Or just have a function in the OTR plugin
to do that (steal it from the AIM plugin) that the protocol plugins can
call.

> And I think the client should output the same *intended message* whether or
> not OTR is installed - after decription, OTR messages from gaim OTR contain
> formatting html, whereas without OTR gaim outputs no formatting html.

Not true: see my otr-users message.  gaim sends *both* the formatted and
non-formatted versions in Jabber.

> This
> means transmitting more information than usual - which, although very
> unlikely, may not be something that the user wants.

There's no more information in the OTR version than in the original
Jabber message.  Miranda was just ignoring the more information-rich
part of the Jabber message.

> Removing the tags means I would have to reimlement something already
> implemented by at least the AIM protocol plugin. Processing the codes
> 'properly' for the client could involve conversion from e.g. html to
> bbcodes, if the client supports those - which would mean differences in the
> nature of OTR plugins for different clients. Or further, if the OTR spec
> specifies handling of HTML, then the otr library should be able to handle it
> - but again that's a reimplementation of stuff that the protocol can already
> do.

The OTR library gives you a plaintext message, in utf-8 encoding, that
is allowed to have HTML tags in it.  If you need something different for
your application, you'll need to convert it.  For example, if your
application decides it needs to convert the HTML tags to bbcodes, either
it, or its OTR plugin, will need to do that; libotr won't do it by
default.  [Of course, if there's a really common conversion that lots of
different clients need, we could consider just bundling it with the
library.]

> Also - and I really don't mean to criticize - but the protocol specs for
> things like jabber and other open protocols, with their RFC's and comittees
> etc etc, tend to be thought out pretty well. The reasons for disallowing
> things like 'mixed contect' (from the jabber RFC) are usually pretty good.
> For example, if I want to send the text <font>blue</font> to a friend, I can
> do so over jabber (because it is encoded and decoded by the protocol so as
> to not create 'mixed content') and expect him to get the message as I typed
> it, if the client performs to the spec. With OTR, if it's removing tags on
> clients that do not support markup, how does it tell the difference between
> formatting tags and what the user has typed? At a minimum it would somehow
> have to encode tags in message text and formatting tags differently - or you
> have restricted what messages users can and can't send.

OTR expects your plaintext input to be HTML-encoded.  Which means if
there's a literal "<" in your plaintext message, you should convert it
to "<" before giving it to OTR.  That's certainly what Jabber does:
even without OTR, "<"s in messages get converted to "<" on the wire.
So if you send "<font>blue</font>" (a message intending to start
with a literal "<") to your buddy, your end will convert it to
"<font>blue</font>" (whether or not you're using OTR), then
if you're using OTR, you'll pass *that* to the OTR plugin, which will
encrypt it.  Your buddy's client will decrypt it back to
"<font>blue</font>" and display it to him as
"<font>blue</font>".

Out of curiosity, why doesn't Miranda just parse the HTML tags for all
protocols that support them?

   - Ian



More information about the OTR-dev mailing list