[OTR-users] Communication with Pidgin to Miranda (or others) has encoding problemes

Paul Aurich paul at aurich.com
Mon Feb 23 14:57:36 EST 2009


And Rainer spake on 02/23/2009 02:42 AM, saying:
> I've experienced a problem during OTR encrypted communication between Pidgin
> and Miranda (or other clients). In short: several characters are encoded as
> HTML entities and so transmitted. E.g. an ampersand is sent as & or a
> newline is not the \n character (0x0A) but <br> instead. On the other way
> round, messages from Miranda to Pidgin are displayed correctly, but written
> HTML entities or tags like <br> will be extra parsed (so a newline is
> displayed instead of "<br>").

Your assumption that libpurple is entity-encoding the HTML entities and
transmitting them as such is fallacious.  They are *displayed* that way in
Miranda, but that is not evidence that they are transmitted as such.

>From gdb:
Breakpoint 1, process_sending_im (account=0x2c240a0, who=0x2cd1b70
"<redacted>",
    message=0x7ffffbf57e10, m=0x0) at otr-plugin.c:325
325	{
(gdb) p *message
$1 = 0x2cb4c60 "foobarzle<br><br>foobarzle"

As you can see, the message that OTR encrypts contains markup, not
entity-encoded stuff.

Now, as Jonathan pointed out, from a client writer's perspective, this is
entirely not the correct thing to do in many situations, since the receiver
is required to grok the arbitrary markup the sender's message contains.

> The communication without OTR is fine, it's plain text transmitted (checked
> the packages with Wireshark).

Yes, Miranda's XMPP plugin (mentioned further along in your message) does
not support XHTML-IM (based on a brief grep'ing of their code) and
libpurple strips all formatting from ICQ<>ICQ messages, so this is what you
would expect.

The real problem, as I see it, is that the libpurple sending-im signal is
triggered from libpurple's core and is sent before the protocols have
gotten a chance to munge the message into the format they want (plaintext
for ICQ, plaintext (+ optionally XHTML-IM) for XMPP, whole-buffer
formatting for MSN, etc). This means that the message contains markup when
the other side may not really want it.

Evan Schoenberg suggested one mechanism for fixing this (well, he was after
fixing AIM Direct Connect messages, but it could fix this, too) by having
the protocol plugins handle the signals [1], though I don't believe
anything was decided in that discussion. In addition (caveat: I'm not very
familiar with the crypto of OTR), I suspect a proper fix for XMPP to send
encrypted markup and encrypted plaintext in their appropriate elements
would require modifications to OTR, since only one of the two messages
would be handled by the receiver and I believe OTR doesn't handle dropped
messages well (someone correct me on this?).

In any event, the OTR specification apparently says the decrypted text can
contain HTML markup [2], so, even though I tend to agree with Jonathan on
this topic, the specification points toward this being a problem with
Miranda's implementation.

I'm unsure what Miranda should be doing for sending messages with markup,
though if the user literally types <br> in a message, my feeling is that
Miranda should be escaping that and sending <br>

~Paul

[1] http://pidgin.im/pipermail/devel/2008-December/007208.html
[2] http://lists.cypherpunks.ca/pipermail/otr-dev/2008-February/000744.html



More information about the OTR-users mailing list