[OTR-dev] Happy Birthday to me. ;-)

Ian Goldberg ian at cypherpunks.ca
Thu Mar 31 21:52:21 EST 2005


We just passed 4000 unique IP downloads from the main site.

I finally tracked down the problem that was causing some people using
otrproxy with Trillian to be unable to connect to the AIM network,
sometimes.  [For some it was occsaional; for some it was always; for
some it was never.]  We'll put out 0.3.0 of the proxy with this fix (and
additional UI) next week.

As I mentioned, the bug was only occasional.  Bugs like that are always
hard to figure out.  Top that off with the fact that I don't run
Trillian, or indeed Windows (so all the actual test runs needed to be
done by someone else).

It turns out the bug is in Trillian, and is this:

- When Trillian needs to make a connection to the AIM server via a
  SOCKS5 proxy, it:
  - connects to the proxy
  - sends some SOCKS5 messages back and forth, ending with a message
    from the proxy to Trillian
  - sends AIM protocol messages back and forth, starting with a message
    from the AIM server (via the proxy) to Trillian.
  (This is exactly as it should be.)

- We have verified that if Trillian receives the first AIM message
  very close in time to the last SOCKS5 message (both are coming from
  the proxy to Trillian), Trillian never sees the AIM message, and never
  starts its side of the AIM protocol.

I surmise this is the bug (but of course I've never actually touched a
running Trillian, so who knows):

1. Trillian's SOCKS5 handler is doing something like:

   recv(proxysocket, bigbuf, sizeof(bigbuf), 0);

2. It handles the SOCKS5 message received into bigbuf.

3. When it sees the final SOCKS5 message, it starts the AIM protocol.

4. The AIM protocol also does something like:

   recv(proxysocket, bigbuf, sizeof(bigbuf), 0);
  
So what's the problem?  If the first AIM message has *also* arrived
before Trillian gets to step 1, bigbuf will contain *both* the final
SOCKS5 message, _and_ the first AIM message.  But Trillian apparently
ignores that fact, and eventually times out waiting for it in step 4.
[Or, more likely, in the select() guarding step 4.]

Fix: after processing the final SOCKS5 message in bigbuf in step 2, see
if there's anything left over, and if so, treat it as a message received
in step 4.

So we'll submit this bug report to Trillian tomorrow.  But what can we
do?  For now, we just hardcode a 1 second delay before sending the first
AIM message from the proxy to the client on Win32.  [This fix will be in
0.3.0.]

Many thanks to Michael Wright, Shar van Egmond, and Ian Neal for their
invaluable debugging assistance!

Birthday Bughunting: what a Blast!  ;-)

   - Ian



More information about the OTR-dev mailing list