[OTR-users] Stylometry?

Sun Jan 2 14:14:30 EST 2005

On Sun, 2 Jan 2005 16:14:15 +0100 (MET), Paul Wouters
<paul at cypherpunks.ca> wrote:
> If you don't trust your conversation partner, you should not be telling
> them anything, even over OTR.

Well thats true, but at the same time the stated goal of OTR is it to
give online conversations the same properties as private offline
conversations.

OTR gets fairly close to that goal, but there are still some places
where it's not perfect.

For example, the person you are talking to could defect, and decide to
help the dissident smashing mafia. They provid him with a computer
with a copy of OTR they have verified and he sits down in front of
them and converses with you. After seizing your computer and seeing
your private key, they are confident you are who you say you are.  
This same even can happen in face-to-face communication, OTR doesn't
do any worse.

But lets consider my concern...  The party you've been talking to for
ages logs your conversations and then is either captured or defects. 
Beyond the obvious information leak,  it's possible that a third party
could be convinced you where the author of the text.

Beyond style masking, I had another idea... what if in addition to the
RSA private key, each party maintained a separate long term DH key. 
At the beginning of every conversation, a DH key negotiation takes
place (inside the secure channel) and a key is generated to encrypt
their log file of the conversation.   When the OTR session is torn
down they toss the key.  A user, in isolation, would be unable to
reveal the logs even under duress.

For some users, this would be a desirable way to store the logs. For
others it wouldn't (or they would simply prefer a simple password
driven system), but this method could only be used if OTR facilitated
an additional DH exchange.  (otherwise, I assume log storage it
outside the scope of OTR)

> I consider OTR a protection against external listening ears, and stil
> assume all my conversations end up being logged on the remote disk.

Sure.

> As for obfuscating writing styles for a judge, I've learned that judges
> t hard enough to understand normal technology, and they will strongly
> dislike you if you start playing technological games in court.

Now thats true... But perhaps stylemasking would be better used on IM
traffic that is sup posted to go in the comparison set, to prevent the
analysis.

> > order against AOL produces a log of your last 10 years of IM traffic
> > to use as a basis for analysis)
> Those logs will be encrypted with OTR, and unreadable to everyone including
> the sender and receiver.

Not yet, sadly.  I have hundreds of thousands of IM that could well be
stored outside of my control *already*, and I continue to generate
them since not everyone I talk to is able to use OTR (I fully intend
to refuse to IM with people who are *unwilling*, but until there is a
windows gaim port, there are quite a few who are unable.. Changing
platforms just to talk to me isn't really reasonable).

> That is not entirely true. It is really an American misconception that corporations
> should have as much info and power through legal clauses in contracts to protect themselves
> against third party claims or governments. I've argued on occasions (including
> my upstream ISP) that not having such data or power is a much better defendable
> position in court, and that having less power over a customer actually removes you
> from conflicts of your customers with third parties much more effectively.

Many years ago I worked at a small (dialup) ISP  (before the birth of
monopoly controlled broadband made such businesses near impossible)..
Or policy was to log as little as possible and retain what we did log
for only as long as we had to (for troubleshooting and such).  Past
that all logs were fed to summarization scripts.

This wasn't because we were privacy mavens, it was just a matter to
reduce our costs in storing and potentially the costs in retrieving
the data.

This was also because we believe that it would be unconscionable to
otherwise profit from that data... so we really had no reason to keep
it.

> Unfortunately, in the US, lawyers will just not even think of removing the
> 'we can do whatever we want for whatever non-reason if we feel like it' clause.
> And as a result, third parties will try to invoke that power (eg Scientology
> to name just one). And I guess give those same lawyers work to do.....

If I were designing an IM system, I would find it idiotic to make all
client to client traffic go through some servers I ran, both from a
traffic perspective and from the liability of being asked to monitor
the traffic.... But that position is because I wouldn't decide to
otherwise profit from the IM traffic.

However, thats not the position of some of todays corporate operators:
The worlds IM traffic is insanely valuable data, and even if their
actions are limited by wiretap laws (which, right now it doesn't look
like they are..) they still can likely collect statistical
aggregations, which can be as commercially valuable as the raw IM
traffic.  Whatever the cost of being forced to disclose information,
or the risk of employees using the information unlawfully, it's
obviously small compared to the potential (although immoral, in my
view) gains of having access to the information.

> You can turn your arugment around. Leaking a lot of my writing style gives
> me the perfect excuse to write 'U 4r3 31337, k177 h1m' and claim that wasn't
> you who wrote it. These types of games will be thrown out by any court or agency
> as wild speculation at best.

An excellent point...  It might be useful to amend the OTR
documentation to discuss the reality of logging (that it happens, and
you can't prevent the remote party from doing it, and you shouldn't
try because they can always bypass it.. and then you'll have a false
sense of security).

Thanks for your reply!