Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87wo2z4tyj.fsf@d>
Date: Sun, 19 Jul 2020 14:36:52 +0300
From: Aleksey Cherepanov <aleksey.4erepanov@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: program to identify hashes in a text file

Johny Krekan <krekan@...nykrekan.com> writes:
> sVid
>
> I13
> sVlicensor
>
> I0
> sVsessionKey
[...]
> F1594119243.6744404

I don't see any hashes but I noticed other thing here.

All these lines looks like data serialized by "pickle" module of python
2. Python 3 produces binary data because it xores same opcodes with 0x80
to store references. I checked java, dotnet, ruby and php: all of them
uses very different formats.

The reference:
https://github.com/python/cpython/blob/2.7/Lib/pickle.py

Excerpts:
FLOAT           = 'F'   # push float object; decimal string argument
INT             = 'I'   # push integer or bool; decimal string argument
UNICODE         = 'V'   # push Unicode string; raw-unicode-escaped'd argument
PUT             = 'p'   # store stack top in memo; index is string arg
SETITEM         = 's'   # add key+value pair to dict

Opcode 's' picks 2 values from stack and uses them to set field in a
dictionary, so it goes after values in serialized data. Opcode 'V'
pushes unicode string ended with regular end of line, while embedded
ends of line are encoded as unicode escape sequence \u000a.

I added opcode 'p' because python 2.7 puts them after every value. I
don't have other version of python, so I cannot tell if other version
skips them. Yet output looks quite similar.

Example in python:
>>> print pickle.dumps({ u'id' : 13, u'licensor' : 0 })
(dp0
Vlicensor
p1
I0
sVid
p2
I13
s.

So in the real data, there is dictionary with keys "id", "licensor",
"sessionKey". I guess there should be more keys, because there should be
a key before the first base64 encoded string. Also there is timestamp
stored as 'float': F1594119243.6744404 (Tue Jul  7 10:54:03 UTC 2020). I
don't see respective key.

The second string encoded in base64 starts with 'V', it should be
pickle's opcode. Then decoded string has a pretty length 256.

Maybe googling the dictionary's keys would point to the framework /
software that produced the data, so it would be possible to read about
the format. But I did not find anything useful.

Thanks!

--
Regards,
Aleksey Cherepanov

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.