Ticket #6308 (assigned defect)

Opened 5 months ago

Last modified 7 weeks ago

MD5 encoding is not done on strings using UTF8

Reported by: guest Owned by: ttrenka
Priority: normal Milestone: future
Component: Dojox Version: 1.0
Severity: normal Keywords: MD5 UTF8 UTF16
Cc:

Description (last modified by ttrenka) (diff)

MD5 encoding is not done using an utf8 version of the string. This results in a different md5 than is expected for utf8 encoded pages.

The original library the dojo MD5 functionality was based on has an updated version (?alpha?) which always does the utf16 => utf8 encoding.

http://pajhome.org.uk/crypt/md5/

Change History

  Changed 5 months ago by bill

  • owner changed from anonymous to ttrenka
  • component changed from General to Dojox

This is presumably for DojoX crypto... Tom, is that yours?

follow-up: ↓ 3   Changed 5 months ago by ttrenka

  • status changed from new to assigned
  • milestone set to 1.3

Yes, it's mine. And the version of MD5 in the repo does not support unicode. But I'll see if an update is warranted.

The main issue is that if one gets updated to use double-byte character sets, all must be updated and I'm not entirely sure I want to support that yet.

in reply to: ↑ 2 ; follow-up: ↓ 4   Changed 5 months ago by guest

Replying to ttrenka:

The main issue is that if one gets updated to use double-byte character sets, all must be updated and I'm not entirely sure I want to support that yet.

The issue we encountered was with high order ASCII (above 128). Our backend is all utf8 (MySQL / PHP) and the forms are all utf8 but when the JS layer encodes the high order ascii the UTF8 != UCS2 encoding.

For example "§": ASCII: 0xA7 (167) UCS2: 00A7 UTF8: C2A7

Each of these results in a different MD5 being generated and makes digest comparison quite difficult.

in reply to: ↑ 3   Changed 5 months ago by ttrenka

  • description modified (diff)

Understood; I will see what I can do.

BTW, you do realize that MD5 has been brute-forced, right?

Replying to guest:

Replying to ttrenka:

The main issue is that if one gets updated to use double-byte character sets, all must be updated and I'm not entirely sure I want to support that yet.

The issue we encountered was with high order ASCII (above 128). Our backend is all utf8 (MySQL / PHP) and the forms are all utf8 but when the JS layer encodes the high order ascii the UTF8 != UCS2 encoding. For example "§": ASCII: 0xA7 (167) UCS2: 00A7 UTF8: C2A7 Each of these results in a different MD5 being generated and makes digest comparison quite difficult.

  Changed 3 months ago by ttrenka

  • milestone changed from 1.3 to 1.4

  Changed 7 weeks ago by ttrenka

  • milestone changed from 1.4 to future
Note: See TracTickets for help on using tickets.