Stories
Slash Boxes
Comments

Dev.SN ♥ developers

posted by martyb on Friday May 22 2015, @11:06AM   Printer-friendly
from the more-fun-with-UTF-8 dept.

Sig and Bio user preferences test. The user preferences page (https://dev.soylentnews.org/users.pl) describes these fields as:

Sig: Appended to the end of comments you post. 120 chars.

Bio: This information is publicly displayed on your user page. 255 chars.

These tests ensure that we support that number of characters and not just that number of bytes. Specifically, characters greater than U+007f require 2 or more octets (bytes) to represent them as UTF-8 chars, as follows (taken from https://tools.ietf.org/html/rfc3629):

Char. number range  |        UTF-8 octet sequence
   (hexadecimal)    |              (binary)
--------------------+---------------------------------------------
0000 0000-0000 007F | 0xxxxxxx
0000 0080-0000 07FF | 110xxxxx 10xxxxxx
0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

 
This discussion has been archived. No new comments can be posted.
Display Options Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by martyb on Friday May 22 2015, @11:40AM

    by martyb (76) on Friday May 22 2015, @11:40AM (#28454) Journal

    Sig on parent comment contained the 120 3-octet chars that had been saved in the Sig, as follows:

    ऀँंःऄअआइईउऊऋऌऍऎएऐऑऒओऔकखगघङचछजझञटठडढणतथदधनऩपफबभमयरऱलळऴवशषसहऺऻ़ऽािीॉॊोौ्ॎॏॐक़ख़ग़ज़ड़ढ़फ़य़ॠॡॢॣ।॥०१२३४५६७८९॰ॱॲॳॴॵॶॷ

    Success.

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2