Title | Test e-mailing of stories that contain multi-octet UTF-8 chars | |
Date | Sunday May 24 2015, @10:53PM | |
Author | martyb | |
Topic | ||
from the dept. |
This is a test story which contains a variety of 1-, 2-, and 3-octet UTF-8 chars. The purpose is to see how well the e-mailing of stories handles these characters. These chars were entered directly (actually, cut-and-paste) as opposed to being entered as decimal/hex/named character entities.
The following is taken from: "3. UTF-8 definition" in: https://tools.ietf.org/html/rfc3629
Char. number range | UTF-8 octet sequence
(hexadecimal) | (binary)
--------------------+---------------------------------------------
0000 0000-0000 007F | 0xxxxxxx
0000 0080-0000 07FF | 110xxxxx 10xxxxxx
0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
peugen 0x40 0x7f 0x0140 0x017f 0x0700 0x073f 0x0800 0x083f | peu2utf8 > bleh.txt
cat bleh.txt
BEGIN
@ABCDEFGHIJKLMNO
PQRSTUVWXYZ[\]^_
`abcdefghijklmno
pqrstuvwxyz{|}~�
ŀŁłŃńŅņŇňʼnŊŋŌōŎŏ
ŐőŒœŔŕŖŗŘřŚśŜŝŞş
ŠšŢţŤťŦŧŨũŪūŬŭŮů
ŰűŲųŴŵŶŷŸŹźŻżŽžſ
܀܁܂܃܄܅܆܇܈܉܊܋܌܍
ܐܑܒܓܔܕܖܗܘܙܚܛܜܝܞܟ
ܠܡܢܣܤܥܦܧܨܩܪܫܬܭܮܯ
ࠀࠁࠂࠃࠄࠅࠆࠇࠈࠉࠊࠋࠌࠍࠎࠏ
ࠐࠑࠒࠓࠔࠕࠖࠗ࠘࠙ࠚ
ࠠࠡࠢࠣࠤࠥࠦࠧࠨ
࠰࠱࠲࠳࠴࠵࠶࠷࠸࠹࠺࠻࠼࠽࠾
END.
Links |
printed from Dev.SN, Test e-mailing of stories that contain multi-octet UTF-8 chars on 2024-05-16 16:38:50