Babblings of an aging geek in love with the Absurd, his family, and his own hubris.... oh, and Lisp.

Unicode Characters

The Story of Chess

Once Upon a Time

Once upon a time, when Sessa invented chess, he showed this to the king.

The king was impressed and offered to pay Sessa.

He said, 1 grain of wheat on the first square, 2 on the next…

How Much?

The king’s treasurer told the king that would be too much.

18,446,744,073,709,551,615 grains

18 thousand trillion

Once Upon a Time

My first computer could only display 128 characters.

• A byte is 8 bits
• 128 uses 7 bits
• Extra bit was a parity check

Parity Check?

7 bits of data Count 8 bits with parity
0000000 0 0000000 0
1010001 3 1010001 1
1101001 4 1101001 0
1111111 7 1111111 1

It got better…

• Parity not too effective.
• Switched to 8 bits
• Now we get 256 characters.
• 128 MORE characters…

What should we display?

• European characters
• ¿Cómo golpear la piñata?
• Góðan daginn⁈
• Or Greek? καλημέρα
• Or Graphic symbols and line drawings:
```┏━━━┱───┐
┃ ☻ ┃ ☺ │
┣━━━╉───┤
┃ ⚑ ┃ ⚐ │
┗━━━┹───┘
```

• Chinese: 早安
• Japanese: おはよう
• Thai: อรุณสวัสดิ์
• Korean: 안녕하세요
• Bengali: সুপ্রভাত
• Nordic Runes: ᚠᚢᚦᚨᚱᚲ
• Math symbols: ∞ ÷ √

Gets Worse

Some languages write from right-to-left.

• Hebrew: בֹּקֶר טוֹב
• Arabic: صباح الخير

But…

• 8 bits was a byte … a “character”
• Everything would have to change:
• Computer displays
• Programming languages
• All programs
• All files and databases
• English email would suddenly be larger!

Could we do both?

A book report is a just a series of ones and zeros, so how should we interpret this?

How Big is Big?

• Use two bytes: 65,536 characters
• Almost would work
• Japanese: 3,000+