Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.3 alpha 5/22/85; site cbosgd.UUCP Path: utzoo!watmath!clyde!cbosgd!mark From: mark@cbosgd.UUCP (Mark Horton) Newsgroups: net.internat Subject: Re: What do we REALLY want? Message-ID: <1594@cbosgd.UUCP> Date: Sun, 10-Nov-85 20:01:25 EST Article-I.D.: cbosgd.1594 Posted: Sun Nov 10 20:01:25 1985 Date-Received: Mon, 11-Nov-85 06:14:29 EST References: <723@inset.UUCP> <960@erix.UUCP> <1569@hammer.UUCP> Organization: AT&T Bell Laboratories, Columbus, Oh Lines: 44 The Japanese Kanji character set can be input in the same phonetic way as was described for Chinese. (You type in 2 or 3 Roman letters which phonetically sound like the syllable you want, and it turns into the (unique) Katakana glyph for the syllable you want. You do this for every syllable in the word and then press a special key, and something consults a (big) table and finds all the glyphs that sound like that. It puts up a menu, which often has 2-6 choices, on an extra line at the bottom of the terminal. You pick one and it goes up on the screen. I'm told there are about 60000 Kanji characters, and a few tens of thousands more Chinese characters (I can't remember the exact numbers.) However, a subset that fits in 14 bits is in common use, and they are willing to restrict theirselves to this subset. There are apparently already official standards for encoding Kanji in 16 bits, intermixed with ASCII. It seems that you take the 14 bits and put them in two bytes, each byte with the 8th bit on. Having two consecutive bytes with the parity bit on means it's a Kanji character. A single parity character might have a different international meaning. This doesn't break tail or grep. I don't know what they do if there are two European characters in a row, but I gather there is some standard way of dealing with this. The only mode needed is attached to the keyboard, so it can tell if you're typing in Roman or Katakana. By the way, I've seen several references to a function "printw" with an assumption that this would be a 16 bit printf. I'd like to point out that the name "printw" has already been taken by curses, which is present in both 4BSD and System V. (printw means "print window.") I'm not even convinced that such a function is needed, since the existing standards seem oriented toward streams of 8 bit bytes. I don't think stdio cares whether a character is Kanji or Roman, that's between the application and the terminal. Regular old printf works fine. Mark Horton P.S. While everybody agrees that this group should exist and should be distributed worldwide, but the name "net.internat" is terrible. Let's settle the issue of whether it's to be moderated (I understand we have a volunteer to be the moderator) and then call it either net.international or mod.international.