Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site mecc.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxt!houxm!ihnp4!stolaf!umn-cs!mmm!dicomed!mecc!sewilco From: sewilco@mecc.UUCP (Scot E. Wilcoxon) Newsgroups: net.internat,net.unix Subject: intnl: Character sets Message-ID: <375@mecc.UUCP> Date: Sun, 3-Nov-85 00:59:50 EST Article-I.D.: mecc.375 Posted: Sun Nov 3 00:59:50 1985 Date-Received: Sun, 10-Nov-85 07:48:43 EST References: <723@inset.UUCP> <960@erix.UUCP> <1569@hammer.UUCP> <6066@utzoo.UUCP> <224@l5.uucp> Reply-To: sewilco@mecc.UUCP (Scot E. Wilcoxon) Organization: MN Ed Comp Corp, St. Paul, MN Lines: 39 Keywords: NAPLPS character set standard Xref: watmath net.internat:32 net.unix:6198 Summary: start with NAPLPS character set I think John's on the right track. We don't need to just decide on a character set representation. We also need to decide what needs to be changed in UNIX. You'll see two similar articles here: character sets and UNIX commands. Partly for the sake of discussion, I'm suggesting the NAPLPS protocol as the character set standard. It allows 7 or 8 bit data streams, and uses escape codes for character set extension. Graphics are part of the standard and their device-independent resolution exceeds that of a phototypesetting machine. Comments? In article <224@l5.uucp> gnu@l5.uucp (John Gilmore) writes: >In article <6066@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes: >> > As far as character sets go, it would seem that 16 bits (65536 >> > possible characters) should be more than enough... >> The trouble with this (and the other similar proposals) is it asks the >> Western world to pay a factor of 2 in storage overhead for the sake of >> the Asian character sets. >I think the proposals are that a coding scheme for text be defined which >allows 16-bit characters to be escape-coded into an 8-bit text stream. >The arguments mostly center on what kind of coding scheme would fit both >the needs of few-16-bit-char folks and few-8-bit-char folks without wasting >too much storage for either. > >Internally to an international program, characters would be 16 bits, >but stdio routines (printw, fprintw, sscanw, etc) would encode to a >bytestream on the way in and out. ("w" for "world" or "wide"). > >(Hmm, the non-Unix-opsys people have been looking for a way to tell when >we Unixoids are reading or writing a text file versus a binary file...now >that we propose encoding our own text files, they will have the clue.) (I'm biased towards NAPLPS because I have Viewtron's free Commodore-64 NAPLPS decoder program. Not very biased..if the net comes up with a new standard there'll be plenty of software for it. Kick my straw man.) -- Scot E. Wilcoxon Minn. Ed. Comp. Corp. circadia!mecc!sewilco 45 03 N / 93 15 W (612)481-3507 {ihnp4,mgnetp}!dicomed!mecc!sewilco