Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site fortune.UUCP Path: utzoo!watmath!clyde!cbosgd!ihnp4!fortune!mats From: mats@fortune.UUCP (Mats Wichmann) Newsgroups: net.internat Subject: Re: What do we REALLY want? Message-ID: <5762@fortune.UUCP> Date: Fri, 8-Nov-85 13:22:17 EST Article-I.D.: fortune.5762 Posted: Fri Nov 8 13:22:17 1985 Date-Received: Sun, 10-Nov-85 09:31:01 EST References:Reply-To: mats@fortune.UUCP (Mats Wichmann) Organization: Fortune Systems, Redwood City, CA Lines: 43 Okay, I don't know if anyone has posted this, we seem to be getting things very sporadically here so I may have missed it. However. There is an ISO standard for "code extension techniques" (ISO 2022) which is supposed to address these wonderful issues. It starts from 7-bit ASCII (very important, because they use the 8th bit...). There are two ways to shift character sets: "Single-shift" and "Locking-Shift". Single shift is like you pressing the SHIFT or CONTROL key on your terminal - it has to be done for each character. Locking Shift puts you into a different mode until an unlock sequence comes along. The AT&T internationalization proposal is based on this idea, but uses only single-shift, and basically follows these two rules: 1. If the high-order bit of an 8-bit byte is turned off, the 8-bit sequence comes from an ASCII character set. 2. If the high-order bit is turned on, the 8-bit sequence is non-ASCII and should be interpreted as belonging to one of the three local character sets. The exact character set it belongs to depends on the internal coding method and whether it was preceded by a single-shift character. There will be special "single-shift characters" which signify one or two byte following sequences (the two magic cookies which select this would be "SS2" = 0x8e and "SS3" = 0x8f). The above is a major condensation, and only represents the proposal as I understand it. The reference document is: "Information Processing - ISO 7-bit and 8-bit Coded Character Sets - Code Extension Techniques", ISO 2022-1982(E). I am relatively new to this game, so if anyone has sensible objections to this scheme, I would love to be educated. This sort of suggestion does of course not tackle issues like sorting at all; it merely suggests how to represent the data, not what you can do with it. Mats Wichmann Fortune Systems {ihnp4,hplabs,dual}!fortune!mats