Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/5/84; site wanginst.UUCP Path: utzoo!linus!wanginst!perlman From: perlman@wanginst.UUCP (Gary Perlman) Newsgroups: net.math.stat Subject: Re: Some topics I wouldn't mind discussing Message-ID: <1145@wanginst.UUCP> Date: Tue, 24-Sep-85 21:30:22 EDT Article-I.D.: wanginst.1145 Posted: Tue Sep 24 21:30:22 1985 Date-Received: Fri, 27-Sep-85 06:37:46 EDT References: <277@nrcvax.UUCP> Reply-To: perlman@wanginst.UUCP (Gary Perlman) Organization: Wang Institute, Tyngsboro, MA 01879 USA Lines: 173 Summary: I am compelled by unknown forces to do this every year, I guess because people thank me for it. Since 1980, I have been distributing a small statistics package called UNIX|STAT, so called because it was developed on UNIX and uses pipelines a lot; it is a very UNIX style package. Thanks to a lot of grundgy work by Fred Horan at Cornell, the Lattice C compiler, and continuing education in portabilty, most of the programs have been ported to MSDOS on the IBM PC. I am not yet ready to distribute the programs on floppies for MSDOS, but more than one site has been able to take the sources I distribute and compile them for MSDOS with other C compilers. Over the next few months, I will be doing V&V work on the MSDOS versions and find some floppy-copy house to make copies. So, what is UNIX|STAT? Well, it's not comprehensive, but there are a lot of good programs in it. They are described below. More programs are likely in the next year. Some people have sent me code (that I have not yet had time to incorporate) for non- parametrics, and I am working on a multi-factor crosstabs/chi-square. People seem to like UNIX|STAT because it integrates with UNIX naturally, reading the standard input and writing the standard output. It even has documentation: tutorials, manual entries, and I have even made a video tape introduction (although the tape has not been distributed with the package). It is also cheap: $20 gets you a mag tape, or you can send me a 600 foot mag tape and prepaid return mailer and get it free. This, obviously, is public domain software. If you send me your postal address, I can send you more documentation. Now for details. Note: if you are using UNIX|STAT 5.0, there is nothing new here. UNIX|STAT 5.0 COMPACT DATA ANALYSIS PROGRAMS UNIX|STAT is a set of UNIX System data manipulation and analysis programs developed at the University of California, San Diego by Gary Perlman (now teaching at the Wang Institute of Graduate Studies). The programs are designed with the UNIX System philosophy that individual programs should be designed as tools that do one task well and produce output suitable for input via pipes to other programs. Interactive use is supported in the UNIX System shell which also provides a programming language for complex analyses. Typical usage involves a pipeline of transformations of data followed by input to an analysis program, summarized schematically by: INPUT DATA | TRANSFORM | ANALYSIS | OUTPUT RESULTS Functionality often built into statistical packages (e.g., graphics, sorting and other data manipulation) is not re-invented in UNIX|STAT which delegates such responsibility to standard UNIX System tools. FEATURES easy to use (negligible training period) simple input formats (free format field oriented) used in pipelines with other UNIX System utilities (sort, vi) flexible data manipulation data validation provided (range and type checking) full documentation support (manual entries, tutorials) extensible (many modular C functions) faster than most packages (usually less than a second per analysis) small enough for micros (10-25K byte programs) runs on any UNIX System (V6, V7, 2.8BSD, 4BSD, III.0, System V, others) public domain software (can't be distributed for gain) in use at more than 300 UNIX System sites for five years CHANGES FOR RELEASE 5.0 (March 5, 1985) reworked to increase portability, reliability, and usability all commands now use a standard option parser (getopt) all calculations are now done in double precision diagnostic error messages have been improved regress now does a partial correlation analysis colex and trans were added as alternatives for dm F ratio probabilities are now better approximated some inefficient input was optimized some non-portable features of C were replaced so that the programs now run under MSDOS on the IBM PC the random number seeding has been improved all programs now use a zero exit status on success version control was added--we are now at release 5.0 UNIX|STAT is Public Domain The programs have been released to the public and are distributed to anyone who wants them. Persons wanting to get a copy of the package should contact me directly. You can get the package for free if you send me a tape and a self-addressed prepaid return mailer. Or you can send me personally $20 US to cover the costs of a tape and mailing. The distribution includes: The C source files for all the programs. The documentation source files. A collection of test examples. Contact: Gary Perlman Wang Institute of Graduate Studies Tyng Road Tyngsboro, MA 01879 USA (617) 649-9731 uucp: decvax!wanginst!perlman sdcsvax!sdcsla!perlman csnet: perlman@wanginst arpa: sdcsla!perlman@nprdc NOTES: UNIX|STAT is unsupported, though known bugs have been removed. UNIX|STAT may not be distributed for profit. UNIX|STAT is NOT a product of any company or organization. UNIX|STAT is distributed on a `` use-at-your-own-risk basis.'' UNIX|STAT(1) UNIX User's Manual UNIX|STAT(1) NAME UNIX | STAT - compact data analysis programs DESCRIPTION UNIX | STAT is a set of data manipulation and analysis pro- grams developed at the University of California, San Diego. The programs are designed with the UNIX System philosophy that individual programs should be designed as tools that do one task well and produce output suitable for input via pipes to other programs. Interactive use is supported in the UNIX System shell which also provides a programming language for complex analyses. Functionality often built into statistical packages (e.g., graphics, sorting and other data manipulation) is not re-invented in UNIX | STAT which delegates such responsibility to standard UNIX System tools. DATA TRANSFORMATION PROGRAMS abut join data files colex column extraction dm column oriented data manipulator io control and monitor input and output maketrix create matrix type file from free-form file perm randomly permute lines in a file repeat repeat a pattern or file reverse reverse lines and characters series print a series of numbers transpose transpose matrix type file ANALYSIS PROGRAMS anova multi-factor anova with repeated measures calc interactive algebraic modeling calculator critf/pof F-ratio/probability conversion functions dataplot flexible data plotting desc descriptions histograms, frequency tables dprime signal detection d' and beta calculations oneway one-way anova and t-test pair paired data statistics, regression, plots regress multivariate linear regression ts time series analysis and plots validata verify data file consistency vincent time-series comparison AUTHOR Gary Perlman (with the help of several others) SEE ALSO sh(1), sort(1), uniq(1), sed(1), awk(1), grep(1), rm(1), cp(1), pr(1), ls(1), mv(1) -- Gary Perlman Wang Institute Tyngsboro, MA 01879 (617) 649-9731 UUCP: decvax!wanginst!perlman CSNET: perlman@wanginst