Path: utzoo!telly!attcan!utgpu!watmath!uunet!mcsun!sunic!tut!tukki!tarvaine
From: tarvaine@tukki.jyu.fi (Tapani Tarvainen)
Newsgroups: gnu.utils.bug
Subject: Re: Bugs in Gnu SED 1.06
Summary: a few more
Message-ID: <1400@tukki.jyu.fi>
Date: 29 Sep 89 07:45:04 GMT
References: <1397@tukki.jyu.fi>
Reply-To: tarvaine@tukki.jyu.fi (Tapani Tarvainen)
Distribution: gnu
Organization: University of Jyvaskyla, Finland
Lines: 198

In article <1397@tukki.jyu.fi> tarvaine@tukki.jyu.fi I wrote:
>This is a report of problems I encountered in porting Gnu SED (1.06)
>to MS-DOS.
...
>That was all: as far as I can tell it now works

Well, I was a bit hasty:  I've found some more bugs:

* Comments at end of file were not treated properly:  The vector->v_length
is incremented before comments are detected.  After junking the comment
a jump is made to a point after it has been incremented (where addresses
are processed), so there's no problem if a real command follows, but if
it's at EOF, v_length if left one too big.  I fixed this by decrementing
v_length after skipping the comment text and jumping to an earlier point
(where EOF and blank lines are skipped before incrementing v_length).

* 'r' reallocation of append buffer looked like this:

case 'r':
    {
        int n;
        char tmp_buf[1024];

        rewind(cur_cmd->x.io_file);
        while((n=fread(append.text+append.length,sizeof(char),
                append.alloc-append.length,cur_cmd->x.io_file))>0) {
            append.length += n;
            if(append.length==append.alloc) {
                append.text = ck_realloc(append.text, 
                    append.alloc + cur_cmd->x.cmd_txt.text_len);
                append.alloc += cur_cmd->x.cmd_txt.text_len;
            }
        }

There are two problems with this:
First, it is possible that append.length==append.alloc initially
('a' won't increase append if it gets exactly full).
Second, cur_cmd->x.cmd_txt.text_len doesn't contain anything useful
here (maybe the code was copied from 'a'?).

The first can be fixed by making it a do-loop.

The second I fixed by doubling the size of the buffer when it gets full.
After looking at how 'a' works I changed it do the same:  it used to
increase the buffer size just enough to hold the new line, which works
but is inefficient.  (This technique is used also in regex &c.  It is
quite efficient in general, but in tight memory situations can result
in unnecessary out-of-memory error.  Would it be a good idea to write
an allocation routine which is given minimum and maximum amount of
memory required, instead of an exact amount?)

Finally, tmp_buf isn't used for anything at all, so I removed it.
I also removed another unused variable ('file' in compile_file()).
(Yes, I know, a good compiler would not allocate memory for unused
variables - but I'm not going to port gcc to MS-DOS!)

To save space I'll put here diff from my previous version to current.
If you want diffs from 1.06 to this just let me know (preferably by
mail: news arrive here about one week old for the time being).

*** sed.old	Thu Sep 28 08:34:47 1989
--- sed.c	Fri Sep 29 08:47:29 1989
***************
*** 1,7 ****
  
  /*  GNU SED, a batch stream editor.
      Copyright (C) 1989, Free Software Foundation, Inc.
! 	Last changed by Tapani Tarvainen 28 September 1989
  
      This program is free software; you can redistribute it and/or modify
      it under the terms of the GNU General Public License as published by
--- 1,7 ----
  
  /*  GNU SED, a batch stream editor.
      Copyright (C) 1989, Free Software Foundation, Inc.
! 	Last changed by Tapani Tarvainen 29 September 1989
  
      This program is free software; you can redistribute it and/or modify
      it under the terms of the GNU General Public License as published by
***************
*** 49,54 ****
--- 49,58 ----
  	- fixed a bug in append_pattern_space
  	  (moved n==0 test to beginning of loop)
  	- fixed 's/any//' so it won't do ck_malloc(0)
+ 	- fixed 'r' reallocation of append buffer
+ 	- changed 'a' reallocation: double append buffer when it overflows
+ 	  instead of adding just enough for the new line
+ 	- fixed handling of comments at end of file
  	- changed error and usage messages slightly (print usage
  	  if no arguments, remove path from name in messages)
  	- moved some ANSI-routine replacements inside #ifndef __STDC__
***************
*** 457,463 ****
  compile_file(str)
  char *str;
  {
! 	FILE *file;
  	int ch;
  
  	prog_start=prog_cur=prog_end=0;
--- 461,467 ----
  compile_file(str)
  char *str;
  {
! /*	FILE *file;	*/
  	int ch;
  
  	prog_start=prog_cur=prog_end=0;
***************
*** 505,510 ****
--- 509,515 ----
  		vector->next_one = 0;
  	}
  	for(;;) {
+ 	skip_comment:
  		do ch=inchar();
  		while(ch!=EOF && (isspace(ch) || ch=='\n' || ch==';'));
  		if(ch==EOF)
***************
*** 523,529 ****
  		cur_cmd->aflags=0;
  		cur_cmd->cmd=0;
  
- 	skip_comment:
  		if(compile_address(&(cur_cmd->a1))) {
  			ch=inchar();
  			if(ch==',') {
--- 528,533 ----
***************
*** 547,552 ****
--- 551,557 ----
  				bad_prog(NO_ADDR);
  			do ch=inchar();
  			while(ch!=EOF && ch!='\n');
+ 			vector->v_length--;
  			goto skip_comment;
  		case '!':
  			if(cur_cmd->aflags & ADDR_BANG_BIT)
***************
*** 1085,1093 ****
  			break;
  
  		case 'a':
! 			if(append.alloc-append.lengthx.cmd_txt.text_len) {
! 				append.text=ck_realloc(append.text,append.alloc+cur_cmd->x.cmd_txt.text_len);
! 				append.alloc+=cur_cmd->x.cmd_txt.text_len;
  			}
  			bcopy(cur_cmd->x.cmd_txt.text,append.text+append.length,cur_cmd->x.cmd_txt.text_len);
  			append.length+=cur_cmd->x.cmd_txt.text_len;
--- 1090,1098 ----
  			break;
  
  		case 'a':
! 			while(append.alloc-append.lengthx.cmd_txt.text_len) {
! 				append.alloc *= 2;
! 				append.text=ck_realloc(append.text,append.alloc);
  			}
  			bcopy(cur_cmd->x.cmd_txt.text,append.text+append.length,cur_cmd->x.cmd_txt.text_len);
  			append.length+=cur_cmd->x.cmd_txt.text_len;
***************
*** 1243,1259 ****
  
  		case 'r':
  			{
! 				int n;
! 				/*char tmp_buf[1024];*/
  
  				rewind(cur_cmd->x.io_file);
! 				while((n=fread(append.text+append.length,sizeof(char),append.alloc-append.length,cur_cmd->x.io_file))>0) {
  					append.length += n;
  					if(append.length==append.alloc) {
! 						append.text = ck_realloc(append.text, append.alloc + cur_cmd->x.cmd_txt.text_len);
! 						append.alloc += cur_cmd->x.cmd_txt.text_len;
  					}
! 				}
  				if(ferror(cur_cmd->x.io_file))
  					panic("Read error on input file to 'r' command\n");
  			}
--- 1248,1263 ----
  
  		case 'r':
  			{
! 				int n=0;
  
  				rewind(cur_cmd->x.io_file);
! 				do {
  					append.length += n;
  					if(append.length==append.alloc) {
! 						append.alloc *= 2;
! 						append.text = ck_realloc(append.text, append.alloc);
  					}
! 				} while((n=fread(append.text+append.length,sizeof(char),append.alloc-append.length,cur_cmd->x.io_file))>0);
  				if(ferror(cur_cmd->x.io_file))
  					panic("Read error on input file to 'r' command\n");
  			}
-- 
Tapani Tarvainen    (tarvaine@tukki.jyu.fi, tarvainen@finjyu.bitnet)