Path: utzoo!attcan!uunet!mcvax!hp4nl!philmds!leo From: leo@philmds.UUCP (Leo de Wit) Newsgroups: comp.lang.c Subject: Re: string manipulation Keywords: squeeze, string Message-ID: <820@philmds.UUCP> Date: 26 Sep 88 17:47:50 GMT References: <1061@lakesys.UUCP> Reply-To: leo@philmds.UUCP (Leo de Wit) Organization: Philips I&E DTS Eindhoven Lines: 93 In article <1061@lakesys.UUCP> chad@lakesys.UUCP (Evil Iggy's illegitimate twin brother) writes: > > For several applications it would become necessary to remove a small >string from a larger one. There appears to be an easy way to do this, and here >is a sample piece of code I was testing : > > char *squeeze(cs, ct) > const char cs[]; > const char ct[]; > { > register int i; > int j,x; > > j = strcspn(cs, ct); > for (i = j; cs[i] != '\0'; i++) > cs[i] = cs[i + x]; > return (char *)s; > } Since you asked to post: Strcspn does not do what you want it to. It will return the index of the first character in cs[] that is equal to one of the characters in ct[]. Unless ct[] is exactly one character long this is not what you want. Ansi seems to support a function called strstr(), which might just return a pointer to a substring within a string. I say 'might' because I've no documentation on strstr() available, but the analogy with strchr() makes me think this (can someone confirm this?). If you don't have strstr(), you can roll your own; note that you generally don't need to examine all characters of cs[] (ct[]) to recognize a match; there are very fast algorithms , so check your literature (keywords: Boyer-Moore). As for your function parameters, I think the declaration const char cs[]; should really be char cs[]; since the contents of ct[] (the characters of the array) gets changed. If I understand it well, the substitution is meant to be in place (don't know where s is coming from, though...). If not, you can of course malloc room for a copy within the function, or supply a pointer to a buffer as a parameter to your function, which buffer can then be filled by the function. Supposed the substitution is in place: #includechar *squeeze(cs, ct) char cs[]; const char ct[]; { char *s, *t; if ((t = strstr(cs,ct)) == (char *)0) { return (char *)0; } s = t + strlen(ct); if (t == cs) { return s; } strcpy(t,s); return cs; } a) Note this version avoids to do a copy if ct[] is a prefix of cs[]. b) If we name the parts of cs[]: A, B and C where B matches ct[], this function copies C over B (unless A is empty, in which case a pointer to C is returned). If C is larger than A it is worthwhile to consider copying A over B (reverse, aligned on the end). It is questionnable whether determining the sizes does involve too much overhead, so I'll give a version for this case without testing for whatever is larger: #include char *squeeze(cs, ct) char cs[]; const char ct[]; { char *s, *t; if ((t = strstr(cs,ct)) == (char *)0) { return (char *)0; } s = cs + strlen(ct); memmove(cs,s,t - cs); return s; } This is, assuming memmove knows how to deal with overlapping pieces of memory (now where is that copy of the ANSI Draft...) Leo.