From nobody Wed Nov 30 23:49:26 2005 From: Paul Eggert Subject: Re: gawk: length return incorrect value when MB_CUR_MAX > 1 To: Hirofumi Saito Cc: bug-gawk@gnu.org, KIMURA Koichi Date: Wed, 30 Nov 2005 13:39:56 -0800 Hirofumi Saito writes: > And then, I tried to use gawk 3.1.5 which I build with sarge. > > $ LANG=ja_JP.utf8 gawk 'BEGIN {print length("abc\0def")}' > 7 > $ LANG=ja_JP.eucJP gawk 'BEGIN {print length("abc\0def")}' > 3 Very strange. I don't get this result with Debian sarge x86; instead, I get 3 in both cases. And that is what I would expect to get, given the source code. Perhaps your locales weren't all built? (Also, I set LC_ALL rather than LANG; that's safer.) > By the way, I patched Kimura's patch, then: Yes, his patch should work. Here's a slightly more-efficient patch: --- gawk-3.1.5.orig/node.c-bak 2005-07-26 11:07:43 -0700 +++ gawk-3.1.5/node.c 2005-11-30 13:33:44 -0800 @@ -749,9 +749,10 @@ str2wstr(NODE *n, size_t **ptr) switch (count) { case (size_t) -2: case (size_t) -1: - case 0: goto done; + case 0: + count = 1; default: *wsp++ = wc; src_count -= count; _______________________________________________ bug-gnu-utils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-gnu-utils