You would think that it returns the number of characters in the NSString, but it doesn’t. Instead, it returns the number of unichars (1 or 2 byte characters). So as far as -[NSString length] is concerned every 4 byte unicode character (the double flat, for instance) counts as 2 characters even though it’ll only be displayed as one character on screen.
This isn’t a bug per-say in -[NSString length], but it’s not the behavior I was expecting. To count the number of characters in NSString use the following function adapted from AGRegex:
int utf8charcount(NSString *string) {
char *str = [string UTF8String];
int chars, pos, len = [str length];
unsigned char c;
for (pos = chars = 0; pos < len; pos++) {
c = str[pos];
if (c <= 0x7f || (0xc0 <= c && c <= 0xfd))
chars++;
}
return chars;
}
char *str = [string UTF8String];
int chars, pos, len = [str length];
unsigned char c;
for (pos = chars = 0; pos < len; pos++) {
c = str[pos];
if (c <= 0x7f || (0xc0 <= c && c <= 0xfd))
chars++;
}
return chars;
}
0 Response to “What Does NSString’s Length Method Really Return?”