0

I'm using an old objectiveC routine (let's call it oldObjectiveCFunction), which parses a String analyzing each char. After analyzing chars, it divides that String into Strings, and returns them into an array called *functions. This is a super reduced sample of how is that old function doing the String parse:

NSMutableArray *functions = [NSMutableArray new];
NSMutableArray *components = [NSMutableArray new];
NSMutableString *sb = [NSMutableString new];
char c;
int sourceLen = source.length;
int index = 0;

while (index < sourceLen) {
    c = [source characterAtIndex:index];
    //here do some random work analyzing the char 
    [sb appendString:[NSString stringWithFormat:@"%c",c]];
    if (some condition){
        [components addObject:(NSString *)sb];                 
        sb = [NSMutableString new];
        [functions addObject:[components copy]];
    }
}

later, I'm getting each String of *functions doing this with Swift code:

let functions = oldObjectiveCFunction(string) as? [[String]]
functions?.forEach({ (function) in
    var functionCopy = function.map { $0 }
    for index in 0..<functionCopy.count {
       let string = functionCopy[index]
    }
}

the problem is that, it works perfectly with normal strings, but if the String contains russian names, like this:

РАЦИОН

the output, the content of my let string variable, is this:

 \u{10}&\u{18}\u{1e}\u{1d}

How can I get the same Russian string instead of that?

I tried doing this:

let string2 = String(describing: string?.cString(using: String.Encoding.utf8))

but it returns even more strange result:

"Optional([32, 16, 38, 24, 30, 29, 0])" 
9
  • Do the strings look ok if you try to consume them in Objective-C instead of Swift?
    – Cristik
    Commented Feb 12, 2021 at 17:19
  • 1
    I'd guess that %c means 8-bit unsigned character (unsigned char); try %C specifier 16-bit UTF-16 code unit (unichar).
    – JosefZ
    Commented Feb 12, 2021 at 17:41
  • @JosefZ it looks like you're correct about the %c vs. %C. You should post that as an answer, since an 8-bit unsigned character can't possibly hold Cyrillic characters.
    – Duncan C
    Commented Feb 12, 2021 at 17:49
  • Does this answer your question? What are the supported Swift String format specifiers?
    – JosefZ
    Commented Feb 12, 2021 at 17:52
  • 1
    Declare unichar c; instead of char c; at 4th line (sorry, I don't speak swift or Objective-C so I'm not sure about correct syntax).
    – JosefZ
    Commented Feb 13, 2021 at 9:19

2 Answers 2

1

Analysis. Sorry, I don't speak swift or Objective-C so the following example is given in Python; however, the 4th and 5th column (unicode reduced to 8-bit) recalls weird numbers in your question.

for ch in 'РАЦИОН':
   print(ch,                          # character itself
      ord(ch),                        # character unicode in decimal
      '{:04x}'.format(ord(ch)),       # character unicode in hexadecimal
      (ord(ch)&0xFF),                 # unicode reduced to 8-bit decimal
      '{:02x}'.format(ord(ch)&0xFF))  # unicode reduced to 8-bit hexadecimal
Р 1056 0420 32 20
А 1040 0410 16 10
Ц 1062 0426 38 26
И 1048 0418 24 18
О 1054 041e 30 1e
Н 1053 041d 29 1d

Solution. Hence, you need to fix all in your code reducing 16-bit to to 8-bit:
first, declare unichar c; instead of char c; at the 4th line,
and use [sb appendString:[NSString stringWithFormat:@"%C",c]]; at the 11th line; note

  • Latin Capital Letter C in %C specifier 16-bit UTF-16 code unit (unichar) instead of
  • Latin Small Letter C in %c specifier 8-bit unsigned character (unsigned char);

Resources. My answer is based on answers to the following questions at SO:

0

Your last result is not strange. The optional comes from the string?, and the cString() function returns an array of CChar ( Int8 ).

I think the problem comes from here - but I'm not sure because the whole thing looks confusing:

[sb appendString:[NSString stringWithFormat:@"%c",c]];

have you tried :

[sb appendString: [NSString stringWithCString:c encoding:NSUTF8StringEncoding]];

Instead of stringWithFormat?

( The solution of the %C instead of %c proposed by your commenters looks a good idea too. ) - oops - just saw you have tried without success.

9
  • can you share how the line should be? I don't know how to deal with Objective C @Moose Commented Feb 12, 2021 at 18:29
  • your code gives me two errors: Use of undeclared identifier 'cString' and Unexpected interface name 'NSString': expected expression Commented Feb 12, 2021 at 18:47
  • :D sorry, I have mixed with swift ! I have get so used to it! I fix
    – Moose
    Commented Feb 12, 2021 at 18:49
  • did you fixed? it gives me exactly the same two errors, infact, it's the same line of code, you don't edited it? Commented Feb 12, 2021 at 18:51
  • oh @Moose now it's edited, but now gives me this error: Use of undeclared identifier 'encoding' Commented Feb 12, 2021 at 18:54

Not the answer you're looking for? Browse other questions tagged or ask your own question.