International characters in printf()-format strings

Is there anything I can do about this problem:


#include <stdio.h>


int main ( int argc, char* argv[] )
{

printf(“ÄÖÜ %d\n”, 123);

printf("\xC4\xD6\xDC %d\n", 123);

return 0;
}

issues

ÄÖÜ
ÄÖÜ

but no number.

I checked this little sample with a few gcc/glib-versions on other
OS’s and it worked on every platform, also QNX4.

TIA – and desperately looking for a solution,

Karsten.


| / | __ ) | Karsten.Hoffmann@mbs-software.de MBS-GmbH
| |/| | _ _
\ Phone : +49-2151-7294-38 Karsten Hoffmann
| | | | |
) |__) | Fax : +49-2151-7294-50 Roemerstrasse 15
|| ||// Mobile: +49-172-3812373 D-47809 Krefeld

Whatever characters your trying to display are being interpreted goofy.
(In fact, as I’m writing this followup, ped won’t display them at all.)
I wonder if the compiler is interpreting them as tri-graphs. I know
that no one uses them any more, but I think the compiler still supports
them.

Just curious. Try this:
printf( “%c%c%c %d\n”, 0xC4, 0xD6, 0xDC, 123 );


Karsten.Hoffmann@mbs-software.de wrote:
KHmsd > Is there anything I can do about this problem:


KHmsd > #include <stdio.h>


KHmsd > int main ( int argc, char* argv[] )
KHmsd > {

KHmsd > printf(" %d\n", 123);

KHmsd > printf("\xC4\xD6\xDC %d\n", 123);

KHmsd > return 0;
KHmsd > }

KHmsd > issues

KHmsd >
KHmsd >
KHmsd >
KHmsd > but no number.

KHmsd > I checked this little sample with a few gcc/glib-versions on other
KHmsd > OS’s and it worked on every platform, also QNX4.

KHmsd > TIA – and desperately looking for a solution,

Karsten.Hoffmann@mbs-software.de wrote:
: Is there anything I can do about this problem:


Here’s a note that we added to the docs for printf():


If the format string contains invalid multibyte characters,
processing stops, and the rest of the format string, including the
% characters, is printed. This can happen, for example, if you
specify accents and diacritical marks using ISO 8859-1
instead of UTF-8. If you call:

setlocale( LC_CTYPE, “C-TRADITIONAL” );

before calling printf(), the locale switches multibyte processing
from UTF-8 to 1-to-1, and printf() safely transfers the
misformed multibyte characters.


Steve Reid stever@qnx.com
TechPubs (Technical Publications)
QNX Software Systems

Steve Reid <stever@qnx.com> wrote:

Karsten.Hoffmann@mbs-software.de > wrote:
: Is there anything I can do about this problem:


Here’s a note that we added to the docs for printf():


If the format string contains invalid multibyte characters,
processing stops, and the rest of the format string, including the
% characters, is printed. This can happen, for example, if you
specify accents and diacritical marks using ISO 8859-1
instead of UTF-8. If you call:

setlocale( LC_CTYPE, “C-TRADITIONAL” );

before calling printf(), the locale switches multibyte processing
from UTF-8 to 1-to-1, and printf() safely transfers the
misformed multibyte characters.

That did the trick!

I searched for the wrong keywords (‘8-bit’, ‘international
characters’), so I didn’t find it.


The trap was, that the default behaviour changed.

I wonder why e.g. the c-libs in e.g. linux environment do the
right conversion, when the C-locale is set ?!?

I even didn’t find the ‘C-TRADITIONAL’ in my linux man pages.
Is it a QNX extension?

\


| / | __ ) | Karsten.Hoffmann@mbs-software.de MBS-GmbH
| |/| | _ _
\ Phone : +49-2151-7294-38 Karsten Hoffmann
| | | | |
) |__) | Fax : +49-2151-7294-50 Roemerstrasse 15
|| ||// Mobile: +49-172-3812373 D-47809 Krefeld

Karsten.Hoffmann@mbs-software.de wrote:
: That did the trick!

Great!

: I searched for the wrong keywords (‘8-bit’, ‘international
: characters’), so I didn’t find it.

Maybe we should add “international characters” to the note. I guess
“diacritical” isn’t quite everyday language. :slight_smile:

: The trap was, that the default behaviour changed.

: I wonder why e.g. the c-libs in e.g. linux environment do the
: right conversion, when the C-locale is set ?!?

: I even didn’t find the ‘C-TRADITIONAL’ in my linux man pages.
: Is it a QNX extension?

I think (but others might know more better) that it’s one of the changes
that came with the Dinkum libraries. I’ve heard that the behaviour will
change in a future version so that these characters will work again
without your having to use C-TRADITIONAL.


Steve Reid stever@qnx.com
TechPubs (Technical Publications)
QNX Software Systems