UTF-8 vs UCS-2,UTF-1 etc

Hi

I need some clarifications on what photon functions which expect what type.

Pt* functions always use UTF-8.
Pf* functions seem to want UCS-2 strings
Pg* functions are very confusing, pasting from the docs of PgDrawText


By default, the function assumes that all strings consist of multibyte
characters that conform to the ISO/IEC 10646-1 UTF-1 multibyte format.
However, if Pg_TEXT_WIDECHAR is set, the function assumes each character
is represented by 2 bytes that conform to the ISO/IEC 10646-1 UCS-2
double-byte format.

UTF-1 ???

what is, UTF-1? i barly found any information about this.
seems it’s a very old and depricated looong time ago? is this a typo of
UTF-16?
I’d love to have this explained
Thanks

/Johan

phearbear <phearbear@home.se> wrote:

Hi

I need some clarifications on what photon functions which expect what type.

Pt* functions always use UTF-8.
Pf* functions seem to want UCS-2 strings

No, we prefer you use the routines which use UTF-8, e.g.
PfExtentText. The UTF-2, and UTF-4 (6.2 and post) are there
for convenience only. If at all possible, use routines that
want UTF-8. This will result in way less headaches, since
all the widget API expects UTF-8 multi-byte.

Pg* functions are very confusing, pasting from the docs of PgDrawText

Use a Pf equivalent if it exists, instead of PgExtentText, use
PfExtentText. Most of the Pg string manipulation routines are
for backwards API compliance.


By default, the function assumes that all strings consist of multibyte
characters that conform to the ISO/IEC 10646-1 UTF-1 multibyte format.
However, if Pg_TEXT_WIDECHAR is set, the function assumes each character
is represented by 2 bytes that conform to the ISO/IEC 10646-1 UCS-2
double-byte format.

UTF-1 ???

UTF-1 is an old single-byte standard, that did not include all the ASCII symbols,
if I am correct. I just took a quick read. The routine actually will
take UTF-8 multi-bytes strings. So in reality, it is just a hold
over in the documentation. I will inform the appropriate people
to change it to UTF-8, after I get a chance to verify the code.

what is, UTF-1? i barly found any information about this.
seems it’s a very old and depricated looong time ago? is this a typo of
UTF-16?
I’d love to have this explained
Thanks

/Johan

dleach@qnx.com wrote:

phearbear <> phearbear@home.se> > wrote:
Hi

I need some clarifications on what photon functions which expect what type.

Pt* functions always use UTF-8.
Pf* functions seem to want UCS-2 strings

No, we prefer you use the routines which use UTF-8, e.g.
PfExtentText. The UTF-2, and UTF-4 (6.2 and post) are there

This should be UTF-16 and UTF-32, sorry.
I got confused with UCS-2 and UCS-4.

for convenience only. If at all possible, use routines that
want UTF-8. This will result in way less headaches, since
all the widget API expects UTF-8 multi-byte.

Pg* functions are very confusing, pasting from the docs of PgDrawText

Use a Pf equivalent if it exists, instead of PgExtentText, use
PfExtentText. Most of the Pg string manipulation routines are
for backwards API compliance.



By default, the function assumes that all strings consist of multibyte
characters that conform to the ISO/IEC 10646-1 UTF-1 multibyte format.
However, if Pg_TEXT_WIDECHAR is set, the function assumes each character
is represented by 2 bytes that conform to the ISO/IEC 10646-1 UCS-2
double-byte format.

UTF-1 ???

UTF-1 is an old single-byte standard, that did not include all the ASCII symbols,
if I am correct. I just took a quick read. The routine actually will
take UTF-8 multi-bytes strings. So in reality, it is just a hold
over in the documentation. I will inform the appropriate people
to change it to UTF-8, after I get a chance to verify the code.

what is, UTF-1? i barly found any information about this.
seems it’s a very old and depricated looong time ago? is this a typo of
UTF-16?
I’d love to have this explained
Thanks

/Johan

phearbear <phearbear@home.se> wrote:

dleach@qnx.com > wrote:
phearbear <> phearbear@home.se> > wrote:

Hi


I need some clarifications on what photon functions which expect what type.


Pt* functions always use UTF-8.
Pf* functions seem to want UCS-2 strings


No, we prefer you use the routines which use UTF-8, e.g.
PfExtentText. The UTF-2, and UTF-4 (6.2 and post) are there
for convenience only. If at all possible, use routines that
want UTF-8. This will result in way less headaches, since
all the widget API expects UTF-8 multi-byte.
[snip]
/Johan



Thanks alot.

As mentioned before, I should have said UCS-2 (or UTF-16) above, not UTF-2,
and UCS-4 (or UTF-32), not UTF-4.

dleach@qnx.com wrote:

phearbear <> phearbear@home.se> > wrote:

Hi


I need some clarifications on what photon functions which expect what type.


Pt* functions always use UTF-8.
Pf* functions seem to want UCS-2 strings


No, we prefer you use the routines which use UTF-8, e.g.
PfExtentText. The UTF-2, and UTF-4 (6.2 and post) are there
for convenience only. If at all possible, use routines that
want UTF-8. This will result in way less headaches, since
all the widget API expects UTF-8 multi-byte.


Pg* functions are very confusing, pasting from the docs of PgDrawText


Use a Pf equivalent if it exists, instead of PgExtentText, use
PfExtentText. Most of the Pg string manipulation routines are
for backwards API compliance.



By default, the function assumes that all strings consist of multibyte
characters that conform to the ISO/IEC 10646-1 UTF-1 multibyte format.
However, if Pg_TEXT_WIDECHAR is set, the function assumes each character
is represented by 2 bytes that conform to the ISO/IEC 10646-1 UCS-2
double-byte format.


UTF-1 ???


UTF-1 is an old single-byte standard, that did not include all the ASCII symbols,
if I am correct. I just took a quick read. The routine actually will
take UTF-8 multi-bytes strings. So in reality, it is just a hold
over in the documentation. I will inform the appropriate people
to change it to UTF-8, after I get a chance to verify the code.


what is, UTF-1? i barly found any information about this.
seems it’s a very old and depricated looong time ago? is this a typo of
UTF-16?
I’d love to have this explained
Thanks


/Johan

Thanks alot.

phearbear <phearbear@home.se> wrote:

dleach@qnx.com > wrote:
phearbear <> phearbear@home.se> > wrote:

dleach@qnx.com > wrote:

phearbear <> phearbear@home.se> > wrote:
[snip]
As mentioned before, I should have said UCS-2 (or UTF-16) above, not UTF-2,
and UCS-4 (or UTF-32), not UTF-4.


Heh, i read it as UCS-4 resp -2 without even noticing the error > :slight_smile:

btw, I can’t seem to find a Pf function to acctually draw a string, or
am i to use the Pg function for that?

You would use PgDrawText to actually draw a string.

the PfRenderText function seems a bit more complex then what i actually
want…

Yes, way to complex for what you want. PfRenderText is mainly
used by drivers, and low level libraries.

dleach@qnx.com wrote:

phearbear <> phearbear@home.se> > wrote:

dleach@qnx.com > wrote:

phearbear <> phearbear@home.se> > wrote:


Hi


I need some clarifications on what photon functions which expect what type.


Pt* functions always use UTF-8.
Pf* functions seem to want UCS-2 strings


No, we prefer you use the routines which use UTF-8, e.g.
PfExtentText. The UTF-2, and UTF-4 (6.2 and post) are there
for convenience only. If at all possible, use routines that
want UTF-8. This will result in way less headaches, since
all the widget API expects UTF-8 multi-byte.

[snip]

/Johan



Thanks alot.


As mentioned before, I should have said UCS-2 (or UTF-16) above, not UTF-2,
and UCS-4 (or UTF-32), not UTF-4.

Heh, i read it as UCS-4 resp -2 without even noticing the error :slight_smile:

btw, I can’t seem to find a Pf function to acctually draw a string, or
am i to use the Pg function for that?
the PfRenderText function seems a bit more complex then what i actually
want…

dleach@qnx.com wrote:

phearbear <> phearbear@home.se> > wrote:

dleach@qnx.com > wrote:

phearbear <> phearbear@home.se> > wrote:


dleach@qnx.com > wrote:


phearbear <> phearbear@home.se> > wrote:

[snip]

As mentioned before, I should have said UCS-2 (or UTF-16) above, not UTF-2,
and UCS-4 (or UTF-32), not UTF-4.



Heh, i read it as UCS-4 resp -2 without even noticing the error > :slight_smile:


btw, I can’t seem to find a Pf function to acctually draw a string, or
am i to use the Pg function for that?


You would use PgDrawText to actually draw a string.
This one only accept UTF-1 or UCS-2 if a flag is set. i looked in the

Pg.h file, and there is a Pg_TEXT_VWIDECHAR define, which claims to be
UTF-4 (hmm, that really should be UCS-4 , i think someone made a boobo
:stuck_out_tongue: )(same for Pg_TEXT_WIDECHAR , as the docs say this mean UCS-2, but
header say UTF-2 :wink: ) unfortunatly this is commented out. is it safe to
use it?(latest beta) Will it be there in final 6.2?




the PfRenderText function seems a bit more complex then what i actually
want…


Yes, way to complex for what you want. PfRenderText is mainly
used by drivers, and low level libraries.

[snip]

You would use PgDrawText to actually draw a string.
This one only accept UTF-1 or UCS-2 if a flag is set. i looked in the
Pg.h file, and there is a Pg_TEXT_VWIDECHAR define, which claims to be
UTF-4 (hmm, that really should be UCS-4 , i think someone made a boobo
:stuck_out_tongue: > )(same for Pg_TEXT_WIDECHAR , as the docs say this mean UCS-2, but
header say UTF-2 > :wink: > ) unfortunatly this is commented out. is it safe to
use it?(latest beta) Will it be there in final 6.2?
[snip]

OK, for PgDrawText, here is what it is:

no flags - UTF-8 multi-byte
Pg_TEXT_WIDECHAR - UTF-16 (UCS-2) 16-bit wide characters

Currently there is no roadmap plan for UTF-32 (UCS-4) 32-bit wide characters
in the PgXX routines, though I have heard rumblings. It is not
safe to use the Pg_TEXT_VWIDECHAR flag. It is best to stick
with multi-byte if at all possible, since the widget library
does not use wide characters of any type.

In 6.2, there are routines PfExtent/PfExtentCx which
can process all three types (multi-bytes, 16-bit wide characters,
and 32-bit wide characters).

For 6.3, I have changed the header file to be:

#define Pg_TEXT_UTF16CHAR (0x04 << :sunglasses:

//
/
Fix a misunderstanding. /
/
It is either UTF-8, UTF-16 (UCS2), or UTF-32 (UCS2) /
/
/

#define Pg_TEXT_UTF2CHAR Pg_TEXT_UTF16CHAR /* This is wrong, but history prevails. /
/
#define Pg_TEXT_UTF32CHAR (0x02 << :sunglasses:*/

#define Pg_TEXT_WIDECHAR (Pg_TEXT_UTF16CHAR)
/*#define Pg_TEXT_VWIDECHAR (Pg_TEXT_UTF32CHAR) */

Thank You.