c - sending unicode over TCP sockets, what about endianness -


i read string unicode symbols (utf8) in c. ones read stored in 3 bytes, these characters can't stored in single byte, i'm worried endianness of these when sent on tcp socket using functions write , read. need particular them make sure machine reads stream interprets these unicode characters correctly?

send byte array. endianness should not issue utf8 encoded strings, since byte oriented. endianness matters example when have 2 bytes , need interpret them single value. if have interpret these 2 bytes individually, endianness not issue.

more info: http://unicode.org/faq/utf_bom.html

q: utf-8 encoding scheme same irrespective of whether underlying processor little endian or big endian?

a: yes. since utf-8 interpreted sequence of bytes, there no endian problem there encoding forms use 16-bit or 32-bit code units. bom used utf-8, used encoding signature distinguish utf-8 other encodings — has nothing byte order. [af]


Comments

Popular posts from this blog

java - Oracle EBS .ClassNotFoundException: oracle.apps.fnd.formsClient.FormsLauncher.class ERROR -

c# - how to use buttonedit in devexpress gridcontrol -

nvd3.js - angularjs-nvd3-directives setting color in legend as well as in chart elements -