html - Is utf-8 a character set or an encoding? -
from understand, unicode character set containing possible characters in languages. utf-8 way represent each of characters in memory. if it's case, why put:
<meta charset="utf-8">
and not
<meta encoding="utf-8">
in html document indicate utf-8 encoding?
<meta charset="foo">
mostly-compatible-by-luck abbreviation of original html 2.0 <meta http-equiv="content-type" content="text/html; charset=foo">
construct. meta http-equiv
used (in limited way) smuggle http headers inside html document, construct equivalent setting charset=foo
on content-type
header of enclosing http response.
the content-type
http header taken mime standard used e-mail (rfc2045, rfc1341). standard called charset
because predates unicode. in days, iso-8559-1, cp1251 et al considered separate character sets. when unicode came along reformulated them encoded subsets of 1 true character set.
now web has standardised on unicode (actually utf-16 code units, more's pity) character model indeed more accurate describe encoding
. name charset
has stuck because there no pressing need fix it.
Comments
Post a Comment