UTF-8 stands for Unicode Transformation Format-8. It is an octet (8-bit) lossless encoding of Unicode characters.
UTF-8 encodes each Unicode character as a variable number of 1 to 4 octets, where the number of octets depends on the integer value assigned to the Unicode character. It is an efficient encoding of Unicode documents that use mostly US-ASCII characters because it represents each character in the range U+0000 through U+007F as a single octet. UTF-8 is the default encoding for XML.
The MIME character set attribute for UTF-8 is UTF-8.
Character sets are case-insensitive, so utf-8 is equally
valid. [IANA Character
Sets].
In an HTML file, place this tag inside <head> ...
</head>:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
In an XML prolog, the encoding is typically specified as an attribute:
<?xml version="1.0" encoding="UTF-8" ?>
In Apache server config or .htaccess, this will cause the HTTP header
to be generated for text/html and text/plain
content:
AddDefaultCharset UTF-8