Turn HTML character references into their plain text UNICODE equivalent.
Handles complete character set defined in HTML 4.01 recommendation
and all reference types (decimal, hex, and entity).
Correctly converts the following formats:
&#Entity; - (Example: &) case sensitive
&#Decimal; - (Example: D)
&#xHex; - (Example: å) case insensitive
Gracefully handles malformed character references by copying original
characters as is when encountered.
Reference:
http://www.w3.org/TR/html4/sgml/entities.html
Parameters: input - the (escaped) input string the unescaped string |