Unicode System in java

Unicode System :

Unicode is a universal international standard character encoding which is capable of representing all the world’s written
Why java uses Unicode System?
Prior Unicode system, there were many language standards:

  • ASCII : It stand for American Standard Code for Information Interchange used for the United States.
  • ISO 8859-1 : It was used for Western European Language.
  • KOI-8 : Used for Russian.
  • GB18030 and BIG-5 : Used for chinese, and so on.

With these languages standard arise two problems:

  • A particular code value corresponds to different letters in the various language standards.
  • The encodings for languages with large character sets have variable length. Some common characters are encoded as single bytes, other require two or more byte.

To solve these problems, a new language standard was developed i.e. Unicode System.

In unicode, character holds 2 byte, so java uses 2 byte for characters.

  • lowest value are: \u0000 and
  • highest value are: \uFFFF