Data Encoding Scheme: Binary Coding Schemes - Unicode, ASCII, EBCDIC

Rajiv Shah
Rajiv Shah
Data Encoding Scheme: Binary Coding Schemes - Unicode, ASCII, EBCDIC

Binary Coding Schemes

The alphabetic data, numeric data, alphanumeric data, symbols, sound data and video data, are represented as combination of bits in the computer. The bits are grouped in a fixed size, such as 8 bits, 6 bits or 4 bits. A code is made by combining bits of definite size. Binary Coding schemes represent the data such as alphabets, digits 0−9, and symbols in a standard code. A combination of bits represents a unique symbol in the data. The standard code enables any programmer to use the same combination of bits to represent a symbol in the data.

The binary coding schemes that are most commonly used are

  1. Extended Binary Coded Decimal Interchange Code (EBCDIC)
  2. American Standard Code for Information Interchange (ASCII)
  3. Unicode

1. EBCDIC


  • The Extended Binary Coded Decimal Interchange Code (EBCDIC) uses 8 bits (4 bits for zone, 4 bits for digit) to represent a symbol in the data.
  • EBCDIC allows 28 = 256 combinations of bits.
  • 256 unique symbols are represented using EBCDIC code. It represents decimal numbers (0−9), lower case letters (a−z), uppercase letters (A−Z), Special characters, and Control characters (printable and non−printable, e.g., for cursor movement, printer vertical spacing, etc.).
  • EBCDIC codes are mainly used in the mainframe computers.

2. ASCII


  • The American Standard Code for Information Interchange (ASCII) is widely used in computers of all types.
  • ASCII codes are of two types—ASCII−7 and ASCII−8.
  • ASCII-7 is a 7-bit standard ASCII code. In ASCII-7, the first 3 bits are the zone bits and the next 4 bits are for the digits. ASCII-7 allows 27 = 128 combinations. 128 unique symbols are represented using ASCII-7. ASCII-7 has been modified by IBM to ASCII-8.
  • ASCII-8 is an extended version of ASCII-7. ASCII-8 is an 8-bit code having 4 bits for zone and 4 bits for the digit. ASCII-8 allows 28 = 256 combinations. ASCII-8 represents 256 unique symbols. ASCII is used widely to represent data in computers.
  • The ASCII-8 code represents 256 symbols.
    • Codes 48 to 57 stand for numeric 0−9.
    • Codes 65 to 90 stand for uppercase letters A−Z.
    • Codes 97 to 122 stand for lowercase letters a−z.
    • Codes 128 to 255 are the extended ASCII codes.

3 Unicode


  • Unicode is a universal character encoding standard for the representation of text which includes letters, numbers and symbols in multi−lingual environments. The Unicode Consortium based in California developed the Unicode standard.
  • Unicode uses 32 bits to represent a symbol in the data.
  • Unicode allows 232 = 4164895296 (~ 4 billion) combinations.
  • Unicode can uniquely represent any character or symbol present in any language like Chinese, Japanese, etc. In addition to the letters; mathematical and scientific symbols are also represented in Unicode codes.
  • An advantage of Unicode is that it is compatible with the ASCII−8 codes. The first 256 codes in Unicode are identical to the ASCII-8 codes.
  • Unicode is implemented by different character encodings. UTF-8 is the most commonly used encoding scheme. UTF stands for Unicode Transformation Format. UTF-8 uses 8 bits to 32 bits per code.

Subscribe to our newsletter

Join our newsletter and get resources, curated content, and design inspiration delivered straight to your inbox.