## Character encoding

 Quick links 3.3.1 Number bases 3.3.2 Converting number bases 3.3.3 Units of information 3.3.4 Binary arithmetic 3.3.5 Character encoding 3.3.6 Representing images 3.3.7 Representing sound 3.3.8 Data compression

### Syllabus content

Understand what a character set is and be able to describe the following character encoding methods:

• • 7-bit ASCII
• • Unicode.

Students should be able to use a given character encoding table to:

• • convert characters to character codes
• • convert character codes to characters.

Understand that character codes are commonly grouped and run in sequence within encoding tables..   Students should know that character codes are grouped and that they run in sequence. For example in ASCII ‘A’ is coded as 65, ‘B’ as 66, and so on, meaning that the codes for the other capital letters can be calculated once the code for ‘A’ is known. This pattern also applies to other groupings such as lower case letters and digits.

Describe the purpose of Unicode and the advantages of Unicode over ASCII. Know that Unicode uses the same codes as ASCII up to 127.   Students should be able to explain the need for data representation of different alphabets and of special symbols allowing a far greater range of characters. It is not necessary to be familiar with UTF-8, UTF-16 or other different versions of Unicode.

## Starter

Convert 251 and 0.5 to binary

Convert AF5 to decimal

Convert 1101 0110 0111 1101 to Hex

## Explanation

### The ASCII character set

Webopedia says that a character set is "A defined list of characters recognized by the computer hardware and software. Each character is represented by a number. The ASCII character set, for example, uses the numbers 0 through 127 to represent all English characters as well as special control characters."

ASCII stands for American Standard Code for Information Interchange.

127 different characters is because the largest number in 7 bit binary is 127 (work it out). The extra bit used to be used for parity (more later). Actually when you include 0 as a number there are 128 numbers so it is a 128 character charater set. Just goes to show how good the web can be isn't it.

The important thing to note is that each character is represented by a number (the computer will see a binary number and we will see a decimal number as we don't yet think in binary). That number is fixed, so "A" will always be 65 ( 01000001).

All computer manufacturers will use the same system which is why you can plug any keyboard into any computer.

ASCII grew from the original teleprinter technology.

Clearly this is out of date. Can you see why?

Now there is an 8-bit Extended ASCII character set which has 256 characters (part of which is shown above).

Each character has a number. The binary value of that number is the value stored in the computer. It is the value sent by the keyboard to the CPU (more later) when you press on a key on the keyboard.

## Exercise

### Binary to decimal to text

Here is a message in Binary, can you use Excel to turn it into text?

 1010101 1101110 1101001 1100011 1101111 1100100 1100101 1101000 1100001 1110011 1100001 1101100 1100001 1110010 1100111 1100101 1110010 1110010 1100001 1101110 1100111 1100101

If I turn each character to its ASCII number and then add a number say 4 to it and then turn it back into the character value then I can write in a simple code. All the reader needs to know is the number that I have added. See if you can work it out and translate this coded message.

 Sah_kia pk _kilqpejc