Floating-Point Representation (Cont.)
The previous formats are part of the IEEE 754 floating-point standard.
For a normalized floating point number (S, E, F):
Significand is equal to (1.F)2 = (1.f1f2f3f4)2
because
- IEEE 754 assumes hidden 1. (not stored) for normalized numbers.
- Significand is therefore 1 bit longer than fraction.
Value of a normalized floating point number is
(-1)S × (1.F)2 × 2val(E)
= (-1)S × (1.f1f2f3f4...)2 × 2val(E)
= (-1)S × (1 + f1×2-1 + f2×2-2 + f3×2-3 + f4×2-4 ...)2 × 2val(E)
For the reason of simplified sorting, IEEE 754 uses biased representation for the exponent, that is,
Value of exponent = val(E) = E – Bias
Recall that exponent field is 8 bits for single precision.
E can be in the range
[0=000000002, 255=111111112=28-1]
E = 0 and E = 255 are reserved for special use and E = 1 to 254 are used for normalized floating point numbers.
So, Bias=127(=254÷2)
and val(E)=E-127
.
For example,
val(E=126=011111102) = 126-127 = -1
val(E=128=100000002) = 128-127 = 1
val(E=254=111111102) = 254-127 = 127
For similar reason, the exponent bias for double precision is 1023 because its 11-bit exponent has the range [0, 2047].
The value of a normalized floating point number is therefore refined as
(-1)S × (1.F)2 × 2E-Bias
= (-1)S × (1.f1f2f3f4...)2 × 2E-Bias
= (-1)S × (1 + f1×2-1 + f2×2-2 + f3×2-3 + f4×2-4 ...)2 × 2E-Bias