Floating-Point Representation (Cont.)


The previous formats are part of the IEEE 754 floating-point standard. For a normalized floating point number (S, E, F):

S E F = f1 f2 f3 f4 ...

Significand is equal to (1.F)2 = (1.f1f2f3f4)2 because Value of a normalized floating point number is
   (-1)S × (1.F)2 × 2val(E)

= (-1)S × (1.f1f2f3f4...)2 × 2val(E)
= (-1)S × (1 + f1×2-1 + f2×2-2 + f3×2-3 + f4×2-4 ...)2 × 2val(E)
For the reason of simplified sorting, IEEE 754 uses biased representation for the exponent, that is,
   Value of exponent = val(E) = E – Bias
Recall that exponent field is 8 bits for single precision. E can be in the range
   [0=000000002, 255=111111112=28-1]
E = 0 and E = 255 are reserved for special use and E = 1 to 254 are used for normalized floating point numbers. So, Bias=127(=254÷2) and val(E)=E-127. For example,
   val(E=126=011111102) = 126-127 =  -1
   val(E=128=100000002) = 128-127 =   1
   val(E=254=111111102) = 254-127 = 127
For similar reason, the exponent bias for double precision is 1023 because its 11-bit exponent has the range [0, 2047]. The value of a normalized floating point number is therefore refined as
   (-1)S × (1.F)2 × 2E-Bias

= (-1)S × (1.f1f2f3f4...)2 × 2E-Bias
= (-1)S × (1 + f1×2-1 + f2×2-2 + f3×2-3 + f4×2-4 ...)2 × 2E-Bias