Floating-Point Representation


Floating-point numbers are usually a multiple of the size of a word. The representation of a MIPS floating-point number is shown below, where 1 in sign bit means negative, exponent is the value of the 8-bit exponent field (including the sign of the exponent), and fraction is the 23-bit number. The bit string represents the following number, which will be explained later:
   0.15625 = (1.0+2-2)×2124-127 = 1.25×2-3

In general, floating-point numbers are of the form (-1)S×F×2E where F involves the value in the fraction field and E involves the value in the exponent field. Two cases may occur for floating-point arithmetic: One way to reduce chances of underflow or overflow is to offer another format that has a larger exponent, called double precision floating-point numbers, whereas the above format is called single precision floating point. The representation of a double precision floating-point number takes two MIPS words, as shown below, where exponent is the value of the 11-bit exponent field, and fraction is the 52-bit number in the fraction field.





      “Never forget that justice is what love looks like in public.”    
      ― Cornel West