Converting a real number to floating point number.


Given the number : 87.461 

Given a floating point storage arrange as :

 1 bit sign
 4 bit exponent
11 bit signficand

1. Use divide by 2 to convert the integer portion of the number to binary. 
Remember the remainder column is upside down.
 
Conversion                    Check
       r                      1    0    1    0    1    1    1
  87                          1
  43   1                           2
  21   1                                4+1
  10   1                                    10
   5   0                                        20+1
   2   1                                             42+1
   1   0                                                  86+1
   0   1                                                   87


2. Use multiply by 2 to convert decimal portion. You do need to look at
your floating point storage at this point to decide how my places to 
calculate to. You should alway go 2 places more than your storage.

In our case, we have an 11 bit significand, so we need to go 11 places after
the 1st 1 bit.

However, in this case, we know we already have a 7 bit integer and will use 
6 bits to store the integer portion of the number. (Remember we don't need 
to store the most significant bit.)  So, we need 5 (6 if rounding occurs)
bits.

  Conversion            
 
   .461                   
  0.922
  1.844
  1.688
  1.376
  0.752
  1.504
  1.008
   

3. Combine the parts. This gives a binary approximation for the number as :

  1010111.0111011
   

4. Re-write value in scientific notation. (The b at the end indicates it is
   base 2).

  1.0101110111011b  * 2^6

5. Select the significand from the mantissa portion. We can only store 11
bits and the most significant does not need to be stored. In a real system,
conversion may include rounding but for our purposes, we will truncate.

   mantissa            significand

  1.0101110111011b     01011101110 
    -----------

6. Calculate the bias. We are given a 4 bit exponent storage. 

Use 2^(n-1)-1 where n=4 in this case.

    2^(4-1)-1  = 2^3 - 1 = 8 - 1 = 7


7. Add bias to the exponent determined in step 4.

    7 + 6 = 13

8. Convert the biased exponent to binary.

  Conversion

  13        Since exponent storage is 6 bit, pad on left with 0s. 
   6  1
   3  0          001101
   1  1 
   0  1

8. Put it together.

Sign positive = 0  
biased exponent (6 bit)  001101 
Signifand (11 bit) 01011101110 

0  1101  01011101110 




Converting back.


1. Start with the significand.

     01011101110

2. Re-establish the most significant bit.

    1.01011101110

3. Unbias the exponent.  

Bias 2^(4-1)-1 = 7 
Biased exponent 13

  13 -7 = 6.

4. Use unbiased exponent to convert back to an ordinary number.

   1.01011101110b * 2^6 = 1010111.01110b

We've converted the integer portion above as a check, so now we do the 
decimal portion.

  .01110 

        . 
 .5     0
 .25    1    .25
 .125   1    .125
 .0625  1    .0625
 .03125 0
             .4375
 

So we were able to store the value

    85.4375