Numerical Precision

Floating Point Numbers

With the proliferation of increasingly complex software and hardware that perform tasks as varied as financial calculations or scientific experiments, the arithmetic involved in these operations has also grown cumbersome. Simple arithmetic is no longer sufficient when trying to compute things such as telephone call rates that are billed by the second and require six or more fractional digits or the Gross National Product of countries that may require fifteen digits to the left of the decimal (Cowlishaw, 2003, p. 3). The computing of such a wide range of numbers requires the use of computers and floating point numbers.

When dealing with computers, real numbers and the infinite combinations that they require are simply too inefficient to handle. The floating point number is designed to eliminate this problem. Floating point numbers can be either single precision or double precision. Single precision numbers have about eight significant decimal digits, while double precision have about seventeen (Mak, 2003, p. 9). These numbers are represented in base 2 instead of base 10 to conform to the internal binary form that computers require (Mak, 2003, p. 9).

These numbers are both stored in the computer's memory in the same way, with single precision requiring a total of 32 bits and double precision requiring 64 (Mak, 2003, p. 34). For both single and double precision numbers, one bit is used to store the sign of the number, which is either "0" for positive or "1" for negative (Mak, 2003, p. 34). The next part of the number that is stored in memory is the exponent, which comprises eight bits for single precision and eleven bits for double (Mak, 2003, p.34). Finally the fractional value of the number is stored using 23 bits for single precision and 52 bits for double (Mak, 2003, p. 34). Because of the fact that floating point numbers limit the amount of memory used to store a number, it makes them very efficient and speedy when performing calculations. However, these same traits can also lead to errors in both rounding and computation.

Binary Coded Decimal

Many of the advantages of floating point numbers have been incorporated into binary coded decimal representations, while also adding the accuracy of decimal encodings (Sanchez & Canton, 2007, p. 52). Since such representations do not have formally established standards, each machine or software package uses the numbers in a unique and often incompatible way (Sanchez & Canton, 2007, p. 52). These formats can be useful for input-output operations and arithmetic calculations when BCD encoding is used.

One type of BCD encoding that is often used is called BCD12. It is named this way because it requires twelve bytes of memory storage, or 96 bits total (Sanchez & Canton, 2007, p. 52). The first four bits represent the sign of the number, either 0000B for a positive number or 0001B for a negative one (Sanchez &….....

Order a one-of-a-kind custom essay on this topic

Numerical Precision Term Paper

Need Help Writing Your Essay?