Floating-point arithmetic - Factbites
 Factbites
 Where results make sense
About us   |   Why use us?   |   Reviews   |   PR   |   Contact us  

Topic: Floating-point arithmetic


    Note: these results are not from the primary (high quality) database.


In the News (Sat 4 Feb 12)

  
 Doing It Fast, fixed point arithmetic techniques and fast 3d transforms
Floating point arithmetic is easy to use and it comes closer to working the way we expect arithmetic to work than any other kind of arithmetic we can use in programs.
(Floating point on the other hand lets the decimal point float around.) You also have a fixed number of digits to work with.
The integer square root of a fixed point number divided by the square root of 2 to the F power where F is the number of bits of fraction in the fixed point number is the square root of the original fixed point number.
www.gameprogrammer.com /4-fixed.html

  
 5.6 decimal -- Decimal floating point arithmetic
module provides support for decimal floating point arithmetic.
Both binary and decimal floating point are implemented in terms of published standards.
Unlike hardware based binary floating point, the decimal module has a user settable precision (defaulting to 28 places) which can be as large as needed for a given problem:
www.python.org /doc/lib/module-decimal.html

  
 Fabulous Adventures In Coding : Floating Point Arithmetic, Part One
Spec'ing and implementing a good (and that's still in relative terms) floating point arithmetic is hard.
We wrote a floating point extension to the BASIC on the Acorn Atom in assembly as a "class project".
Since the whole point of denormals is to degrade into an underflow situation gracefully, it would be bad if there were such a gap.
blogs.msdn.com /ericlippert/archive/2005/01/10/350108.aspx

  
 The IEEE standard for floating point arithmetic
This standard specifies how single precision (32 bit) and double precision (64 bit) floating point numbers are to be represented, as well as how arithmetic should be carried out on them.
The IEEE single precision floating point standard representation requires a 32 bit word, which may be represented as numbered from 0 to 31, left to right.
The IEEE double precision floating point standard representation requires a 64 bit word, which may be represented as numbered from 0 to 63, left to right.
www.psc.edu /general/software/packages/ieee/ieee.html

  
 Arithmetic and Data Types - Display of floating point numbers
Constants written using the "e" notation are normally of type double, they may be forced to other floating point types by appending a suitable letter as described earlier.
All the floating point values we have seen so far have been written using the conventional integral part/decimal point/fractional part notation.
For input via scanf() any of the floating point conversion styles can be specified in a scanf() input specification string and any style of floating point number will be accepted and converted.
www.scit.wlv.ac.uk /cbook/chap4.display.fp.html

  
 Merlyn - Pascal Floating-Point Page - J R Stockton
If the arithmetic required is exact in pence, then do the calculation in pence not in pounds; that should be exact, even with float variables, up to the limit for the type.
For conversions by arithmetic to a numerical value, which could be translated to another language, see in pas-type.htm.
General Decimal Arithmetic - page set, with links, from IBM Hursley.
www.merlyn.demon.co.uk /pas-real.htm

  
 Model of Floating Point Arithmetic
This behavior is presented in terms of a model of floating point arithmetic that builds on the concept of the canonical form (see A.5.3).
For any predefined arithmetic operation that yields a result of a floating point type T, the required bounds on the result are given by a model interval of T (called the result interval) defined in terms of the operand values as follows:
The model numbers of a floating point type T are zero and all the values expressible in the canonical form (for the type T), in which mantissa has T'Model_Mantissa digits and exponent has a value greater than or equal to T'Model_Emin.
www.adaic.org /standards/95lrm/html/RM-G-2-1.html

  
 Floataway Flotation Tanks - Welcome
Floating in a float tank with warm Epsom Salts was invented by John Lilly in the sixties but of course floating and swimming are as old as our species.
Floating to relaxation is a quicker and much simpler technique.
We believe that the development of the float tank has the potential to be a major advance in the progress of human consciousness and has already proved its worth.
www.floataway.co.uk

  
 Floating Point Arithmetic
In order to illustrate some of the details of floating point arithmetic, we will consider an imaginary floating point format in which the exponent is stored in 5 bits, the significand is stored in 10 bits, and 1 bit is used to store the sign of the number.
Floating point arithmetic derives its name from something that happens when you use exponential notation.
The step which actually performs the operation can result in another kind of error: overflows can occur in floating point arithmetic as well as in fixed, but they are detected in the exponent rather than the significand.
www.rwc.uc.edu /koehler/comath/14.html

  
 Floating-point Arithmetic Precision
However, the C language requires floating point values to be truncated when a conversion to an integer is involved.
The -pcn option causes the compiler to change the floating point precision control when the main() function is compiled.
Unfortunately, the overflow point can be far away from the point of the actual bug.
wwwmath.uni-muenster.de /~wuebbel/docs/c_ug/lin1065.htm

  
 .NET Undocumented: Floating Point Arithmetic, II
When comparing two floating point values, even if they are bitwise equal, it is important to check that the difference is smaller than some small magnitude.
Arithmetic rounding, on the other hand, will result in a growing divergence between the addition of pre-rounded numbers and that of unrounded numbers as more values are added.
In addition, certain special floating values such as infinities have no special counterpart in the int or long data type, a value of type double is the only satisfactory return value that will not cause an error.
wesnerm.blogs.com /net_undocumented/2005/01/floating_point__1.html

  
 Citations: A proposed radix- and word-lengthindependent standard for floating-point arithmetic - Cody, Coonen, Gay, Hanson (ResearchIndex)
For example, rather than insist that floating point addition obey x Phi y = f l [ x y ] x y) 1 ffi) where ffi is some deterministically defined value (such as the relative error incurred by rounding to the nearest floating point number, as in IEEE floating point arithmetic
The IEEE 754 standard is for binary floating point numbers and specifies bit level layout while IEEE 854 covers both binary and decimal floating point numbers but does not specify encodings for....
It is not the purpose of this paper to ar gue that the IEEE standard is the best possible floating point standard but rather to accept the standar d as given and pr ovide an introduction to its use.
citeseer.ist.psu.edu /context/226152/0

  
 Floating-Point Arithmetic
A normalized radix 10 floating-point number has its decimal point just to the left of the first non-zero digit in the mantissa.
For each bytecode that performs arithmetic on floats, there is a corresponding bytecode that performs the same operation on doubles.
A normalized mantissa has its binary point (the base-two equivalent of a decimal point) just to the left of the most significant non-zero digit.
www.artima.com /underthehood/floating.html

  
 Floating point - Wikipedia, the free encyclopedia
Floating point representation is more likely to be appropriate when proportional accuracy over a range of scales is needed.
Rounding errors: unlike the fixed-point counterpart, the application of dither in a floating point environment is nearly impossible.
This website can be used to calculate the floating point representation of a decimal number.
www.wikipedia.org /wiki/Floating_point

  
 Fabulous Adventures In Coding : Fun With Floating Point Arithmetic, Part Three
Of course you pointed us to a standard which allows the rounding to be wrong, and efforts to provide nice "round trip" conversions do require rounding to be wrong in some cases, but anyone who doesn't understand floating point well enough will surely not understand these points, and we might expect them to complain.
If floating point was obvious and easy to understand then there is no way Eric could milk it for several long blog entries.
After my boss told me that the original programmer was not a programmer and did not understand floating point, I understood to take the original code as a rough model instead of a spec.
blogs.msdn.com /ericlippert/archive/2005/01/17/354658.aspx

  
 IEEE Arithmetic
Assume you convert a decimal floating point number a to its IEEE single format binary representation b, and then translate b back to a decimal number c; how many orders of magnitude are between a and a - c?
Remember that this set of numerical values is described as a set of binary floating-point numbers.
For the single format, the difference between a normal number and a subnormal number is that the leading bit of the significand (the bit to left of the binary point) of a normal number is 1, whereas the leading bit of the significand of a subnormal number is 0.
docs.sun.com /source/806-3568/ncg_math.html

  
 General Decimal Arithmetic
The arithmetic permits a single representation of decimal numbers, whether they be integers, fixed-point (scaled), or floating-point; this minimizes conversion overheads.
The arithmetic was designed as a decimal extended floating-point arithmetic, directly implementing the rules that people are taught at school.
The working precision of the arithmetic is not determined by the representation, but is freely selectable within the limits of the representation as required for the problem being solved.
www2.hursley.ibm.com /decimal

  
 Floating Point Arithmetic
Since floating point numbers may be 32 bits or 64 bits, we need a way to manipulate them on the stack without knowing their size.
Floating point numbers will always be converted using base 10 (decimal) regardless of the value of BASE.
This version of SWAP assumes that the stack has an integer, N, and a float, F. It is defined differently for different width floating point numbers.
web.tampabay.rr.com /jforth/C09_Floating.html

  
 Fast Robust Predicates for Computational Geometry
Unfortunately, traditional libraries for arbitrary precision floating-point arithmetic are quite slow, and can reduce the speed of an application by one or two orders of magnitude.
If these coordinates are expressed as single or double precision floating-point numbers, roundoff error may lead to an incorrect result when the true determinant is near zero.
The orientation test determines whether a point lies to the left of, to the right of, or on a line or plane defined by other points.
www.cs.cmu.edu /~quake/robust.html

  
 754story.html
Anarchy, among floating-point arithmetics implemented in software on microprocessors, impelled Dr. Robert Stewart to convene meetings in an attempt to reach a consensus under the aegis of the IEEE.
Palmer, now the manager of Intel's floating-point effort, recruited Kahan as a consultant to help design the arithmetic for the i432 (which died later) and for the i8086/8's upcoming i8087 coprocessor.
In 1976 Intel began to design a floating-point co-processor for its i8086/8 and i432 microprocessors.
www.cs.berkeley.edu /~wkahan/ieee754status/754story.html

  
 2005 GCC Developers' Summit, June 21 ~ 24, 2005, Ottawa, Canada
To examine the emerging C extension and decimal floating point arithmetic (and in absence of real hardware), the decNumber package from Mike Cowlishaw has been integrated into GCC to provide decimal arithmetic operations.
This paper examines the IEEE 754r proposed encoding and operations, the C language extension, as well as, the experiences encountered in implementing the C extension via integration of decNumber to implement the actual decimal floating point arithmetic.
Historically, floating point hardware has tended towards binary floating-point arithmetic.
www.gccsummit.org /2005/view_abstract.php?content_key=31

  
 fpbiblio.txt
There is also some material on integer arithmetic in this list as some of the methods used with integer arithmetic contain interesting ideas that may be useful in the realization of a floating point arithmetic package.
My main interest was in software based, binary floating point arithmetic on a microprocessor, so you won't find much material about the hardware used in floating point arithmetic (e.g.
FPBIBLIO.TXT version 1.8, 18-August-1994 This is a bibliography of material on floating point arithmetic that I came up with while doing research on a floating-point package of my own.
www.textfiles.com /programming/fpbiblio.txt

  
 B. Floating Point Arithmetic: Issues and Limitations
While pathological cases do exist, for most casual use of floating-point arithmetic you'll see the result you expect in the end if you simply round the display of your final results to the number of decimal digits you expect.
The errors in Python float operations are inherited from the floating-point hardware, and on most machines are on the order of no more than 1 part in 2**53 per operation.
Note that this is in the very nature of binary floating-point: this is not a bug in Python, it is not a bug in your code either.
docs.python.org /tut/node16.html

  
 IEEE floating-point representations of real numbers
Only the part of the mantissa that comes after the binary point is actually stored, since the bit to the left of the binary point is completely predictable (it's always 1, since the mantissa is always greater than or equal to one and less than two).
that precedes the binary point is once again a ``hidden bit.'' As in single-precision representations, the all-zero exponent is used for unnormalized numbers and (with an all-zero mantissa) for 0, and the all-one exponent is used for the pseudo-numbers positive infinity, negative infinity, and NaN.
(The ``decimal point'' in this case is actually a ``binary point,'' separating the digit in the units place from the digits representing multiples of negative powers of two.)
www.math.grin.edu /~stone/courses/fundamentals/IEEE-reals.html

  
 Anand Jain
To reduce the number of arithmetic operations in the coder, we can use string manipulation techniques to replace the last digits of the floating point number.
In the new method developed by me floating point arithmetic is used.
Arithmetic coding on the other hand gives the result in the form of a very long continuous number.
www.anand-jain.8m.com

  
 IEEE floating-point standard - Wikipedia, the free encyclopedia
The IEEE Standard for Binary Floating-Point Arithmetic (IEEE 754) is the most widely-used standard for floating-point computation, and is followed by many CPU and FPU implementations.
IEEE 754 specifies four formats for representing floating-point values: single-precision (32-bit), double-precision (64-bit), single-extended precision (≥ 43-bit, not commonly used) and double-extended precision (≥ 79-bit, usually implemented with 80 bits).
In other words, two positive floating-point numbers (known not to be NaNs) can be compared with an unsigned binary integer comparison using the same bits, providing the floating-point numbers use the same byte order (this ordering, therefore, cannot be used in portable code through a union in the C programming language).
en.wikipedia.org /wiki/IEEE_floating-point_standard

  
 Floating point unit
Typical operations arefloating point arithmetic (such as addition and multiplication), but some systems may be capable of performing exponential ortrigonometric calculations as well (such as square roots or cosines).
Floating point addition and multiplication operations are typically pipelined, but more complicated operations, like division, maynot be, and some systems may even have a dedicated floating point divider circuit.
In some computer architectures, floating pointoperations are handled completely separate from integer operations, with dedicated floating point registers and independent clocking schemes.
www.therfcc.org /floating-point-unit-68135.html

Try your search on: Qwika (all wikis)

Factbites
  About us   |   Why use us?   |   Reviews   |   Press   |   Contact us  
Copyright © 2005-2007 www.factbites.com Usage implies agreement with terms.