Floating-point Functions

This chapter describes the GMP functions for performing floating point arithmetic. These functions start with the prefix `mpf_`.

GMP floating point numbers are stored in objects of type `mpf_t`.

The GMP floating-point functions have an interface that is similar to the GMP integer functions. The function prefix for floating-point operations is `mpf_`.

There is one significant characteristic of floating-point numbers that has motivated a difference between this function class and other GMP function classes: the inherent inexactness of floating point arithmetic. The user has to specify the precision of each variable. A computation that assigns a variable will take place with the precision of the assigned variable; the precision of variables used as input is ignored.

The precision of a calculation is defined as follows: Compute the requested operation exactly (with "infinite precision"), and truncate the result to the destination variable precision. Even if the user has asked for a very high precision, GMP will not calculate with superfluous digits. For example, if two low-precision numbers of nearly equal magnitude are added, the precision of the result will be limited to what is required to represent the result accurately.

The GMP floating-point functions are not intended as a smooth extension to the IEEE P754 arithmetic. Specifically, the results obtained on one computer often differs from the results obtained on a computer with a different word size.

Initialization Functions

Function: void mpf_set_default_prec (unsigned long int prec)
Set the default precision to be at least prec bits. All subsequent calls to `mpf_init` will use this precision, but previously initialized variables are unaffected.

An `mpf_t` object must be initialized before storing the first value in it. The functions `mpf_init` and `mpf_init2` are used for that purpose.

Function: void mpf_init (mpf_t x)
Initialize x to 0. Normally, a variable should be initialized once only or at least be cleared, using `mpf_clear`, between initializations. The precision of x is undefined unless a default precision has already been established by a call to `mpf_set_default_prec`.

Function: void mpf_init2 (mpf_t x, unsigned long int prec)
Initialize x to 0 and set its precision to be at least prec bits. Normally, a variable should be initialized once only or at least be cleared, using `mpf_clear`, between initializations.

Function: void mpf_clear (mpf_t x)
Free the space occupied by x. Make sure to call this function for all `mpf_t` variables when you are done with them.

Here is an example on how to initialize floating-point variables:

```{
mpf_t x, y;
mpf_init (x);			/* use default precision */
mpf_init2 (y, 256);		/* precision at least 256 bits */
...
/* Unless the program is about to exit, do ... */
mpf_clear (x);
mpf_clear (y);
}
```

The following three functions are useful for changing the precision during a calculation. A typical use would be for adjusting the precision gradually in iterative algorithms like Newton-Raphson, making the computation precision closely match the actual accurate part of the numbers.

Function: void mpf_set_prec (mpf_t rop, unsigned long int prec)
Set the precision of rop to be at least prec bits. Since changing the precision involves calls to `realloc`, this routine should not be called in a tight loop.

Function: unsigned long int mpf_get_prec (mpf_t op)
Return the precision actually used for assignments of op.

Function: void mpf_set_prec_raw (mpf_t rop, unsigned long int prec)
Set the precision of rop to be at least prec bits. This is a low-level function that does not change the allocation. The prec argument must not be larger that the precision previously returned by `mpf_get_prec`. It is crucial that the precision of rop is ultimately reset to exactly the value returned by `mpf_get_prec` before the first call to `mpf_set_prec_raw`.

Assignment Functions

These functions assign new values to already initialized floats (see section Initialization Functions).

Function: void mpf_set (mpf_t rop, mpf_t op)
Function: void mpf_set_ui (mpf_t rop, unsigned long int op)
Function: void mpf_set_si (mpf_t rop, signed long int op)
Function: void mpf_set_d (mpf_t rop, double op)
Function: void mpf_set_z (mpf_t rop, mpz_t op)
Function: void mpf_set_q (mpf_t rop, mpq_t op)
Set the value of rop from op.

Function: int mpf_set_str (mpf_t rop, char *str, int base)
Set the value of rop from the string in str. The string is of the form `M@N' or, if the base is 10 or less, alternatively `MeN'. `M' is the mantissa and `N' is the exponent. The mantissa is always in the specified base. The exponent is either in the specified base or, if base is negative, in decimal.

The argument base may be in the ranges 2 to 36, or -36 to -2. Negative values are used to specify that the exponent is in decimal.

Unlike the corresponding `mpz` function, the base will not be determined from the leading characters of the string if base is 0. This is so that numbers like `0.23' are not interpreted as octal.

White space is allowed in the string, and is simply ignored. [This is not really true; white-space is ignored in the beginning of the string and within the mantissa, but not in other places, such as after a minus sign or in the exponent. We are considering changing the definition of this function, making it fail when there is any white-space in the input, since that makes a lot of sense. Please tell us your opinion about this change. Do you really want it to accept "3 14" as meaning 314 as it does now?]

This function returns 0 if the entire string up to the '\0' is a valid number in base base. Otherwise it returns -1.

Function: void mpf_swap (mpf_t rop1, mpf_t rop2)
Swap the values rop1 and rop2 efficiently.

Combined Initialization and Assignment Functions

For convenience, GMP provides a parallel series of initialize-and-set functions which initialize the output and then store the value there. These functions' names have the form `mpf_init_set...`

Once the float has been initialized by any of the `mpf_init_set...` functions, it can be used as the source or destination operand for the ordinary float functions. Don't use an initialize-and-set function on a variable already initialized!

Function: void mpf_init_set (mpf_t rop, mpf_t op)
Function: void mpf_init_set_ui (mpf_t rop, unsigned long int op)
Function: void mpf_init_set_si (mpf_t rop, signed long int op)
Function: void mpf_init_set_d (mpf_t rop, double op)
Initialize rop and set its value from op.

The precision of rop will be taken from the active default precision, as set by `mpf_set_default_prec`.

Function: int mpf_init_set_str (mpf_t rop, char *str, int base)
Initialize rop and set its value from the string in str. See `mpf_set_str` above for details on the assignment operation.

Note that rop is initialized even if an error occurs. (I.e., you have to call `mpf_clear` for it.)

The precision of rop will be taken from the active default precision, as set by `mpf_set_default_prec`.

Conversion Functions

Function: double mpf_get_d (mpf_t op)
Convert op to a double.

Function: char * mpf_get_str (char *str, mp_exp_t *expptr, int base, size_t n_digits, mpf_t op)
Convert op to a string of digits in base base. The base may vary from 2 to 36. Generate at most n_digits significant digits, or if n_digits is 0, the maximum number of digits accurately representable by op.

If str is `NULL`, space for the mantissa is allocated using the default allocation function.

If str is not `NULL`, it should point to a block of storage enough large for the mantissa, i.e., n_digits + 2. The two extra bytes are for a possible minus sign, and for the terminating null character.

The exponent is written through the pointer expptr.

If n_digits is 0, the maximum number of digits meaningfully achievable from the precision of op will be generated. Note that the space requirements for str in this case will be impossible for the user to predetermine. Therefore, you need to pass `NULL` for the string argument whenever n_digits is 0.

The generated string is a fraction, with an implicit radix point immediately to the left of the first digit. For example, the number 3.1416 would be returned as "31416" in the string and 1 written at expptr.

A pointer to the result string is returned. This pointer will will either equal str, or if that is `NULL`, will point to the allocated storage.

Arithmetic Functions

Function: void mpf_add (mpf_t rop, mpf_t op1, mpf_t op2)
Function: void mpf_add_ui (mpf_t rop, mpf_t op1, unsigned long int op2)
@ifnottex Set rop to op1 + op2.

Function: void mpf_sub (mpf_t rop, mpf_t op1, mpf_t op2)
Function: void mpf_ui_sub (mpf_t rop, unsigned long int op1, mpf_t op2)
Function: void mpf_sub_ui (mpf_t rop, mpf_t op1, unsigned long int op2)
Set rop to op1 - op2.

Function: void mpf_mul (mpf_t rop, mpf_t op1, mpf_t op2)
Function: void mpf_mul_ui (mpf_t rop, mpf_t op1, unsigned long int op2)
@ifnottex Set rop to op1 times op2.

Division is undefined if the divisor is zero, and passing a zero divisor to the divide functions will make these functions intentionally divide by zero. This lets the user handle arithmetic exceptions in these functions in the same manner as other arithmetic exceptions.

Function: void mpf_div (mpf_t rop, mpf_t op1, mpf_t op2)
Function: void mpf_ui_div (mpf_t rop, unsigned long int op1, mpf_t op2)
Function: void mpf_div_ui (mpf_t rop, mpf_t op1, unsigned long int op2)
Set rop to op1/op2.

Function: void mpf_sqrt (mpf_t rop, mpf_t op)
Function: void mpf_sqrt_ui (mpf_t rop, unsigned long int op)
@ifnottex Set rop to the square root of op.

Function: void mpf_pow_ui (mpf_t rop, mpf_t op1, unsigned long int op2)
@ifnottex Set rop to op1 raised to the power op2.

Function: void mpf_neg (mpf_t rop, mpf_t op)
Set rop to -op.

Function: void mpf_abs (mpf_t rop, mpf_t op)
Set rop to the absolute value of op.

Function: void mpf_mul_2exp (mpf_t rop, mpf_t op1, unsigned long int op2)
@ifnottex Set rop to op1 times 2 raised to op2.

Function: void mpf_div_2exp (mpf_t rop, mpf_t op1, unsigned long int op2)
@ifnottex Set rop to op1 divided by 2 raised to op2.

Comparison Functions

Function: int mpf_cmp (mpf_t op1, mpf_t op2)
Function: int mpf_cmp_ui (mpf_t op1, unsigned long int op2)
Function: int mpf_cmp_si (mpf_t op1, signed long int op2)
@ifnottex Compare op1 and op2. Return a positive value if op1 > op2, zero if op1 = op2, and a negative value if op1 < op2.

Function: int mpf_eq (mpf_t op1, mpf_t op2, unsigned long int op3)
Return non-zero if the first op3 bits of op1 and op2 are equal, zero otherwise. I.e., test of op1 and op2 are approximately equal.

Function: void mpf_reldiff (mpf_t rop, mpf_t op1, mpf_t op2)
Compute the relative difference between op1 and op2 and store the result in rop.

Macro: int mpf_sgn (mpf_t op)
@ifnottex Return +1 if op > 0, 0 if op = 0, and -1 if op < 0.

This function is actually implemented as a macro. It evaluates its arguments multiple times.

Input and Output Functions

Functions that perform input from a stdio stream, and functions that output to a stdio stream. Passing a `NULL` pointer for a stream argument to any of these functions will make them read from `stdin` and write to `stdout`, respectively.

When using any of these functions, it is a good idea to include `stdio.h' before `gmp.h', since that will allow `gmp.h' to define prototypes for these functions.

Function: size_t mpf_out_str (FILE *stream, int base, size_t n_digits, mpf_t op)
Output op on stdio stream stream, as a string of digits in base base. The base may vary from 2 to 36. Print at most n_digits significant digits, or if n_digits is 0, the maximum number of digits accurately representable by op.

In addition to the significant digits, a leading `0.' and a trailing exponent, in the form `eNNN', are printed. If base is greater than 10, `@' will be used instead of `e' as exponent delimiter.

Return the number of bytes written, or if an error occurred, return 0.

Function: size_t mpf_inp_str (mpf_t rop, FILE *stream, int base)
Input a string in base base from stdio stream stream, and put the read float in rop. The string is of the form `M@N' or, if the base is 10 or less, alternatively `MeN'. `M' is the mantissa and `N' is the exponent. The mantissa is always in the specified base. The exponent is either in the specified base or, if base is negative, in decimal.

The argument base may be in the ranges 2 to 36, or -36 to -2. Negative values are used to specify that the exponent is in decimal.

Unlike the corresponding `mpz` function, the base will not be determined from the leading characters of the string if base is 0. This is so that numbers like `0.23' are not interpreted as octal.

Return the number of bytes read, or if an error occurred, return 0.

Miscellaneous Functions

Function: void mpf_ceil (mpf_t rop, mpf_t op)
Function: void mpf_floor (mpf_t rop, mpf_t op)
Function: void mpf_trunc (mpf_t rop, mpf_t op)
Set rop to op rounded to an integer. `mpf_ceil` rounds to the next higher integer, `mpf_floor` to the next lower, and `mpf_trunc` to the integer towards zero.

Function: void mpf_urandomb (mpf_t rop, gmp_randstate_t state, unsigned long int nbits)
Generate a uniformly distributed random float in rop, such that 0 <= rop < 1, with nbits significant bits in the mantissa.

The variable state must be initialized by calling one of the `gmp_randinit` functions (section Random State Initialization) before invoking this function.

Function: void mpf_random2 (mpf_t rop, mp_size_t max_size, mp_exp_t max_exp)
Generate a random float of at most max_size limbs, with long strings of zeros and ones in the binary representation. The exponent of the number is in the interval -exp to exp. This function is useful for testing functions and algorithms, since this kind of random numbers have proven to be more likely to trigger corner-case bugs. Negative random numbers are generated when max_size is negative.