Real Number Data Type in Java

In Java, real numbers (numbers with fractional part/decimals) are represented using floating-point data types. Java provides two primitive floating-point types: float and double.

The reason for having two is that in real-world applications, decimal values exist at different levels of precision. Depending on the required accuracy, programmers can choose either float (less precision, smaller memory) or double (higher precision, larger memory).

General Formula for IEEE-754 Floating Point Representation:

Value = (-1)^sign × (1 + mantissa) × 2^{(exponent -
bias)}

sign → 0 (positive), 1 (negative)
mantissa → fractional binary part (without the hidden leading 1, since it’s assumed)
exponent → stored exponent value (in binary)
bias →

127 for float (32-bit)
1023 for double (64-bit)

1. float

Syntax:

float a;

Size: 4 bytes = 32 bits
Format: IEEE-754 Single Precision Floating Point

Bit distribution:

1 bit → Sign (0 = positive, 1 = negative)
8 bits → Exponent (stored in Excess-127 form, i.e., actual exponent = stored value − 127)
23 bits → Mantissa (fractional part, with a hidden leading 1)

Formula:

Value = (-1)^sign × (1 + mantissa_23bits) × 2^{(exponent -
127)}
Range: ~ 1.4 × 10⁻⁴⁵ to 3.4 × 10³⁸
Precision: 5–6 significant decimal digits
Default behavior: Decimal literals are treated as double by default in Java.
To assign them to float, you must either use suffix f/F or explicitly typecast.

Manual Conversion Example (float 5.75f)

Convert 5.75 into binary:

5.75 = 101.11 (binary)
     = 1.0111 × 2²
Sign = 0 (positive)
Exponent = 129 (stored) → 129 − 127 = 2 (actual)
Mantissa = .0111

Value = (-1)⁰ × (1.0111) × 2² = 5.75

Java Example:

float pi = 3.1415926f;
System.out.println(pi); // prints 3.1415925 (approximation)

// Or using typecast:
float pi = (float) 3.1415926;
System.out.println(pi); // prints 3.1415925 (approximation)

2. double

Syntax:

double a;

Size: 8 bytes = 64 bits
Format: IEEE-754 Double Precision Floating Point

Bit distribution:

1 bit → Sign
11 bits → Exponent (stored in Excess-1023 form, i.e., actual exponent = stored value − 1023)
52 bits → Mantissa (fractional part, with a hidden leading 1)

Formula:

Value = (-1)^sign × (1 + mantissa_52bits) × 2^{(exponent -
1023)}
Range: ~ 4.9 × 10⁻³²⁴ to 1.7 × 10³⁰⁸
Precision: 15–16 significant decimal digits
Default behavior: Decimal literals are treated as double by default, so suffix d/D is optional.

Manual Conversion Example (double 5.75)

5.75 = 101.11 (binary)
     = 1.0111 × 2²
Sign = 0 (positive)
Exponent = 1025 (stored) → 1025 − 1023 = 2 (actual)
Mantissa = .0111

Value = (-1)⁰ × (1.0111) × 2² = 5.75

Java Example:

double pi = 3.141592653589793; // no suffix needed
System.out.println(pi); // prints 3.141592653589793 (more accurate)

Precision Limitation of float & double

Floating-point numbers are not stored exactly, but as an approximation using binary fractions. Arithmetic on floats/doubles often introduces round-off errors.

Example of error:

System.out.println(0.1 + 0.2);

Output:

0.30000000000000004

Explanation:
This happens because 0.1 and 0.2 cannot be exactly represented in binary, leading to small inaccuracies.

Bonus Point: BigDecimal (Class)

When exact precision is required (e.g., money, scientific calculations, financial applications), Java provides the BigDecimal class (introduced in Java 1.1).

Package: java.math.BigDecimal
Size: No fixed limit (depends on memory)
Usage: For calculations where accuracy matters more than performance

Example:

import java.math.BigDecimal;

public class BigDecimalExample {
    public static void main(String[] args) {
        BigDecimal num1 = new BigDecimal("0.1");
        BigDecimal num2 = new BigDecimal("0.2");
        BigDecimal sum = num1.add(num2);

        System.out.println("Accurate Sum: " + sum);
    }
}

Output:

Accurate Sum: 0.3

Summary:

                    float → 4 bytes, 5–6 digit precision, decimal literals require f/F
                            suffix or typecast.
double → 8 bytes, 15–16 digit precision, default for decimals (suffix d
                            optional).
BigDecimal → Arbitrary precision, slower, but exact (used in
                            financial/scientific apps).

                

Real Number Data Types in Java

General Formula for IEEE-754 Floating Point Representation:

1. float

Syntax:

Bit distribution:

Formula:

Manual Conversion Example (float 5.75f)

Java Example:

2. double

Syntax:

Bit distribution:

Formula:

Manual Conversion Example (double 5.75)

Java Example:

Precision Limitation of float & double

Example of error:

Output:

Bonus Point: BigDecimal (Class)

Example:

Output:

Summary:

Welcome to ShikshaSanchar!

ShikshaSanchar

Programming

Resources

Connect with Us