Home » Java » Core Java » Using a memory mapped file for a huge matrix

About Peter Lawrey

Using a memory mapped file for a huge matrix


Matrices can be really large, sometimes larger than you can hold in one array. You can extend the maximum size by having multiple arrays however this can make your heap size really large and inefficient. An alternative is to use a wrapper over a memory mapped file. The advantage of memory mapped files is that they have very little impact on the heap and can be swapped in and out by the OS fairly transparently.

A huge matrix

This code supports large matrices of double. It partitions the file into 1 GB mappings. (As Java doesn’t support mappings of 2 GB or more at a time, a pet hate of mine ;)

import sun.misc.Cleaner;
import sun.nio.ch.DirectBuffer;

import java.io.Closeable;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.util.ArrayList;
import java.util.List;

public class LargeDoubleMatrix implements Closeable {
    private static final int MAPPING_SIZE = 1 << 30;
    private final RandomAccessFile raf;
    private final int width;
    private final int height;
    private final List mappings = new ArrayList();

    public LargeDoubleMatrix(String filename, int width, int height) throws IOException {
        this.raf = new RandomAccessFile(filename, "rw");
        try {
            this.width = width;
            this.height = height;
            long size = 8L * width * height;
            for (long offset = 0; offset < size; offset += MAPPING_SIZE) {
                long size2 = Math.min(size - offset, MAPPING_SIZE);
                mappings.add(raf.getChannel().map(FileChannel.MapMode.READ_WRITE, offset, size2));
        } catch (IOException e) {
            throw e;

    protected long position(int x, int y) {
        return (long) y * width + x;

    public int width() {
        return width;

    public int height() {
        return height;

    public double get(int x, int y) {
        assert x >= 0 && x < width;
        assert y >= 0 && y < height;
        long p = position(x, y) * 8;
        int mapN = (int) (p / MAPPING_SIZE);
        int offN = (int) (p % MAPPING_SIZE);
        return mappings.get(mapN).getDouble(offN);

    public void set(int x, int y, double d) {
        assert x >= 0 && x < width;
        assert y >= 0 && y < height;
        long p = position(x, y) * 8;
        int mapN = (int) (p / MAPPING_SIZE);
        int offN = (int) (p % MAPPING_SIZE);
        mappings.get(mapN).putDouble(offN, d);

    public void close() throws IOException {
        for (MappedByteBuffer mapping : mappings)

    private void clean(MappedByteBuffer mapping) {
        if (mapping == null) return;
        Cleaner cleaner = ((DirectBuffer) mapping).cleaner();
        if (cleaner != null) cleaner.clean();

public class LargeDoubleMatrixTest {
    public void getSetMatrix() throws IOException {
        long start = System.nanoTime();
        final long used0 = usedMemory();
        LargeDoubleMatrix matrix = new LargeDoubleMatrix("ldm.test", 1000 * 1000, 1000 * 1000);
        for (int i = 0; i < matrix.width(); i++)
            matrix.set(i, i, i);
        for (int i = 0; i < matrix.width(); i++)
            assertEquals(i, matrix.get(i, i), 0.0);
        long time = System.nanoTime() - start;
        final long used = usedMemory() - used0;
        if (used == 0)
            System.err.println("You need to use -XX:-UseTLAB to see small changes in memory usage.");
        System.out.printf("Setting the diagonal took %,d ms, Heap used is %,d KB%n", time / 1000 / 1000, used / 1024);

    private long usedMemory() {
        return Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();

With the following test, which writes to each of the diagonal values of a one million * one million matrix. This is too large to hope to create on the heap.

Setting the diagonal took 314,819 ms, Heap used is 2,025 KB

$ ls -l ldm.test
-rw-rw-r-- 1 peter peter 8000000000000 2011-12-30 12:42 ldm.test
$ du -s ldm.test 
4010600 ldm.test

That’s 8,000,000,000,000 bytes or ~7.3 TB in virtual memory, in a Java process! This works because it only allocates or pages in the pages which you use. So while the file size is almost 8 TB, the actual disk space and memory used is 4 GB.
With a more modest file size of 100K * 100K matrix you see something like the following. Its still an 80 GB matrix, which uses trivial heap space. ;)

Setting the diagonal took 110 ms, Heap used is 71 KB

$ ls -l ldm.test
-rw-rw-r-- 1 peter peter 80000000000 2011-12-30 12:49 ldm.test
$ du -s ldm.test 
400000 ldm.test

Reference: Using a memory mapped file for a huge matrix   from our JCG partner Peter Lawrey  at the Vanilla Java blog

Related Articles :

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!


1. JPA Mini Book

2. JVM Troubleshooting Guide

3. JUnit Tutorial for Unit Testing

4. Java Annotations Tutorial

5. Java Interview Questions

6. Spring Interview Questions

7. Android UI Design


and many more ....



Leave a Reply

Your email address will not be published. Required fields are marked *


Want to take your Java skills to the next level?

Grab our programming books for FREE!

Here are some of the eBooks you will get:

  • Advanced Java Guide
  • Java Design Patterns
  • JMeter Tutorial
  • Java 8 Features Tutorial
  • JUnit Tutorial
  • JSF Programming Cookbook
  • Java Concurrency Essentials