Skip to main content

Heap Dump Fundamentals

A heap dump is a snapshot of every object alive in a Java Virtual Machine at a single point in time. It captures what objects exist, how much memory each uses, and how they reference each other. HeapLens parses this snapshot and builds a model you can explore interactively.

What's Inside a Heap Dump

When the JVM writes an HPROF file, it records four categories of data:

1. Objects (Instances)

Every Java object on the heap is dumped with:

  • Its object ID (a unique address)
  • Its class (which class this is an instance of)
  • Its field values (primitives stored inline, references stored as IDs pointing to other objects)

For example, a HashMap instance stores its size field (an int), its table field (a reference to a Node[] array), and its loadFactor (a float).

2. Arrays

Both object arrays (String[], Node[]) and primitive arrays (byte[], int[]) are recorded. Object arrays store a list of references to other objects. Primitive arrays store raw values.

3. Classes

Each loaded class is dumped with its:

  • Class name and superclass
  • Static fields (these are roots — they keep objects alive across the entire application)
  • Instance field descriptors (names and types, used to correctly parse instance data)

4. GC Roots

GC roots are the starting points for reachability analysis. An object is "alive" if and only if there is a reference chain from at least one GC root to that object. Common GC root types:

Root TypeSource
System ClassBootstrap classloader classes (java.lang.String, java.util.HashMap, etc.)
JNI GlobalNative code holding a global reference
ThreadActive thread objects and their stack frames
Java Stack FrameLocal variables on active method stacks
MonitorObjects currently used as synchronization locks

The Object Graph

All these references form a directed graph:

GC Root (Thread)
└─→ Controller
├─→ UserService
│ └─→ HashMap (user cache)
│ ├─→ Node[] (table)
│ │ ├─→ Node → "alice" → User(...)
│ │ └─→ Node → "bob" → User(...)
│ └─→ ...
└─→ Logger
└─→ ArrayList (handlers)

Every arrow is a reference — one object holding a pointer to another. If you cut the arrow from Controller to UserService, everything below it (the HashMap, its entries, the User objects) becomes unreachable and eligible for garbage collection.

This graph is the foundation of all HeapLens analysis. The Dominator Tree simplifies this graph into a hierarchy of ownership. The Retained vs. Shallow Size concepts tell you how much memory each node in that hierarchy controls.

Why Heap Dumps Matter

Heap dumps answer questions that profilers, metrics, and logs cannot:

  • "Why is my service using 4 GB when it should use 500 MB?" — The heap dump shows exactly which objects exist and what holds them alive.
  • "Where is the memory leak?" — Leak detection algorithms find objects retaining disproportionate memory.
  • "What's wasting memory?" — Waste analysis identifies duplicate strings, empty collections, and oversized buffers that can be reduced without code changes.

They are a snapshot, not a timeline. They tell you what is, not how it got there. For that reason, heap dumps pair well with flight recordings (JFR) or allocation profilers that show where objects were created.