Architecture — archetype + chunk storage¶
What you'll learn: how japes lays out components in memory, why
it picked an archetype model over sparse sets or a single table-of-
structs, how an entity's physical address is encoded in
EntityLocation, and why moving an entity between archetypes is just
a pair of swap-removes on parallel arrays. The goal is that by the
end, you can read any hot path in ecs-core and know which data
structure every field load is walking.
The three candidates¶
Before getting to what japes does, the shape of the decision itself is worth spelling out. Every ECS storage layer picks one of roughly three ways to arrange components in memory:
| Model | How it stores things | Iteration cost | Structural-change cost |
|---|---|---|---|
| Sparse set (e.g. EnTT) | One dense T[] per component type, plus an entity→slot redirect table |
Cheapest per-component iteration; can pack to N slots where N is entities with that component |
Cheap add/remove — just append/swap-remove in the dense array |
| Table-of-structs | One big table, every entity has every component, unused fields default | Tightest cache layout when most entities have most components | Every add of a new component type reshapes the whole table |
| Archetype + chunk (japes, Bevy, Flecs) | Group entities by component set; each group owns a list of fixed-capacity chunks; each chunk holds one parallel column per component | Close to sparse-set per-component, still competitive on multi-component queries | Structural change = move between archetypes; pair of swap-removes |
japes picks the third. The decision is driven by three specific requirements:
- Multi-component queries must be fast. A
@Systemthat readsPositionandVelocitytogether needs the two component arrays to be pointer-equal-indexable — slotiin thePosition[]and slotiin theVelocity[]must describe the same entity. Sparse sets require an indirection through the entity→slot table on every second and subsequent component; archetype chunks make it a flat indexed load. - Change detection has to be free when nobody is watching.
Every chunk carries one
ChangeTrackerper component; trackers for components with no observer are flipped tofullyUntracked = trueand become no-ops. This lines up naturally with chunk-per-archetype, because the set of observers for a component is statically resolved at plan build. - Entity handle stability under structural change. Users hold
Entity(a packedlong— seeecs-core/.../entity/Entity.java) across ticks. The archetype model makes the handle stable through component add/remove: the entity moves between archetypes, but itsEntityvalue (index << 32 | generation) is unchanged, and a single table update in the entity allocator re-points it.
The shape of an archetype¶
An Archetype (ecs-core/.../archetype/Archetype.java) is the
container for every entity that carries exactly the same set of
component types. Its fields:
final ArchetypeId id; // sorted set of ComponentId
final Map<ComponentId, Class<? extends Record>> componentTypes;
final int chunkCapacity; // how many entities per chunk
final List<Chunk> chunks = new ArrayList<>();
final Set<ComponentId> dirtyTrackedComponents; // shared with ArchetypeGraph
final Set<ComponentId> fullyUntrackedComponents; // shared with ArchetypeGraph
int openChunkIndex = -1; // lazy open-slot cache
Every time a new archetype is created (because some entity gained or
lost a component), the ArchetypeGraph memoises the transition in its
addEdges / removeEdges map
(ecs-core/.../archetype/ArchetypeGraph.java, ~lines 116–134) so the
next entity making the same move skips the ArchetypeId.with(...)
computation entirely. That's the "graph" in ArchetypeGraph: a lazy,
pay-as-you-go index of "add component X from this archetype goes to
that archetype." New nodes are materialised on first visit.
Cached open chunk
openChunkIndex used to be a linear scan on every add — O(chunks).
Profiling ParticleScenarioBenchmark (which respawns ~100 entities
per tick) showed it sitting at a measurable fraction of tick time.
It now starts at -1, is bumped on add when the selected chunk is
still non-full, and is reset to the last-modified chunk on remove.
Every live chunk is either the open one or full, so the "find the
next open slot" operation is one integer comparison. See
Archetype.findOrCreateChunkIndex at line ~106.
A chunk is parallel Object[] columns¶
Chunk (ecs-core/.../storage/Chunk.java) holds chunkCapacity
entity slots. For each component type in the archetype, the chunk
allocates one ComponentStorage — in the default configuration
(DefaultComponentStorage), this is a single reference array sized to
capacity. So one chunk with 1024 slots carrying
{Position, Velocity, Health} physically owns:
Entity[1024] entities— the handle listPosition[1024]behind aComponentStorage<Position>Velocity[1024]behind aComponentStorage<Velocity>Health[1024]behind aComponentStorage<Health>- One
ChangeTrackerper component, also sized to 1024
Slot i in every array describes the same logical entity. A tight
iteration loop over Position + Velocity is a single
for (int slot = 0; slot < count; slot++) walking two parallel
indexed loads — exactly what the tier-1 generator emits (see
tier-1 generation).
Why one storage per component per chunk — not one per archetype
An earlier prototype kept one ComponentStorage per archetype and
segmented it into logical ranges. That was trivially shown to
regress both iteration (the loop had to bounds-check chunk edges)
and structural change (moving an entity out of a chunk left a hole
that could not be swap-filled from anywhere except the same chunk).
One storage per chunk gives the swap-remove trick a clean domain:
every component column knows its own chunk-local count, and its
last-slot is always the tail of this chunk.
Flat ComponentStorage<?>[] lookup by id¶
Component lookups (Chunk.componentStorage(ComponentId),
Chunk.changeTracker(ComponentId)) used to be HashMap gets. They're
now flat arrays indexed by ComponentId.id()
(Chunk.java lines ~19–27):
Sized to maxGlobalComponentId + 1, with null in every slot the
chunk doesn't own. Two hot paths — World.setComponent and every
tier-1 generator's "load storages at chunk entry" preamble — now do
storagesById[id] instead of storages.get(compIdObject). That is
one aaload versus a hash, an equals call, and a node walk. On the
sparse-delta benchmark this single change was worth several µs.
A second pair of flat arrays — storageList and trackerList — holds
the same storages and trackers indexed densely, so whole-chunk
sweeps (remove, markAdded) can iterate without touching null
slots.
EntityLocation: the entity-to-slot table¶
EntityAllocator (ecs-core/.../entity/EntityAllocator.java) hands
out Entity handles and owns the Entity → EntityLocation table.
EntityLocation (ecs-core/.../archetype/EntityLocation.java) is a
record:
public record EntityLocation(Archetype archetype, int chunkIndex, int slotIndex) {
public ArchetypeId archetypeId() { return archetype.id(); }
}
Two things are worth noticing.
It holds a direct Archetype reference, not an ArchetypeId.
The comment at the top of the file explains why:
World.setComponent can now skip the archetypeGraph.get(archetypeId)
map lookup entirely — the archetype is reachable in a single field
load from the location. archetypeId() still exists but just
forwards.
It does NOT hold a direct Chunk reference. A previous revision
tried that and it regressed setComponent-heavy workloads by ~9 %.
The JIT was able to hoist archetype.chunks().get(chunkIndex) out of
the loop when the archetype was stable, whereas location.chunk()
varies per entity and prevents the hoist. The comment on
EntityLocation.java documents this exactly. The indirection is
cheaper than the cache pollution.
Component registry keys are Class<?>¶
ComponentRegistry keys every component type by the Class object
itself. No string interning, no annotation scanning, no String.hashCode
on the hot path. The resulting ComponentId is a densely-packed
integer — ComponentId.id() — allocated in registration order.
Two consequences:
- Flat
ComponentStorage<?>[]arrays indexed byid()work because ids are dense. - The per-system metadata (
SystemDescriptor) can resolve every component reference to itsComponentIdonce at plan build time, so the hot path never touches aClassobject again.
Moves between archetypes are swap-removes¶
Suppose entity E lives in archetype A = {Position, Velocity}, and
a system adds a Health component. E must move to
B = {Position, Velocity, Health}. The sequence is:
- Look up
B = archetypeGraph.addEdge(A.id(), HealthId)— memoised, O(1) after the first hit. - Allocate a new slot in
B's open chunk. - Copy every shared component (
Position,Velocity) fromE's old slot to the new slot. - Write the newly-added
Healthinto the new slot. - Swap-remove
E's old slot inA: overwrite it with the last entity in the same chunk, decrementcount, and update the swapped entity'sEntityLocationto point atE's old index. This isArchetype.remove+Chunk.remove— a per-storage, per-tracker swap-remove loop. - Update
entityLocations[E.index()] = new EntityLocation(B, newChunkIdx, newSlot).
Every column in the chunk does its swap-remove the same way — the
code is in Chunk.remove:
public void remove(int slot) {
int lastIndex = count - 1;
if (slot < lastIndex) {
entities[slot] = entities[lastIndex];
}
for (var storage : storageList) {
storage.swapRemove(slot, count);
}
for (var tracker : trackerList) {
tracker.swapRemove(slot, count);
}
entities[lastIndex] = null;
count--;
}
The per-component swapRemove is one field load, one field store,
one null-write — constant-time per column, linear in the number of
components the archetype carries (typically 2–6).
The dirty-bit propagation bug
ChangeTracker.swapRemove has to propagate the dirty bit of the
moved entity, because the slot index that slot held is gone.
Otherwise entities that were dirty at the moment another entity
was swap-removed become invisible to every @Filter(Changed)
observer. This was a real silent-correctness bug fixed in the
PR referenced in DEEP_DIVE.md. The fix is in
ChangeTracker.swapRemove lines ~177–204 — see the
change tracking page for the detail.
Archetype generation counter¶
ArchetypeGraph bumps a long generation every time a new archetype
is materialised (getOrCreate, line ~54). The generation is load-
bearing for one very specific optimisation:
SystemExecutionPlan.cachedMatchingArchetypes.
Query systems re-resolve "which archetypes do I iterate" at tick
start. Without a cache, this walks findMatchingCache — itself a
ConcurrentHashMap<Set<ComponentId>, List<Archetype>> — and
AbstractSet.hashCode walks every element of the required set on
every call. Profiling showed it was ~18 % of a RealisticTick tick.
The fix is two-level:
findMatchingCache(at the graph level) memoises the result by required-set identity. Rebuilt only when a new archetype is created.SystemExecutionPlan.cachedMatchingArchetypes(graphGeneration)(at the plan level) memoises the lookup bylonggeneration. Tick start compares two longs; if they match, the cached list is still valid.
One long comparison per system per tick instead of one set hash. The generation counter is the whole reason it works.
What happens on World.setComponent¶
This is the hottest write path in the library. The full sequence:
entityAllocator.locationOf(entity)returns the cachedEntityLocation— one array load.location.archetype()is the directArchetypereference — one field read.archetype.chunks().get(location.chunkIndex())— oneArrayListget.chunk.componentStorage(compId)— onestoragesById[id]aaload.storage.set(slot, value)— one aastore.chunk.changeTracker(compId).markChanged(slot, currentTick)— one aaload, oneaddedTicks[slot] = tickstore, one dirty-list append (if observed).
Five loads, two stores, one conditional append. No hash lookups,
no reflection, no boxing (the value is already a record reference
held by the user). Every stretch where this path gets slower is
visible in the benchmark numbers; every optimisation round in the
optimisation journey is a line-level change
to this sequence.
Related¶
- Tier-1 bytecode generation — how the per-entity iteration loop binds to the column layout described above
- Change tracking — how
ChangeTrackerplugs into the archetype / chunk machinery - Relations — the side-table alternative for entity-to-entity pairs, which does not fragment archetypes
- Optimisation journey — the war story for some of the specific changes referenced above
- Tutorial — Components — the user-facing view of what a record-based component looks like