Golang Map Internals
Key Points
- Go maps are implemented as hash tables. Each hash table consists of buckets, and each bucket can hold up to 8 key-value pairs.
- The map header contains important metadata such as the number of entries, the number of buckets (which is always a power of two), and a random hash seed.
- Maps grow dynamically when the load factor exceeds a certain threshold or when there are too many overflow buckets. The load factor was reduced from 6.5 to 6 in Go 1.21 to optimize memory usage.
- Maps only grow and do not shrink, even after deletions. This means that the number of buckets remains constant, regardless of whether entries are deleted.
Map Creation and Structure
When you create a map using m := make(map[string]string)
, the Go runtime internally creates a map header structure. The variable m
then points to this structure. The map header contains several critical components:
- Count: This represents the total number of key-value pairs currently stored in the map.
- B: This indicates the number of buckets in the map. The value is always a power of two, and the logarithm base 2 of this value (
log2(buckets)
) is stored to save space. - Bucket Array Pointer: This is a pointer to an array of buckets. The array is stored in contiguous memory to ensure efficient access.
- Random Hash Seed: This ensures that each map instance generates unique hash distributions. It enhances security by mitigating potential denial-of-service attacks caused by hash collisions.
Each bucket in the map can store up to 8 key-value pairs. Additionally, each bucket includes an array of hash codes for faster comparisons during lookups, as well as an overflow pointer. The overflow pointer is used to link additional buckets if the current bucket becomes full.
Map Header
+-----------------------------+
| count | int |
| B | int |
| buckets_ptr | *BucketArray|
| hash_seed | uint32 |
+-----------------------------+
|
v
Bucket Array
+---+---+---+---+
| B0 | B1 | B2 | ... |
+---+---+---+---+
|
v
Bucket Structure
+-----------------------------+
| hash_codes[8] |
| key_value_pairs[8] |
| overflow_ptr (*Bucket) |
+-----------------------------+
Inserting New Values
When inserting a new key-value pair into a map, the following steps occur:
- Hash Computation: The Go runtime computes a hash code for the given key using a hash function. This computation incorporates a random hash seed to ensure uniqueness.
- Bucket Selection: Part of the hash code is used to determine which bucket should store the new key-value pair. Since the number of buckets is always a power of two, this step uses bit masking for efficient indexing.
- Collision Handling: The full hash code is compared with the hash codes already stored in the selected bucket. If no match is found, it means the key is new.
- Insertion:
- If the bucket has an empty slot (i.e., it contains fewer than 8 key-value pairs), the new pair is added directly.
- If the bucket is already full, a new overflow bucket is created, and the overflow pointer is updated to point to this new bucket. The new key-value pair is then stored in the overflow bucket.
This process ensures that the map can handle collisions effectively by chaining overflow buckets when necessary.
Initial State
+---+---+---+---+
| B0 | B1 | B2 | ... |
+---+---+---+---+
|
v
Bucket B0
+-----------------------------+
| hash_codes[8] |
| key_value_pairs[7] + empty |
| overflow_ptr = NULL |
+-----------------------------+
After Insertion (Overflow)
+---+---+---+---+
| B0 | B1 | B2 | ... |
+---+---+---+---+
|
v
Bucket B0
+-----------------------------+
| hash_codes[8] |
| key_value_pairs[8] |
| overflow_ptr -> OverflowBucket |
+-----------------------------+
Map Growth Mechanics
Go maps are designed to grow dynamically to maintain performance under varying conditions. Growth occurs under the following circumstances:
- Load Factor Threshold: The map grows when the total number of entries reaches or exceeds
(B * 8) / loadFactor
. In Go 1.21 and later, the load factor is set to 6. This means that growth is triggered when the average number of elements per bucket approaches approximately 1.333 (since(8 / 6) ≈ 1.333
). This prevents buckets from becoming overly crowded. - Overflow Bucket Limit: Growth is also triggered if the number of overflow buckets exceeds the number of main buckets (
noverflow > B
). This prevents deep overflow chains, which could degrade performance.
When growth occurs, the following actions take place:
- The number of buckets is doubled, creating a new bucket array with twice the capacity.
- All existing entries are rehashed and redistributed into the new bucket array. This redistribution is done incrementally to minimize performance impact during operations.
- A pointer to the old bucket array is maintained at the end of the new array. This allows for efficient copying of entries over time.
This incremental growth strategy ensures that map operations remain responsive, even during resizing.
Before Growth
+---+---+---+---+
| B0 | B1 | B2 | ... |
+---+---+---+---+
After Growth
+---+---+---+---+---+---+---+---+
| B0 | B1 | B2 | ... | B3 | B4 | B5 | ... |
+---+---+---+---+---+---+---+---+
|
v
Old Bucket Array (for incremental copying)
+---+---+---+---+
| B0 | B1 | B2 | ... |
+---+---+---+---+
Deletion Behavior
Deleting an entry from a Go map involves removing it from its respective bucket and decrementing the count
value in the map header. However, it is important to note that maps do not shrink. The number of buckets remains constant, even if all entries are deleted. This design choice means that maps only grow and never reduce their bucket count. As a result, engineers should be mindful of memory usage, especially in scenarios where maps undergo frequent deletions and insertions.
Performance Considerations
- Time Complexity: On average, lookups, insertions, and deletions in Go maps have a time complexity of O(1). This efficiency is due to the hash table implementation. However, in the worst case—such as when there are many collisions (e.g., due to poor hash distribution or adversarial inputs)—the time complexity can degrade to O(n).
- Random Iteration Order: The iteration order of Go maps is random and may differ each time the same map is printed. This randomness is due to the use of a random hash seed, which prevents reliance on iteration order and enhances security against hash collision-based denial-of-service attacks.
- Memory Management: Since maps do not shrink, engineers should carefully manage memory usage, particularly in long-running applications. Strategies such as creating new maps and copying only necessary data can help mitigate memory bloat in sparse maps.
Comparative Insights
For software engineers, understanding the differences between maps and other data structures is essential. For example:
- Maps are ideal for scenarios requiring fast lookups based on keys, such as caching or configuration storage.
- Compared to slices, maps offer O(1) access by key, while slices provide O(1) access by index but require O(n) for searches.
Practical Implications
Software engineers working with Go maps should consider the following:
- Be aware of the load factor (set to 6 in recent versions) and its impact on map growth. Applications should be tested for performance under varying map sizes.
- Monitor memory usage carefully, especially in long-running applications, as maps do not shrink and could lead to memory leaks if not managed properly.
- Leverage the random hash seed for security, particularly in web applications, to mitigate hash collision-based denial-of-service attacks.
By understanding these aspects, engineers can optimize Go applications, balancing performance, memory usage, and security.
Summary Table
Aspect | Details |
---|---|
Internal Structure | Maps are implemented as an array of buckets. Each bucket can store up to 8 key-value pairs. |
Bucket Components | Each bucket contains an array of hash codes, a list of key-value pairs, and an overflow pointer. |
Map Header | Contains metadata: number of entries, log2(buckets) (buckets are a power of two), pointer to bucket array, and a random hash seed. |
Lookup Time | O(1) due to hash functions. |
Load Factor | 6, triggers growth when bucket elements reach this limit (changed from 6.5 in Go 1.21). |
Map Growth | Doubles the number of buckets, redistributes old entries efficiently over time, maintains a pointer to the old bucket array. |
Deletion Behavior | Maps only grow; deleting all entries does not reduce the number of buckets. |
Iteration | Random; outputs differ each time the same map is printed. |
Overflow Handling | When a bucket is full (8 pairs) and receives a new value, a new bucket is created, and the overflow pointer points to it. |