How to Increase Slice Capacity in Go
Do you know what happens when you add a new element to a slice where len == cap
?
Most Gophers would confidently answer that the capacity doubles, new space is allocated, the slice values are copied, and the slice header pointer is updated. This is actually correct.
func TestAppendSlice(t *testing.T) {
s := make([]int, 2, 3)
s = append(s, 0)
fmt.Println("Capacity: ", cap(s), "Pointer: ", unsafe.Pointer(&s[0]))
s = append(s, 0)
fmt.Println("Capacity: ", cap(s), "Pointer: ", unsafe.Pointer(&s[0]))
}
// Capacity: 3 Pointer: 0xc00002a5d0
// Capacity: 6 Pointer: 0xc00002e750
As we can see, the address of s[0]
has changed, indicating that the slice has been allocated to a new space and its pointer has been updated. Additionally, the capacity has doubled.
In fact, that’s only half correct. What if we change int
to int16
? Let's see what happens:
func TestAppendSlice(t *testing.T) {
s := make([]int16, 2, 3)
s = append(s, 0)
fmt.Println("Capacity: ", cap(s), "Pointer: ", unsafe.Pointer(&s[0]))
s = append(s, 0)
fmt.Println("Capacity: ", cap(s), "Pointer: ", unsafe.Pointer(&s[0]))
}
// Capacity: 3 Pointer: 0xc000015b30
// Capacity: 8 Pointer: 0xc000015b38
We can observe that the capacity changed to 8 instead of doubling. Why does the capacity change this way? Is it just assigned an arbitrary value?
How does a slice become so larger?
The secret behind this can be found in growslice.
First, growslice
calculates the new capacity, newcap
, using the following method:
newcap := nextslicecap(newLen, oldCap)
The nextslicecap(newLen, oldCap int) int
function is implemented here, and the code is as follows:
func nextslicecap(newLen, oldCap int) int {
newcap := oldCap
doublecap := newcap + newcap
if newLen > doublecap {
return newLen
}
const threshold = 256
if oldCap < threshold {
return doublecap
}
for {
// Transition from growing 2x for small slices
// to growing 1.25x for large slices. This formula
// gives a smooth-ish transition between the two.
newcap += (newcap + 3*threshold) >> 2
// We need to check `newcap >= newLen` and whether `newcap` overflowed.
// newLen is guaranteed to be larger than zero, hence
// when newcap overflows then `uint(newcap) > uint(newLen)`.
// This allows to check for both with the same comparison.
if uint(newcap) >= uint(newLen) {
break
}
}
// Set newcap to the requested cap when
// the newcap calculation overflowed.
if newcap <= 0 {
return newLen
}
return newcap
}
As you might know, up to a certain threshold (which changed from 1024 to 256 in Go 1.18), the capacity is doubled. Beyond that threshold, the capacity increases by 1.25 times.
If it were returned as is, what we know would be accurate. However, unfortunately, this value undergoes further post-processing.
The post-processing code is as follows:
newcap := nextslicecap(newLen, oldCap)
var overflow bool
var lenmem, newlenmem, capmem uintptr
// Specialize for common values of et.Size.
// For 1 we don't need any division/multiplication.
// For goarch.PtrSize, compiler will optimize division/multiplication into a shift by a constant.
// For powers of 2, use a variable shift.
noscan := !et.Pointers()
switch {
case et.Size_ == 1:
lenmem = uintptr(oldLen)
newlenmem = uintptr(newLen)
capmem = roundupsize(uintptr(newcap), noscan)
overflow = uintptr(newcap) > maxAlloc
newcap = int(capmem)
case et.Size_ == goarch.PtrSize:
lenmem = uintptr(oldLen) * goarch.PtrSize
newlenmem = uintptr(newLen) * goarch.PtrSize
capmem = roundupsize(uintptr(newcap)*goarch.PtrSize, noscan)
overflow = uintptr(newcap) > maxAlloc/goarch.PtrSize
newcap = int(capmem / goarch.PtrSize)
case isPowerOfTwo(et.Size_):
var shift uintptr
if goarch.PtrSize == 8 {
// Mask shift for better code generation.
shift = uintptr(sys.TrailingZeros64(uint64(et.Size_))) & 63
} else {
shift = uintptr(sys.TrailingZeros32(uint32(et.Size_))) & 31
}
lenmem = uintptr(oldLen) << shift
newlenmem = uintptr(newLen) << shift
capmem = roundupsize(uintptr(newcap)<<shift, noscan)
overflow = uintptr(newcap) > (maxAlloc >> shift)
newcap = int(capmem >> shift)
capmem = uintptr(newcap) << shift
default:
lenmem = uintptr(oldLen) * et.Size_
newlenmem = uintptr(newLen) * et.Size_
capmem, overflow = math.MulUintptr(et.Size_, uintptr(newcap))
capmem = roundupsize(capmem, noscan)
newcap = int(capmem / et.Size_)
capmem = uintptr(newcap) * et.Size_
}
The implementation of et.Pointers
at the top line is as follows. According to the comments, noscan
is expected to be false
for cases where the type is a basic data type, a struct that does not contain pointers, a struct where the first field is a pointer, or an empty struct. noscan
is a variable related to memory alignment requirements.
type Type struct {
Size_ uintptr
PtrBytes uintptr
// number of (prefix) bytes in the type that can contain pointers
...
}
func (t *Type) Pointers() bool { return t.PtrBytes != 0 }
In the case of et.Size_
, which determines the important logic, it represents the size of the type. Specifically, it is the value returned by unsafe.Sizeof(element)
.
Let’s examine each case in detail.
et.Size_ == 1
When the data type size is 1 byte. This typically applies to types like byte
.
lenmem
: Memory size for the current length.newlenmem
: Memory size for the new length.capmem
: Memory size aligned to the new capacity. Theroundupsize
function is used.overflow
: Checks if the new capacity exceedsmaxAlloc
.newcap
: The new capacity derived from the aligned memory size.
et.Size_ == goarch.PtrSize
When the data type size is equal to the pointer size (goarch.PtrSize
). This applies to scenarios such as arrays of pointers.
lenmem
andnewlenmem
: Adjust the memory size based on the pointer size.capmem
: Align the memory size to the new capacity, adjusted according to the pointer size.overflow
: Check if the new capacity exceedsmaxAlloc
divided by the pointer size.newcap
: Calculate the new capacity from the aligned memory size.
isPowerOfTwo(et.Size_)
When the data type size is a power of 2. In this case, bitwise operations are used to efficiently calculate the memory size.
shift
: Whenet.Size_
is a power of 2, bit shifting is used for efficient calculation.lenmem
andnewlenmem
: Adjust memory size using bit shifting.capmem
: Align the memory size to the new capacity using bit shifting.overflow
: Check if the new capacity exceedsmaxAlloc
divided by the shifted value.newcap
: Calculate the new capacity from the aligned memory size.
default
For cases that do not match the above conditions: Basic memory size adjustment is performed.
lenmem
andnewlenmem
: Adjust the memory size using the data type size.capmem
: Calculate the memory size for the new capacity.overflow
: Check for overflow by multiplying the data type size with the memory size.newcap
: Calculate the new capacity from the aligned memory size.
What is the roundupsize
function that appears repeatedly?
roundupsize
This function is used in Go’s memory management system to align the size of memory blocks. It adjusts memory blocks to the appropriate size during allocation, enhancing memory usage efficiency and ensuring proper memory alignment. The function differentiates between small and large objects, calculating the appropriate memory size for each.
func roundupsize(size uintptr, noscan bool) (reqSize uintptr) {
reqSize = size
if reqSize <= maxSmallSize-mallocHeaderSize {
// Small object.
if !noscan && reqSize > minSizeForMallocHeader { // !noscan && !heapBitsInSpan(reqSize)
reqSize += mallocHeaderSize
}
// (reqSize - size) is either mallocHeaderSize or 0. We need to subtract mallocHeaderSize
// from the result if we have one, since mallocgc will add it back in.
if reqSize <= smallSizeMax-8 {
return uintptr(class_to_size[size_to_class8[divRoundUp(reqSize, smallSizeDiv)]]) - (reqSize - size)
}
return uintptr(class_to_size[size_to_class128[divRoundUp(reqSize-smallSizeMax, largeSizeDiv)]]) - (reqSize - size)
}
// Large object. Align reqSize up to the next page. Check for overflow.
reqSize += pageSize - 1
if reqSize < size {
return size
}
return reqSize &^ (pageSize - 1)
}
reqSize <= maxSmallSize - mallocHeaderSize
)
When the object is considered small
- Small Objects: If the size is less than or equal to
smallSizeMax - 8
, return the aligned size using theclass_to_size
andsize_to_class8
arrays. - Larger Small Objects: If the size exceeds
smallSizeMax
, find the aligned size using theclass_to_size
andsize_to_class128
arrays.
reqSize > maxSmallSize — mallocHeaderSize
- Page Size Alignment: Align
reqSize
to the next page size boundary to ensure proper memory alignment.pageSize
represents the size of a memory page. - Overflow Check: If the size after page alignment is smaller than the original request size, return the original size.
- Return Aligned Size: Return the memory block size aligned to the page size.
After all these operations are completed, we obtain the new slice!
Conclusion
In fact, even while writing this post, I couldn’t precisely determine what happens for each type. Explaining it verbally, especially while reviewing the code, would be quite challenging.
However, as a Gopher, it is satisfying to know that slices are not merely doubled but are returned with new capacities based on different calculations depending on their type.
With this knowledge, there will surely be opportunities to apply it in the future, won’t there?