Memory Management
Last updated
Last updated
Assoc. Prof. Wiroon Sriborrirux, Founder of Advance Innovation Center (AIC) and Bangsaen Design House (BDH), Electrical Engineering Department, Faculty of Engineering, Burapha University
In a computing system, storage space can usually be divided into two types: internal storage space and external storage space. Internal storage space usually has a faster access speed and can be accessed randomly according to the variable address, which is what we usually call RAM (random access memory), which can be understood as the computer's memory; while the content stored in the external storage space is relatively fixed, and the data will not be lost even after power failure. This is what is usually called ROM (read-only memory), which can be understood as the computer's hard disk.
In computer systems, variables and intermediate data are generally stored in RAM, and are only transferred from RAM to the CPU for calculation when they are actually used. The memory size required for some data needs to be determined according to the actual situation during the program running process, which requires the system to have the ability to dynamically manage memory space. When a user needs a section of memory space, he/she applies to the system, and the system selects a suitable section of memory space and allocates it to the user. After the user has finished using it, it is released back to the system so that the system can recycle and reuse the section of memory space.
This chapter mainly introduces two memory management methods in RT-Thread, namely dynamic memory heap management and static memory pool management. After studying this chapter, readers will understand the memory management principles and usage of RT-Thread.
Due to the very strict timing requirements in real-time systems, memory management is often much more demanding than in general-purpose operating systems:
1) The time for allocating memory must be certain. The general memory management algorithm is to find a free memory block in the memory that is suitable for the data to be stored according to the length of the data to be stored, and then store the data in it. The time spent on finding such a free memory block is uncertain, so for real-time systems, this is unacceptable. Real-time systems must ensure that the memory block allocation process is completed within a predictable and certain time, otherwise the response of real-time tasks to external events will become uncertain.
2) As memory is continuously allocated and released, the entire memory area will have more and more fragments (because during use, some memory is requested, some of which is released, resulting in some small memory blocks in the memory space, their addresses are not continuous, and they cannot be allocated as a whole large memory block). There is still enough free memory in the system, but because their addresses are not continuous, they cannot form a continuous complete memory block, which will make the program unable to apply for a large memory. For general systems, this inappropriate memory allocation algorithm can be solved by restarting the system (once every month or several months), but for those embedded systems that need to work in the field all year round, it becomes unacceptable.
3) The resource environment of embedded systems is also different. Some systems have limited resources and only have tens of KB of memory available for allocation, while some systems have several MB of memory. It becomes complicated to choose efficient memory allocation algorithms suitable for these different systems.
In terms of memory management, the RT-Thread operating system provides different memory allocation management algorithms according to the upper-layer applications and system resources. Generally speaking, they can be divided into two categories: memory heap management and memory pool management. Memory heap management can be divided into three cases according to the specific memory devices:
The first one is the allocation management for small memory blocks (small memory management algorithm);
The second is the allocation management of large memory blocks (slab management algorithm);
The third is for the allocation of multiple memory heaps (memheap management algorithm)
Memory heap management is used to manage a continuous memory space. The memory distribution of RT-Thread has been introduced in the previous "Kernel Basics" chapter. As shown in the figure below, RT-Thread uses the space from the "end of the ZI segment" to the end of the memory as the memory heap.
The memory heap can allocate memory blocks of any size according to user needs when current resources are sufficient. When users no longer need to use these memory blocks, they can be released back to the heap for allocation and use by other applications. In order to meet different needs, the RT-Thread system provides different memory management algorithms, namely small memory management algorithm, slab management algorithm and memheap management algorithm.
The small memory management algorithm is mainly used for systems with less system resources and is generally used for systems with less than 2MB of memory space; while the slab memory management algorithm is mainly used when the system resources are relatively abundant, providing a fast algorithm similar to the multi-memory pool management algorithm. In addition to the above, RT-Thread also has a management algorithm for multiple memory heaps, namely the memheap management algorithm. The memheap method is suitable for situations where there are multiple memory heaps in the system. It can "paste" multiple memories together to form a large memory heap, which is very convenient for users to use.
When the system is running, only one of these memory heap management algorithms can be selected or the memory heap manager cannot be used at all. The API interfaces they provide to applications are exactly the same.
Note
Note: Because the memory heap manager needs to meet the safety allocation in multi-threaded situations and consider the mutual exclusion problem between multiple threads, please do not allocate or release dynamic memory blocks in the interrupt service routine, because it may cause the current context to be suspended and wait.
The small memory management algorithm is a simple memory allocation algorithm. Initially, it is a large piece of memory. When a memory block needs to be allocated, matching memory blocks are split from this large memory block, and then the split free memory blocks are returned to the heap management system. Each memory block contains a management data header, through which the used blocks and free blocks are linked in a bidirectional linked list.
Use this implementation before 4.1.0
As shown in the following figure:
Each memory block (whether allocated or free) contains a header that includes:
1) magic : a variable (or magic number), which will be initialized to 0x1ea0 (the English word heap), is used to mark this memory block as a memory data block for memory management; the variable is not only used to identify this data block as a memory data block for memory management, but is also a memory protection word: if this area is overwritten, it means that this memory block is illegally overwritten (under normal circumstances, only the memory manager will touch this memory).
2) used : indicates whether the current memory block has been allocated.
The performance of memory management is mainly reflected in the allocation and release of memory. The small memory management algorithm can be reflected in the following example.
4.1.0 and later versions use this implementation
As shown in the following figure:
Heap start : The heap start address stores memory usage information, heap start address pointer, heap end address pointer, minimum heap free address pointer and memory size.
Each memory block (whether allocated or free) contains a header, including:
pool_ptr : small memory object address. If the last bit of the memory block is 1, it is marked as used. If the last bit of the memory block is 0, it is marked as unused. The small memory algorithm structure members can be quickly obtained through this address calculation.
As shown in the following figure, the free list pointer lfree initially points to a 32-byte memory block. When the user thread wants to allocate another 64-byte memory block, but the memory block pointed to by this lfree pointer is only 32 bytes and cannot meet the requirement, the memory manager will continue to look for the next memory block. When it finds the next memory block, 128 bytes, it meets the allocation requirement. Because this memory block is relatively large, the allocator will split this memory block, and the remaining memory block (52 bytes) will continue to remain in the lfree list, as shown in the following figure, the list structure after allocating 64 bytes.
In addition, before each allocation of a memory block, a 12-byte header is reserved for magic, used information, and linked list nodes. The address returned to the application is actually the address after 12 bytes of this memory block. The first 12-byte header is the part that users should never touch (Note: the length of the 12-byte header may vary depending on the system alignment difference).
When releasing, the process is the opposite, but the allocator will check whether the adjacent memory blocks are free. If they are free, they will be merged into a large free memory block.
RT-Thread's slab allocator is a memory allocation algorithm optimized for embedded systems based on the slab allocator implemented by Matthew Dillon, the founder of DragonFly BSD. The original slab algorithm is an efficient kernel memory allocation algorithm introduced by Jeff Bonwick for the Solaris operating system.
The implementation of RT-Thread's slab allocator mainly removes the object construction and destruction process, and only retains a pure buffer-type memory pool algorithm. The slab allocator divides the object into multiple zones according to its size, which can also be seen as a memory pool for each type of object, as shown in the following figure:
The size of a zone is between 32K and 128K bytes. The allocator will automatically adjust the size of the heap when the heap is initialized. The zones in the system include up to 72 types of objects, and can allocate a maximum of 16K of memory space at a time. If it exceeds 16K, it will be directly allocated from the page allocator. The size of the memory block allocated to each zone is fixed. Zones that can allocate memory blocks of the same size will be linked in a linked list, and the zone linked lists of the 72 objects are placed in an array (zone_array[]) for unified management.
The following are the two main operations of the memory allocator:
(1) Memory allocation
Assuming that a 32-byte memory is allocated, the slab memory allocator will first find the corresponding zone list from the zone array list header array according to the 32-byte value. If this list is empty, a new zone is allocated to the page allocator, and then the first free memory block is returned from the zone. If the list is not empty, the first zone node in the zone list must have a free block (otherwise it should not be placed in the list), then the corresponding free block is taken. If all free memory blocks in the zone are used up after the allocation is completed, the allocator needs to delete the zone node from the list.
(2) Memory release
The allocator needs to find the zone node where the memory block is located, and then link the memory block to the zone's free memory block list. If the zone's free list indicates that all the zone's memory blocks have been released, that is, the zone is completely free, then when the number of completely free zones in the zone list reaches a certain number, the system will release this completely free zone to the page allocator.
The memheap management algorithm is suitable for systems with multiple memory heaps whose addresses may not be continuous. Using memheap memory management can simplify the use of systems with multiple memory heaps: when there are multiple memory heaps in the system, users only need to initialize multiple required memheaps during system initialization, and enable the memheap function to easily glue multiple memheaps (with discontinuous addresses) together for system heap allocation.
Note
Note: After enabling memheap, the original heap function will be disabled. You can only choose one of them by enabling or disabling RT_USING_MEMHEAP_AS_HEAP.
The working mechanism of memheap is shown in the figure below. First, multiple blocks of memory are added to the memheap_item linked list for bonding. When allocating a memory block, it will first allocate memory from the default memory heap. If it cannot be allocated, it will search the memheap_item linked list and try to allocate a memory block from other memory heaps. The application does not need to care which memory heap the currently allocated memory block is located on, just like operating a memory heap.
When using the memory heap, the heap must be initialized when the system is initialized, which can be done through the following function interface:
This function will use the memory space in the parameter begin_addr and end_addr area as a memory heap. The following table describes the input parameters of this function:
Input parameters of rt_system_heap_init()
parameter | describe |
begin_addr | The starting address of the heap memory area |
end_addr | End address of the heap memory area |
When using memheap memory, the heap memory must be initialized when the system is initialized. This can be done through the following function interface:
If there are multiple discontinuous memheaps, you can call this function multiple times to initialize them and add them to the memheap_item linked list. The following table describes the input parameters and return values of this function:
Input parameters and return values of rt_memheap_init()
parameter | describe |
memheap | memheap control block |
name | The name of the memory heap |
start_addr | The starting address of the heap memory area |
size | Heap memory size |
return | —— |
RT_EOK | success |
The operations on the memory heap are shown in the figure below, including initialization, memory block application, and memory release. All dynamic memory should be released after use for application and use by other programs.
Allocating and freeing memory blocks
Allocate a memory block of a user-specified size from the memory heap. The function interface is as follows:
The rt_malloc function will find a memory block of appropriate size from the system heap space, and then return the available address of the memory block to the user. The following table describes the input parameters and return values of this function:
Input parameters and return value of rt_malloc()
parameter | describe |
nbytes | The size of the memory block to be allocated, in bytes |
return | —— |
The address of the allocated memory block | success |
RT_NULL | fail |
It is very necessary to judge whether the return value of rt_malloc is null. After the application finishes using the memory requested from the memory allocator, it must release it in time, otherwise it will cause memory leaks.
Example using dynamically allocated memory:
The function interface for releasing a memory block is as follows:
The rt_free function will return the memory to be released to the heap manager. When calling this function, the user needs to pass the pointer of the memory block to be released. If it is a null pointer, it will be returned directly. The following table describes the input parameters of this function:
Input parameters of rt_free()
parameter | describe |
ptr | Pointer to the memory block to be released |
The argument to rt_free must be one of:
is NULL.
is a value previously returned from rt_malloc, rt_calloc, or rt_realloc.
To reallocate the size of a memory block (increase or decrease) based on the allocated memory block, you can use the following function interface:
When reallocating a memory block, the original memory block data remains unchanged (in the case of shrinking, the subsequent data is automatically truncated). The following table describes the input parameters and return values of this function:
rt_realloc() Input Parameters and Return Values
parameter | describe |
rmem | Pointer to the allocated memory block |
newsize | Reallocated memory size |
return | —— |
Address of the reallocated memory block | success |
The rt_realloc function is used to modify the size of a previously allocated memory block. Using this function, you can expand or shrink a block of memory. If it is used to expand a memory block, the original content of the memory block is retained, and the newly added memory is added to the back of the original memory block. The new memory is not initialized in any way. If it is used to shrink a memory block, part of the memory at the end of the memory is removed, and the original content of the remaining memory is retained.
If the original memory block cannot be resized, rt_realloc will allocate another block of memory of the correct size and copy the contents of the original block to the new block. Therefore, after using rt_realloc, you can no longer use the pointer to the old memory block, but should use the new pointer returned by rt_realloc instead.
If the first argument to rt_realloc is NULL, it behaves just like rt_malloc.
Allocating multiple memory blocks
Allocating multiple memory blocks of consecutive memory addresses from the memory heap can be done through the following function interface:
The following table describes the input parameters and return values of this function:
Input parameters and return value of rt_calloc()
parameter | describe |
count | Number of memory blocks |
size | Memory block capacity |
return | —— |
Pointer to the address of the first memory block | Success, and all allocated memory blocks are initialized to zeros. |
RT_NULL | Allocation Failure |
During the memory block allocation process, the user can set a hook function. The function interface to be called is as follows:
The set hook function will be called back after the memory allocation is completed. When calling back, the allocated memory block address and size will be passed in as entry parameters. The following table describes the input parameters of this function:
Input parameters to rt_malloc_sethook()
parameter | describe |
hook | Hook function pointer |
The hook function interface is as follows:
The following table describes the input parameters of the hook function:
Assign hook function interface parameters
parameter | describe |
ptr | Pointer to the allocated memory block |
size | The size of the allocated memory block |
When releasing memory, the user can set a hook function, and the function interface called is as follows:
The set hook function will be called back before the memory release is completed. When calling back, the address of the released memory block will be passed in as the entry parameter (the memory block is not released at this time). The following table describes the input parameters of this function:
Input parameters to rt_free_sethook()
parameter | describe |
hook | Hook function pointer |
The hook function interface is as follows:
The following table describes the input parameters of the hook function:
Input parameters of the hook function hook
parameter | describe |
ptr | Pointer to the memory block to be released |
Summary of dynamic memory usage:
Check if the pointer returned from the rt_malloc function is NULL
Do not access memory other than dynamically allocated memory
Do not pass a pointer to rt_free that was not returned by rt_malloc
Do not access dynamic memory after it has been freed.
Use sizeof to calculate the length of data types and improve program portability
Common dynamic memory errors:
Dereferences a NULL pointer
An operation on allocated memory crossed the boundary
Freeing memory that was not dynamically allocated
Free a portion of a dynamically allocated memory block (rt_free(ptr + 4))
Continue to use dynamic memory after it is released
Memory fragmentation: Frequent calls to memory allocation and release interfaces will lead to memory fragmentation. One strategy to avoid memory fragmentation is to use 内存池 + 内存堆
a mixed approach.
This is an application example of a memory heap. This program will create a dynamic thread, which will dynamically apply for memory and release it, applying for a larger amount of memory each time, and terminate when the memory cannot be applied, as shown in the following code:
Memory heap management
The simulation results are as follows:
The routine successfully allocates memory and prints information; when trying to apply for 65536 bytes (64KB) of memory, the allocation fails because the total RAM size is only 64K and the available RAM is less than 64K.
The memory heap manager can allocate memory blocks of any size, which is very flexible and convenient. However, it also has obvious disadvantages: first, the allocation efficiency is not high, and free memory blocks must be searched for each allocation; second, it is easy to generate memory fragmentation. In order to improve the efficiency of memory allocation and avoid memory fragmentation, RT-Thread provides another memory management method: Memory Pool.
Memory pool is a memory allocation method used to allocate a large number of small memory blocks of the same size. It can greatly speed up the speed of memory allocation and release, and can minimize memory fragmentation. In addition, RT-Thread's memory pool supports thread suspension. When there are no free memory blocks in the memory pool, the requesting thread will be suspended until there are new available memory blocks in the memory pool, and then the suspended requesting thread will be awakened.
The thread suspension function of the memory pool is very suitable for scenarios that require synchronization through memory resources. For example, when playing music, the player thread decodes the music file and then sends it to the sound card driver, thereby driving the hardware to play music.
As shown in the figure above, when the player thread needs to decode data, it will request a memory block from the memory pool. If the memory block has been used up, the thread will be suspended, otherwise it will obtain a memory block to place the decoded data;
The player thread then writes the memory block containing the decoded data into the sound card abstract device (the thread will return immediately and continue to decode more data);
When the sound card device finishes writing, the callback function set by the player thread will be called to release the written memory block. If before this, the player thread is suspended because all the memory blocks in the memory pool have been used up, then it will be awakened at this time and continue decoding.
The memory pool control block is a data structure used by the operating system to manage the memory pool. It stores some information about the memory pool, such as the starting address of the memory pool data area, the size of the memory block, and the list of memory blocks. It also includes a linked list structure used to connect memory blocks, a collection of thread wait events that are suspended due to unavailable memory blocks, etc.
In the RT-Thread real-time operating system, the memory pool control block is represented by the structure struct rt_mempool. Another C expression rt_mp_t represents the memory block handle, which is implemented in C language as a pointer to the memory pool control block. The detailed definition is shown in the following code:
Memory block allocation mechanism
When a memory pool is created, it first requests a large block of memory from the system, and then divides it into multiple small memory blocks of the same size. The small memory blocks are directly connected through a linked list (this linked list is also called a free linked list). Each time an allocation is made, the first memory block at the head of the free linked list is taken out and provided to the applicant. As can be seen from the figure below, multiple memory pools of different sizes are allowed in the physical memory. Each memory pool is composed of multiple free memory blocks, which the kernel uses for memory management. When a memory pool object is created, the memory pool object is assigned to a memory pool control block. The parameters of the memory control block include the memory pool name, memory buffer, memory block size, number of blocks, and a waiting thread queue.
The kernel is responsible for allocating memory pool control blocks to the memory pool. It also receives requests for memory block allocation from user threads. After obtaining this information, the kernel can allocate memory for the memory pool from the memory pool. Once the memory pool is initialized, the size of the internal memory block cannot be adjusted.
Each memory pool object is composed of the above structure, in which suspend_thread forms a waiting list of application threads. That is, when there is no available memory block in the memory pool and the application thread is allowed to wait, the application thread will be suspended on the suspend_thread linked list.
The memory pool control block is a structure that contains important parameters related to the memory pool and acts as a link between various states of the memory pool. The relevant interfaces of the memory pool are shown in the figure below. The operations on the memory pool include: creating/initializing the memory pool, applying for memory blocks, releasing memory blocks, and deleting/leaving the memory pool. However, not all memory pools will be deleted, which is related to the designer's needs, but the used memory blocks should be released.
Creating and Deleting Memory Pools
The operation of creating a memory pool will create a memory pool object and allocate a memory pool from the heap. Creating a memory pool is a prerequisite for allocating and releasing memory blocks from the corresponding memory pool. After creating a memory pool, the thread can perform operations such as applying and releasing from the memory pool. The following function interface is used to create a memory pool, which returns a created memory pool object.
This function interface can be used to create a memory pool that matches the required memory block size and number, provided that the system resources allow (most importantly, the memory heap memory resources). When creating a memory pool, you need to assign a name to the memory pool. The kernel then applies for a memory pool object from the system, and then allocates a memory buffer calculated from the number of blocks and the block size from the memory heap, then initializes the memory pool object, and organizes the successfully applied memory buffer into a linked list of free blocks that can be used for allocation. The following table describes the input parameters and return values of this function:
Input parameters and return value of rt_mp_create()
parameter | describe |
name | Memory pool name |
block_count | Number of memory blocks |
block_size | Memory block capacity |
return | —— |
Handle to the memory pool | Memory pool object created successfully |
RT_NULL | Creation failed |
Deleting a memory pool will delete the memory pool object and release the requested memory. Use the following function interface:
When deleting a memory pool, all threads waiting on the memory pool object will be woken up first (returning -RT_ERROR), and then the memory pool data storage area allocated from the memory heap will be released, and then the memory pool object will be deleted. The following table describes the input parameters and return values of this function:
Input parameters and return value of rt_mp_delete()
parameter | describe |
mp | The memory pool object handle returned by rt_mp_create |
return | —— |
RT_EOK | Deleted successfully |
Initializing and leaving the memory pool
Initializing a memory pool is similar to creating a memory pool, except that the initialization of the memory pool is used in static memory management mode, and the memory pool control block comes from the static object applied by the user in the system. In addition, unlike creating a memory pool, the memory space used by the memory pool object here is a buffer space specified by the user. The user passes the pointer of the buffer to the memory pool control block, and the rest of the initialization work is the same as creating a memory pool. The function interface is as follows:
When initializing a memory pool, pass the memory pool object to be initialized to the kernel, as well as the memory space used by the memory pool, the number and size of memory blocks managed by the memory pool, and assign a name to the memory pool. In this way, the kernel can initialize the memory pool and organize the memory space used by the memory pool into a linked list of free blocks that can be allocated. The following table describes the input parameters and return values of this function:
Input parameters and return value of rt_mp_init()
parameter | describe |
mp | Memory pool object |
name | Memory pool name |
start | The starting position of the memory pool |
size | Memory pool data area size |
block_size | Memory block capacity |
return | —— |
RT_EOK | Initialization successful |
- RT_ERROR | fail |
Number of memory pool blocks = size / (block_size + 4 linked list pointer size), the calculation result is an integer.
For example: the total size of the memory pool data area is set to 4096 bytes, and the memory block size block_size is set to 80 bytes; then the number of memory blocks applied for is 4096/(80+4)= 48.
Detaching the memory pool will detach the memory pool object from the kernel object manager. Detaching the memory pool uses the following function interface:
After using this function interface, the kernel first wakes up all threads waiting on the memory pool object, and then detaches the memory pool object from the kernel object manager. The following table describes the input parameters and return values of this function:
rt_mp_detach() Input Parameters and Return Values
parameter | describe |
mp | Memory pool object |
return | —— |
RT_EOK | success |
Allocating and freeing memory blocks
Allocate a memory block from the specified memory pool using the following interface:
The time parameter means the timeout for applying for memory block allocation. If there are available memory blocks in the memory pool, take the next memory block from the free block list of the memory pool, reduce the number of free blocks and return this memory block; if there are no free memory blocks in the memory pool, determine the timeout setting: if the timeout is set to zero, return an empty memory block immediately; if the waiting time is greater than zero, suspend the current thread on the memory pool object until there are available free memory blocks in the memory pool or the waiting time is reached. The following table describes the input parameters and return values of this function:
Input parameters and return value of rt_mp_alloc()
parameter | describe |
mp | Memory pool object |
time | Timeout |
return | —— |
The address of the allocated memory block | success |
RT_NULL | fail |
Any memory block must be released after use, otherwise it will cause memory leaks. The following interface is used to release the memory block:
When using this function interface, first calculate the memory pool object where the memory block is located (or belongs to) through the pointer of the memory block to be released, then increase the number of available memory blocks of the memory pool object, and add the released memory block to the free memory block list. Then determine whether there is a suspended thread on the memory pool object. If so, wake up the first thread on the suspended thread list. The following table describes the input parameters of this function:
Input parameters of rt_mp_free()
parameter | describe |
block | Memory block pointer |
This is an application routine of a static memory pool. This routine will create a static memory pool object and two dynamic threads. One thread will try to obtain a memory block from the memory pool, and the other thread will release the memory block, as shown in the following code:
Memory pool usage examples
The simulation results are as follows:
This routine initializes 4096 /(80+4) = 48 memory blocks when initializing the memory pool object.
① After thread 1 applied for 48 memory blocks, the memory blocks were used up and needed to be released from other places before they could be applied again; but at this time, thread 1 applied for another one in a waiting manner. Since it could not be allocated, thread 1 was suspended;
②Thread 2 starts to execute the operation of releasing memory; when thread 2 releases a memory block, a memory block becomes free, waking up thread 1 to apply for memory. After the application is successful, it applies again, and thread 1 is suspended again, and the cycle repeats②;
③Thread 2 continues to release the remaining memory blocks and the release is completed.