Lines Matching +full:in +full:- +full:memory

2 NUMA Memory Policy
5 What is NUMA Memory Policy?
8 In the Linux kernel, "memory policy" determines from which node the kernel will
9 allocate memory in a NUMA system or in an emulated NUMA system. Linux has
10 supported platforms with Non-Uniform Memory Access architectures since 2.4.?.
11 The current memory policy support was added to Linux 2.6 around May 2004. This
12 document attempts to describe the concepts and APIs of the 2.6 memory policy
15 Memory policies should not be confused with cpusets
16 (``Documentation/admin-guide/cgroup-v1/cpusets.rst``)
18 memory may be allocated by a set of processes. Memory policies are a
19 programming interface that a NUMA-aware application can take advantage of. When
21 takes priority. See :ref:`Memory Policies and cpusets <mem_pol_and_cpusets>`
24 Memory Policy Concepts
27 Scope of Memory Policies
28 ------------------------
30 The Linux kernel supports _scopes_ of memory policy, described here from
40 allocations across all nodes with "sufficient" memory, so as
41 not to overload the initial boot node with boot-time
45 this is an optional, per-task policy. When defined for a
56 executable image that has no awareness of memory policy. See the
57 :ref:`Memory Policy APIs <memory_policy_apis>` section,
61 In a multi-threaded task, task policies apply only to the thread
68 installed. Any pages already faulted in by the task when the task
75 A "VMA" or "Virtual Memory Area" refers to a range of a task's
78 :ref:`Memory Policy APIs <memory_policy_apis>` section,
98 mapping-- i.e., at Copy-On-Write.
101 virtual address space--a.k.a. threads--independent of when
106 are NOT inheritable across exec(). Thus, only NUMA-aware
109 * A task may install a new VMA policy on a sub-range of a
111 the existing virtual memory area into 2 or 3 VMAs, each with
123 Conceptually, shared policies apply to "memory objects" mapped
126 policies--using the mbind() system call specifying a range of
134 As of 2.6.22, only shared memory segments, created by shmget() or
138 support allocation at fault time--a.k.a lazy allocation--so hugetlbfs
143 As mentioned above in :ref:`VMA policies <vma_policy>` section,
154 Thus, different tasks that attach to a shared memory segment can have
157 a shared memory region, when one task has installed shared policy on
160 Components of Memory Policies
161 -----------------------------
163 A NUMA memory policy consists of a "mode", optional mode flags, and
169 Internally, memory policies are implemented by a reference counted
171 discussed in context, below, as required to explain the behavior.
173 NUMA memory policy supports the following 4 behavioral modes:
175 Default Mode--MPOL_DEFAULT
176 This mode is only used in the memory policy APIs. Internally,
177 MPOL_DEFAULT is converted to the NULL memory policy in all
178 policy scopes. Any existing non-default policy will simply be
187 When specified in one of the memory policy APIs, the Default mode
191 be non-empty.
194 This mode specifies that memory must come from the set of
195 nodes specified by the policy. Memory will be allocated from
196 the node in the set with sufficient free memory that is
201 from the single node specified in the policy. If that
202 allocation fails, the kernel will search other nodes, in order
206 Internally, the Preferred policy uses a single node--the
222 page granularity, across the nodes specified in the policy.
226 For allocation of anonymous pages and shared memory pages,
241 specified by the policy based on the order in which they are
244 interleaved system default policy works in this mode.
248 satisfied from the nodemask specified in the policy. If there is
249 a memory pressure on all nodes in the nodemask, the allocation
253 NUMA memory policy supports the following optional mode flags:
258 nodes changes after the memory policy has been defined.
261 change in the set of allowed nodes, the preferred nodemask (Preferred
263 remapped to the new set of allowed nodes. This may result in nodes
266 With this flag, if the user-specified nodes overlap with the
267 nodes allowed by the task's cpuset, then the memory policy is
272 mems 1-3 that sets an Interleave policy over the same set. If
273 the cpuset's mems change to 3-5, the Interleave will now occur
287 set of allowed nodes. The kernel stores the user-passed nodemask,
292 mempolicy is rebound because of a change in the set of allowed
297 1,3,5 may be remapped to 7-9 and then to 1-3 if the set of
302 nodes. In other words, if nodes 0, 2, and 4 are set in the user's
303 nodemask, the policy will be effected over the first (and in the
304 Bind or Interleave case, the third and fifth) nodes in the set of
309 of the new set of allowed nodes (for example, node 5 is set in
310 the user's nodemask when the set of allowed nodes is only 0-3),
312 if not already set, sets the node in the mempolicy nodemask.
315 mems 2-5 that sets an Interleave policy over the same set with
316 MPOL_F_RELATIVE_NODES. If the cpuset's mems change to 3-7, the
317 interleave now occurs over nodes 3,5-7. If the cpuset's mems
318 then change to 0,2-3,5, then the interleave occurs over nodes
319 0,2-3,5.
322 nodemasks to specify memory policies using this flag should
323 disregard their current, actual cpuset imposed memory placement
325 memory nodes 0 to N-1, where N is the number of memory nodes the
327 set of memory nodes allowed by the task's cpuset, as that may
335 Memory Policy Reference Counting
344 When a new memory policy is allocated, its reference count is initialized
346 new policy. When a pointer to a memory policy structure is stored in another
350 During run-time "usage" of the policy, we attempt to minimize atomic operations
378 3) Page allocation usage of task or vma policy occurs in the fault path where
384 shared memory policy while another task, with a distinct mmap_lock, is
390 extra reference on shared policies in the same query/allocation paths
391 used for non-shared policies. For this reason, shared policies are marked
392 as such, and the extra reference is dropped "conditionally"--i.e., only
396 shared policies in a tree structure under spinlock, shared policies are
397 more expensive to use in the page allocation path. This is especially
398 true for shared policies on shared memory regions shared by tasks running
400 falling back to task or system default policy for shared memory regions,
401 or by prefaulting the entire shared memory region into memory and locking
406 Memory Policy APIs
409 Linux supports 4 system calls for controlling memory policy. These APIS
415 user space applications reside in a package that is not part of the
417 prefix, are defined in <linux/syscalls.h>; the mode and flag
418 definitions are defined in <linux/mempolicy.h>.
420 Set [Task] Memory Policy::
425 Set's the calling task's "task/process memory policy" to mode
435 Get [Task] Memory Policy or Related Information::
441 Queries the "task/process memory policy" of the calling task, or the
467 sys_set_mempolicy_home_node set the home node for a VMA policy present in the
471 the default allocation policy to allocate memory close to the local node for an
475 Memory Policy Command Line Interface
478 Although not strictly part of the Linux implementation of memory policy,
484 + set the shared policy for a shared memory segment via mbind(2)
486 The numactl(8) tool is packaged with the run-time version of the library
487 containing the memory policy system call wrappers. Some distributions
488 package the headers and compile-time libraries in a separate development
493 Memory Policies and cpusets
496 Memory policies work within cpusets as described above. For memory policies
501 specified for the policy and the set of nodes with memory is used. If the
506 The interaction of memory policies and cpusets can be problematic when tasks
507 in two cpusets share access to a memory region, such as shared memory segments
510 memories are allowed in both cpusets may be used in the policies. Obtaining
511 this information requires "stepping outside" the memory policy APIs to use the
512 cpuset information and requires that one know in what cpusets other task might
514 memory sets are disjoint, "local" allocation is the only valid policy.