1
0
Fork 0
mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-25 17:53:34 -05:00
linux/mm
Anton Vorontsov 70ddf637ee memcg: add memory.pressure_level events
With this patch userland applications that want to maintain the
interactivity/memory allocation cost can use the pressure level
notifications.  The levels are defined like this:

The "low" level means that the system is reclaiming memory for new
allocations.  Monitoring this reclaiming activity might be useful for
maintaining cache level.  Upon notification, the program (typically
"Activity Manager") might analyze vmstat and act in advance (i.e.
prematurely shutdown unimportant services).

The "medium" level means that the system is experiencing medium memory
pressure, the system might be making swap, paging out active file
caches, etc.  Upon this event applications may decide to further analyze
vmstat/zoneinfo/memcg or internal memory usage statistics and free any
resources that can be easily reconstructed or re-read from a disk.

The "critical" level means that the system is actively thrashing, it is
about to out of memory (OOM) or even the in-kernel OOM killer is on its
way to trigger.  Applications should do whatever they can to help the
system.  It might be too late to consult with vmstat or any other
statistics, so it's advisable to take an immediate action.

The events are propagated upward until the event is handled, i.e.  the
events are not pass-through.  Here is what this means: for example you
have three cgroups: A->B->C.  Now you set up an event listener on
cgroups A, B and C, and suppose group C experiences some pressure.  In
this situation, only group C will receive the notification, i.e.  groups
A and B will not receive it.  This is done to avoid excessive
"broadcasting" of messages, which disturbs the system and which is
especially bad if we are low on memory or thrashing.  So, organize the
cgroups wisely, or propagate the events manually (or, ask us to
implement the pass-through events, explaining why would you need them.)

Performance wise, the memory pressure notifications feature itself is
lightweight and does not require much of bookkeeping, in contrast to the
rest of memcg features.  Unfortunately, as of current memcg
implementation, pages accounting is an inseparable part and cannot be
turned off.  The good news is that there are some efforts[1] to improve
the situation; plus, implementing the same, fully API-compatible[2]
interface for CONFIG_MEMCG=n case (e.g.  embedded) is also a viable
option, so it will not require any changes on the userland side.

[1] http://permalink.gmane.org/gmane.linux.kernel.cgroups/6291
[2] http://lkml.org/lkml/2013/2/21/454

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix CONFIG_CGROPUPS=n warnings]
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Leonid Moiseichuk <leonid.moiseichuk@nokia.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:38 -07:00
..
backing-dev.c
balloon_compaction.c
bootmem.c
bounce.c mm: make snapshotting pages for stable writes a per-bio operation 2013-04-29 15:54:33 -07:00
cleancache.c fs: encode_fh: return FILEID_INVALID if invalid fid_type 2013-02-26 02:46:10 -05:00
compaction.c
debug-pagealloc.c
dmapool.c
fadvise.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
failslab.c
filemap.c mm: trace filemap add and del 2013-04-29 15:54:28 -07:00
filemap_xip.c
fremap.c Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs" 2013-03-28 17:45:51 -07:00
frontswap.c
highmem.c
huge_memory.c THP: fix comment about memory barrier 2013-04-29 15:54:37 -07:00
hugetlb.c mm, hugetlb: include hugepages in meminfo 2013-04-29 15:54:35 -07:00
hugetlb_cgroup.c
hwpoison-inject.c
init-mm.c
internal.h mm: accelerate munlock() treatment of THP pages 2013-02-27 19:10:09 -08:00
interval_tree.c
Kconfig Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ksm.c ksm: fix m68k build: only NUMA needs pfn_to_nid 2013-03-08 15:05:34 -08:00
maccess.c
madvise.c mm: madvise: complete input validation before taking lock 2013-04-29 15:54:37 -07:00
Makefile memcg: add memory.pressure_level events 2013-04-29 15:54:38 -07:00
memblock.c memblock: add assertion for zero allocation alignment 2013-04-29 15:54:28 -07:00
memcontrol.c memcg: add memory.pressure_level events 2013-04-29 15:54:38 -07:00
memory-failure.c HWPOISON: check dirty flag to match against clean page 2013-04-29 15:54:28 -07:00
memory.c THP: fix comment about memory barrier 2013-04-29 15:54:37 -07:00
memory_hotplug.c mm, hotplug: avoid compiling memory hotremove functions when disabled 2013-04-29 15:54:37 -07:00
mempolicy.c mm/mempolicy.c: fix sp_node_init() argument ordering 2013-03-08 15:05:34 -08:00
mempool.c
migrate.c mm: rewrite the comment over migrate_pages() more comprehensibly 2013-04-29 15:54:37 -07:00
mincore.c
mlock.c Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs" 2013-03-28 17:45:51 -07:00
mm_init.c
mmap.c mm: reinititalise user and admin reserves if memory is added or removed 2013-04-29 15:54:37 -07:00
mmu_context.c
mmu_notifier.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
mmzone.c
mprotect.c
mremap.c
msync.c
nobootmem.c
nommu.c mm: replace hardcoded 3% with admin_reserve_pages knob 2013-04-29 15:54:36 -07:00
oom_kill.c
page-writeback.c mm: make snapshotting pages for stable writes a per-bio operation 2013-04-29 15:54:33 -07:00
page_alloc.c page_alloc: make setup_nr_node_ids() usable for arch init code 2013-04-29 15:54:36 -07:00
page_cgroup.c
page_io.c
page_isolation.c
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c
process_vm_access.c Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys 2013-03-12 11:05:45 -07:00
quicklist.c
readahead.c
rmap.c rmap: recompute pgoff for unmapping huge page 2013-04-29 15:54:28 -07:00
shmem.c mm/shmem.c: remove an ifdef 2013-04-29 15:54:28 -07:00
slab.c
slab.h
slab_common.c
slob.c
slub.c mm/slub.c: use register_hotmemory_notifier() 2013-04-29 15:54:36 -07:00
sparse-vmemmap.c sparse-vmemmap: specify vmemmap population range in bytes 2013-04-29 15:54:35 -07:00
sparse.c mm, hotplug: avoid compiling memory hotremove functions when disabled 2013-04-29 15:54:37 -07:00
swap.c
swap_state.c
swapfile.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
truncate.c
util.c
vmalloc.c kexec, vmalloc: export additional vmalloc layer information 2013-04-29 15:54:34 -07:00
vmpressure.c memcg: add memory.pressure_level events 2013-04-29 15:54:38 -07:00
vmscan.c memcg: add memory.pressure_level events 2013-04-29 15:54:38 -07:00
vmstat.c mm: remove CONFIG_HOTPLUG ifdefs 2013-04-29 15:54:37 -07:00