Linux函数 warn_alloc():page allocation failure问题分析

作者 by adtxl / 2022-06-24 / 暂无评论 / 301 个足迹

转载整理自https://www.cnblogs.com/arnoldlu/p/10691034.html

代码部分使用的kernel版本为4.19.176

在内存申请的时候经常会遇到类似xxx: page allocation failure: order:10...类型的问题,这是warn_alloc()的输出。

warn_alloc()被如下函数调用:__alloc_pages_slowpath()__vmalloc_area_node()__vmalloc_node_rangevmemmap_aloc_block

下面分三部分了解这种问题的来龙去脉:

  • 什么情况会导致warn_alloc()?
  • warn_alloc()都做了哪些事情?
  • 结合实际问题分析问题原因。

1. 触发warn_alloc()的原因

要了什么情况下会导致warn_alloc(),就需要分析在何种情况下会被调用。

1.1 __alloc_pages_slowpath()

__alloc_pages_slowpath()表示页面申请进入了slowpath,那相对就有fastpath。

__alloc_pages_nodemask()中可知,这个fastpath就是get_page_from_freelist()__alloc_pages_nodemask()是分配页面的后备选择。

// mm/page_alloc.c

static inline struct page *
__alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
                        struct alloc_context *ac)
{
    bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM;
    const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER;
    struct page *page = NULL;
    unsigned int alloc_flags;
    unsigned long did_some_progress;
    enum compact_priority compact_priority;
    enum compact_result compact_result;
    int compaction_retries;
    int no_progress_loops;
    unsigned int cpuset_mems_cookie;
    int reserve_flags;

    /*
     * We also sanity check to catch abuse of atomic reserves being used by
     * callers that are not in atomic context.
     */
    if (WARN_ON_ONCE((gfp_mask & (__GFP_ATOMIC|__GFP_DIRECT_RECLAIM)) ==
                (__GFP_ATOMIC|__GFP_DIRECT_RECLAIM)))
        gfp_mask &= ~__GFP_ATOMIC;

retry_cpuset:
    compaction_retries = 0;
    no_progress_loops = 0;
    compact_priority = DEF_COMPACT_PRIORITY;
    cpuset_mems_cookie = read_mems_allowed_begin();

    /*
     * The fast path uses conservative alloc_flags to succeed only until
     * kswapd needs to be woken up, and to avoid the cost of setting up
     * alloc_flags precisely. So we do that now.
     */
    alloc_flags = gfp_to_alloc_flags(gfp_mask);

    /*
     * We need to recalculate the starting point for the zonelist iterator
     * because we might have used different nodemask in the fast path, or
     * there was a cpuset modification and we are retrying - otherwise we
     * could end up iterating over non-eligible zones endlessly.
     */
    ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
                    ac->high_zoneidx, ac->nodemask);
    if (!ac->preferred_zoneref->zone)
        goto nopage;

    if (gfp_mask & __GFP_KSWAPD_RECLAIM)
        wake_all_kswapds(order, gfp_mask, ac);

    /*
     * The adjusted alloc_flags might result in immediate success, so try
     * that first
     */
    page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
    if (page)
        goto got_pg;

    /*
     * For costly allocations, try direct compaction first, as it's likely
     * that we have enough base pages and don't need to reclaim. For non-
     * movable high-order allocations, do that as well, as compaction will
     * try prevent permanent fragmentation by migrating from blocks of the
     * same migratetype.
     * Don't try this for allocations that are allowed to ignore
     * watermarks, as the ALLOC_NO_WATERMARKS attempt didn't yet happen.
     */
    if (can_direct_reclaim &&
            (costly_order ||
               (order > 0 && ac->migratetype != MIGRATE_MOVABLE))
            && !gfp_pfmemalloc_allowed(gfp_mask)) {
        page = __alloc_pages_direct_compact(gfp_mask, order,
                        alloc_flags, ac,
                        INIT_COMPACT_PRIORITY,
                        &compact_result);
        if (page)
            goto got_pg;

        /*
         * Checks for costly allocations with __GFP_NORETRY, which
         * includes THP page fault allocations
         */
        if (costly_order && (gfp_mask & __GFP_NORETRY)) {
            /*
             * If compaction is deferred for high-order allocations,
             * it is because sync compaction recently failed. If
             * this is the case and the caller requested a THP
             * allocation, we do not want to heavily disrupt the
             * system, so we fail the allocation instead of entering
             * direct reclaim.
             */
            if (compact_result == COMPACT_DEFERRED)
                goto nopage;

            /*
             * Looks like reclaim/compaction is worth trying, but
             * sync compaction could be very expensive, so keep
             * using async compaction.
             */
            compact_priority = INIT_COMPACT_PRIORITY;
        }
    }

retry:
    /* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
    if (gfp_mask & __GFP_KSWAPD_RECLAIM)
        wake_all_kswapds(order, gfp_mask, ac);

    reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
    if (reserve_flags)
        alloc_flags = reserve_flags;

    /*
     * Reset the nodemask and zonelist iterators if memory policies can be
     * ignored. These allocations are high priority and system rather than
     * user oriented.
     */
    if (!(alloc_flags & ALLOC_CPUSET) || reserve_flags) {
        ac->nodemask = NULL;
        ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
                    ac->high_zoneidx, ac->nodemask);
    }

    /* Attempt with potentially adjusted zonelist and alloc_flags */
    page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
    if (page)
        goto got_pg;

    /* Caller is not willing to reclaim, we can't balance anything */
    if (!can_direct_reclaim)
        goto nopage;

    /* Avoid recursion of direct reclaim */
    if (current->flags & PF_MEMALLOC)
        goto nopage;

    /* Try direct reclaim and then allocating */
    page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
                            &did_some_progress);
    if (page)
        goto got_pg;

    /* Try direct compaction and then allocating */
    page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
                    compact_priority, &compact_result);
    if (page)
        goto got_pg;

    /* Do not loop if specifically requested */
    if (gfp_mask & __GFP_NORETRY)
        goto nopage;

    /*
     * Do not retry costly high order allocations unless they are
     * __GFP_RETRY_MAYFAIL
     */
    if (costly_order && !(gfp_mask & __GFP_RETRY_MAYFAIL))
        goto nopage;

    if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
                 did_some_progress > 0, &no_progress_loops))
        goto retry;

    /*
     * It doesn't make any sense to retry for the compaction if the order-0
     * reclaim is not able to make any progress because the current
     * implementation of the compaction depends on the sufficient amount
     * of free memory (see __compaction_suitable)
     */
    if (did_some_progress > 0 &&
            should_compact_retry(ac, order, alloc_flags,
                compact_result, &compact_priority,
                &compaction_retries))
        goto retry;


    /* Deal with possible cpuset update races before we start OOM killing */
    if (check_retry_cpuset(cpuset_mems_cookie, ac))
        goto retry_cpuset;

    /* Reclaim has failed us, start killing things */
    page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
    if (page)
        goto got_pg;

    /* Avoid allocations with no watermarks from looping endlessly */
    if (tsk_is_oom_victim(current) &&
        (alloc_flags == ALLOC_OOM ||
         (gfp_mask & __GFP_NOMEMALLOC)))
        goto nopage;

    /* Retry as long as the OOM killer is making progress */
    if (did_some_progress) {
        no_progress_loops = 0;
        goto retry;
    }

nopage:
    /* Deal with possible cpuset update races before we fail */
    if (check_retry_cpuset(cpuset_mems_cookie, ac))
        goto retry_cpuset;

    /*
     * Make sure that __GFP_NOFAIL request doesn't leak out and make sure
     * we always retry
     */
    if (gfp_mask & __GFP_NOFAIL) {
        /*
         * All existing users of the __GFP_NOFAIL are blockable, so warn
         * of any new users that actually require GFP_NOWAIT
         */
        if (WARN_ON_ONCE(!can_direct_reclaim))
            goto fail;

        /*
         * PF_MEMALLOC request from this context is rather bizarre
         * because we cannot reclaim anything and only can loop waiting
         * for somebody to do a work for us
         */
        WARN_ON_ONCE(current->flags & PF_MEMALLOC);

        /*
         * non failing costly orders are a hard requirement which we
         * are not prepared for much so let's warn about these users
         * so that we can identify them and convert them to something
         * else.
         */
        WARN_ON_ONCE(order > PAGE_ALLOC_COSTLY_ORDER);

        /*
         * Help non-failing allocations by giving them access to memory
         * reserves but do not use ALLOC_NO_WATERMARKS because this
         * could deplete whole memory reserves which would just make
         * the situation worse
         */
        page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_HARDER, ac);
        if (page)
            goto got_pg;

        cond_resched();
        goto retry;
    }
fail:
    warn_alloc(gfp_mask, ac->nodemask,
            "page allocation failure: order:%u", order);
got_pg:
    return page;
}

1.2 __vmalloc_area_node()

static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
                 pgprot_t prot, int node)
{
    struct page **pages;
    unsigned int nr_pages, array_size, i;
    const gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
    const gfp_t alloc_mask = gfp_mask | __GFP_NOWARN;
    const gfp_t highmem_mask = (gfp_mask & (GFP_DMA | GFP_DMA32)) ?
                    0 :
                    __GFP_HIGHMEM;

    nr_pages = get_vm_area_size(area) >> PAGE_SHIFT;
    array_size = (nr_pages * sizeof(struct page *));

    /* Please note that the recursion is strictly bounded. */
    if (array_size > PAGE_SIZE) {
        pages = __vmalloc_node(array_size, 1, nested_gfp|highmem_mask,
                PAGE_KERNEL, node, area->caller);
    } else {
        pages = kmalloc_node(array_size, nested_gfp, node);
    }

    if (!pages) {
        remove_vm_area(area->addr);
        kfree(area);
        return NULL;
    }

    area->pages = pages;
    area->nr_pages = nr_pages;

    for (i = 0; i < area->nr_pages; i++) {
        struct page *page;

        if (node == NUMA_NO_NODE)
            page = alloc_page(alloc_mask|highmem_mask);
        else
            page = alloc_pages_node(node, alloc_mask|highmem_mask, 0);

        if (unlikely(!page)) {
            /* Successfully allocated i pages, free them in __vunmap() */
            area->nr_pages = i;
            atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
            goto fail;
        }
        area->pages[i] = page;
        if (gfpflags_allow_blocking(gfp_mask|highmem_mask))
            cond_resched();
    }
    atomic_long_add(area->nr_pages, &nr_vmalloc_pages);

    if (map_vm_area(area, prot, pages))
        goto fail;
    return area->addr;

fail:
    warn_alloc(gfp_mask, NULL,
              "vmalloc: allocation failure, allocated %ld of %ld bytes",
              (area->nr_pages*PAGE_SIZE), area->size);
    vfree(area->addr);
    return NULL;
}

1.3 __vmalloc_node_range

// /mm/vmalloc.c

/**
 *  __vmalloc_node_range  -  allocate virtually contiguous memory
 *  @size:      allocation size
 *  @align:     desired alignment
 *  @start:     vm area range start
 *  @end:       vm area range end
 *  @gfp_mask:  flags for the page level allocator
 *  @prot:      protection mask for the allocated pages
 *  @vm_flags:  additional vm area flags (e.g. %VM_NO_GUARD)
 *  @node:      node to use for allocation or NUMA_NO_NODE
 *  @caller:    caller's return address
 *
 *  Allocate enough pages to cover @size from the page level
 *  allocator with @gfp_mask flags.  Map them into contiguous
 *  kernel virtual space, using a pagetable protection of @prot.
 */
void *__vmalloc_node_range(unsigned long size, unsigned long align,
            unsigned long start, unsigned long end, gfp_t gfp_mask,
            pgprot_t prot, unsigned long vm_flags, int node,
            const void *caller)
{
    struct vm_struct *area;
    void *addr;
    unsigned long real_size = size;

    size = PAGE_ALIGN(size);
    if (!size || (size >> PAGE_SHIFT) > totalram_pages)
        goto fail;

    area = __get_vm_area_node(size, align, VM_ALLOC | VM_UNINITIALIZED |
                vm_flags, start, end, node, gfp_mask, caller);
    if (!area)
        goto fail;

    addr = __vmalloc_area_node(area, gfp_mask, prot, node);
    if (!addr)
        return NULL;

    /*
     * First make sure the mappings are removed from all page-tables
     * before they are freed.
     */
    vmalloc_sync_unmappings();

    /*
     * In this function, newly allocated vm_struct has VM_UNINITIALIZED
     * flag. It means that vm_struct is not fully initialized.
     * Now, it is fully initialized, so remove this flag here.
     */
    clear_vm_uninitialized_flag(area);

    kmemleak_vmalloc(area, size, gfp_mask);

    return addr;

fail:
    warn_alloc(gfp_mask, NULL,
              "vmalloc: allocation failure: %lu bytes", real_size);
    return NULL;
}

1.4 vmemmap_aloc_block

// mm/sparse-vmemmap.c

void * __meminit vmemmap_alloc_block(unsigned long size, int node)
{
    /* If the main allocator is up use that, fallback to bootmem. */
    if (slab_is_available()) {
        gfp_t gfp_mask = GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_NOWARN;
        int order = get_order(size);
        static bool warned;
        struct page *page;

        page = alloc_pages_node(node, gfp_mask, order);
        if (page)
            return page_address(page);

        if (!warned) {
            warn_alloc(gfp_mask & ~__GFP_NOWARN, NULL,
                   "vmemmap alloc failure: order:%u", order);
            warned = true;
        }
        return NULL;
    } else
        return __earlyonly_bootmem_alloc(node, size, size,
                __pa(MAX_DMA_ADDRESS));
}

2. warn_alloc()解析

warn_alloc()首先显示相关进程和内存分配gfp_mask信息,然后打印栈信息,

// mm/page_alloc.c
void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
{
    struct va_format vaf;
    va_list args;
    static DEFINE_RATELIMIT_STATE(nopage_rs, DEFAULT_RATELIMIT_INTERVAL,
                      DEFAULT_RATELIMIT_BURST);

    if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs))
        return;

    va_start(args, fmt);
    vaf.fmt = fmt;
    vaf.va = &args;
    // 打印进程名,order,mode,nodemask信息
    pr_warn("%s: %pV, mode:%#x(%pGg), nodemask=%*pbl\n",
            current->comm, &vaf, gfp_mask, &gfp_mask,
            nodemask_pr_args(nodemask));
    // 显示warn_alloc()出入的参数
    va_end(args);

    cpuset_print_current_mems_allowed();
    // 打印栈信息
    dump_stack();
    // 显示内存信息
    warn_alloc_show_mem(gfp_mask, nodemask);
}

不同的页面有不同的属性,在warn_alloc()输出的字母对应了页面的属性。主要有M、U、E、C。

// mm/page_alloc.c

static void show_migration_types(unsigned char type)
{
    static const char types[MIGRATE_TYPES] = {
        [MIGRATE_UNMOVABLE] = 'U',
        [MIGRATE_MOVABLE]   = 'M',
        [MIGRATE_RECLAIMABLE]   = 'E',
        [MIGRATE_HIGHATOMIC]    = 'H',
#ifdef CONFIG_CMA
        [MIGRATE_CMA]       = 'C',
#endif
#ifdef CONFIG_MEMORY_ISOLATION
        [MIGRATE_ISOLATE]   = 'I',
#endif
    };
    char tmp[MIGRATE_TYPES + 1];
    char *p = tmp;
    int i;

    for (i = 0; i < MIGRATE_TYPES; i++) {
        if (type & (1 << i))
            *p++ = types[i];
    }

    *p = '\0';
    printk(KERN_CONT "(%s) ", tmp);
}

3. 实例解析

实例1

在monkey测试,多app切换时,报了大量的page allocation failure,如下所示。通过log可以看到,wifi进程在分配order大小为3的页面时,分配失败了。

[ 1798.431973] .(1)[2164:wifi@1.0-servic]usbcore: deregistering interface driver wlan
[ 1798.802034] .(0)[2164:wifi@1.0-servic][wlan]Set ALL DBG module log level to [0x2f]
[ 1798.817716] .(0)[2164:wifi@1.0-servic][wlan]Reset ALL DBG module log level to DEFAULT!
[ 1798.829657] .(0)[2164:wifi@1.0-servic]wifi@1.0-servic: page allocation failure: order:3, mode:0x484020(GFP_ATOMIC|__GFP_COMP), nodemask=(null)
[ 1798.844969] -(0)[2164:wifi@1.0-servic]CPU: 0 PID: 2164 Comm: wifi@1.0-servic Tainted: P           O      4.19.176 #2
[ 1798.855542] -(0)[2164:wifi@1.0-servic]Hardware name: UniPhier LD20 Global Board v4 (REF_LD20_GP_V4) (DT)
[ 1798.865052] -(0)[2164:wifi@1.0-servic]Call trace:
[ 1798.869772] -(0)[2164:wifi@1.0-servic] dump_backtrace+0x0/0x1b0
[ 1798.875711] -(0)[2164:wifi@1.0-servic] show_stack+0x24/0x30
[ 1798.881301] -(0)[2164:wifi@1.0-servic] dump_stack+0xb4/0xec
[ 1798.886889] -(0)[2164:wifi@1.0-servic] warn_alloc+0xf0/0x158
[ 1798.892560] -(0)[2164:wifi@1.0-servic] __alloc_pages_nodemask+0xb5c/0xd68
[ 1798.899368] -(0)[2164:wifi@1.0-servic] kmalloc_order+0x38/0x78
[ 1798.905216] -(0)[2164:wifi@1.0-servic] kmalloc_order_trace+0x3c/0x110
[ 1798.911770] -(0)[2164:wifi@1.0-servic] glSetHifInfo+0x328/0x610 [wlan_7961_usb]
[ 1798.919163] -(0)[2164:wifi@1.0-servic] wlanGetConfig+0x428/0xc98 [wlan_7961_usb]
[ 1798.926654] -(0)[2164:wifi@1.0-servic] kalP2pIndicateChnlSwitch+0x61c/0x658 [wlan_7961_usb]
[ 1798.935035] -(0)[2164:wifi@1.0-servic] usb_probe_interface+0x190/0x2e8
[ 1798.941585] -(0)[2164:wifi@1.0-servic] really_probe+0x3c4/0x420
[ 1798.947518] -(0)[2164:wifi@1.0-servic] driver_probe_device+0x9c/0x148
[ 1798.953976] -(0)[2164:wifi@1.0-servic] __driver_attach+0x154/0x158
[ 1798.960176] -(0)[2164:wifi@1.0-servic] bus_for_each_dev+0x78/0xe0
[ 1798.966283] -(0)[2164:wifi@1.0-servic] driver_attach+0x30/0x40
[ 1798.972133] -(0)[2164:wifi@1.0-servic] bus_add_driver+0x1f0/0x288
[ 1798.978243] -(0)[2164:wifi@1.0-servic] driver_register+0x68/0x118
[ 1798.984353] -(0)[2164:wifi@1.0-servic] usb_register_driver+0x7c/0x170
[ 1798.990883] -(0)[2164:wifi@1.0-servic] glRegisterBus+0x88/0xa0 [wlan_7961_usb]
[ 1798.998183] -(0)[2164:wifi@1.0-servic] init_module+0x2b8/0x2d8 [wlan_7961_usb]
[ 1799.005429] -(0)[2164:wifi@1.0-servic] do_one_initcall+0x5c/0x260
[ 1799.011539] -(0)[2164:wifi@1.0-servic] do_init_module+0x64/0x1ec
[ 1799.017560] -(0)[2164:wifi@1.0-servic] load_module+0x1c7c/0x1ec0
[ 1799.023582] -(0)[2164:wifi@1.0-servic] __se_sys_finit_module+0xa0/0x100
[ 1799.030215] -(0)[2164:wifi@1.0-servic] __arm64_sys_finit_module+0x24/0x30
[ 1799.037023] -(0)[2164:wifi@1.0-servic] el0_svc_common.constprop.0+0x7c/0x198
[ 1799.044090] -(0)[2164:wifi@1.0-servic] el0_svc_compat_handler+0x2c/0x38
[ 1799.050721] -(0)[2164:wifi@1.0-servic] el0_svc_compat+0x8/0x34
[ 1799.062345] .(0)[2164:wifi@1.0-servic]Mem-Info:
[ 1799.068807] .(0)[2164:wifi@1.0-servic]active_anon:23568 inactive_anon:23598 isolated_anon:0
[ 1799.068807]  active_file:17726 inactive_file:16226 isolated_file:0
[ 1799.068807]  unevictable:957 dirty:25 writeback:0 unstable:0
[ 1799.068807]  slab_reclaimable:10288 slab_unreclaimable:23111
[ 1799.068807]  mapped:27324 shmem:1783 pagetables:9768 bounce:0
[ 1799.068807]  free:90821 free_pcp:1085 free_cma:80512
[ 1799.107204] .(0)[2164:wifi@1.0-servic]Node 0 active_anon:94396kB inactive_anon:94392kB active_file:70904kB inactive_file:65028kB unevictable:3828kB isolated(anon):0kB isolated(file):0kB mapped:109296kB dirty:100kB writeback:0kB shmem:7132kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[ 1799.139110] DMA32 free:362116kB min:4556kB low:29180kB high:30484kB active_anon:94412kB inactive_anon:94396kB active_file:70828kB inactive_file:65156kB unevictable:3828kB writepending:100kB present:1382784kB managed:1306504kB mlocked:3828kB kernel_stack:28688kB pagetables:39072kB bounce:0kB free_pcp:4352kB local_pcp:980kB free_cma:321800kB
[ 1799.172271] .(3)[2164:wifi@1.0-servic]lowmem_reserve[]: 0 0 0
[ 1799.181357] DMA32: 4060*4kB (UMECH) 942*8kB (UMECH) 223*16kB (UMCH) 47*32kB (UC) 176*64kB (UMECH) 53*128kB (UMEC) 1*256kB (C) 5*512kB (C) 3*1024kB (C) 1*2048kB (C) 75*4096kB (C) = 362032kB
[ 1799.213727] .(3)[2164:wifi@1.0-servic]Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 1799.230588] .(3)[2164:wifi@1.0-servic]36685 total pagecache pages
[ 1799.237426] .(3)[2164:wifi@1.0-servic]291 pages in swap cache
[ 1799.244467] .(3)[2164:wifi@1.0-servic]Swap cache stats: add 192421, delete 192130, find 6130/80536
[ 1799.254343] .(3)[2164:wifi@1.0-servic]Free swap  = 59508kB
[ 1799.261722] .(3)[2164:wifi@1.0-servic]Total swap = 495612kB
[ 1799.273726] .(3)[2164:wifi@1.0-servic]345696 pages RAM
[ 1799.279474] .(3)[2164:wifi@1.0-servic]0 pages HighMem/MovableOnly
[ 1799.287385] .(3)[2164:wifi@1.0-servic]19070 pages reserved
[ 1799.294245] .(3)[2164:wifi@1.0-servic]95232 pages cma reserved
[ 1799.300863] .(3)[2164:wifi@1.0-servic]0 pages hwpoisoned
[ 1799.313057] .(3)[2164:wifi@1.0-servic]wlan: probe of 1-3:1.3 failed with error -1
[ 1799.322287] .(3)[2164:wifi@1.0-servic]usbcore: registered new interface driver wlan

通过打印的zone 统计信息可以判断,即如下信息,可以看到order为3的page,即大小为32K,还有47*32kB (UC),即47个。但是,都是UC类型的,即UNMOVABLE和CMA类型的page。而分配时的内存flag为(GFP_ATOMIC|__GFP_COMP),也就时需要ATOMIC类型的页面,GFP_ATOMIC类型的页面一般用来从中断处理和进程上下文之外的其他代码中分配内存,从不睡眠。

[ 1799.181357] DMA32: 4060*4kB (UMECH) 942*8kB (UMECH) 223*16kB (UMCH) 47*32kB (UC) 176*64kB (UMECH) 53*128kB (UMEC) 1*256kB (C) 5*512kB (C) 3*1024kB (C) 1*2048kB (C) 75*4096kB (C) = 362032kB

所以出现这种问题的原因,最先想到的就是系统中存在内存泄漏的地方,导致GFP_ATOMIC类型的页面不够使用,由此导致分配失败。然而,通过排查,并未发现系统中存在内存泄漏的地方。

通过刚开机时的/proc/pagetypeinfo信息来看,一开机时的GFP_ATOMIC类型的页面就很少,因此怀疑是系统本身内存过少导致的。通过减少系统中的一些no-map类型的内存,以及部分CMA类型的内存大小,增大系统的可用内存,开机发先GFP_ATOMIC类型的页面明显增多,再进行测试也没发现page allocation failure。根本原因还是系统本身的ddr比较小,而部分模块又在设备树中reserve了太多的no-map类型的内存和CMA内存,实际使用时发现CMA类型的内存使用效率很低,如何提高系统对CMA类型的内存使用效率,还要继续研究。

独特见解