转载自https://blog.csdn.net/shift_wwx/article/details/121593698
5. mainloop
这里暂时不过多分析,因为涉及到不同handler 可能引起的delay 处理。
主要是使用epoll机制,通过epoll_wait阻塞等待触发:
/* Wait for events until the next polling timeout */
nevents = epoll_wait(epollfd, events, maxevents, delay);
等待唤醒后主要做了两件事情,确认是否有connect断开,执行handler:
/*
* First pass to see if any data socket connections were dropped.
* Dropped connection should be handled before any other events
* to deallocate data connection and correctly handle cases when
* connection gets dropped and reestablished in the same epoll cycle.
* In such cases it's essential to handle connection closures first.
*/
for (i = 0, evt = &events[0]; i < nevents; ++i, evt++) {
if ((evt->events & EPOLLHUP) && evt->data.ptr) {
ALOGI("lmkd data connection dropped");
handler_info = (struct event_handler_info*)evt->data.ptr;
ctrl_data_close(handler_info->data);
}
}
/* Second pass to handle all other events */
for (i = 0, evt = &events[0]; i < nevents; ++i, evt++) {
if (evt->events & EPOLLERR) {
ALOGD("EPOLLERR on event #%d", i);
}
if (evt->events & EPOLLHUP) {
/* This case was handled in the first pass */
continue;
}
if (evt->data.ptr) {
handler_info = (struct event_handler_info*)evt->data.ptr;
call_handler(handler_info, &poll_params, evt->events);
}
}
从init 中可以知道 epoll 主要监听了 9 个event,不同的event fd 对应不同的handler 处理逻辑。这些handler 大致分为:
- 一个socket listener fd 监听,主要是
/dev/socket/lmkd
,在init() 中添加到epoll; - 三个客户端socket data fd 的数据通信,在
ctrl_connect_handler()
中添加到epoll; - 三个presurre 状态的监听,在
init_psi_monitors() -> init_mp_psi()
中添加到epoll;(或者init_mp_common 的旧策略) - 一个是LMK event kpoll_fd 监听,在init() 中添加到epoll,目前新的lmkd 不再使用这个监听;
- 一个是wait 进程death 的pid fd 监听,在
start_wait_for_proc_kill()
中添加到epoll;
下面来详细剖析这些 handler 的处理流程。
6. ctrl listener fd 的处理流程 ctrl_connect_handler
首先,在init 中得知,socket lmkd 在listen 之后会将fd 添加到epoll 中,用以监听socket 从上一节mailloop 得知epoll 触发后会调用event 对应的handler 接口,对于 lmkd,如果connect 成功后会触发ctrl_connect_handler。
AMS 中会尝试连接 lmkd,如果无法connect 会每隔 1 s 去retry,一直到connect。
frameworks/base/servcies/core/java/com/android/server/am/ProcessList.java
// lmkd reconnect delay in msecs
private static final long LMKD_RECONNECT_DELAY_MS = 1000;
...
final class KillHandler extends Handler {
...
@Override
public void handleMessage(Message msg) {
switch (msg.what) {
case KILL_PROCESS_GROUP_MSG:
...
case LMKD_RECONNECT_MSG:
if (!sLmkdConnection.connect()) {
Slog.i(TAG, "Failed to connect to lmkd, retry after " +
LMKD_RECONNECT_DELAY_MS + " ms");
// retry after LMKD_RECONNECT_DELAY_MS
sKillHandler.sendMessageDelayed(sKillHandler.obtainMessage(
KillHandler.LMKD_RECONNECT_MSG), LMKD_RECONNECT_DELAY_MS);
}
break;
default:
super.handleMessage(msg);
}
}
}
通过代码可以得知,AMS 会通过sLmkdConnection.connect() 尝试连接lmkd,如果connect 失败会一直retry。
当LmkdConnection 连接lmkd 成功后,会进行notiry,而 lmkd 会通过epoll 收到消息,并调用ctrl_connect_handler
:
static void ctrl_connect_handler(int data __unused, uint32_t events __unused,
struct polling_params *poll_params __unused) {
struct epoll_event epev;
int free_dscock_idx = get_free_dsock();
if (free_dscock_idx < 0) {
for (int i = 0; i < MAX_DATA_CONN; i++) {
ctrl_data_close(i);
}
free_dscock_idx = 0;
}
data_sock[free_dscock_idx].sock = accept(ctrl_sock.sock, NULL, NULL);
if (data_sock[free_dscock_idx].sock < 0) {
ALOGE("lmkd control socket accept failed; errno=%d", errno);
return;
}
ALOGI("lmkd data connection established");
/* use data to store data connection idx */
data_sock[free_dscock_idx].handler_info.data = free_dscock_idx;
data_sock[free_dscock_idx].handler_info.handler = ctrl_data_handler;
data_sock[free_dscock_idx].async_event_mask = 0;
epev.events = EPOLLIN;
epev.data.ptr = (void *)&(data_sock[free_dscock_idx].handler_info);
if (epoll_ctl(epollfd, EPOLL_CTL_ADD, data_sock[free_dscock_idx].sock, &epev) == -1) {
ALOGE("epoll_ctl for data connection socket failed; errno=%d", errno);
ctrl_data_close(free_dscock_idx);
return;
}
maxevents++;
}
对于 lmkd 会提供最大 3 个的客户端连接,如果超过3个后要进行ctrl_data_close()
以断开epoll 和socket。
如果没有超过的话,会通过accept 创建个新的data socket,并将其添加到epoll 中。
主要注意的是data 的交互函数ctrl_data_handler()
.
7. ctrl data fd 的处理流程 ctrl_data_handler
static void ctrl_data_handler(int data, uint32_t events,
struct polling_params *poll_params __unused) {
if (events & EPOLLIN) {
ctrl_command_handler(data);
}
}
当时添加到epoll 时是以EPOLLIN 添加的,所以这里接着会调用ctrl_command_handler,主要处理从ProcessList.java 中发出的几个 lmk command:
enum lmk_cmd {
LMK_TARGET = 0, /* Associate minfree with oom_adj_score */
LMK_PROCPRIO, /* Register a process and set its oom_adj_score */
LMK_PROCREMOVE, /* Unregister a process */
LMK_PROCPURGE, /* Purge all registered processes */
LMK_GETKILLCNT, /* Get number of kills */
LMK_SUBSCRIBE, /* Subscribe for asynchronous events */
LMK_PROCKILL, /* Unsolicited msg to subscribed clients on proc kills */
LMK_UPDATE_PROPS, /* Reinit properties */
};
7.1 cmd_procprio
例如,进程的oom_adj_score 发生变化时,AMS 会调用setOomAdj 通知到lmkd:
frameworks/base/servcies/core/java/com/android/server/am/ProcessList.java
public static void setOomAdj(int pid, int uid, int amt) {
...
long start = SystemClock.elapsedRealtime();
ByteBuffer buf = ByteBuffer.allocate(4 * 4);
buf.putInt(LMK_PROCPRIO);
buf.putInt(pid);
buf.putInt(uid);
buf.putInt(amt);
writeLmkd(buf, null);
long now = SystemClock.elapsedRealtime();
...
lmkd 中ctrl_command_handler
函数根据cmd 解析出LMK_PROCPRIO
,最终调用cmd_procprio()
:
case LMK_PROCPRIO:
/* process type field is optional for backward compatibility */
if (nargs < 3 || nargs > 4)
goto wronglen;
cmd_procprio(packet, nargs, &cred);
break;
接着来看cmd_procprio 的处理:
static void cmd_procprio(LMKD_CTRL_PACKET packet, int field_count, struct ucred *cred) {
struct proc *procp;
char path[LINE_MAX];
char val[20];
int soft_limit_mult;
struct lmk_procprio params;
bool is_system_server;
struct passwd *pwdrec;
int tgid;
lmkd_pack_get_procprio(packet, field_count, ¶ms);
...
snprintf(path, sizeof(path), "/proc/%d/oom_score_adj", params.pid);
snprintf(val, sizeof(val), "%d", params.oomadj);
if (!writefilestring(path, val, false)) {
ALOGW("Failed to open %s; errno=%d: process %d might have been killed",
path, errno, params.pid);
/* If this file does not exist the process is dead. */
return;
}
...
procp = pid_lookup(params.pid);
if (!procp) {
int pidfd = -1;
if (pidfd_supported) {
pidfd = TEMP_FAILURE_RETRY(sys_pidfd_open(params.pid, 0));
...
}
procp = static_cast<struct proc*>(calloc(1, sizeof(struct proc)));
if (!procp) {
// Oh, the irony. May need to rebuild our state.
return;
}
procp->pid = params.pid;
procp->pidfd = pidfd;
procp->uid = params.uid;
procp->reg_pid = cred->pid;
procp->oomadj = params.oomadj;
proc_insert(procp);
} else {
...
proc_unslot(procp);
procp->oomadj = params.oomadj;
proc_slot(procp)
}
}
- 首先是将AMS 中传下来的进程的
oom_score_adj
写入到节点/proc/pid/oom_score_adj
; - 通过
pid_lookup
查找是否已经存在的进程; - 如果是新的进程,将通过
sys_pidfd_open
获取pidfd,并通过proc_insert
添加到procadjslot_list
数组链表中; - 如果是已经存在的进程,则更新oomadj 属性,重新添加到
procadjslot_list
数组链表中;
7.2 cmd_procremove
同理7.1 节,当应用进程不再启动时,会通过ProcessList.remove() 发送命令 LMK_PROCREMOVE 通知 lmkd,并最终调用到cmd_procremove:
static void cmd_procremove(LMKD_CTRL_PACKET packet, struct ucred *cred) {
...
procp = pid_lookup(params.pid);
if (!procp) {
return;
}
...
pid_remove(params.pid);
}
代码比较简单,如果proc 存在,则通过pid_remove 进行移除工作。
7.3 cmd_procpurge
一般是在AMS 构造的时候会对 lmkd 进行connect,如果connect 成功,则会发命令LMK_PROCPURGE
通知lmkd 先进行环境的打扫工作,最终调用 cmd_procpurge
:
static void cmd_procpurge(struct ucred *cred) {
...
for (i = 0; i < PIDHASH_SZ; i++) {
procp = pidhash[i];
while (procp) {
next = procp->pidhash_next;
/* Purge only records created by the requestor */
if (claim_record(procp, cred->pid)) {
pid_remove(procp->pid);
}
procp = next;
}
}
}
代码比较简单,就是将所有的proc 都清理一遍。
7.4 cmd_subscribe
在AMS 通过ProcessList connect 到 lmkd 之后,会发送命令LMK_SUBSCRIBE
:
frameworks/base/servcies/core/java/com/android/server/am/ProcessList.java
public boolean onLmkdConnect(OutputStream ostream) {
try {
...
// Subscribe for kill event notifications
buf = ByteBuffer.allocate(4 * 2);
buf.putInt(LMK_SUBSCRIBE);
buf.putInt(LMK_ASYNC_EVENT_KILL);
ostream.write(buf.array(), 0, buf.position());
} catch (IOException ex) {
return false;
}
return true;
}
用以接受 lmkd 在kill 进程后的通知,lmkd 在kill 进程需要根据client 是否有subscribe 决定是否通知,如果向 lmkd 发送subscribe:
static void cmd_subscribe(int dsock_idx, LMKD_CTRL_PACKET packet) {
struct lmk_subscribe params;
lmkd_pack_get_subscribe(packet, ¶ms);
data_sock[dsock_idx].async_event_mask |= 1 << params.evt_type;
}
会将对应的客户端信息数组 data_sock
中对应的async_event_mask
标记为LMK_ASYNC_EVENT_KILL
,在 lmkd kill 一个进程后会调用:
static void ctrl_data_write_lmk_kill_occurred(pid_t pid, uid_t uid) {
LMKD_CTRL_PACKET packet;
size_t len = lmkd_pack_set_prockills(packet, pid, uid);
for (int i = 0; i < MAX_DATA_CONN; i++) {
if (data_sock[i].sock >= 0 && data_sock[i].async_event_mask & 1 << LMK_ASYNC_EVENT_KILL) {
ctrl_data_write(i, (char*)packet, len);
}
}
}
通过ctrl_data_write
通知 AMS 中的ProcessList:
frameworks/base/servcies/core/java/com/android/server/am/ProcessList.java
sLmkdConnection = new LmkdConnection(sKillThread.getLooper().getQueue(),
new LmkdConnection.LmkdConnectionListener() {
...
@Override
public boolean handleUnsolicitedMessage(ByteBuffer dataReceived,
int receivedLen) {
...
}
7.5 cmd_target
从ProcessList.java 中得知在ProcessList 构造时会初始化一次,另外会在ATMS.updateConfiguration 是会触发:
frameworks/base/services/core/java/com/android/server/wm/ActivityTaskManagerService.java
public boolean updateConfiguration(Configuration values) {
mAmInternal.enforceCallingPermission(CHANGE_CONFIGURATION, "updateConfiguration()");
synchronized (mGlobalLock) {
...
mH.sendMessage(PooledLambda.obtainMessage(
ActivityManagerInternal::updateOomLevelsForDisplay, mAmInternal,
DEFAULT_DISPLAY));
...
}
}
感兴趣的可以跟一下源码,最终会调用到ProcessList.updateOomLevels()
frameworks/base/servcies/core/java/com/android/server/am/ProcessList.java
private void updateOomLevels(int displayWidth, int displayHeight, boolean write) {
...
if (write) {
ByteBuffer buf = ByteBuffer.allocate(4 * (2 * mOomAdj.length + 1));
buf.putInt(LMK_TARGET);
for (int i = 0; i < mOomAdj.length; i++) {
buf.putInt((mOomMinFree[i] * 1024)/PAGE_SIZE);
buf.putInt(mOomAdj[i]);
}
writeLmkd(buf, null);
...
}
}
系统通过这个函数计算oom adj 的minfree,并将各个级别的 minfree和oom_adj_score
传入到 lmkd 中,至于adj minfree 的算法,后续会补充,这里继续跟lmkd 的cmd_target
:
static void cmd_target(int ntargets, LMKD_CTRL_PACKET packet) {
int i;
struct lmk_target target;
char minfree_str[PROPERTY_VALUE_MAX];
char *pstr = minfree_str;
char *pend = minfree_str + sizeof(minfree_str);
static struct timespec last_req_tm;
struct timespec curr_tm;
...
for (i = 0; i < ntargets; i++) {
lmkd_pack_get_target(packet, i, &target);
lowmem_minfree[i] = target.minfree;
lowmem_adj[i] = target.oom_adj_score;
pstr += snprintf(pstr, pend - pstr, "%d:%d,", target.minfree,
target.oom_adj_score);
if (pstr >= pend) {
/* if no more space in the buffer then terminate the loop */
pstr = pend;
break;
}
}
lowmem_targets_size = ntargets;
/* Override the last extra comma */
pstr[-1] = '\0';
property_set("sys.lmk.minfree_levels", minfree_str);
...
}
代码比较简单,将minfree 和oom_adj_score
进行组装,然后将组装的字符串存入到prop sys.lmk.minfree_levels。
这里的prop 其实应该是为了后面debug 时查看的,而最终的是两个数组变量:
static int lowmem_adj[MAX_TARGETS];
static int lowmem_minfree[MAX_TARGETS];
这里是将AMS 中设置的oom adj 都存放起来,后面需要kill 进程时会根据内存的使用情况、内存的mem pressure计算出最合适的min_score_adj
,然后根据这个min_score_adj
,kill 所有大于此值的进程。
至此, ctrl data fd 的处理流程 ctrl_data_handler
基本已剖析完成了,主要是配合第 6 节,AMS 在构造的时候会通过ProcessList 进行相对 lmkd 的初始化,包括connect 和 lmkd kill 进程后的通知监听。
- 在AMS 初始化时connect lmkd,并发送命令
LMK_PROCPURGE
进行环境清理; - 同上,在AMS 发送完
LMK_PROCPURGE
后,会紧接着发送LMK_SUBSCRIBE
用以接受 lmkd kill 进程后的通知; - 在AMS 停掉某个进程时,会发送命令
LMK_PROCREMOVE
; - 在AMS 更新
oom_score_adj
时,会通过接口setOomAdj 发送命令LMK_PROCPRIO
; - 在更新oom level 时,会通过updateOomLevels 发送命令
LMK_TARGET
;
评论 (0)