greys 简介

greys的使用参见链接: 使用greys来排查线上问题

greys.sh

一般使用greys时,启动命令如下:

1
sudo -u tomcat -H ./greys.sh [pid]

greys.sh的最后一行main "${@}"将命令行的所有参数都传给了main函数, 看main函数的实现:

1
2
3
4
5
6
7
8
9
10
while getopts "PUJC" ARG
do
case ${ARG} in
P) OPTION_CHECK_PERMISSION=0;;
U) OPTION_UPDATE_IF_NECESSARY=0;;
J) OPTION_ATTACH_JVM=0;;
C) OPTION_ACTIVE_CONSOLE=0;;
?) usage;exit 1;;
esac
done

首先脚本使用getopts来获取命令行的参数, 指定解析-P-U-J-C这几个参数,设置一些flag。
注意下case语句中的, 代表无法识别的命令行参数, 这时就打印出help,然后退出程序:

The GNU getopt command uses the GNU getopt() library function to do the parsing of the arguments and options.

If getopt() does not recognize an option character, it prints an error message to stderr, stores the character in optopt, and returns ?. The calling program may prevent the error message by setting opterr to 0.

然后greys.sh这个脚本会检查greys的版本是否有更新, 除了检查更新就是attach jvmactive console

1
2
3
4
5
6
7
8
9
if [[ ${OPTION_ATTACH_JVM} -eq 1 ]]; then
attach_jvm ${greys_local_version}\
|| exit_on_err 1 "attach to target jvm(${TARGET_PID}) failed."
fi

if [[ ${OPTION_ACTIVE_CONSOLE} -eq 1 ]]; then
active_console ${greys_local_version}\
|| exit_on_err 1 "active console failed."
fi

${OPTION_ATTACH_JVM}${OPTION_ACTIVE_CONSOLE}的默认值都是1:

1
2
3
4
5
6

# the option to control greys.sh attach target jvm
OPTION_ATTACH_JVM=1

# the option to control greys.sh active greys-console
OPTION_ACTIVE_CONSOLE=1

attach jvm分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
attach_jvm()
{
local greys_lib_dir=${GREYS_LIB_DIR}/${1}/greys

# if [ ${TARGET_IP} = ${DEFAULT_TARGET_IP} ]; then
if [ ! -z ${TARGET_PID} ]; then
${JAVA_HOME}/bin/java \
${BOOT_CLASSPATH} ${JVM_OPTS} \
-jar ${greys_lib_dir}/greys-core.jar \
-pid ${TARGET_PID} \
-target ${TARGET_IP}":"${TARGET_PORT} \
-core "${greys_lib_dir}/greys-core.jar" \
-agent "${greys_lib_dir}/greys-agent.jar"
fi
}

attach jvm这个函数,就是调用 greys-core这个jar包, jar包执行时会调用指定的Main-Class的的main方法。
Main-ClassMETA-INF中指定, 查看文件的内容:

1
2
3
4
5
6
7
➜  greys  unzip -q -c greys-core.jar  META-INF/MANIFEST.MF
Manifest-Version: 1.0
Archiver-Version: Plexus Archiver
Created-By: Apache Maven
Built-By: vlinux
Build-Jdk: 1.8.0_91
Main-Class: com.github.ompc.greys.core.GreysLauncher

因此执行这个jar包后,会调用GreysLauncher的main方法。

attach到jvm过程

GreysLauncher在main函数中做了两件事情,一是解析命令行配置, 二是attach到具体的jvm上。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
public static void main(String[] args) {
try {
new GreysLauncher(args);
} catch (Throwable t) {
System.err.println("start greys failed, because : " + getCauseMessage(t));
System.exit(-1);
}
}

public GreysLauncher(String[] args) throws Exception {

// 解析配置文件
Configure configure = analyzeConfigure(args);

// 加载agent
attachAgent(configure);
}

配置文件就是脚本中指定的参数,主要有如下字段:

1
2
3
4
5
6
private String targetIp;                // 目标主机IP
private int targetPort; // 目标进程号
private int javaPid; // 对方java进程号
private int connectTimeout = 6000; // 连接超时时间(ms)
private String greysCore; // greys-core.jar的位置
private String greysAgent; // greys-agent.jar的位置

attach原理

我们在用jstack命令查看jvm的线程dump的时候,经常看到这两个进程,一个是"Signal Dispatcher", 另外一个是"Attach Listener";
这两个线程就和attach功能密切相关。

1
2
3
4
5
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f23b80d2800 nid=0xb82e runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Attach Listener" #28 daemon prio=9 os_prio=0 tid=0x00007f2328001000 nid=0x3bb5 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

Signal Dispatcher负责响应SIGQUIT, 并创建 Attach ListenerAttach Listener负责建立通信,执行相应的命令。

attach jvm就是根据com.sun.tools.attach.VirtualMachine的接口提供的方法——attachloadAgent

GreysLauncher

1
2
3
4
5
6
7
8
9
10
11
12
13
Object vmObj = null;
try {
if (null == attachVmdObj) { // 使用 attach(String pid) 这种方式
vmObj = vmClass.getMethod("attach", String.class).invoke(null, "" + configure.getJavaPid());
} else {
vmObj = vmClass.getMethod("attach", vmdClass).invoke(null, attachVmdObj);
}
vmClass.getMethod("loadAgent", String.class, String.class).invoke(vmObj, configure.getGreysAgent(), configure.getGreysCore() + ";" + configure.toString());
} finally {
if (null != vmObj) {
vmClass.getMethod("detach", (Class<?>[]) null).invoke(vmObj, (Object[]) null);
}
}

通过上面的代码, greys-core.jargreys-agent.jar这两个jar包就被引入到了jvm

The loadAgent method is used to load agents that are written in the Java Language and deployed in a JAR file. (See java.lang.instrument for a detailed description on how these agents are loaded and started).

loadAgent会将greys-core.jargreys-agent.jar两个jar包引入进来。jar包引入后会从/META-INF/MANIFEST.MF中读取配置的agent类。

agent启动过程

agent有两种启动方式,一种再jvm启动的时候一起启动, 一种是动态的attach到一个运行的jvm上。

随启动参数启动

以agent形式启动需要在jvm启动参数添加:

1
-javaagent:btrace-agent.jar

这种加载方式需要实现下面两个接口中的一个:

1
2
3
4
5
6
7
8
/**
* JVM先尝试调用这个方法
*/
public static void premain(String agentArgs, Instrumentation inst);
/**
* 如果上面的方法不存在,则尝试调用这个方法
*/
public static void premain(String agentArgs);

同时必须在MANIFEST.MF中包含Premain-Class指定对应的类。

动态attach的方式启动

attach形式需要实现下面的两个接口

1
2
3
4
5
6
7
8
9
10
/**
* 首先尝试调用这个方法
*/
public static void agentmain(String agentArgs, Instrumentation inst);


/**
* 上面的方法不存在,会尝试调用这个方法
*/
public static void agentmain(String agentArgs);

同时在Jar包中必须指定 Agent-Class, 因此当此jar包被加载时,jvm会从/META-INF/MANIFEST.MF中读取配置的Premain-ClassAgent-Class, greys-agent的信息显示如下:

1
2
3
4
5
6
7
8
9
Manifest-Version: 1.0
Archiver-Version: Plexus Archiver
Created-By: Apache Maven
Built-By: vlinux
Build-Jdk: 1.8.0_91
Agent-Class: com.github.ompc.greys.agent.AgentLauncher
Can-Redefine-Classes: true
Can-Retransform-Classes: true
Premain-Class: com.github.ompc.greys.agent.AgentLauncher

因此入口定位在AgentLauncher

AgentLauncher

AgentLauncher的主要完成了一下的功能:

- 自定义类加载器,减少对现有工程的侵蚀
- 启动一个`GaServer`监听指定的端口

GaServer读取用户的输入的命令, 将命令交给CommandHandler在新的线程中进行具体的处理。

active console分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# active console
# $1 : greys_local_version
active_console()
{

local greys_lib_dir=${GREYS_LIB_DIR}/${1}/greys

if type ${JAVA_HOME}/bin/java 2>&1 >> /dev/null; then

# use default console
${JAVA_HOME}/bin/java \
-cp ${greys_lib_dir}/greys-core.jar \
com.github.ompc.greys.core.GreysConsole \
${TARGET_IP} \
${TARGET_PORT}

elif type telnet 2>&1 >> /dev/null; then

# use telnet
telnet ${TARGET_IP} ${TARGET_PORT}

elif type nc 2>&1 >> /dev/null; then

# use netcat
nc ${TARGET_IP} ${TARGET_PORT}

else

echo "'telnet' or 'nc' is required." 1>&2
return 1

fi
}

active console主要是启动一个客户端, 它对不同的方式做了判断; 以java方式启动的会执行greys-core.jarGreysConsole
的main方法:

1
2
3
public static void main(String... args) throws IOException {
new GreysConsole(new InetSocketAddress(args[0], Integer.valueOf(args[1])));
}

GreysConsole的构造函数中连接到上面启动的GaServer, 将用户输入的命令发送到server端, 然后将server端的返回显示在交互式shell上。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
// com.github.ompc.greys.core.GreysConsole#activeConsoleReader
/**
* 发送命令到服务端
*/
private void activeConsoleReader() {
final Thread socketThread = new Thread("ga-console-reader-daemon") {

private StringBuilder lineBuffer = new StringBuilder();

@Override
public void run() {
try {

while (isRunning) {

final String line = console.readLine();

// 如果是\结尾,则说明还有下文,需要对换行做特殊处理
if (StringUtils.endsWith(line, "\\")) {
// 去掉结尾的\
lineBuffer.append(line.substring(0, line.length() - 1));
continue;
} else {
lineBuffer.append(line);
}

final String lineForWrite = lineBuffer.toString();
lineBuffer = new StringBuilder();

// replace ! to \!
// history.add(StringUtils.replace(lineForWrite, "!", "\\!"));

// flush if need
if (history instanceof Flushable) {
((Flushable) history).flush();
}

console.setPrompt(EMPTY);
if (isNotBlank(lineForWrite)) {
socketWriter.write(lineForWrite + "\n");
} else {
socketWriter.write("\n");
}
socketWriter.flush();

}
} catch (IOException e) {
err("read fail : %s", e.getMessage());
shutdown();
}

}

};
socketThread.setDaemon(true);
socketThread.start();
}

// com.github.ompc.greys.core.GreysConsole#loopForWriter
// 将服务端返回输出到界面
private void loopForWriter() {
try {
while (isRunning) {
final int c = socketReader.read();
if (c == EOF) {
break;
}
if (c == EOT) {
hackingForReDrawPrompt();
console.setPrompt(DEFAULT_PROMPT);
console.redrawLine();
} else {
out.write(c);
}
out.flush();
}
} catch (IOException e) {
err("write fail : %s", e.getMessage());
shutdown();
}

}

Misc

使用maven生成MainFest文件

maven-jar-plugin

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
<build>
<finalName>qtracer-agent</finalName>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.4</version>
<configuration>
<archive>
<manifestEntries>
<Premain-Class>qunar.tc.qtracer.instrument.AgentMain</Premain-Class>
<Agent-Class>qunar.tc.qtracer.instrument.AgentMain</Agent-Class>
<Can-Redefine-Classes>true</Can-Redefine-Classes>
<Can-Retransform-Classes>true</Can-Retransform-Classes>
</manifestEntries>
</archive>
</configuration>
</plugin>
</plugins>
</build>

maven-assembly-plugin

1
2
3
4
5
6
7
8
9
10
11
12
13
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifestEntries>
<Premain-Class>**.**.InstrumentTest</Premain-Class>
<Agent-Class>**.**..InstrumentTest</Agent-Class>
<Can-Redefine-Classes>true</Can-Redefine-Classes>
<Can-Retransform-Classes>true</Can-Retransform-Classes>
</manifestEntries>
</archive>
</configuration>
</plugin>

参考链接