HAOJX

kubernetes容器日志收集方案

字数统计: 1.1k阅读时长: 5 min
2018/12/21 Share

kubernetes官网文档描述的很详细 , 具体地址是: https://kubernetes.io/docs/concepts/cluster-administration/logging/

总结如下:

日志收集的级别是分为三种:

pod level

pod级别的日志 , 默认是输出到标准输出和标志输入

例子: debug/counter-pod.yaml

1
2
3
4
5
6
7
8
9
10
apiVersion: v1
kind: Pod
metadata:
name: counter
spec:
containers:
- name: count
image: busybox
args: [/bin/sh, -c,
'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done']

通过kubectl logs 来查看

1
2
3
4
5
$ kubectl logs counter
0: Mon Jan 1 00:00:00 UTC 2001
1: Mon Jan 1 00:00:01 UTC 2001
2: Mon Jan 1 00:00:02 UTC 2001
...

node level

架构如下

node级别的日志 , 通过配置容器的log-driver来进行管理 , 这种需要配合logrotare来进行 , 日志超过10MB , 自动进行rotate操作 , 常用的log-dirver如下:

Driver Description
none No logs are available for the container and docker logs does not return any output.
json-file The logs are formatted as JSON. The default logging driver for Docker.
syslog Writes logging messages to the syslog facility. The syslog daemon must be running on the host machine.
journald Writes log messages to journald. The journald daemon must be running on the host machine.
gelf Writes log messages to a Graylog Extended Log Format (GELF) endpoint such as Graylog or Logstash.
fluentd Writes log messages to fluentd (forward input). The fluentd daemon must be running on the host machine.
awslogs Writes log messages to Amazon CloudWatch Logs.
splunk Writes log messages to splunk using the HTTP Event Collector.
etwlogs Writes log messages as Event Tracing for Windows (ETW) events. Only available on Windows platforms.
gcplogs Writes log messages to Google Cloud Platform (GCP) Logging.
logentries Writes log messages to Rapid7 Logentries.

Cluster-level logging architectures

集群级别的日志收集 , 这种有三种日志收集方案

使用节点日志agent

也就是在node级别进行日志收集。一般使用DaemonSet部署在每个node中。这种方式优点是耗费资源少,因为只需部署在节点,且对应用无侵入。缺点是只适合容器内应用日志必须都是标准输出

架构如下:

使用sidecar container作为容器日志agent

也就是在pod中跟随应用容器起一个日志处理容器,有两种形式:一种是直接将应用容器的日志收集并输出到标准输出(叫做Streaming sidecar container)

但需要注意的是,这时候,宿主机上实际上会存在两份相同的日志文件:一份是应用自己写入的;另一份则是 sidecar 的 stdout 和 stderr 对应的 JSON 文件。这对磁盘是很大的浪费 , 所以说,除非万不得已或者应用容器完全不可能被修改,否则我还是建议你直接使用方案一,或者直接使用下面的第三种方案。

例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
apiVersion: v1
kind: Pod
metadata:
name: counter
spec:
containers:
- name: count
image: busybox
args:
- /bin/sh
- -c
- >
i=0;
while true;
do
echo "$i: $(date)" >> /var/log/1.log;
echo "$(date) INFO $i" >> /var/log/2.log;
i=$((i+1));
sleep 1;
done
volumeMounts:
- name: varlog
mountPath: /var/log
- name: count-log-1
image: busybox
args: [/bin/sh, -c, 'tail -n+1 -f /var/log/1.log']
volumeMounts:
- name: varlog
mountPath: /var/log
- name: count-log-2
image: busybox
args: [/bin/sh, -c, 'tail -n+1 -f /var/log/2.log']
volumeMounts:
- name: varlog
mountPath: /var/log
volumes:
- name: varlog
emptyDir: {}

每一个pod中都起一个日志收集agent(比如logstash或fluebtd

也就是相当于把方案一里的 logging agent,放在了pod里

logging-agent还可以是fluentd , fluentd的源是 , 应用程序的日志文件 , 一般情况下会把fluentd的配置保存在configmap里

例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
data:
fluentd.conf: |
<source>
type tail
format none
path /var/log/1.log
pos_file /var/log/1.log.pos
tag count.format1
</source>

<source>
type tail
format none
path /var/log/2.log
pos_file /var/log/2.log.pos
tag count.format2
</source>

<match **>
type google_cloud
</match>

之后我们在应用的pod里, 可以把将fluentd作为sidecar , 将其产生的1.log 和 2.log 转发到 ElasticSear

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
apiVersion: v1
kind: Pod
metadata:
name: counter
spec:
containers:
- name: count
image: busybox
args:
- /bin/sh
- -c
- >
i=0;
while true;
do
echo "$i: $(date)" >> /var/log/1.log;
echo "$(date) INFO $i" >> /var/log/2.log;
i=$((i+1));
sleep 1;
done
volumeMounts:
- name: varlog
mountPath: /var/log
- name: count-agent
image: k8s.gcr.io/fluentd-gcp:1.30
env:
- name: FLUENTD_ARGS
value: -c /etc/fluentd-config/fluentd.conf
volumeMounts:
- name: varlog
mountPath: /var/log
- name: config-volume
mountPath: /etc/fluentd-config
volumes:
- name: varlog
emptyDir: {}
- name: config-volume
configMap:
name: fluentd-config

这种方式因为部署了sidecar , 所以会消耗资源 , 每个pod都要起一个日志收集容器,相对来说,Streaming sidecar container的形式比较折中,既能收集多种形式的容器,耗费资源也没有太多,这种的话因为没有将日志输出到stdout里, 所以用kubectl log是看不到的

在应用容器中直接将日志推到存储后端

这种就很简单了 , 直接扔给后端了

CATALOG
  1. 1. pod level
  2. 2. node level
  3. 3. Cluster-level logging architectures
    1. 3.1. 使用节点日志agent
    2. 3.2. 使用sidecar container作为容器日志agent
    3. 3.3. 每一个pod中都起一个日志收集agent(比如logstash或fluebtd
    4. 3.4. 在应用容器中直接将日志推到存储后端