Linux Redhat 9 oom不触发
测试环境Amazon AMI ID: ami-0e54fe8afeb8fa59a
Operating System: Red Hat Enterprise Linux 9.2 (Plow)
Kernel: Linux 5.14.0-284.11.1.el9_2.x86_64
MySQL Version: Ver 8.0.33 for Linux on x86_64 (MySQL Community Server - GPL)
测试步骤:
使用上面的AMI启动一个新的实例, 在实例启动之后ssh连接进去。
按照MySQL官方的 repo 安装一个社区的版本。
# https://dev.mysql.com/downloads/repo/yum/
# (mysql80-community-release-el9-1.noarch.rpm)
wget https://dev.mysql.com/get/mysql80-community-release-el9-1.noarch.rpm
dnf install -y ./mysql80-community-release-el9-1.noarch.rpm
dnf makecache -y
dnf update -y
# Refer this doc to install mysql-community packages.
# https://dev.mysql.com/doc/refman/8.0/en/linux-installation-yum-repo.html
dnf install -y mysql-community-server
# Check the installation success.
systemctl status mysqld
编辑MySQL配置文件, 添加如下的参数:
vim /etc/my.cnf
#设置buffer pool 的参数大于物理内存。 例如 os 本身由可用的内存是 16G, 那么设置一个更大的值即可。
innodb_buffer_pool_size = 40G
sudo systemctl enable --now mysqld
sudo systemctl status mysqld
# 获取root用户的临时初始化密码:
sudo grep 'temporary password' /var/log/mysqld.log
# 使用root用户登录, 并查看配置已经生效。
mysql -u root -p
MySQL [(none)]> show variables like "%buffer_pool_size%";
+-------------------------+-------------+
| Variable_name | Value |
+-------------------------+-------------+
| innodb_buffer_pool_size | 42949672960 |
+-------------------------+-------------+
1 row in set (0.005 sec)
创建database 以及 table:
create database test;
CREATE TABLE tt1(
id int NOT NULL AUTO_INCREMENT PRIMARY KEY,
person_id tinyint not null ,
person_name1 VARCHAR(3000) ,
person_name2 VARCHAR(3000) ,
person_name3 VARCHAR(3000) ,
person_name4 VARCHAR(3000) ,
person_name5 VARCHAR(3000) ,
gmt_create datetime ,
gmt_modified datetime
) ;
insert into tt1 (person_id, person_name1, person_name2, person_name3, person_name4, person_name5, gmt_create, gmt_modified)
values (1, lpad('',3000,'*'), lpad('',3000,'*'), lpad('',3000,'*'), lpad('',3000,'*'), lpad('',3000,'*'), now(), now());
# 反复执行这个命令, 复制全表的数据, 大概执行 10 次左右。
insert into tt1 (person_id, person_name1, person_name2, person_name3, person_name4, person_name5, gmt_create, gmt_modified)
select person_id, person_name1, person_name2, person_name3, person_name4, person_name5, now(), now() from tt1;
# 查看当前table的信息 ,确认一下数据量。
show table status like 'tt1'\G
在另一个机器上面 select 这个表格。
select count(*) from tt1;
如果数据量足够大的话, 那么就会占用超过物理内存的空间,导致OOM。
问题其他版本的os上面都会触发oom, 然后MySQL会被干掉重启, 只有Redhat 9 这个版本的os是不会触发的。
测试了Ubuntu 22, amazonlinux2, amazonlinux2023 都没由这个问题, 他们的内核都是 5.10+ 。
Redhat 9 这个版本的表现是, 在接近物理内存容量极限的时候, os 开始非常频繁的扫描并尝试回收内存的空间, 导致命令的响应变慢,我开了sar的监控命令,返回数据的速度也会便的比较慢, 大部分的进程会慢慢变成 Uninterreptable状态, 并且数量越来越多, 最终会导致实例的网络子系统不工作, 完全不响应任何的网络报文,实例的健康检查失败。CPU使用率依旧还在, EBS的监控显示这个卷满速度读取输出, 并且非常平稳, 响应时间以及队列的长度也没有异常。
如果这个时候尝试取手动触发oom, 是可以成功的, 在杀掉Mysqld之后, os就恢复了正常。
分析思路可以确定的是 磁盘的工作是正常的, CPU使用率慢慢增长但是也是正常的。 这部分的工作没有问题。
在发生问题的时候因为网络无法正常的工作,ssh已经断开, 无法确定当时的情况。
初步判断是内存的问题, 但是具体的差异是哪里。
第一sar命令中的记录
第二尝试对比与行为正常的os的差异, 获取sysctl -a 并记录到文件中。
对比完关于内存部分的参数, 完全没有任何的差别, 不是设置或者配置文件导致的这个问题。
min 水位的设置是默认的, 与其他的发行版本一直, 测试的时候用的相同的规格, 所以不存在这类的差异。
第三触发一个kdump 看看网络不相应的时候 os 当时的状态, 这个不太会。
第四查看内核的编译选项, 看看有什么不同。
第五学习os 内存的部分, 看看具体这个版本的内核如何进行内存的回收的, 或者触发oom的条件有哪些。
升级以及清理内核的步骤
通常情况下升级内核版本的步骤CentOS 升级步骤
yum makecache -y
yum update -y
grub2-editenv list
grub2-set-default 'CentOS Linux (3.10.xxxxx.el7.elrepo.x86_64) 7 (Core)' # entry_name
systemctl reboot
清理旧版本的步骤RHEL 或者 Centos
rpm -qa kernel*
# 这个命令会列出所有当前已经安装的版本的内核, 然后手动使用命令移除对应的软件包即可。
直接使用yum 移除不需要的版本即可.
yum remove -y kernel-devel-5.10.216-204.855.amzn2.x86_64 kernel-devel-5.10.218-208.862.amzn2.x86_64 kernel-5.10.216-204.855.amzn2.x86_64 kernel-5.10.218-208.862.amzn2.x86_64
rpm -qa | grep kernel
kernel-tools-5.10.219-208.866.amzn2.x86_64
kernel-headers-5.10.219-208.866.amzn2.x86_64
kernel-devel-5.10.219-208.866.amzn2.x86_64
kernel-5.10.219-208.866.amzn2.x86_64
列出确认一下是不是已经清理出来.
ls -alh /boot/
total 29M
dr-xr-xr-x 4 root root 4.0K Jul 19 15:02 ./
dr-xr-xr-x 19 root root 268 Jul 1 17:32 ../
-rw-r--r-- 1 root root 174 Jun 18 22:04 .vmlinuz-5.10.219-208.866.amzn2.x86_64.hmac
-rw------- 1 root root 4.5M Jun 18 22:04 System.map-5.10.219-208.866.amzn2.x86_64
-rw-r--r-- 1 root root 141K Jun 18 22:04 config-5.10.219-208.866.amzn2.x86_64
drwxr-xr-x 3 root root 17 Oct 14 2022 efi/
drwx------ 5 root root 79 Jul 19 15:02 grub2/
-rw------- 1 root root 14M Jul 9 15:03 initramfs-5.10.219-208.866.amzn2.x86_64.img
-rw-r--r-- 1 root root 643K Oct 14 2022 initrd-plymouth.img
-rw-r--r-- 1 root root 268K Jun 18 22:05 symvers-5.10.219-208.866.amzn2.x86_64.gz
-rwxr-xr-x 1 root root 9.7M Jun 18 22:04 vmlinuz-5.10.219-208.866.amzn2.x86_64*
当然 如果全都卸载了. 也是可以重装的(doge.
yum groupinstall -y "Development Tools"
yum install -y kernel kernel-devel kernel-debug
Ubuntu 降级Ubuntu Online 的内核不能直接卸载, 需要安装, 然后切换, 卸载新的
root@ip-172-31-59-13:~# update-initramfs -k all -c
update-initramfs: Generating /boot/initrd.img-5.15.0-1048-aws
update-initramfs: Generating /boot/initrd.img-5.4.0-1126-aws
root@ip-172-31-59-13:~# update-grub
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/40-force-partuuid.cfg'
Sourcing file `/etc/default/grub.d/50-cloudimg-settings.cfg'
Sourcing file `/etc/default/grub.d/init-select.cfg'
Generating grub configuration file ...
GRUB_FORCE_PARTUUID is set, will attempt initrdless boot
Found linux image: /boot/vmlinuz-5.15.0-1048-aws
Found initrd image: /boot/microcode.cpio /boot/initrd.img-5.15.0-1048-aws
Found linux image: /boot/vmlinuz-5.4.0-1126-aws
Found initrd image: /boot/microcode.cpio /boot/initrd.img-5.4.0-1126-aws
Found Ubuntu 20.04.6 LTS (20.04) on /dev/nvme0n1p1
done
查看可用内核的版本
root@ip-172-31-59-13:$ apt search linux-image | grep 5.4.0 | grep linux-image | grep aws
查看所有已经安装的内核
root@ip-172-31-59-13:~$ dpkg --get-selections | grep linux
console-setup-linux install
libselinux1:amd64 install
linux-aws install
linux-aws-5.15-headers-5.15.0-1048 install
linux-aws-headers-5.4.0-1126 install
linux-base install
linux-headers-5.15.0-1048-aws install
linux-headers-5.4.0-1126-aws install
linux-headers-aws install
linux-image-5.15.0-1048-aws install
linux-image-5.4.0-1126-aws install
linux-image-aws install
linux-modules-5.15.0-1048-aws install
linux-modules-5.4.0-1126-aws install
util-linux install
安装内核
root@ip-172-31-59-13:~$ apt install -y linux-image-5.4.0-1126-aws/focal-updates linux-headers-5.4.0-1126-aws
指定Grub Entry条目
root@ip-172-31-59-13:~$ vim /etc/default/grub
其中Entry的变量应该设置为下面的格式:
Advanced options for Ubuntu>Ubuntu, with Linux 5.4.0-1126-aws
清理内核的步骤 - Version 2Deb 包管理工具清理步骤
列出所有已经安装的内核版本:dpkg --list | grep linux-image
列出所有旧的内核并自动删除除当前内核之外的旧内核:sudo apt-get autoremove --purge`
如果想手动删除旧内核,可以使用以下命令,sudo apt-get remove --purge linux-image-X.X.X-X-generic
Rpm 包管理工具的清理步骤
查看安装的内核rpm -qa | grep kernel
使用yum卸载sudo yum install yum-utils
设置只保留两个内核sudo package-cleanup --oldkernels --count=2
Linux内存管理笔记
内存管理部分的笔记Crash命令的使用使用这个命令需要有debuginfo 以及kernel debug 的数据包, 同时可能需要gdb。
需要在配置文件里面开启这个 仓库: rhel-8-baseos-rhui-debug-rpms
具体的步骤也可以看这个文档, 来自Redhat 官方: https://access.redhat.com/solutions/9907
yum install -y kernel-debuginfo
# 使用这个命令就可以安装, 但是尺寸非常的大。
crash /boot/vmlinuz-$(uname -a)
使用命令crash来进行 PM 和 VM 的对应关系:
内核的 debug 文件在: /var/lib/debug/lib/modules/kernel-version/
使用crash命令:
~ # ❯❯❯ crash
crash 7.3.2-4.el8
Copyright (C) 2002-2022 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2022 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
WARNING: kernel relocated [592MB]: patching 107327 gdb minimal_symbol values
KERNEL: /usr/lib/debug/lib/modules/4.18.0-477.13.1.el8_8.x86_64/vmlinux [TAINTED]
DUMPFILE: /proc/kcore
CPUS: 2
DATE: Tue Jul 11 17:07:01 CST 2023
UPTIME: 01:04:34
LOAD AVERAGE: 0.15, 0.03, 0.01
TASKS: 226
NODENAME: center
RELEASE: 4.18.0-477.13.1.el8_8.x86_64
VERSION: #1 SMP Thu May 18 10:27:05 EDT 2023
MACHINE: x86_64 (2199 Mhz)
MEMORY: 7.9 GB
PID: 7657
COMMAND: "crash"
TASK: ffff9ce7d835a800 [THREAD_INFO: ffff9ce7d835a800]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
crash> vm -p [pid]
PID: 913 TASK: ffff9ce7c75fd000 CPU: 0 COMMAND: "sshd"
MM PGD RSS TOTAL_VM
ffff9ce7c11b8000 ffff9ce7c75ae000 7604k 76644k
VMA START END FLAGS FILE
ffff9ce7c759f828 55a7fcc7b000 55a7fcd4c000 8000875 /usr/sbin/sshd
VIRTUAL PHYSICAL
55a7fcc7b000 12026b000
55a7fcc7c000 1201df000
55a7fcc7d000 1201ec000
55a7fcc7e000 1200c7000
55a7fcc7f000 120c43000
55a7fcc80000 10fa79000
55a7fcc81000 11fdd3000
55a7fcc82000 11087f000
55a7fcc83000 11fa8d000
55a7fcc84000 10fe05000
55a7fcc85000 110870000
55a7fcc86000 10fa2c000
55a7fcc87000 10f9fc000
55a7fcc88000 10fdab000
55a7fcc89000 11f296000
55a7fcc8a000 1117ec000
55a7fcc8b000 10fdac000
55a7fcc8c000 120c65000
55a7fcc8d000 12011b000
55a7fcc8e000 110714000
55a7fcc8f000 110c83000
55a7fcc90000 110c90000
55a7fcc91000 110d2b000
55a7fcc92000 120730000
55a7fcc93000 12076f000
55a7fcc94000 1207e8000
55a7fcc95000 110c2f000
55a7fcc96000 110c3c000
55a7fcc97000 120650000
55a7fcc98000 1206c1000
55a7fcc99000 120c67000
55a7fcc9a000 120c0f000
55a7fcc9b000 FILE: /usr/sbin/sshd OFFSET: 20000
55a7fcc9c000 11d46d000
55a7fcc9d000 10fe01000
55a7fcc9e000 10fdb9000
55a7fcc9f000 10fde7000
55a7fcca0000 FILE: /usr/sbin/sshd OFFSET: 25000
# 结果省略了后面的部分, 太长了。。 。。
可以看到内存的映射关系, notmapped 表示没有被映射到物理内存的部分。
一般来说 后面的三位是一样的, 如果是THP的话, 那么后面的五位是一样的。
这个vtop 可以直接查看里面保存的内容以及具体的映射关系。
crash> vtop 55d5473fc000
VIRTUAL PHYSICAL
55d5473fc000 (not accessible)
rd命令可以读取指定的内存虚拟地址之后的偏移量。
crash> rd 55d54879d000 100
rd: invalid user virtual address: 55d54879d000 type: "64-bit UVADDR"
超过内存申请容量的使用, 会导致 访问内存越界, 例如申请了1G的内存,但是尝试写入超出的数据量, 会导致数据写到后续不属于这个进程的空间上, 而这个时候内核会触发一个 segfault, 来终止这个进程。
这个报错不是立刻发生的,可能确实会溢出一部分。
匿名页面 实际上是 mmap with MAP_ANONYMOUS flag映射出来的虚拟内存地址, 当需要第一次去写匿名页面的时候, 会将物理内存的地址映射到虚拟内存并将其中填0.
overcommit 0 可以所有的地址, 1 无限制,虚拟内存没有限制, 2 按照一定的比例进行计算, 最终的结果。
GDB 调试工具的使用记录
首先写了一个这样的程序:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int *data;
data = (int *)malloc(100 * sizeof(int)); // 分配 100 个 int 大小的内存块
if (data == NULL) {
// 内存分配失败的处理
return 1;
}
//data[100] = 0;
// 使用 data 数组...
// 例如,初始化数组
for (int i = 0; i < 100; ++i) {
data[i] = 0; // 将每个元素初始化为 0
}
free(data);
printf("%d\n", data[2]);
}
编译这个程序.
╰─>$ gcc ./123.c -g -Og
运行需要调试的程序.
╰─>$ ./a.out
971012533
尝试使用 gdb 调试.
╰─>$ gdb ./a.out -d .
╰─>$ gdb ./a.out -c ./COREDUMP_FILE
指令说明
list - list 指令会列出 10 行代码. 可以重复使用, 每10行一次.
(gdb) list
11 return 1;
12 }
13
14 //data[100] = 0;
15
16 // 使用 data 数组...
17 // 例如,初始化数组
18 for (int i = 0; i < 100; ++i) {
19 data[i ...
添加一个Redhat到EKS集群, 基于Packer的步骤
Copy From zhojiew 的私有仓库文档, 已经经过授权 ~
官方提供了基于packer工具的构建脚本
这里手动把相关的步骤执行下,基于redhat9构建一个自定义ami。据称eks优化的ami也是通过以下步骤完成的
手动构建ami拉仓库
cd /home/ec2-suer
sudo yum install git -y
git clone https://github.com/awslabs/amazon-eks-ami.git
配置环境变量
KUBERNETES_VERSION=1.26.4
KUBERNETES_BUILD_DATE=2023-05-11
BINARY_BUCKET_NAME=amazon-eks
BINARY_BUCKET_REGION=cn-north-1
DOCKER_VERSION=20.10.23-1.amzn2.0.1
CONTAINERD_VERSION=1.6.*
RUNC_VERSION=1.1.5-1.amzn2
CNI_PLUGIN_VERSION=v0.8.6
PULL_CNI_FROM_GITHUB=true
SONOBUOY_E2E_REGISTRY=""
PAUSE_CONTAINER_VERSION=3.5
CACHE_CONTAINER_IMAGES=false
WORKING_DIR=/tmp/worker
TEMPLATE_DIR=/home/ec2-user/amazon-eks-ami
复制文件更新内核(可以跳过)
mkdir -p $WORKING_DIR
mkdir -p $WORKING_DIR/log-collector-script
mkdir -p $WORKING_DIR/bin
mv $TEMPLATE_DIR/files/* $WORKING_DIR/
mv $TEMPLATE_DIR/log-collector-script/linux/eks-log-collector.sh $WORKING_DIR/log-collector-script/
sudo chmod -R a+x $WORKING_DIR/bin/
sudo mv /tmp/worker/bin/* /usr/bin/
# sudo bash $TEMPLATE_DIR/scripts/upgrade_kernel.sh
KERNEL_VERSION=5.10
sudo grubby \
--update-kernel=ALL \
--args="psi=1"
sudo grubby \
--update-kernel=ALL \
--args="clocksource=tsc tsc=reliable"
sudo reboot
构建的主要逻辑在脚本install-worker.sh中
# sudo bash $TEMPLATE_DIR/scripts/install-worker.sh
export AWS_DEFAULT_OUTPUT="json"
ARCH="amd64"
sudo yum update -y
sudo yum install -y \
chrony \
conntrack \
curl \
ethtool \
ipvsadm \
jq \
nfs-utils \
socat \
unzip \
wget \
yum-utils \
yum-plugin-versionlock \
mdadm \
pigz
# Remove any old kernel versions.
sudo package-cleanup --oldkernels --count=1 -y
# Remove the ec2-net-utils package
if yum list installed | grep ec2-net-utils; then sudo yum remove ec2-net-utils -y -q; fi
sudo mkdir -p /etc/eks/
sudo mv $WORKING_DIR/configure-clocksource.service /etc/eks/configure-clocksource.service
# iptables
sudo mv $WORKING_DIR/iptables-restore.service /etc/eks/iptables-restore.service
# awscli
sudo yum install less unzip jq -y
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install --update
complete -C '/usr/local/bin/aws_completer' aws
# systemd
sudo mv "${WORKING_DIR}/runtime.slice" /etc/systemd/system/runtime.slice
编译安装runc
# install runc and lock version
# sudo yum install -y runc-${RUNC_VERSION}
sudo yum install libseccomp-devel.x86_64 golang -y
go env -w GOPROXY=https://goproxy.io,direct
git clone https://github.com/opencontainers/runc
cd runc
make
sudo make install
安装containerd
# install containerd and lock version
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
# sudo yum install -y containerd-${CONTAINERD_VERSION}
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
sudo yum install -y containerd # 1.6.21
配置containerd
sudo mkdir -p /etc/eks/containerd
sudo mv $WORKING_DIR/containerd-config.toml /etc/eks/containerd/containerd-config.toml
# containerd and related service
sudo mv $WORKING_DIR/kubelet-containerd.service /etc/eks/containerd/kubelet-containerd.service
sudo mv $WORKING_DIR/sandbox-image.service /etc/eks/containerd/sandbox-image.service
sudo mv $WORKING_DIR/pull-sandbox-image.sh /etc/eks/containerd/pull-sandbox-image.sh
sudo mv $WORKING_DIR/pull-image.sh /etc/eks/containerd/pull-image.sh
sudo chmod +x /etc/eks/containerd/pull-sandbox-image.sh
sudo chmod +x /etc/eks/containerd/pull-image.sh
sudo mkdir -p /etc/systemd/system/containerd.service.d
cat << EOF | sudo tee /etc/systemd/system/containerd.service.d/10-compat-symlink.conf
[Service]
ExecStartPre=/bin/ln -sf /run/containerd/containerd.sock /run/dockershim.sock
EOF
cat << EOF | sudo tee -a /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
cat << EOF | sudo tee -a /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
# skip docker
日志轮换配置
# logrotate
sudo mv $WORKING_DIR/logrotate-kube-proxy /etc/logrotate.d/kube-proxy
sudo mv $WORKING_DIR/logrotate.conf /etc/logrotate.conf
sudo chown root:root /etc/logrotate.d/kube-proxy
sudo chown root:root /etc/logrotate.conf
sudo mkdir -p /var/log/journal
下载kubelet和aws-iam-authenticator
## download bin in china region
S3_DOMAIN="amazonaws.com.cn"
S3_PATH="s3://amazon-eks/1.26.4/2023-05-11/bin/linux/amd64"
# Verify that the aws-iam-authenticator is at last v0.5.9 or greater
BINARIES=(
kubelet
aws-iam-authenticator
)
for binary in ${BINARIES[*]}; do
aws s3 cp $S3_PATH/$binary . --region cn-north-1
sudo chmod +x $binary
sudo mv $binary /usr/bin/
done
继续配置服务
# kubernetes
sudo mkdir -p /etc/kubernetes/manifests
sudo mkdir -p /var/lib/kubernetes
sudo mkdir -p /var/lib/kubelet
sudo mkdir -p /opt/cni/bin
CNI_PLUGIN_FILENAME="cni-plugins-linux-${ARCH}-${CNI_PLUGIN_VERSION}"
aws s3 cp --region $BINARY_BUCKET_REGION $S3_PATH/${CNI_PLUGIN_FILENAME}.tgz .
su ...
BufferIO与DirectIO的比较
测试方法使用BufferIO的方式, 测试文件的写入:
#!/bin/bash
perf record -T -C 0 -- taskset -c 0 dd if=/dev/zero of=./a.dat bs=4k count=16384
使用DirectIO的方式, 测试文件的写入:
#!/bin/bash
perf record -T -C 0 -- taskset -c 0 dd if=/dev/zero of=./a.dat bs=4k count=16384 oflag=direct
运行结果BufferIO:
[root@ip-172-31-53-200 perf_records]# ./start_test_bufferio.sh
16384+0 records in
16384+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 0.118848 s, 565 MB/s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.225 MB perf.data (485 samples) ]
ll -h
-rw-r--r--. 1 root root 64M Jun 1 13:45 a.dat
[root@ip-172-31-53-200 ~]# dstat -tf
----system---- -----cpu0-usage----------cpu1-usage----------cpu2-usage----------cpu3-usage---- dsk/nvme0n1 ---net/lo-----net/eth0- ---paging-- ---system--
time |usr sys idl wai stl:usr sys idl wai stl:usr sys idl wai stl:usr sys idl wai stl| read writ| recv send: recv send| in out | int csw
01-06 13:35:48| 2 0 99 0 0: 1 0 99 0 0: 0 1 98 0 0: 1 0 99 0 0|8192B 35k|1096B 1096B: 968B 828B| 0 0 | 712 971
01-06 13:35:49| 25 9 60 5 0: 0 0 99 0 0: 4 12 84 0 0: 4 8 90 0 0| 0 64M|1096B 1096B: 576B 756B| 0 0 |2283 1311
01-06 13:35:50| 6 1 94 0 0: 1 1 99 0 0: 16 2 83 0 0: 0 1 99 0 0| 0 0 |1096B 1096B: 156B 418B| 0 0 | 954 1018
DirectIO:
[root@ip-172-31-53-200 perf_records]# ./start_test_directio.sh
16384+0 records in
16384+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 10.4225 s, 6.4 MB/s
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 2.417 MB perf.data (41489 samples) ]
[root@ip-172-31-53-200 ~]# dstat -tf
----system---- -----cpu0-usage----------cpu1-usage----------cpu2-usage----------cpu3-usage---- dsk/nvme0n1 ---net/lo-----net/eth0- ---paging-- ---system--
time |usr sys idl wai stl:usr sys idl wai stl:usr sys idl wai stl:usr sys idl wai stl| read writ| recv send: recv send| in out | int csw
01-06 13:36:36| 0 1 99 0 0: 1 1 99 0 0: 0 1 100 0 0: 0 1 100 0 0| 0 0 |1097B 1097B: 688B 624B| 0 0 | 622 930
01-06 13:36:37| 3 4 32 61 0: 4 14 81 0 0: 1 0 97 0 0: 0 5 94 0 0| 0 4277k|1095B 1095B: 332B 338B| 0 0 |6434 3133
01-06 13:36:38| 3 3 0 92 0: 1 1 99 0 0: 3 1 96 0 0: 0 0 99 0 0| 0 6421k|1096B 1096B: 52B 174B| 0 0 |8767 4148
01-06 13:36:39| 4 4 0 92 0: 0 0 99 0 0: 4 1 96 0 0: 2 0 100 0 0| 0 6431k|1096B 1096B: 52B 150B| 0 0 |8790 4191
01-06 13:36:40| 4 4 0 91 0: 0 1 99 0 0: 2 1 96 0 0: 0 1 99 0 0| 0 6320k|1096B 1096B: 52B 142B| 0 0 |8744 4092
01-06 13:36:41| 4 4 0 92 0: 1 0 99 0 0: 3 0 96 0 0: 0 0 100 0 0| 0 6216k|1096B 1096B: 52B 142B| 0 0 |8662 4103
01-06 13:36:42| 3 4 0 92 0: 1 1 99 0 0: 2 2 96 0 0: 0 0 99 0 0| 0 7492k|1576B 1576B: 52B 134B| 0 0 |8756 4099
01-06 13:36:43| 3 3 0 91 0: 1 0 99 0 0: 4 1 96 0 0: 0 0 100 0 0| 0 6284k|1096B 1096B: 52B 134B| 0 0 |8720 4077
01-06 13:36:44| 4 2 0 92 0: 0 0 99 0 0: 2 1 96 0 0: 0 0 99 0 0| 0 6296k|1096B 1096B: 52B 134B| 0 0 |8788 4067
01-06 13:36:45| 4 5 0 91 0: 1 0 99 0 0: 4 0 96 0 0: 1 0 99 0 0| 0 6368k|1096B 1096B: 52B 134B| 0 0 |8792 4071
01-06 13:36:46| 3 5 0 92 0: 1 1 99 0 0: 4 1 96 0 0: 0 0 100 0 0| 0 5904k|1096B 1096B: 52B 134B| 0 0 |8576 3893
01-06 13:36:47| 25 7 0 69 0: 0 0 99 0 0: 2 1 96 0 0: 0 0 100 0 0| 0 4811k|1097B 1097B: 364B 763B| 0 0 |7035 3360
01-06 13:36:48| 4 0 96 0 0: 1 0 99 0 0: 22 3 75 1 0: 0 1 100 0 0|2642k 109k|1095B 1095B: 208B 472B| 0 0 | 977 1008
01-06 13:36:49| 0 1 99 0 0: 0 0 100 0 0: 0 0 98 0 0: 0 0 99 0 0| 0 0 |1096B 1096B: 104B 276B| 0 0 | 640 903
Perf 采样结果BufferIO:
DirectIO:
VPCFlowlog 解析
VPC Flow Log 怎么看https://docs.amazonaws.cn/vpc/latest/userguide/flow-logs.html#flow-log-recordshttps://docs.amazonaws.cn/vpc/latest/userguide/flow-logs-records-examples.html#flow-log-example-tcp-flag
vpc flow log里的tcp-flags记录的不是某个单个tcp包头里的flag,而是单次观察的时间窗口里这条flow的所有tcp包出现过的tcp flag的合计。
TCP flags can be OR-ed during the aggregation interval. For short connections, the flags might be set on the same line in the flow log record, for example, 19 for SYN-ACK and FIN, and 3 for SYN and FIN. For an example, see TCP flag sequence.For general information about TCP flags (such as the meaning of flags like FIN, SYN, and ACK), see TCP segment structureon Wikipedia..
这个记录里面的值, 是这样计算出来的, 从右向左 , 从 0 次方开始计算。
FIN 2^0
SYN 2^1
RST 2^2
PSH 2^3
ACK 2^4
URG 2^5
ECE 2^6
CWR 2^7
Linux OS Debug 方法记录
触发 EC2 Linux 的 NMI Unknown 中断发送一个诊断请求给 EC2, 触发 os 本身 NMI Unknown 事件,这个时间会触发 Kdump 记录当时的现场。
aws ec2 send-diagnostic-interrupt --region cn-north-1 --instance-id i-********************
记录下来的现场文件保存在 /var/crash/
[root@mysql 5.14.0-284.11.1.el9_2.x86_64]# ll /var/crash/
total 0
drwxr-xr-x. 2 root root 67 Jun 6 05:22 127.0.0.1-2023-06-06-05:22:11
drwxr-xr-x. 2 root root 67 Jun 6 08:58 127.0.0.1-2023-06-06-08:58:20
drwxr-xr-x. 2 root root 67 Jun 9 09:39 127.0.0.1-2023-06-09-09:39:56
使用Crash命令进行分析, 需要安装kernel-debug 和 kernel-debuginfo kernel-devel
[root@mysql 5.14.0-284.11.1.el9_2.x86_64]# crash /usr/lib/debug/lib/modules/5.14.0-284.11.1.el9_2.x86_64/vmlinux /var/crash/127.0.0.1-2023-06-09-09\:39\:56/vmcore
相关文档:
New – Trigger a Kernel Panic to Diagnose Unresponsive EC2 Instances发送诊断中断(适用于高级用户)
Cscope 查看内核源代码# 下载源代码
yum install -y yum-utils
yum
yum download --source kernel
# 解压代码包
rpm2cpio ./kernel-5.14.0-284.11.1.el9_2.src.rpm | cpio -div
tar xf ./linux-5.14.0-284.11.1.el9_2.tar.xz
# 使用命令查看源代码
make cscope ARCH=x86
# 读取并标记tag
make tags ARCH=x86
# 查看
cscope -d
Dracut 的使用和命令# 添加驱动程序到 ramfs
]$ dracut -f -v --add-drivers "nvme ena" /boot/initramfs-$(uname -r).img $(uname -r)
# 查看是否有模块在 ramfs 中
]$ lsinitrd /boot/initramfs-$(uname -r).img | grep -E "nvme|ena"
安全软件引起的用户空间进程失去响应Redhat关于这个问题的文档说明:
https://access.redhat.com/solutions/5201171
https://access.redhat.com/solutions/2838901
使用Ftrace的方法,和一部分命令的使用方法:
[root@ip-172-31-51-167 ~]$ echo 'func fanotify_get_response +p' > /sys/kernel/debug/dynamic_debug/control
追踪这个系统调用, 并输出 callgraph.内核的DynamicTracing, 这是一个古老的方式了, 出现在Kprobe之前。会直接将追踪的结果输出到dmesg中。
[root@ip-172-31-51-167 ~]$ perf trace -s -p 2688
[root@ip-172-31-51-167 ~]$ cd /var/crash/127.0.0.1-2023-08-11-06:53:10
[root@ip-172-31-51-167 ~]$ crash /usr/lib/debug/lib/modules/6.1.34-59.116.amzn2023.x86_64/vmlinux vmcore
[root@ip-172-31-51-167 127.0.0.1-2023-08-11-06:53:10]$ ll /var/crash
total 0
drwxr-xr-x. 2 root root 67 Aug 10 05:52 127.0.0.1-2023-08-10-05:52:16
drwxr-xr-x. 2 root root 67 Aug 10 06:13 127.0.0.1-2023-08-10-06:13:44
drwxr-xr-x. 2 root root 67 Aug 11 05:45 127.0.0.1-2023-08-10-13:03:27
drwxr-xr-x. 2 root root 91 Aug 12 15:05 127.0.0.1-2023-08-11-04:57:13
drwxr-xr-x. 2 root root 67 Aug 11 08:41 127.0.0.1-2023-08-11-06:53:10
drwxr-xr-x. 2 root root 67 Aug 11 20:56 badstop
drwxr-xr-x. 2 root root 41 Aug 11 20:46 crash
Grubby 命令简单的用法设置内核参数:
# 查看所有内核的参数
$ grubby --info=ALL
# 设置默认的启动内核
$ grubby --set-default-index=1
# 查看当前的默认启动内核
$ grubby --default-kernel
# 移除所有内核的参数
$ grubby --update-kernel=ALL --remove-args="systemd.log_level=debug systemd.log_target=kmsg log_buf_len=1M loglevel=8 crashkernel=512M"
# 更新所有内核的参数
$ grubby --update-kernel=ALL --args="systemd.log_level=debug systemd.log_target=kmsg log_buf_len=1M loglevel=8 crashkernel=512M"
# 为特定的内核添加参数。
$ grubby --update-kernel=/boot/vmlinuz-5.9.1-1.el8.elrepo.x86_64 --args=“systemd.log_level=debug systemd.log_target=kmsg log_buf_len=1M loglevel=8 crashkernel=512M”
[root@ip-172-31-0-170 ~]# sudo kdumpctl status
kdump: Kdump is operational
[root@ip-172-31-0-170 ~]# sudo kdumpctl showmem
kdump: Reserved 256MB memory for crash kernel
[root@ip-172-31-0-170 ~]# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt1)/boot/vmlinuz-6.1.34-59.116.amzn2023.x86_64 root=UUID=483d7075-a0f8-4ba8-a951-a668fa079cac ro console=tty0 console=ttyS0,115200n8 nvme_core.io_timeout=42949672
95 rd.emergency=poweroff rd.shell=0 selinux=1 security=selinux quiet systemd.log_level=debug systemd.log_target=kmsg log_buf_len=1M loglevel=8 crashkernel=512M
快速启动一个 prometheus 和 grafana
快速创建一个可用的 prometheus 和 grafana 进行测试, 并将数据保留在当前的目录中, 在重启之后数据不会丢失:
创建一个目录.
mkdir /opt/monitor
mkdir /opt/monitor/grafana
mkdir /opt/monitor/grafana_data
mkdir /opt/monitor/prometheus
mkdir /opt/monitor/prometheus_data
touch /opt/monitor/docker-compose.yaml
创建docker-compose 文件
---
version: "3"
services:
prometheus:
container_name: prometheus
image: reg.liarlee.site/docker.io/prom/prometheus:latest
restart: always
network_mode: host
environment:
- TZ=Asia/Shanghai
volumes:
# - /opt/monitor/prometheus/prometheus.yaml:/etc/prometheus/prometheus.yml
- /opt/monitor/prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/usr/share/prometheus/console_libraries'
- '--web.console.templates=/usr/share/prometheus/consoles'
- '--storage.tsdb.retention.time=90d'
grafana:
container_name: grafana
image: reg.liarlee.site/docker.io/grafana/grafana-oss:main-ubuntu
restart: always
network_mode: host
environment:
- TZ=Asia/Shanghai
volumes:
- /opt/monitor/grafana_data:/var/lib/grafana
- /opt/monitor/grafana/datasource:/etc/grafana/provisioning/datasources
# - /opt/monitor/grafana/grafana.ini:/etc/grafana/grafana.ini
- /etc/localtime:/etc/localtime:ro
user: '472'
准备基础配置文件
docker compose up -d
docker cp grafana:/etc/grafana/grafana.ini /opt/monitor/grafana/grafana.ini
docker cp prometheus:/etc/prometheus/prometheus.yml /opt/monitor/prometheus/prometheus.yaml
chown -R 472:472 /opt/monitor/grafana_data
chown -R 472:472 /opt/monitor/grafana
chown -R nobody:nobody /opt/monitor/prometheus_data
chown -R nobody:nobody /opt/monitor/prometheus
docker compose down --remove-orphans
准备prometheus 作为默认的Datasource
touch /opt/monitor/grafana/datasource/datasource.yml
---
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
url: http://localhost:9090
isDefault: true
access: proxy
editable: true
修改配置文件中需要的参数, 取消配置文件中的注释, 然后重启即可。
docker compose down --remove-orphans && docker compose up -d
数据库单表的测试
基于这个问题的测试为什么MySQL单表不要超过2000w行?
测试过程:CREATE TABLE test(
id int NOT NULL AUTO_INCREMENT PRIMARY KEY comment '主键',
person_id int not null comment '用户id',
person_name VARCHAR(200) comment '用户名称',
gmt_create datetime comment '创建时间',
gmt_modified datetime comment '修改时间'
) comment '人员信息表';
插入数据:
insert into test values(1,1,'user_1', NOW(), now());
insert into test (person_id, person_name, gmt_create, gmt_modified)
select (@i:=@i+1) as rownum, person_name, now(), now() from test, (select @i:=100) as init;
set @i=1;
//测试 SQL,记录他们的运行时间
select count(*) from test;
select count(*) from test where id=XXX;
查看这个表格的数据量大小:
show table status like 'test'\G
200w行表:
mysql> describe table test;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
| 1 | SIMPLE | test | NULL | ALL | NULL | NULL | NULL | NULL | 2092640 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
1 row in set, 1 warning (0.00 sec)
MySQL [test]> select count(*) from test;
1 row in set (0.045 sec)
1 row in set (0.050 sec)
1 row in set (0.050 sec)
400w:
mysql> describe table test;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
| 1 | SIMPLE | test | NULL | ALL | NULL | NULL | NULL | NULL | 4185280 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
1 row in set, 1 warning (0.00 sec)
MySQL [test]> select count(*) from test;
1 row in set (0.126 sec)
1 row in set (0.120 sec)
1 row in set (0.119 sec)
800w:
mysql> describe table test;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
| 1 | SIMPLE | test | NULL | ALL | NULL | NULL | NULL | NULL | 8370120 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+-------+
1 row in set, 1 warning (0.00 sec)
MySQL [test]> select count(*) from test;
1 row in set (0.266 sec)
1 row in set (0.266 sec)
1 row in set (0.253 sec)
1600w:
mysql> describe table test;
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-------+
| 1 | SIMPLE | test | NULL | ALL | NULL | NULL | NULL | NULL | 16337090 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-------+
1 row in set, 1 warning (0.00 sec)
MySQL [test]> select count(*) from test;
1 row in set (0.544 sec)
1 row in set (0.524 sec)
1 row in set (0.523 sec)
3200w:
mysql> describe table test;
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-------+
| 1 | SIMPLE | test | NULL | ALL | NULL | NULL | NULL | NULL | 32665301 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-------+
1 row in set, 1 warning (0.00 sec)
MySQL [test]> select count(*) from test;
1 row in set (1.068 sec)
1 row in set (1.057 sec)
1 row in set (1.044 sec)
这个结果基本上都是线性的, 感觉数据量实在是太小了。
mysql> show table status like 'test'\G
*************************** 1. row ***************************
Name: test
Engine: InnoDB
Version: 10
Row_format: Dynamic
Rows: 4182365
Avg_row_length: 48
Data_length: 202063872
Max_data_length: 0
...
Network 相关知识不知道放那儿
Origin Version:https://datatracker.ietf.org/doc/html/rfc1180
Chinese Version:http://arthurchiao.art/blog/rfc1180-a-tcp-ip-tutorial-zh/
Tuning initcwnd for optimum Performance:
https://www.cdnplanet.com/blog/tune-tcp-initcwnd-for-optimum-performance/
https://www.kawabangga.com/posts/5217
Other Linux Network Stack Explaination:https://www.clockblog.life/article/2023/7/4/44.html
Linux内核网络https://www.clockblog.life/article/2023/7/4/44.html
https://blogs.runsunway.com/