我的的博客: 四月 2009

2009年4月9日星期四

iptables+tc+imq限制网络流量(三)

0 评论 19:48 发帖者 frankzh

我们把环境都准备好了现在开始测试吧，比如现在我的网络中192.168.1.211这个ip经常是用p2p等下载导致整个网络丢包慢，我现要对这个用户设置上传保证带宽为3KB/S 上传最大带宽为5KB/S，设置这个用户下载保证带宽为50KB/S 上传最大带宽为60KB/S。
先看下192.168.1.211的流量情况

192.168.1.211下载流量

192.168.1.211上传流量

编辑脚本


vi qoseth0
#!/bin/bash
ifc_zew="eth0"  <-- 内网网卡 
max_upload="512kbit" <-- 公司最大的上传带宽 
upload_user="30kbit" 
upload_serv="70kbit" 
max_download="6000kbit" <-- 公司最大的下载带宽 
download_user="127kbit" 
download_serv="127kbit"  

if [ "$1" = "status" ]; then  
echo "############### Download ###################"  
echo "[qdisc]"  
tc -s qdisc show dev imq1   
echo "" 
echo "[class]"  
tc -s class show dev imq1   

echo ""  
echo "[filter]"  
tc -s filter show dev imq1   
echo ""   
echo "############### Upload ###################"    
echo "[qdisc]"  
tc -s qdisc show dev imq0   

echo ""  
echo "[class]"  
tc -s class show dev imq0   
echo ""  
echo "[filter]"  
tc -s filter show dev imq0   

echo ""   

echo "###########Download iptables##############"  
iptables -t mangle -L QOS-IP-OUT -v -x 2> /dev/null

echo "############Upload iptables###############"
iptables -t mangle -L QOS-IP-IN -v -x 2> /dev/null

exit
fi

tc qdisc del dev imq0 root    2> /dev/null > /dev/null
tc qdisc del dev imq1 root    2> /dev/null > /dev/null
iptables -t mangle -D POSTROUTING -o eth0 -m connmark --mark 1000 -j QOS-IP-OUT 2> /dev/null > /dev/null
iptables -t mangle -D PREROUTING  -i eth0 -m connmark --mark 1000 -j QOS-IP-IN 2> /dev/null > /dev/null
iptables -t mangle -D QOS-IP-OUT -j IMQ --todev 1 >/dev/null 2>&1
iptables -t mangle -D QOS-IP-IN -j IMQ --todev 0 >/dev/null 2>&1
iptables -t mangle -F QOS-IP-IN 2> /dev/null > /dev/null
iptables -t mangle -X QOS-IP-IN 2> /dev/null > /dev/null
iptables -t mangle -F QOS-IP-OUT 2> /dev/null > /dev/null
iptables -t mangle -X QOS-IP-OUT 2> /dev/null > /dev/null

ip link set imq0 down 2> /dev/null > /dev/null
ip link set imq1 down 2> /dev/null > /dev/null

if [ "$1" = "stop" ]; then
echo "Qos per IP Stop on imq0 and imq1."
exit
fi

modprobe imq >/dev/null 2>&1
modprobe imq numdevs=2 >/dev/null 2>&1
modprobe ipt_IMQ >/dev/null 2>&1

将所有对外工作的打上1000的标记
iptables -A FORWARD -i ppp0 -j CONNMARK --set-mark 1000

ip link set imq0 up

tc qdisc add dev imq0 root handle 1: htb  < --定义最顶层的队列规则 
tc class add dev imq0 parent 1: classid 1:1 htb rate $max_upload ceil $max_upload 定义根类别1:1
tc class add dev imq0 parent 1:1 classid 1:10 htb rate 488kbit ceil $max_upload quantum 1514 定义叶类别 1:10 
tc class add dev imq0 parent 1:1 classid 1:20 htb rate $upload_user ceil $max_upload quantum 1514 定义叶类别 1:20 
tc class add dev imq0 parent 1:1 classid 1:30 htb rate $upload_user ceil $max_upload quantum 1514 定义叶类别 1:30 
tc class add dev imq0 parent 1:1 classid 1:40 htb rate 24kbit ceil 40kbit quantum 1514  定义叶类别 1:40 
 
定义各类别分别打上不同的标记 
tc filter add dev imq0 parent 1: protocol ip prio 1 handle 201 fw  flowid 1:10 
tc filter add dev imq0 parent 1: protocol ip prio 1 handle 202 fw  flowid 1:20 
tc filter add dev imq0 parent 1: protocol ip prio 1 handle 203 fw  flowid 1:30 
tc filter add dev imq0 parent 1: protocol ip prio 1 handle 204 fw  flowid 1:40  
定义各叶类别的队列规则 
tc qdisc add dev imq0 parent 1:10 handle 10 sfq perturb 10 
tc qdisc add dev imq0 parent 1:20 handle 20 sfq perturb 10 
tc qdisc add dev imq0 parent 1:30 handle 30 sfq perturb 10 
tc qdisc add dev imq0 parent 1:40 handle 40 sfq perturb 10  

iptables -t mangle -N QOS-IP-IN   <-- 新增一条叫QOS-IP-IN的链
将符合要求的并打上标记1000的归为QOS-IP-IN链处理
iptables -t mangle -I PREROUTING -i eth0 -m connmark --mark 1000 -j QOS-IP-IN  

将QOS-IP-IN链交由imq0管理
iptables -t mangle -A QOS-IP-IN -j IMQ --todev 0 
来源是192.168.1.211机器 属于QOS-IP-IN链的封包打上204标记
iptables -t mangle -A QOS-IP-IN -s 192.168.1.211/32 -j MARK --set-mark 204   

ip link set imq1 up  
tc qdisc add dev imq1 root handle 1: htb 
tc class add dev imq1 parent 1: classid 1:1 htb rate $max_download ceil $max_download 
tc class add dev imq1 parent 1:1 classid 1:10 htb rate 3200kbit ceil 6000kbit burst 5k quantum 1514 
tc class add dev imq1 parent 1:1 classid 1:20 htb rate $download_user ceil $max_download burst 5k quantum 1514 
tc class add dev imq1 parent 1:1 classid 1:30 htb rate $download_user ceil $max_download burst 5k quantum 1514 
tc class add dev imq1 parent 1:1 classid 1:40 htb rate 400kbit ceil 480kbit burst 5k quantum 1514  
tc filter add dev imq1 parent 1: protocol ip prio 1 handle 101 fw  flowid 1:10 
tc filter add dev imq1 parent 1: protocol ip prio 1 handle 102 fw  flowid 1:20 
tc filter add dev imq1 parent 1: protocol ip prio 1 handle 103 fw  flowid 1:30 
tc filter add dev imq1 parent 1: protocol ip prio 1 handle 104 fw  flowid 1:40  

tc qdisc add dev imq1 parent 1:10 handle 10 sfq perturb 10 
tc qdisc add dev imq1 parent 1:20 handle 20 sfq perturb 10 
tc qdisc add dev imq1 parent 1:30 handle 30 sfq perturb 10 
tc qdisc add dev imq1 parent 1:40 handle 40 sfq perturb 10  

iptables -t mangle -N QOS-IP-OUT 
iptables -t mangle -I POSTROUTING  -o eth0 -m connmark --mark 1000 -j QOS-IP-OUT 

iptables -t mangle -A QOS-IP-OUT -j IMQ --todev 1  
iptables -t mangle -A QOS-IP-OUT  -d  192.168.1.211/32 -j MARK --set-mark 104  

启动脚本 ./qoseth0 start  在192.168.1.211机器上登录外网的一台ftp下载如图

上传如图

符合我们设置的要求

再来我们看看192.168.1.211对nat主机192.168.1.1的下载和上传会不会也被限制住呢？
下载如图


上传如图
可以看出来对nat主机的存储速度就不会限制了 符合我们的要求。

再来测试下使用迅雷下载




再关闭控制  ./qoseth0 stop
再用迅雷下载


可以看到迅雷下载也是可以限制了

iptables+tc+imq限制网络流量(二)

0 评论 18:53 发帖者 frankzh

在跳出对话框中选择 Networking --->

确定后再选择 Networking options --->

确定后再选择 [*] Network packet filtering framework (Netfilter) --->

确定后再选择 Core Netfilter Configuration --->

确定后再选择 "layer7" match support

退回上层菜单选择 IP:Netfilter configuration -->

确定后再选择 IMQ target suport

退回到第一层菜单选择 Device drivers --->

确定后再选择 [*] Network device support --->

确定后再选择 IMQ (intermediate queueing device) support

安装核心


make
make modules
make modules_install
make install

安装iptables


[root@imq src]# cd iptables-1.4.1-rc3
[root@imq iptables-1.4.1-rc3]# ./configure --with-ksource=/usr/src/linux
[root@imq iptables-1.4.1-rc3]# make
[root@imq iptables-1.4.1-rc3]# make install

安装layer7 的协议文件


[root@imq iptables-1.4.1-rc3]# cd /usr/src/l7-protocols-2008-12-18
[root@imq l7-protocols-2008-12-18]# make install
[root@imq l7-protocols-2008-12-18]# mkdir -p /etc/l7-protocols
[root@imq l7-protocols-2008-12-18]# cp -R * /etc/l7-protocols

重启机器
[root@imq l7-protocols-2008-12-18]# reboot

检查配置结果
[root@imq ~]# uname -a; iptables -V
Linux imq.3322.org 2.6.26.6 #1 SMP Mon Feb 23 00:36:18 CST 2009 i686 i686 i386 GNU/Linux
iptables v1.4.1-rc3

检查是否支持HTB
[root@imq linux]# cat  /usr/src/linux/.config | grep HTB
CONFIG_NET_SCH_HTB=m

检查是否安装了iproute  TC命令在这个套件里
[root@imq linux]# yum list | grep iproute
iproute.i386                             2.6.18-7.el5           installed


[root@imq ~]# modprobe imq 加载imq
[root@imq ~]# modprobe imq numdevs=2  设置imq设备的个数
[root@imq ~]# ip link set imq0 up  启动imq0设备
[root@imq ~]# ifconfig  查看网卡其中就有imq0

imq0      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
          UP RUNNING NOARP  MTU:16000  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:11000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
可以看到imq0已经启用了
[root@imq ~]#

2009年4月8日星期三

iptables+tc+imq限制网络流量(一)

0 评论 19:04 发帖者 frankzh

网络基本架构图

一个公司的内部网络中，经常会有某些人员使用p2p 或者bt工具下载东西，而这些下载工具在下载的同时也在进行上传的动作，这样就会造成某些人员占用了很大的带宽资源，而导致其他同事无法正上网上工作。要解决这个问题必须对网络带宽的进行限制合理分配资源。

nat主机是centos5.2系统，需要先准备好的软件如下：


[root@imq src]# cd /usr/src
[root@imq src]# ls
iptables-1.4.1-imq.diff     l7-protocols-2008-12-18.tar.gz  imq-nat.diff
linux-2.6.26.6.tar.bz2      netfilter-layer7-v2.21.tar.gz   iptables-1.4.1-rc3.tar.bz2  
linux-2.6.25-imq5.diff

编译内核加入Qos 和L7-Filter +imq 支持，在linux中网络流量采用队列式的，队列(queue)是网卡存放外送封包的地方，队列规则就是采用来管制流量的规则，如果没有设定，预设的规则是FIFO(first in first out,就是先进先出)，也就是没有任何限制的意思。

QOS 能吧网络带宽分成很多群组，然后每个群组都可以设置一定得保证带宽和最大带宽。而且可以设置优先权，这个时候我们就可以根据我们设置的规则使用iptables把资料分门别类的送到QOS所设置的群组内，这样不同的应用就可以分别享有不同大小的带宽资源了。

在linux中QOS 有很多种有CBT,HTB,HFSC 其中htb比较容易看懂而且设置的带宽比较精确，下来我们就用HTB来做QOS管理。

由于QOS的tc机制在一个设备上只能限制上传或者下载，所以无法在用一张网卡上同时限制上传和下载。

因为nat通常都需要使用两张网卡，很多人就是使用对外的网卡控制上传，对内的网卡控制下载，但是这不是一个好的方式，因为NAT主机在很多时候都有可能做为PROXYD等等服务，这样连到NAT主机的服务也会被限制到了，而我们不希望内部人员连到NAT主机的封包也被限制。

这个时候就要通过IMQ 这种虚拟设备来解决这个问题，我的理解就是imq这个机制可以把很多网卡虚拟为一个网卡也可以把一张网卡虚拟成多张网卡。什么情况下需要将多个网卡虚拟为一张网卡呢比如我内网有很多张网卡我希望把这些内网卡看做一个整体来限制速度那么就可以把这些内网卡虚拟为一个imq设备然后对这个imq设备进行设置就好了，而这篇文章开头所显示的网络结构图就要采用把一张网卡虚拟成多个imq设备。

将套件解压


[root@imq src]# tar -jxvf linux-2.6.26.6.tar.bz2 
[root@imq src]# tar -zxvf netfilter-layer7-v2.21.tar.gz 
[root@imq src]# tar -jxvf iptables-1.4.1-rc3.tar.bz2
[root@imq src]# tar -zxvf l7-protocols-2008-12-18.tar.gz

将核心的目录做个软连接叫linux


[root@imq src]# ln -s /usr/src/linux-2.6.26.6 linux

给 kernel source 上 layer7 和imq 的补丁。


[root@imq src]# cd linux
[root@imq linux]# patch -p1 < ../netfilter-layer7-v2.21/kernel-2.6.25-2.6.28-layer7-2.21.patch 
[root@imq linux]# patch -p1 < ../linux-2.6.25-imq5.diff

给iptables打上补丁


[root@imq linux]# cd ../iptables-1.4.1-rc3
[root@imq iptables-1.4.1-rc3]# cp ../netfilter-layer7-v2.21/iptables-1.4.1.1-for-kernel-2.6.20forward/libxt_layer7.* extensions/
[root@imq iptables-1.4.1-rc3]# patch -p1 < ../iptables-1.4.1-imq.diff
[root@imq iptables-1.4.1-rc3]# cd extensions
[root@imq extensions]# chmod 0755 .IMQ*
[root@imq extensions]# cd /usr/src/linux/drivers/net/
[root@imq net]# patch < imq-nat.diff

编译内核


[root@imq net]# cd /usr/src/linux
[root@imq linux]# cp /boot/config-2.6.18-92.1.13.el5 ./.config
[root@imq linux]# make menuconfig

heartbeat+DRBD+MySQL（三）

0 评论 01:43 发帖者 frankzh

登陆node-a机器


[root@node-a ha.d]# /etc/rc.d/init.d/heartbeat start 
Starting High-Availability services:
                                                           [  OK  ]
[root@node-a ha.d]# chkconfig heartbeat on


[root@node-a ~]# ip addr show eth0 <-- 虚拟ip 192.168.1.10已经启用
2: eth0:  mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:0c:29:8c:0c:be brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.11/24 brd 192.168.1.255 scope global eth0
    inet 192.168.1.10/24 brd 192.168.1.255 scope global secondary eth0
    inet6 fe80::20c:29ff:fe8c:cbe/64 scope link
       valid_lft forever preferred_lft forever

[root@node-a ~]# cd /mnt/mysql
[root@node-a mysql]# ls
ibdata1  ib_logfile0  ib_logfile1  lost+found  mysql  test

[root@node-a ~] tail -f /var/log/ha-debug <-- 日志可以看出来 设定资源都正常启动了
IPaddr2[2296][2331]: 2009/03/31_15:18:10 INFO: ip -f inet addr add 192.168.1.10/24 brd 192.168.1.255 dev eth0
IPaddr2[2296][2333]: 2009/03/31_15:18:10 INFO: ip link set eth0 up
IPaddr2[2296][2335]: 2009/03/31_15:18:11 INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.1.10 eth0 192.168.1.10 auto not_used not_used
IPaddr2[2267][2338]: 2009/03/31_15:18:11 INFO:  Success
Filesystem[2456][2486]: 2009/03/31_15:18:13 INFO: Running start for /dev/drbd0 on /mnt/mysql
Filesystem[2456][2491]: 2009/03/31_15:18:13 INFO: Starting filesystem check on /dev/drbd0
Filesystem[2445][2502]: 2009/03/31_15:18:13 INFO:  Success
ResourceManager[2171][2503]: 2009/03/31_15:18:13 debug: /etc/ha.d/resource.d/Filesystem /dev/drbd0 /mnt/mysql start done. RC=0
ResourceManager[2171][2542]: 2009/03/31_15:18:14 info: Running /etc/init.d/mysqld  start
ResourceManager[2171][2543]: 2009/03/31_15:18:14 debug: Starting /etc/init.d/mysqld  start
ResourceManager[2171][2652]: 2009/03/31_15:18:17 debug: /etc/init.d/mysqld  start done. RC=0
heartbeat[2009]: 2009/03/31_15:18:18 info: Local Resource acquisition completed. (none)
heartbeat[2009]: 2009/03/31_15:18:18 info: local resource transition completed.

node-a和node-b机器执行  为了方便测试，开放数据库可以用192.168.1段的ip来连接
GRANT ALL PRIVILEGES ON *.* TO 'root'@'192.168.1.%'；

现在你在192.168.1.1机器测试访问数据库
web01# mysql -h 192.168.1.10 -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.0.45 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql>

说明可以使用了。

最后我们整体来测试一下当机器出现问题后的是否能正常接管
1.测试机器直接关机


目前的状态是 node-b是主要机器  node-a是次要机器,两机器现在正常

node-b 执行关机 shutdown -h now 
node-a 机器日志显示
heartbeat[6911]: 2009/04/05_15:13:46 info: all HA resource acquisition completed (standby).
heartbeat[5513]: 2009/04/05_15:13:46 info: Standby resource acquisition done [all].
heartbeat[7503]: 2009/04/05_15:13:46 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[7503][7509]: 2009/04/05_15:13:46 info: Running /etc/ha.d/rc.d/status status
mach_down[7515][7536]: 2009/04/05_15:13:46 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down[7515][7540]: 2009/04/05_15:13:46 info: mach_down takeover complete for node node-b.
heartbeat[5513]: 2009/04/05_15:13:46 info: mach_down takeover complete.
heartbeat[7541]: 2009/04/05_15:13:46 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[7541][7547]: 2009/04/05_15:13:46 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
ip-request-resp[7541][7553]: 2009/04/05_15:13:46 received ip-request-resp IPaddr2::192.168.1.10/24/eth0/192.168.1.255 OK yes
ResourceManager[7554][7565]: 2009/04/05_15:13:46 info: Acquiring resource group: node-a IPaddr2::192.168.1.10/24/eth0/192.168.1.255 drbddisk::mysql Filesystem::/dev/drbd0::/mnt/mysql mysqld
IPaddr2[7577][7634]: 2009/04/05_15:13:47 INFO:  Running OK
Filesystem[7660][7704]: 2009/04/05_15:13:48 INFO:  Running OK

[root@node-a ~]# netstat -anpt
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
tcp        0      0 0.0.0.0:3306                0.0.0.0:*                   LISTEN      7481/mysqld
tcp        0      0 192.168.100.11:7789         0.0.0.0:*                   LISTEN      -
tcp        0      0 :::22                       :::*                        LISTEN      1742/sshd
tcp        0    148 ::ffff:192.168.1.11:22      ::ffff:192.168.1.210:3542   ESTABLISHED 1869/0

[root@node-a ~]# /etc/rc.d/init.d/drbd status
drbd driver loaded OK; device status:
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-i386-build, 2008-10-03 11:42:32
m:res    cs            st               ds                 p  mounted     fstype
0:mysql  WFConnection  Primary/Unknown  UpToDate/DUnknown  C  /mnt/mysql  ext3

node-a机器发现node-b机器死了 node-a成功接管node-b为master  


再重启node-b机器
node-a机器还是为master 并没有发现脑裂


接着我们
node-a 关机shutdown -h now

node-b机器显示 node-b成功接管node-a为master
Apr  5 15:22:13 node-b kernel: drbd0: role( Secondary -> Primary )
Apr  5 15:22:13 node-b kernel: drbd0: Writing meta data super block now.
Apr  5 15:22:14 node-b Filesystem[2346]: [2390]: INFO:  Resource is stopped
Apr  5 15:22:14 node-b ResourceManager[2054]: [2404]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /mnt/mysql start
Apr  5 15:22:14 node-b Filesystem[2417]: [2447]: INFO: Running start for /dev/drbd0 on /mnt/mysql
Apr  5 15:22:14 node-b Filesystem[2417]: [2452]: INFO: Starting filesystem check on /dev/drbd0
Apr  5 15:22:15 node-b kernel: kjournald starting.  Commit interval 5 seconds
Apr  5 15:22:15 node-b kernel: EXT3 FS on drbd0, internal journal
Apr  5 15:22:15 node-b kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr  5 15:22:15 node-b Filesystem[2406]: [2463]: INFO:  Success
Apr  5 15:22:15 node-b ResourceManager[2054]: [2503]: info: Running /etc/init.d/mysqld  start
Apr  5 15:22:18 node-b heartbeat: [2022]: info: all HA resource acquisition completed (standby).
Apr  5 15:22:18 node-b heartbeat: [1907]: info: Standby resource acquisition done [all].
Apr  5 15:22:18 node-b harc[2614]: [2620]: info: Running /etc/ha.d/rc.d/status status
Apr  5 15:22:19 node-b mach_down[2626]: [2647]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
Apr  5 15:22:19 node-b mach_down[2626]: [2651]: info: mach_down takeover complete for node node-a.
Apr  5 15:22:19 node-b heartbeat: [1907]: info: mach_down takeover complete.
Apr  5 15:22:19 node-b harc[2652]: [2658]: info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
Apr  5 15:22:19 node-b ip-request-resp[2652]: [2664]: received ip-request-resp IPaddr2::192.168.1.10/24/eth0/192.168.1.255 OK yes
Apr  5 15:22:19 node-b ResourceManager[2665]: [2676]: info: Acquiring resource group: node-b IPaddr2::192.168.1.10/24/eth0/192.168.1.255 drbddisk::mysql Filesystem::/dev/drbd0::/mnt/mysql mysqld
Apr  5 15:22:20 node-b IPaddr2[2688]: [2745]: INFO:  Running OK

[root@node-b ~]# netstat -anpt
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
tcp        0      0 0.0.0.0:3306                0.0.0.0:*                   LISTEN      2592/mysqld
tcp        0      0 192.168.100.12:7789         0.0.0.0:*                   LISTEN      -
tcp        0      0 192.168.100.12:59833        192.168.100.11:7789         TIME_WAIT   -
tcp        0      0 :::22                       :::*                        LISTEN      1719/sshd
tcp        0      0 ::ffff:192.168.1.12:22      ::ffff:192.168.1.210:3682   ESTABLISHED 2003/0



再重启node-a机器
node-b机器还是为master 并没有发现脑裂

2. 测试网卡失效
目前node-a 为主要机器 node-b为次要机器


[root@node-a ~]# ifdown eth0  <-- 将node-a 的eth0网卡失效

在node-b机器的日志显示
[root@node-b ~]# tail -f /var/log/messages
Apr  8 16:10:23 node-b heartbeat: [1902]: info: Link node-a:eth0 dead.<-- 发现eth0死
Apr  8 16:10:24 node-b ipfail: [1994]: info: Telling other node that we have more visible ping nodes.
Apr  8 16:10:24 node-b ipfail: [1994]: info: Link Status update: Link node-a/eth0 now has status dead
Apr  8 16:10:26 node-b ipfail: [1994]: info: Asking other side for ping node count.
Apr  8 16:10:26 node-b ipfail: [1994]: info: Checking remote count of ping nodes.
Apr  8 16:10:27 node-b ipfail: [1994]: info: Telling other node that we have more visible ping nodes.
Apr  8 16:10:32 node-b heartbeat: [1902]: info: node-a wants to go standby [all]
<--  将node-a 变更为 standby
Apr  8 16:10:36 node-b kernel: drbd0: peer( Primary -> Secondary )
<--  间node-a 变更为 次要机器( Primary -> Secondary )
Apr  8 16:10:36 node-b heartbeat: [1902]: info: standby: acquire [all] resources from node-a
Apr  8 16:10:36 node-b heartbeat: [3022]: info: acquire all HA resources (standby).
Apr  8 16:10:37 node-b ResourceManager[3035]: [3046]: info: Acquiring resource group: node-b IPaddr2::192.168.1.10/24/eth0/192.168.1.255 drbddisk::mysql Filesystem::/dev/drbd0::/mnt/mysql mysqld
Apr  8 16:10:37 node-b IPaddr2[3058]: [3115]: INFO:  Resource is stopped
Apr  8 16:10:37 node-b ResourceManager[3035]: [3129]: info: Running /etc/ha.d/resource.d/IPaddr2 192.168.1.10/24/eth0/192.168.1.255 start
Apr  8 16:10:37 node-b IPaddr2[3160]: [3195]: INFO: ip -f inet addr add 192.168.1.10/24 brd 192.168.1.255 dev eth0
Apr  8 16:10:37 node-b IPaddr2[3160]: [3197]: INFO: ip link set eth0 up
Apr  8 16:10:37 node-b IPaddr2[3160]: [3199]: INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.1.10 eth0 192.168.1.10 auto not_used not_used
Apr  8 16:10:37 node-b IPaddr2[3131]: [3203]: INFO:  Success
Apr  8 16:10:37 node-b ResourceManager[3035]: [3232]: info: Running /etc/ha.d/resource.d/drbddisk mysql start
Apr  8 16:10:37 node-b kernel: drbd0: role( Secondary -> Primary )
Apr  8 16:10:37 node-b kernel: drbd0: Writing meta data super block now.
Apr  8 16:10:38 node-b Filesystem[3249]: [3293]: INFO:  Resource is stopped
Apr  8 16:10:38 node-b ResourceManager[3035]: [3307]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /mnt/mysql start
Apr  8 16:10:38 node-b Filesystem[3320]: [3350]: INFO: Running start for /dev/drbd0 on /mnt/mysql
Apr  8 16:10:38 node-b Filesystem[3320]: [3355]: INFO: Starting filesystem check on /dev/drbd0
Apr  8 16:10:38 node-b kernel: kjournald starting.  Commit interval 5 seconds
Apr  8 16:10:38 node-b kernel: EXT3 FS on drbd0, internal journal
Apr  8 16:10:38 node-b kernel: EXT3-fs: mounted filesystem with ordered data mode.
Apr  8 16:10:38 node-b Filesystem[3309]: [3366]: INFO:  Success
Apr  8 16:10:38 node-b ResourceManager[3035]: [3406]: info: Running /etc/init.d/mysqld  start

[root@node-b ~]# ip addr show eth0
2: eth0:  mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:0c:29:41:a0:8e brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.12/24 brd 192.168.1.255 scope global eth0
    inet 192.168.1.10/24 brd 192.168.1.255 scope global secondary eth0
    inet6 fe80::20c:29ff:fe41:a08e/64 scope link
       valid_lft forever preferred_lft forever

说明node-b机器 已经接管node-a为master

当node-a 机器恢复eth0
 
ifup eth0

在node-b机器日志显示 侦测到node-a的eth0已经恢复
Apr  8 16:15:11 node-b heartbeat: [1902]: info: Link node-a:eth0 up.
Apr  8 16:15:11 node-b ipfail: [1994]: info: Link Status update: Link node-a/eth0 now has status up
Apr  8 16:15:14 node-b ipfail: [1994]: info: Ping node count is balanced.

node-b 还是 master机器

3.测试heartbeat服务死掉


[root@node-b ~]# /etc/rc.d/init.d/heartbeat stop
Stopping High-Availability services:
                                                           [  OK  ]

在node-a机器的日志上显示
[root@node-a ~]# tail -f /var/log/ha-debug
ipfail[2021]: 2009/04/08_16:14:00 debug: Other side is unstable.
heartbeat[1929]: 2009/04/08_16:14:04 info: Received shutdown notice from 'node-b'.
heartbeat[1929]: 2009/04/08_16:14:04 info: Resources being acquired from node-b.
heartbeat[1929]: 2009/04/08_16:14:04 debug: StartNextRemoteRscReq(): child count 1
heartbeat[3311]: 2009/04/08_16:14:04 info: acquire all HA resources (standby).
ResourceManager[3340][3357]: 2009/04/08_16:14:04 info: Acquiring resource group: node-a IPaddr2::192.168.1.10/24/eth0/192.168.1.255 drbddisk::mysql Filesystem::/dev/drbd0::/mnt/mysql mysqld
IPaddr2[3381][3495]: 2009/04/08_16:14:06 INFO:  Resource is stopped
IPaddr2[3389][3503]: 2009/04/08_16:14:06 INFO:  Resource is stopped
heartbeat[3312]: 2009/04/08_16:14:06 info: Local Resource acquisition completed.
heartbeat[1929]: 2009/04/08_16:14:06 debug: StartNextRemoteRscReq(): child count 2
heartbeat[1929]: 2009/04/08_16:14:06 debug: StartNextRemoteRscReq(): child count 1
ResourceManager[3340][3515]: 2009/04/08_16:14:06 info: Running /etc/ha.d/resource.d/IPaddr2 192.168.1.10/24/eth0/192.168.1.255 start
ResourceManager[3340][3516]: 2009/04/08_16:14:06 debug: Starting /etc/ha.d/resource.d/IPaddr2 192.168.1.10/24/eth0/192.168.1.255 start
IPaddr2[3546][3581]: 2009/04/08_16:14:07 INFO: ip -f inet addr add 192.168.1.10/24 brd 192.168.1.255 dev eth0
IPaddr2[3546][3583]: 2009/04/08_16:14:07 INFO: ip link set eth0 up
IPaddr2[3546][3585]: 2009/04/08_16:14:07 INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.1.10 eth0 192.168.1.10 auto not_used not_used
IPaddr2[3517][3589]: 2009/04/08_16:14:07 INFO:  Success
ResourceManager[3340][3590]: 2009/04/08_16:14:07 debug: /etc/ha.d/resource.d/IPaddr2 192.168.1.10/24/eth0/192.168.1.255 start done. RC=0
ResourceManager[3340][3618]: 2009/04/08_16:14:07 info: Running /etc/ha.d/resource.d/drbddisk mysql start
ResourceManager[3340][3619]: 2009/04/08_16:14:07 debug: Starting /etc/ha.d/resource.d/drbddisk mysql start
ResourceManager[3340][3623]: 2009/04/08_16:14:07 debug: /etc/ha.d/resource.d/drbddisk mysql start done. RC=0
Filesystem[3635][3679]: 2009/04/08_16:14:08 INFO:  Resource is stopped
ResourceManager[3340][3693]: 2009/04/08_16:14:08 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /mnt/mysql start
ResourceManager[3340][3694]: 2009/04/08_16:14:08 debug: Starting /etc/ha.d/resource.d/Filesystem /dev/drbd0 /mnt/mysql start
Filesystem[3706][3736]: 2009/04/08_16:14:09 INFO: Running start for /dev/drbd0 on /mnt/mysql
Filesystem[3706][3741]: 2009/04/08_16:14:09 INFO: Starting filesystem check on /dev/drbd0
Filesystem[3695][3752]: 2009/04/08_16:14:09 INFO:  Success


[root@node-a ~]# ip addr show eth0
2: eth0:  mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:0c:29:8c:0c:be brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.11/24 brd 192.168.1.255 scope global eth0
    inet 192.168.1.10/24 brd 192.168.1.255 scope global secondary eth0
    inet6 fe80::20c:29ff:fe8c:cbe/64 scope link
       valid_lft forever preferred_lft forever

[root@node-a ~]# netstat -anpt
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
tcp        0      0 0.0.0.0:3306                0.0.0.0:*                   LISTEN      3881/mysqld
tcp        0      0 192.168.100.11:49788        192.168.100.12:7789         ESTABLISHED -
tcp        0      0 192.168.100.11:7789         192.168.100.12:53825        ESTABLISHED -
tcp        0      0 :::22                       :::*                        LISTEN      1741/sshd
tcp        0      0 ::ffff:192.168.1.11:22      ::ffff:192.168.1.211:1285   ESTABLISHED 2037/0

node-a机器已经接管了node-b机器为master


[root@node-b ~]# /etc/rc.d/init.d/heartbeat start <-- node-b再启动 heartbeat
Starting High-Availability services:
2009/04/08_16:20:42 INFO:  Resource is stopped
                                                           [  OK  ]
node-a机器日志显示 侦测出node-b的heartbeat已经启动
[root@node-a ~]# tail -f /var/log/ha-debug
heartbeat[1929]: 2009/04/08_16:16:35 info: Heartbeat restart on node node-b
heartbeat[1929]: 2009/04/08_16:16:35 info: Link node-b:eth0 up.
heartbeat[1929]: 2009/04/08_16:16:35 info: Status update for node node-b: status init
heartbeat[1929]: 2009/04/08_16:16:35 info: Link node-b:eth1 up.
heartbeat[1929]: 2009/04/08_16:16:35 info: Status update for node node-b: status up

此时 node-a机器任然为master 
服务一切正常，到此所有测试已经完毕！

主意事项：
资源启动是从上到下依次进行，而关闭资源是从下到上依次进行比如：



IPaddress::192.168.12.30/24 - Runs /etc/ha.d/resources.d/IPaddress 192.168.12.30/24 {start,stop}
drbddsk::mysql - Runs /etc/ha.d/resources.d/drbddsk mysql {start,stop}
Filesystem::/dev/drbd0::/mnt/mysql::ext3::defaults - Runs /etc/ha.d/resources.d/Filesystem /dev/drbd0 /mnt/mysql ext3 defaults {start,stop}
mysqld - Runs mysqld {start,stop}

heartbeat+DRBD+MySQL（二）

0 评论 00:32 发帖者 frankzh

回到node-a机器


[root@node-a ~]# drbdadm -- --overwrite-data-of-peer primary mysql 
以node-a为主服务开始同步

[root@node-a ~]# watch cat /proc/drbd 查看同步信息
Every 2.0s: cat /proc/drbd                              Sat Mar 28 21:46:44 2009

version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-i386-bui
ld, 2008-10-03 11:42:32
 0: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r---
    ns:745388 nr:0 dw:0 dr:753568 al:0 bm:45 lo:1 pe:1 ua:256 ap:0 oos:2891724
        [===>................] sync'ed: 20.7% (2891724/3637100)K

同时在node-b机器上查看同步信息
[root@node-b ~]# cat /proc/drbd
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-i386-build, 2008-10-03 11:42:32
 0: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate C r---
    ns:0 nr:2894368 dw:2894304 dr:0 al:0 bm:176 lo:3 pe:2008 ua:2 ap:0 oos:742796
        [==============>.....] sync'ed: 79.7% (742796/3637100)K
        finish: 0:00:26 speed: 27,744 (17,124) K/sec <-- 同步的传输速度

node-a机器查看同步完后的信息 总共花了total 147 sec
[root@node-a ~]# tail -f /var/log/messages
Mar 28 21:46:09 node-a kernel: drbd0: writing of bitmap took 9 jiffies
Mar 28 21:46:09 node-a kernel: drbd0: 3552 MB (909275 bits) marked out-of-sync by on disk bit-map.
Mar 28 21:46:09 node-a kernel: drbd0: Writing meta data super block now.
Mar 28 21:46:09 node-a kernel: drbd0: conn( Connected -> WFBitMapS )
Mar 28 21:46:09 node-a kernel: drbd0: conn( WFBitMapS -> SyncSource )
Mar 28 21:46:09 node-a kernel: drbd0: Began resync as SyncSource (will sync 3637100 KB [909275 bits set]).
Mar 28 21:46:09 node-a kernel: drbd0: Writing meta data super block now.
Mar 28 21:48:37 node-a kernel: drbd0: Resync done (total 147 sec; paused 0 sec; 24740 K/sec)
Mar 28 21:48:37 node-a kernel: drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
Mar 28 21:48:37 node-a kernel: drbd0: Writing meta data super block now.

开始建立node-a 的DRBD使用的文件磁盘


[root@node-a ~]# mkfs.ext3 -L mysql /dev/drbd0
mke2fs 1.39 (29-May-2006)
Filesystem label=mysql
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
455168 inodes, 909275 blocks
45463 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=931135488
28 block groups
32768 blocks per group, 32768 fragments per group
16256 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 21 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

[root@node-a mnt]# mkdir /mnt/mysql
[root@node-a mnt]# mount /dev/drbd0 /mnt/mysql
[root@node-a mysql]# vi /etc/my.cnf <-- 修改mysql数据库存放路径
datadir=/mnt/mysql

[root@node-a mysql]# /etc/init.d/mysqld start <-- 启动mysql
Initializing MySQL database:  Installing MySQL system tables...
OK
Filling help tables...
OK

To start mysqld at boot time you have to copy
support-files/mysql.server to the right place for your system

PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
To do so, start the server, then issue the following commands:
/usr/bin/mysqladmin -u root password 'new-password'
/usr/bin/mysqladmin -u root -h node-a password 'new-password'
See the manual for more instructions.
You can start the MySQL daemon with:
cd /usr ; /usr/bin/mysqld_safe &

You can test the MySQL daemon with mysql-test-run.pl
cd mysql-test ; perl mysql-test-run.pl

Please report any problems with the /usr/bin/mysqlbug script!

The latest information about MySQL is available on the web at
http://www.mysql.com
Support MySQL by buying support/licenses at http://shop.mysql.com
                                                           [  OK  ]
Starting MySQL:                                            [  OK  ]

[root@node-a mysql]# cd /mnt/mysql/
[root@node-a mysql]# ls
ibdata1  ib_logfile0  ib_logfile1  lost+found  mysql  test
有数据存在了

[root@node-a mysql]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-logV0100
                      7.7G  868M  6.4G  12% /
/dev/hda1              99M   24M   71M  25% /boot
tmpfs                 125M     0  125M   0% /dev/shm
/dev/drbd0            3.5G   92M  3.2G   3% /mnt/mysql  <-- 已经挂载好了

[root@node-a mysql]# scp  /etc/my.cnf  root@192.168.1.12:/etc/my.cnf

[root@node-a mysql]# /etc/rc.d/init.d/drbd status <-- 状态为一主一副了
drbd driver loaded OK; device status:
version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-i386-build, 2008-10-03 11:42:32
m:res    cs         st                 ds                 p  mounted     fstype
0:mysql  Connected  Primary/Secondary  UpToDate/UpToDate  C  /mnt/mysql  ext3

cd .. 退出/mnt/mysql目录
umount /mnt/mysql         <-- 卸载/mnt/mysql
[root@node-a mnt]# drbdadm secondary mysql （把node-a机器设置为次用机器）

登陆node-b机器


[root@node-b ~]# mkdir /mnt/mysql

[root@node-b mysql]# /etc/init.d/drbd strat
[root@node-b mysql]# drbdadm primary mysql  （把node-b机器设置为主机 这样才能挂载到/mnt/mysql）
[root@node-b mysql]# mount /dev/drbd0 /mnt/mysql  

等几十秒后
[root@node-b mysql]# cd /mnt/mysql
[root@node-b mysql]# ls
ibdata1  ib_logfile0  ib_logfile1  lost+found  mysql  test

说明node-a机器上建立的数据库 已经自动同步到node-b机器上了


[root@node-b ~]# umount /mnt/mysql  
[root@node-b ~]# drbdadm secondary mysql <-- 再将node-b设置为次要机器

[root@node-a mnt]# drbdadm primary mysql <-- 将node-a设置为主要机器

修改heartbeat的配置文件


[root@node-a] cp /usr/share/doc/heartbeat-*/ha.cf /etc/ha.d/
[root@node-a] vi /etc/ha.d/ha.cf
ucast eth0 192.168.1.12
ucast eth1 192.168.100.12
ping 192.168.1.1
respawn hacluster /usr/lib/heartbeat/ipfail
respawn hacluster /usr/lib/heartbeat/dopd
apiauth dopd gid=haclient uid=hacluster
udpport        694
warntime 5
deadtime 15
initdead 60
keepalive 2
node node-a
node node-b
auto_failback off
watchdog /dev/watchdog
use_logd yes
#crm yes

在node-b机器上编辑 
[root@node-b ~]#  vi  /etc/ha.d/ha.cf
ucast eth0 192.168.1.11
ucast eth1 192.168.100.11
ping 192.168.1.1
respawn hacluster /usr/lib/heartbeat/ipfail
respawn hacluster /usr/lib/heartbeat/dopd
apiauth dopd gid=haclient uid=hacluster
udpport        694
warntime 5
deadtime 15
initdead 60
keepalive 2
node node-a
node node-b
auto_failback off
watchdog /dev/watchdog
use_logd yes
#crm yes



[root@node-a ~]# cp /usr/share/doc/heartbeat-*/logd.cf /etc/

[root@node-a ~]# vi  /etc/logd.cf  <-- 设置软件日志
debugfile /var/log/ha-debug
logfile        /var/log/ha-log
logfacility    none


[root@node-a ~]# cp /usr/share/doc/heartbeat-*/authkeys /etc/ha.d/

[root@node-a ~]# vi /etc/ha.d/authkeys <-- 设置验证方式采用 crc 可以节省cpu的消耗
auth 1
1 crc

[root@node-a ~]# chmod 600 /etc/ha.d/authkeys <-- 设置认证文件只能是root访问

编辑资源文件


[root@node-a ~]# vi /etc/ha.d/haresources
node-a \   <-- 资源名称 用"uname-n"就能显示出来
IPaddr2::192.168.1.10/24/eth0/192.168.1.255 \ <-- 资源虚拟的ip和网卡和子网掩码
drbddisk::mysql \ <-- drbd的资源
Filesystem::/dev/drbd0::/mnt/mysql \  <-- 使用得文件系统
mysqld     <-- 使用mysql


把authkeys haresources ha.cf 都复制到node-b机器
scp ha.cf haresources authkeys root@192.168.1.12:/etc/ha.d/

[root@node-b ~]#  vi /etc/ha.d/haresources
node-b  \    <-- 这个需要修改为node-b
IPaddr2::192.168.1.10/24/eth0/192.168.1.255 \
drbddisk::mysql \
Filesystem::/dev/drbd0::/mnt/mysql \
mysqld

2009年4月7日星期二

heartbeat+DRBD+MySQL（一）

0 评论 03:09 发帖者 frankzh

最近测试了下使用heartbeat+DRBD+MySQL来实现mysql的高可用性，规划如下图。无论那台机器死了另一台都可以监控到并在设定的时间内接管对方的服务这样就保证了mysql的高可用性，

测试之前先按照我之前写的“一张盘安装centos5.2”安装好系统，
具体测试环境如下：

虚拟机虚拟多一块已硬盘和多一张虚拟网卡
系统 centos5.2安装在虚拟机里

第一台虚拟机器
node-a 主服务器
eth0 192.168.1.11
eth0 192.168.1.10 虚拟ip 由HeartBeat控制接管作为主要提供mysql访问的ip
eth1 192.168.100.11 提供给DRBD传输数据使用

第二台虚拟机器
node-b 次服务器
eth0 192.168.1.12
eth0 192.168.1.10 虚拟ip 由HeartBeat控制接管作为主要提供mysql访问的ip
eth1 192.168.100.12 提供给DRBD传输数据使用

系统分区采用lvm模式

[root@node-a ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-logV0100
         7.7G  601M  6.7G   9% /
/dev/hda1              99M   12M   83M  13% /boot
tmpfs                 125M     0  125M   0% /dev/shm


[root@node-a ~]# lvdisplay -C
LV       VG         Attr   LSize   Origin Snap%  Move Log Copy%  Convert
LogVol01 VolGroup00 -wi-ao 512.00M
logV0100 VolGroup00 -wi-ao   7.88G
lvol0    VolGroup00 -wi-a-   3.47G  <---这个分区就是留给drbd使用的 安装的时候先不要格式化

安装drbd


[root@node-a yum.repos.d]# yum -y install drbd82 kmod-drbd82

[root@node-a yum.repos.d]# rpm -qi kmod-drbd82 <--- 查看kmod-drbd82对应的内核版本  Name        : kmod-drbd82                 Relocations: (not relocatable)  Version     : 8.2.6                        Vendor: CentOS  
Release     : 2                             Build Date: Sat 04 Oct 2008 12:46:57 AM CST  
Install Date: Fri 27 Mar 2009 11:56:57 PM CST      Build Host: c5-i386-build Group       : System Environment/Kernel     Source RPM: drbd82-kmod-8.2.6-2.src.rpm Size        : 2047377                          License: GPL  
Signature   : DSA/SHA1, Sat 04 Oct 2008 12:47:52 AM CST, Key ID a8a447dce8562897 URL         : http://www.drbd.org/  Summary     : drbd82 kernel module(s)  Description :  
This package provides the drbd82 kernel modules built for the  
Linux kernel 2.6.18-92.1.13.el5 for the i686 family of processors.<--- 这里显示出来的是kernel 2.6.18-92.1.13.el5  所以我们更新下内核版本    
[root@node-a yum.repos.d]# uname -r <--- 查看但前系统使用的内核版本 2.6.18-92.el5  [root@node-a yum.repos.d]# rpm -q kernel kernel-2.6.18-92.el5 kernel-2.6.18-92.1.22.el5   
[root@node-a yum.repos.d]# yum -y install yum-allowdowngrade   
[root@node-a yum.repos.d]# yum --allow-downgrade -y install kernel-2.6.18-92.1.13.el5 安装上drbd对应的内核版本  reboot  重启机器   
[root@node-a ~]#  uname -r 2.6.18-92.1.13.el5  <--- 内核版本已经正确对应上了

设置node-a的drbd配置了


[root@node-a ~]# vi /etc/drbd.conf
#
# please have a a look at the example configuration file in
# /usr/share/doc/drbd82/drbd.conf
#
global {
minor-count 1;
#usage-count yes;
}

common { syncer { rate 100M; } }


resource mysql {
protocol C; # There are A, B and C protocols. Stick with C.
#  incon-degr-cmd "echo 'DRBD Degraded!' | wall; sleep 60 ; halt -f";
# If a cluster starts up in degraded mode, it will echo a message to all
# users. It'll wait 60 seconds then halt the system.

net {
cram-hmac-alg sha1;
shared-secret "FooFunFactory";
}

on node-a {
device /dev/drbd0; # The name of our drbd device.
disk /dev/VolGroup00/lvol0;    # Partition we wish drbd to use.
address 192.168.100.11:7789; # node0 IP address and port number.
meta-disk internal; # Stores meta-data in lower portion of sdb1.
}

on node-b {
device /dev/drbd0; # Our drbd device, must match node0.
disk /dev/VolGroup00/lvol0;    # Partition drbd should use.
address 192.168.100.12:7789; # IP address of node1, and port number.
meta-disk internal; #Stores meta-data in lower portion of sdb1.
}

disk {
on-io-error detach; # What to do when the lower level device errors.
}


startup {
wfc-timeout 0; # drbd init script will wait infinitely on resources.
degr-wfc-timeout 120; # 2 minutes.
}
} # End of resource mysql

安装一些相关的套件


yum -y install mysql-server.i386  which  ntsysv  ntp  crontabs

安装heartbeat


yum -y install heartbeat 出现错误提示
useradd: user hacluster exists
error: %pre(heartbeat-2.1.3-3.el5.centos.i386) scriptlet failed, exit status 9
error:   install: %pre scriptlet failed (2), skipping heartbeat-2.1.3-3.el5.centos

再安装一次 yum -y install heartbeat 就好了

node-a 执行
drbdadm create-md mysql <--- drbd 初始化  
启动drbd 
[root@node-a ~]# /etc/rc.d/init.d/drbd start Starting DRBD resources:    [ d(mysql) s(mysql) n(mysql) ]. .......... ***************************************************************  DRBD's startup script waits for the peer node(s) to appear.  - In case this node was already a degraded cluster before the    reboot the timeout is 120 seconds. [degr-wfc-timeout]  - If the peer was available before the reboot the timeout will    expire after 0 seconds. [wfc-timeout]    (These values are for resource 'mysql'; 0 sec -> wait forever)
To abort waiting enter 'yes' [  17]:yes  <-- 输入yes   
chkconfig drbd on  设置开机启动 drbd   
设置node-a 防火墙不要开启  SELinux disabled 掉   
shutdown -h now 把node-a 关机

为了方便可以直接把node-a的虚拟机器所在的目录copy一份出来做为node-b 这样就不用重新安装系统了启动node-b 后登陆机器修改一下配置
1.修改 /etc/hosts and /etc/sysconfig/networking 设置机器名为node-b
2.修改两张网卡为正确的ip
3.重启启动node-b机器

将node-a机器重启开机


[root@node-a ~]# cat /proc/drbd <-- 查看drbd的状态 自己是Secondary 对方为Unknown version: 8.2.6 (api:88/proto:86-88) GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-i386-build, 2008-10-03 11:42:32  0: cs:WFConnection st:Secondary/Unknown ds:Inconsistent/DUnknown C r---         ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 oos:3637100 Inconsistent 表示还没有同步

登录node-b机器

编辑
[root@node-b ~]# vi /etc/drbd.conf
#
# please have a a look at the example configuration file in
# /usr/share/doc/drbd82/drbd.conf
#
global {
minor-count 1;
#usage-count yes;
}

common { syncer { rate 100M; } }


resource mysql {
protocol C; # There are A, B and C protocols. Stick with C.
#  incon-degr-cmd "echo 'DRBD Degraded!' | wall; sleep 60 ; halt -f";
# If a cluster starts up in degraded mode, it will echo a message to all
# users. It'll wait 60 seconds then halt the system.

net {
cram-hmac-alg sha1;
shared-secret "FooFunFactory";
}

on node-a {
device /dev/drbd0; # The name of our drbd device.
disk /dev/VolGroup00/lvol0;    # Partition we wish drbd to use.
address 192.168.100.11:7789; # node0 IP address and port number.
meta-disk internal; # Stores meta-data in lower portion of sdb1.
}

on node-b {
device /dev/drbd0; # Our drbd device, must match node0.
disk /dev/VolGroup00/lvol0;    # Partition drbd should use.
address 192.168.100.12:7789; # IP address of node1, and port number.
meta-disk internal; #Stores meta-data in lower portion of sdb1.
}

disk {
on-io-error detach; # What to do when the lower level device errors.
}


startup {
wfc-timeout 0; # drbd init script will wait infinitely on resources.
degr-wfc-timeout 120; # 2 minutes.
}
} # End of resource mysql

同样执行
drbdadm create-md mysql

/etc/rc.d/init.d/drbd start

[root@node-b ~]# cat /proc/drbd <-- 状态是Secondary version: 8.2.6 (api:88/proto:86-88) GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-i386-build, 2008-10-03 11:42:32  0: cs:Connected st:Secondary/Secondary ds:Inconsistent/Inconsistent C r---        ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 oos:3637100

一张盘安装centos5.2（三）

0 评论 02:47 发帖者 frankzh

配置第二张网卡eth1的参数

随机器启动

配置ip为192.168.100.11

选择ok

设置网关和主次dns服务区地址

手动配置机器名称如 node-a

配置时区下拉菜单中选择亚洲上海

设置超级用户密码

选择Customize software selection自己定义安装软件

把默认有*号的都取消也就是什么软件都不要安装，等到系统安装完后需要什么软件我们使用yum再来安装否则一张盘安装不下去。

选择ok 后开始安装

安装完后选择 reboot自动重启进入系统系统安装到此完毕。

一张盘安装centos5.2（二）

0 评论 02:22 发帖者 frankzh

再查看和修改分区

采用lvm卷分区如下其中lvol0先不需要格式化以后测试其他服务需要用到

选择grub启动管理

再选择yes

选择yes 不设置启动密码

选择yes

选择ok

选择eth0 后edit 编辑第一块网卡的参数

选随网卡随机器启动而启用使用ipv4

手动设置好eth0 192.168.1.11的ip和掩码

一张盘安装centos5.2（一）

0 评论 00:41 发帖者 frankzh

首先下载http://mirrors.163.com/centos/5/isos/i386/CentOS-5.2-i386-bin-1of6.iso 这个文件，我是用vmware虚拟出机器来安装centos5.2。VMware虚拟出来的硬件如下图

图中显示有两个虚拟磁盘和两张虚拟网卡，其中hard disk2 和ethernet 2是自己添加上去的准备提供给以后需要其他服务测试使用。重启虚拟机器进入安装界面

我们选择文字安装输入linux text 后回车

选择ok

语言选择英语

键盘选择美式

选择yes

选择删除linux分区和hda 区

确认选择yes