• Posts tagged "克隆体"

Blog Archives

让Hadoop跑在云端系列文章 之 增加删除hadoop节点

让Hadoop跑在云端系列文章,介绍了如何整合虚拟化和Hadoop,让Hadoop集群跑在VPS虚拟主机上,通过云向用户提供存储和计算的服务。

现在硬件越来越便宜,一台非品牌服务器,2颗24核CPU,配48G内存,2T的硬盘,已经降到2万块人民币以下了。这种配置如果简单地放几个web应用,显然是奢侈的浪费。就算是用来实现单节点的hadoop,对计算资源浪费也是非常高的。对于这么高性能的计算机,如何有效利用计算资源,就成为成本控制的一项重要议题了。

通过虚拟化技术,我们可以将一台服务器,拆分成12台VPS,每台2核CPU,4G内存,40G硬盘,并且支持资源重新分配。多么伟大的技术啊!现在我们有了12个节点的hadoop集群, 让Hadoop跑在云端,让世界加速。

关于作者:

  • 张丹(Conan), 程序员Java,R,PHP,Javascript
  • weibo:@Conan_Z
  • blog: http://blog.fens.me
  • email: bsspirit@gmail.com

转载请注明出处:

http://blog.fens.me/hadoop-clone-add-delete/

clone-add-del

前言

让Hadoop跑在云端系列文章,经过前面几篇文章的介绍,我们已经可以创建并管理虚拟机,增加hadoop节点。本文只是把操作过程整理一下,做一个操作总结,让没有计算机背景的同学,也可以进行操作。

目录

  1. 增加克隆体hadoop节点c6
  2. 删除c6节点
  3. 实现脚本

 

1. 增加克隆体hadoop节点c6

1. 登陆host主机,查检c6.wtmat.com域名是否已经被正确解析。


~ ssh cos@host.wtmart.com

~ ping c6.wtmart.com
ping: unknown host c6.wtmart.com

2. 登陆dns.wtmart.com服务器,做域名绑定。


~ ssh cos@dns.wtmart.com

~  sudo vi /etc/bind/db.wtmart.com
#增加
c6      IN      A       192.168.1.35

#重启dns服务器
~ sudo /etc/init.d/bind9 restart
 * Stopping domain name service... bind9                                                              waiting for pid 1418 to die                                                                                              [ OK ]
 * Starting domain name service... bind9

~ exit

3. 返回host, 查检c6.wtmat.com域名是否已经被正确解析。


~ ping c6.wtmart.com -n
PING c6.wtmart.com (192.168.1.35) 56(84) bytes of data.
From 192.168.1.79 icmp_seq=1 Destination Host Unreachable
From 192.168.1.79 icmp_seq=2 Destination Host Unreachable
From 192.168.1.79 icmp_seq=3 Destination Host Unreachable

c6.wtmart.com已被解析到192.168.1.35,只是还没有主机,下面我们就给c6增加一台虚拟机。

4. 在host,克隆虚拟机


~ sudo virt-clone --connect qemu:///system -o hadoop-base -n c6 -f /disk/sdb1/c6.img
Cloning hadoop-base.img               1% [                          ]  42 MB/s | 531 MB     15:53 ETA
Cloning hadoop-base.img                                                        |  40 GB     07:54

5. 打开虚拟机管理控制软件virsh


~ sudo virsh

#查看主机状态
virsh # list --all
 Id    Name                           State
----------------------------------------------------
 5     server3                        running
 6     server4                        running
 7     d2                             running
 8     r1                             running
 9     server2                        running
 18    server5                        running
 48    c3                             running
 50    c1                             running
 52    c4                             running
 53    c2                             running
 55    c5                             running
 -     c6                             shut off
 -     d1                             shut off
 -     hadoop-base                    shut off
 -     u1210-base                     shut off


#编辑c6虚拟机,给虚拟机挂载分区硬盘/dev/sdb10
~ edit c6
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/sdb10'/>
<target dev='vdb' bus='virtio'/>
</disk>

#启动c6
~ start c6
Domain c6 started

#进入c6
~ console c6

6. 在c6中,执行快速配置脚本。


~ pwd
/home/cos

~ ls -l
drwxrwxr-x 2 cos cos 4096 Jul  9 23:50 hadoop
-rw-rw-r-- 1 cos cos 1404 Jul 11 16:50 quick.sh
drwxrwxr-x 7 cos cos 4096 Jul  9 23:31 toolkit

#修改虚拟机参数
~ vi quick.sh
export HOST=c6
export IP=192.168.1.35


#请用sudo身份执行脚本
~ sudo sh ./quick.sh

====================hostname host============================
====================ip address============================
Rather than invoking init scripts through /etc/init.d, use the service(8)
utility, e.g. service networking restart

Since the script you are attempting to invoke has been converted to an
Upstart job, you may also use the stop(8) and then start(8) utilities,
e.g. stop networking ; start networking. The restart(8) utility is also available.
networking stop/waiting
networking start/running
====================dns============================
====================fdisk mount============================
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x8f02312d.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): Partition type:
   p   primary (0 primary, 0 extended, 4 free)
   e   extended
Select (default p): Partition number (1-4, default 1): Using default value 1
First sector (2048-379580414, default 2048): Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-379580414, default 379580414): Using default value 379580414

Command (m for help): The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
mke2fs 1.42.5 (29-Jul-2012)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
11862016 inodes, 47447295 blocks
2372364 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
1448 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872

Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

==================== hadoop folder============================
====================ssh============================
Generating public/private rsa key pair.
Your identification has been saved in /home/cos/.ssh/id_rsa.
Your public key has been saved in /home/cos/.ssh/id_rsa.pub.
The key fingerprint is:
55:0d:3c:61:cc:53:e5:68:24:aa:33:18:3b:fc:08:75 cos@c6
The key's randomart image is:
+--[ RSA 2048]----+
|           +*=o..|
|           +*o.o |
|      o E o  oo .|
|     o = o   .   |
|    . = S        |
|     . + o       |
|      . .        |
|                 |
|                 |
+-----------------+


#退出虚拟机
~ exit

7. 登陆hadoop的主节点c1.wtmart.com


~ ssh cos@c1.wtmart.com

#查看当前hadoop集群状态, 5个hadoop节点正常运行
~ hadoop dfsadmin -report
Configured Capacity: 792662536192 (738.22 GB)
Present Capacity: 744482840576 (693.35 GB)
DFS Remaining: 744482676736 (693.35 GB)
DFS Used: 163840 (160 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 5 (5 total, 0 dead)

Name: 192.168.1.30:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 15181529088 (14.14 GB)
DFS Remaining: 143351558144(133.51 GB)
DFS Used%: 0%
DFS Remaining%: 90.42%
Last contact: Fri Jul 12 23:16:09 CST 2013


Name: 192.168.1.31:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:16:08 CST 2013


Name: 192.168.1.32:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249581568 (7.68 GB)
DFS Remaining: 150283526144(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:16:08 CST 2013


Name: 192.168.1.34:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:16:09 CST 2013


Name: 192.168.1.33:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:16:08 CST 2013

8. 增加c6到hadoop集群


~ pwd
/home/cos

~ ls
-rw-rw-r-- 1 cos cos 1078 Jul 12 23:30 clone-node-add.sh
-rw-rw-r-- 1 cos cos  918 Jul 12 23:44 clone-node-del.sh
drwxrwxr-x 2 cos cos 4096 Jul  9 21:42 download
drwxr-xr-x 6 cos cos 4096 Jul  9 23:31 hadoop
drwxrwxr-x 7 cos cos 4096 Jul  9 23:31 toolkit

#修改配置参数
~ vi clone-node-add.sh

#新增节点c6.wtmart.com
export NEW_NODE=c6.wtmart.com
#配置slaves节点
export SLAVES=c1.wtmart.com:c2.wtmart.com:c3.wtmart.com:c4.wtmart.com:c5.wtmart.com:c6.wtmart.com

#运行脚本,以当前用户运行
~ sh ./clone-node-add.sh
===============sync ssh=========================
Warning: Permanently added 'c6.wtmart.com,192.168.1.35' (ECDSA) to the list of known hosts.
scp c1.wtmart.com
scp c2.wtmart.com
scp c3.wtmart.com
scp c4.wtmart.com
scp c5.wtmart.com
scp c6.wtmart.com
===============sync hadoop slaves=========================
scp c1.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
scp c2.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
scp c3.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
scp c4.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
scp c5.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
scp c6.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
===============restart hadoop cluster=========================
Warning: $HADOOP_HOME is deprecated.

stopping jobtracker
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c1.wtmart.com: stopping tasktracker
c5.wtmart.com: stopping tasktracker
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c4.wtmart.com: stopping tasktracker
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c3.wtmart.com: stopping tasktracker
c2.wtmart.com: stopping tasktracker
c6.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c6.wtmart.com:
c6.wtmart.com: no tasktracker to stop
stopping namenode
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: stopping datanode
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c6.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c6.wtmart.com:
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c6.wtmart.com: no datanode to stop
c5.wtmart.com: stopping datanode
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c2.wtmart.com: stopping datanode
c3.wtmart.com: stopping datanode
c4.wtmart.com: stopping datanode
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: stopping secondarynamenode
Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-namenode-c1.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c1.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c1.out
c6.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c6.wtmart.com:
c5.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c5.out
c2.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c2.out
c6.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c6.out
c3.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c3.out
c4.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c4.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting secondarynamenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-secondarynamenode-c1.out
starting jobtracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-jobtracker-c1.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c1.out
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c5.wtmart.com:
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c6.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c6.wtmart.com:
c3.wtmart.com:
c5.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c5.out
c2.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c2.out
c6.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c6.out
c4.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c4.out
c3.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c3.out

9. 查看hadoop节点已经增加,c6已经增加到hadoop集群中


#查看系统进程
~ jps
12019 TaskTracker
11763 SecondaryNameNode
12098 Jps
11878 JobTracker
11633 DataNode
11499 NameNode

#查看hadoop节点
~ hadoop dfsadmin -report
Configured Capacity: 983957319680 (916.38 GB)
Present Capacity: 925863960576 (862.28 GB)
DFS Remaining: 925863768064 (862.28 GB)
DFS Used: 192512 (188 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 6 (6 total, 0 dead)

Name: 192.168.1.30:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 15181541376 (14.14 GB)
DFS Remaining: 143351545856(133.51 GB)
DFS Used%: 0%
DFS Remaining%: 90.42%
Last contact: Fri Jul 12 23:27:01 CST 2013


Name: 192.168.1.31:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:27:00 CST 2013


Name: 192.168.1.32:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249581568 (7.68 GB)
DFS Remaining: 150283526144(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:27:00 CST 2013


Name: 192.168.1.34:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:27:00 CST 2013


Name: 192.168.1.33:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:27:00 CST 2013


Name: 192.168.1.35:50010
Decommission Status : Normal
Configured Capacity: 191294783488 (178.16 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 9913651200 (9.23 GB)
DFS Remaining: 181381103616(168.92 GB)
DFS Used%: 0%
DFS Remaining%: 94.82%
Last contact: Fri Jul 12 23:27:02 CST 2013

删除c6节点

1. 登陆hadoop主节点,c1.wtmart.com


~ ssh cos@c1.wtmart.com

~ pwd
/home/cos

~ ls -l
-rw-rw-r-- 1 cos cos 1078 Jul 12 23:30 clone-node-add.sh
-rw-rw-r-- 1 cos cos  918 Jul 12 23:44 clone-node-del.sh
drwxrwxr-x 2 cos cos 4096 Jul  9 21:42 download
drwxr-xr-x 6 cos cos 4096 Jul  9 23:31 hadoop
drwxrwxr-x 7 cos cos 4096 Jul  9 23:31 toolkit

2. 修改配置脚本


~ vi clone-node-del.sh
export DEL_NODE=c6
export SLAVES=c1.wtmart.com:c2.wtmart.com:c3.wtmart.com:c4.wtmart.com:c5.wtmart.com

#运行脚本,以当前用户运行
~  sh ./clone-node-del.sh
===============sync ssh=========================
scp c1.wtmart.com
scp c2.wtmart.com
scp c3.wtmart.com
scp c4.wtmart.com
scp c5.wtmart.com
===============sync hadoop slaves=========================
scp c1.wtmart.com
slaves                                                               100%   70     0.1KB/s   00:00
scp c2.wtmart.com
slaves                                                               100%   70     0.1KB/s   00:00
scp c3.wtmart.com
slaves                                                               100%   70     0.1KB/s   00:00
scp c4.wtmart.com
slaves                                                               100%   70     0.1KB/s   00:00
scp c5.wtmart.com
slaves                                                               100%   70     0.1KB/s   00:00
===============restart hadoop cluster=========================
Warning: $HADOOP_HOME is deprecated.

stopping jobtracker
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: stopping tasktracker
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c2.wtmart.com: stopping tasktracker
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c5.wtmart.com: stopping tasktracker
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c3.wtmart.com: stopping tasktracker
c4.wtmart.com: stopping tasktracker
stopping namenode
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: stopping datanode
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c4.wtmart.com: stopping datanode
c5.wtmart.com: stopping datanode
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c3.wtmart.com: stopping datanode
c2.wtmart.com: stopping datanode
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: stopping secondarynamenode
Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-namenode-c1.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c5.wtmart.com:
c3.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c3.out
c1.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c1.out
c5.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c5.out
c2.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c2.out
c4.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c4.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting secondarynamenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-secondarynamenode-c1.out
starting jobtracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-jobtracker-c1.out
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c5.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c5.out
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c4.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c4.out
c3.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c3.out
c2.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c2.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c1.out

3. 查看hadoop节点,c6已经被删除


~ hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.

Safe mode is ON
Configured Capacity: 792662536192 (738.22 GB)
Present Capacity: 744482836480 (693.35 GB)
DFS Remaining: 744482672640 (693.35 GB)
DFS Used: 163840 (160 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 5 (5 total, 0 dead)

Name: 192.168.1.30:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 15181533184 (14.14 GB)
DFS Remaining: 143351554048(133.51 GB)
DFS Used%: 0%
DFS Remaining%: 90.42%
Last contact: Fri Jul 12 23:45:29 CST 2013


Name: 192.168.1.31:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:45:29 CST 2013


Name: 192.168.1.32:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249581568 (7.68 GB)
DFS Remaining: 150283526144(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:45:29 CST 2013


Name: 192.168.1.34:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:45:32 CST 2013


Name: 192.168.1.33:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:45:32 CST 2013

4. 登陆host,删除c6的虚拟机


~ ssh cos@host.wtmart.com
~ sudo virsh

#暂停c6的虚拟机
virsh # destroy c6
Domain c6 destroyed

#销毁c6的虚拟机实例
virsh # undefine c6
Domain c6 has been undefined

#查看虚拟机列表,c6已经不存在
virsh # list --all
 Id    Name                           State
----------------------------------------------------
 5     server3                        running
 6     server4                        running
 7     d2                             running
 8     r1                             running
 9     server2                        running
 18    server5                        running
 48    c3                             running
 50    c1                             running
 52    c4                             running
 53    c2                             running
 55    c5                             running
 -     d1                             shut off
 -     hadoop-base                    shut off
 -     u1210-base                     shut off

5. 物理硬盘删除c6的镜像文件


~ cd /disk/sdb1
~ sudo rm c6.img

完成删除虚拟c6节点的操作。

3. 实现脚本

quick.sh


~ vi quick.sh

#!/bin/bash
export HOST=c5
export DOMAIN=$HOST.wtmat.com
export IP=192.168.1.34
export DNS=192.168.1.7

#1. hostname host
echo "====================hostname host============================"
hostname $HOST
echo $HOST >  /etc/hostname
sed -i -e "/127.0.0.1/d" /etc/hosts
sed -i -e 1"i\127.0.0.1 localhost ${HOST}" /etc/hosts

#2. ip
echo "====================ip address============================"
sed -i -e "/address/d;/^iface eth0 inet static/a\address ${IP}" /etc/network/interfaces
/etc/init.d/networking restart

#3. dns
echo "====================dns============================"
echo "nameserver ${DNS}" > /etc/resolv.conf

#4. fdisk mount
echo "====================fdisk mount============================"
(echo n; echo p; echo ; echo ; echo ; echo w) | fdisk /dev/vdb
mkfs -t ext4 /dev/vdb1
mount /dev/vdb1 /home/cos/hadoop
echo "/dev/vdb1 /home/cos/hadoop ext4 defaults 0 0 " >> /etc/fstab

#5. hadoop folder
echo "==================== hadoop folder============================"
mkdir /home/cos/hadoop/data
mkdir /home/cos/hadoop/tmp
chown -R cos:cos /home/cos/hadoop/
chmod 755 /home/cos/hadoop/data
chmod 755 /home/cos/hadoop/tmp

#6. ssh
echo "====================ssh============================"
rm -rf /home/cos/.ssh/*
sudo -u cos ssh-keygen -t rsa -N "" -f /home/cos/.ssh/id_rsa
sudo cat /home/cos/.ssh/id_rsa.pub >> /home/cos/.ssh/authorized_keys
chown -R cos:cos /home/cos/.ssh/
exit

clone-node-add.sh


~ vi clone-node-add.sh

#!/bin/bash
export NEW_NODE=c6.wtmart.com
export PASS=cos
export SLAVES=c1.wtmart.com:c2.wtmart.com:c3.wtmart.com:c4.wtmart.com:c5.wtmart.com:c6.wtmart.com
IFS=:

#sudo apt-get install sshpass
#1. sync ssh
echo "===============sync ssh========================="
sshpass -p ${PASS} scp -o StrictHostKeyChecking=no cos@${NEW_NODE}:/home/cos/.ssh/authorized_keys .
cat authorized_keys >> /home/cos/.ssh/authorized_keys
rm authorized_keys

for SLAVE in $SLAVES
do
        echo scp $SLAVE
        sshpass -p ${PASS} scp /home/cos/.ssh/authorized_keys cos@$SLAVE:/home/cos/.ssh/authorized_keys
done

#2. sync hadoop slaves
echo "===============sync hadoop slaves========================="
rm /home/cos/toolkit/hadoop-1.0.3/conf/slaves
for SLAVE in $SLAVES
do
   echo $SLAVE >> /home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

for SLAVE in $SLAVES
do
        echo scp $SLAVE
        scp /home/cos/toolkit/hadoop-1.0.3/conf/slaves cos@$SLAVE:/home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

#3. restart hadoop cluster
echo "===============restart hadoop cluster========================="
stop-all.sh
start-all.sh

clone-node-del.sh


~ vi clone-node-del.sh

#!/bin/bash
export DEL_NODE=c6
export PASS=cos
export SLAVES=c1.wtmart.com:c2.wtmart.com:c3.wtmart.com:c4.wtmart.com:c5.wtmart.com
IFS=:

#0 stop
stop-all.sh

#1. sync ssh
echo "===============sync ssh========================="
sed -i "/cos@${DEL_NODE}/d" /home/cos/.ssh/authorized_keys

for SLAVE in $SLAVES
do
        echo scp $SLAVE
        sshpass -p ${PASS} scp /home/cos/.ssh/authorized_keys cos@$SLAVE:/home/cos/.ssh/authorized_keys
done

#2. sync hadoop slaves
echo "===============sync hadoop slaves========================="
rm /home/cos/toolkit/hadoop-1.0.3/conf/slaves
for SLAVE in $SLAVES
do
   echo $SLAVE >> /home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

for SLAVE in $SLAVES
do
        echo scp $SLAVE
        scp /home/cos/toolkit/hadoop-1.0.3/conf/slaves cos@$SLAVE:/home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

#3. restart hadoop cluster
echo "===============restart hadoop cluster========================="
start-all.sh

方便维护人员快速增加删除hadoop节点。

转载请注明出处:

http://blog.fens.me/hadoop-clone-add-delete/

打赏作者

让Hadoop跑在云端系列文章 之 克隆虚拟机优化方案1-安装和配置

让Hadoop跑在云端系列文章,介绍了如何整合虚拟化和Hadoop,让Hadoop集群跑在VPS虚拟主机上,通过云向用户提供存储和计算的服务。

现在硬件越来越便宜,一台非品牌服务器,2颗24核CPU,配48G内存,2T的硬盘,已经降到2万块人民币以下了。这种配置如果简单地放几个web应用,显然是奢侈的浪费。就算是用来实现单节点的hadoop,对计算资源浪费也是非常高的。对于这么高性能的计算机,如何有效利用计算资源,就成为成本控制的一项重要议题了。

通过虚拟化技术,我们可以将一台服务器,拆分成12台VPS,每台2核CPU,4G内存,40G硬盘,并且支持资源重新分配。多么伟大的技术啊!现在我们有了12个节点的hadoop集群, 让Hadoop跑在云端,让世界加速。

关于作者:

  • 张丹(Conan), 程序员Java,R,PHP,Javascript
  • weibo:@Conan_Z
  • blog: http://blog.fens.me
  • email: bsspirit@gmail.com

转载请注明出处:

http://blog.fens.me/hadoop-clone-improve

clone-improve

前言

把虚拟化的hadoop环境创建好之后,我们就要考虑如何对系统进行优化了。从运维的角度,我找到了4个优化的出发点,安装,配置,监控,管理。
为了完成1个人管理1000节点的目标,点滴的优化,都是未来成功的基石。

我在努力着。。。

 

目录

  1. 对系统优化简单分析
  2. 优化问题1:c1作为母体每次克隆时要停机。
  3. 优化问题2:手动操作步骤太多。

 

1. 对系统优化简单分析(10个节点)

刚才我们从运维的角度,提出了4点优化的出发点:安装,配置,监控,管理。

现在系统成功运行了2个节点,一步一步地,如何能方便的做出10个节点呢?
注:如果上来就想着1000个节点,我们失去方向。请已经熟悉1000个节点方案的朋友忽略这篇文章!

安装
简单概括就是安装要简单,最好是一条命令或者一个脚本就可以完成!在我们的虚拟环境中,安装一个hadoop节点,其实就创建一台新的虚拟机,就一条命令!

可是现在的结构,c1作为母体每次克隆时要停机,就意味着hadoop环境要停机,这不是我们希望的。我们将讨论如何进行改进!

配置
克隆体的hadoop节点创建成功后,由于hostname, ip, dns, 挂载磁盘等,都从母体复制过来的。但这几项配置要求每个节点是不一样的,需要手动修改。

所以,我们应该做一个脚本,每次自动去修改这些配置项,减少手动的修改,减少复杂性。保证新增加的节点能顺利的加入原有的hadoop集群。

监控
我们现在用KVM虚拟机,可以直接通过host虚拟机管理控制台查看每个节点的情况,当然这些信息是不够。我们还需要安装其他的系统监控工具,及各种的hadoop监控工具。

关于监控,我们将在后继续介绍。

管理
如果我们想整套的hadoop环境更易用,可以通过openstack做管理,这会是一种更理想的方案。

当然,这篇文章不会涉及到这个问题,我们将在后继续介绍。

2. 优化问题1:c1作为母体每次克隆时要停机。

up1

c1作为hadoop集群的,namenode节点不应该停止,因些我们重新制作一个名为hadoop-base的母体。通过新母体制造克隆体。

先停止hadoop,通过 让Hadoop跑在云端系列文章 之 克隆虚拟机增加Hadoop节点 文章的方法,克隆一个新的母体hadoop-base。


~ sudo virt-clone --connect qemu:///system -o c1 -n hadoop-base -f /disk/sdb1/hadoop-base.img

查看虚拟机列表


virsh # list --all
 Id    Name                           State
----------------------------------------------------
 5     server3                        running
 6     server4                        running
 7     d2                             running
 8     r1                             running
 9     server2                        running
 18    server5                        running
 48    c3                             running
 50    c1                             running
 -     c2                             shut off
 -     d1                             shut off
 -     hadoop-base                    shut off
 -     u1210-base                     shut off

重新启动hadoop集群c1,c3两个节点。

在c1中查看hadoop节点


~ hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.

Safe mode is ON
Configured Capacity: 317066272768 (295.29 GB)
Present Capacity: 293635162112 (273.47 GB)
DFS Remaining: 293635084288 (273.47 GB)
DFS Used: 77824 (76 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 2 (2 total, 0 dead)

Name: 192.168.1.30:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 15181529088 (14.14 GB)
DFS Remaining: 143351558144(133.51 GB)
DFS Used%: 0%
DFS Remaining%: 90.42%
Last contact: Thu Jul 11 13:00:50 CST 2013

Name: 192.168.1.32:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249581568 (7.68 GB)
DFS Remaining: 150283526144(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Thu Jul 11 13:00:52 CST 2013

通过新的母体hadoop-base制造克隆体c4


~ sudo virt-clone --connect qemu:///system -o hadoop-base -n c4 -f /disk/sdb1/c4.img

#增加分区硬盘/dev/sdb7
~ sudo virsh
virsh # edit c4
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/sdb8'/>
<target dev='vdb' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</disk>

#启动c4
virsh # start c4

#进入c4
virsh # console c4

现在,母体已经从c1变成的hadoop-base。c1已经不需要再关机了。

3. 优化问题2:手动操作步骤太多。

我们手动操作分成2个脚本来处理

  • 新克隆体虚拟机的配置
  • 向集群增加克隆体配置

新克隆体虚拟机的配置
登陆c4后,通过脚本来代替手动的配置。
脚本对下面6个步骤进行配置修改操作:

  1. hostname host
  2. ip
  3. dns
  4. fdisk mount
  5. hadoop folder
  6. ssh

对以上6个步骤的解释,请参考:Hadoop跑在云端系列文章 之 克隆虚拟机增加Hadoop节点


#!/bin/bash
export HOST=c4
export DOMAIN=$HOST.wtmat.com
export IP=192.168.1.33
export DNS=192.168.1.7

#1. hostname host
echo "====================hostname host============================"
hostname $HOST
echo $HOST >  /etc/hostname
sed -i -e "/127.0.0.1/d" /etc/hosts
sed -i -e 1"i\127.0.0.1 localhost ${HOST}" /etc/hosts

#2. ip
echo "====================ip address============================"
sed -i -e "/address/d;/^iface eth0 inet static/a\address ${IP}" /etc/network/interfaces
/etc/init.d/networking restart

#3. dns
echo "====================dns============================"
echo "nameserver ${DNS}" > /etc/resolv.conf

#4. fdisk mount
echo "====================fdisk mount============================"
(echo n; echo p; echo ; echo ; echo ; echo w) | fdisk /dev/vdb
mkfs -t ext4 /dev/vdb1
mount /dev/vdb1 /home/cos/hadoop
echo "/dev/vdb1 /home/cos/hadoop ext4 defaults 0 0 " >> /etc/fstab

#5. hadoop folder
echo "==================== hadoop folder============================"
mkdir /home/cos/hadoop/data
mkdir /home/cos/hadoop/tmp
chown -R cos:cos /home/cos/hadoop/
chmod 755 /home/cos/hadoop/data
chmod 755 /home/cos/hadoop/tmp

#6. ssh
echo "====================ssh============================"
rm -rf /home/cos/.ssh/*
sudo -u cos ssh-keygen -t rsa -N "" -f /home/cos/.ssh/id_rsa 
sudo -u cos cat /home/cos/.ssh/id_rsa.pub >> /home/cos/.ssh/authorized_keys

exit

向集群增加克隆体配置
返回c1节点,用脚本完成加载c4的操作。

脚本对下面3步骤进行操作:

  1. 同步ssh公钥
  2. 同步hadoop的slaves文件
  3. 把c4加入到集群环境

下面脚本使用sshpaas软件,请提前安装


sudo apt-get install sshpass

脚本代码


#!/bin/bash
export NEW_NODE=c4.wtmart.com
export PASS=cos
export SLAVES=c1.wtmart.com:c3.wtmart.com:c4.wtmart.com
IFS=:

#1. sync ssh
sshpass -p ${PASS} scp -o StrictHostKeyChecking=no cos@${NEW_NODE}:/home/cos/.ssh/authorized_keys .
cat authorized_keys >> /home/cos/.ssh/authorized_keys

for SLAVE in $SLAVES
do
	echo scp $SLAVE
	sshpass -p ${PASS} scp /home/cos/.ssh/authorized_keys cos@$SLAVE:/home/cos/.ssh/authorized_keys
done

#2. sync hadoop slaves

rm /home/cos/toolkit/hadoop-1.0.3/conf/slaves
for SLAVE in $SLAVES
do
   echo $SLAVE >> /home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

for SLAVE in $SLAVES
do
	echo scp $SLAVE
	scp /home/cos/toolkit/hadoop-1.0.3/conf/slaves cos@$SLAVE:/home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

#3. restart hadoop cluster
stop-all.sh
start-all.sh

重启hadoop集群,看到新的节点c4,已经加入到集群


~ hadoop dfsadmin -report

Safe mode is ON
Configured Capacity: 475598360576 (442.94 GB)
Present Capacity: 443917721600 (413.43 GB)
DFS Remaining: 443917615104 (413.43 GB)
DFS Used: 106496 (104 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)

Name: 192.168.1.30:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 15181529088 (14.14 GB)
DFS Remaining: 143351558144(133.51 GB)
DFS Used%: 0%
DFS Remaining%: 90.42%
Last contact: Thu Jul 11 15:55:57 CST 2013

Name: 192.168.1.32:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249581568 (7.68 GB)
DFS Remaining: 150283526144(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Thu Jul 11 15:55:56 CST 2013

Name: 192.168.1.33:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Thu Jul 11 15:55:58 CST 2013

有了这2个脚本,我们生成10个-20个节点,基本就是文件复制的时间了。

优化问题我们将继续深入进行。。。

转载请注明出处:

http://blog.fens.me/hadoop-clone-improve

打赏作者

让Hadoop跑在云端系列文章 之 克隆虚拟机增加Hadoop节点

让Hadoop跑在云端系列文章,介绍了如何整合虚拟化和Hadoop,让Hadoop集群跑在VPS虚拟主机上,通过云向用户提供存储和计算的服务。

现在硬件越来越便宜,一台非品牌服务器,2颗24核CPU,配48G内存,2T的硬盘,已经降到2万块人民币以下了。这种配置如果简单地放几个web应用,显然是奢侈的浪费。就算是用来实现单节点的hadoop,对计算资源浪费也是非常高的。对于这么高性能的计算机,如何有效利用计算资源,就成为成本控制的一项重要议题了。

通过虚拟化技术,我们可以将一台服务器,拆分成12台VPS,每台2核CPU,4G内存,40G硬盘,并且支持资源重新分配。多么伟大的技术啊!现在我们有了12个节点的hadoop集群, 让Hadoop跑在云端,让世界加速。

关于作者:

  • 张丹(Conan), 程序员Java,R,PHP,Javascript
  • weibo:@Conan_Z
  • blog: http://blog.fens.me
  • email: bsspirit@gmail.com

转载请注明出处:

http://blog.fens.me/hadoop-clone-node/

clone-guest

前言

通过虚拟化技术,我们可轻松的增加或删除一台虚拟机。像hadoop技术,安装,配置,运维,管理都很复杂,如果能通过虚拟化技术,降低运维成本,是多么开心的一件事啊!设想一下,如果一个人能够管理1000个hadoop节点,那么小型公司也可以随随便便构建像百度,阿里一样的强大的计算集群环境。世界也许会更奇妙!

当然,本文并不是讲一个人如何管理1000个hadoop节点。但我会介绍一种方式,通过克隆虚拟机来增加Hadoop节点。也许在大家的实践操作中,就能做出一个人运维1000个节点集群的方案。

目录 

  1. 系统环境介绍
  2. 克隆虚拟机
  3. 完成2个节点的hadoop集群

1. 系统环境介绍

我延续上篇文章的系统环境,让Hadoop跑在云端系列文章 之 创建Hadoop母体虚拟机

我们已成功地创建了Hadoop母体虚拟机c1。接下来,我们要用clone的方式创建c2,c3,c4,c5 ,4台克隆虚拟机。

2. 克隆虚拟机

在host中,打开虚拟机管理软件,查看c1的状态。


~ sudo virsh

virsh # list
 Id    Name                           State
----------------------------------------------------
 5     server3                        running
 6     server4                        running
 7     d2                             running
 8     r1                             running
 9     server2                        running
 18    server5                        running
 42    c1                             running

c1正在运行中,由于c2之前已经创建,我们已c3来举例说明。

创建克隆体c3


~ sudo virt-clone --connect qemu:///system -o c1 -n c3 -f /disk/sdb1/c3.img
ERROR    Domain with devices to clone must be paused or shutoff.

关闭c1,并重新克隆


virsh # destroy c1
Domain c1 destroyed

~ sudo virt-clone --connect qemu:///system -o c1 -n c3 -f /disk/sdb1/c3.img
ERROR    A disk path must be specified to clone '/dev/sdb5'

分区硬盘引入的错误。(无比强大的google,已经找不到对这个错误的解释了)

接下的操作:

  1. 重新启动c1,注释/etc/fstab自动挂载/dev/vdb1的操作(自行解决)
  2. 卸载给c1分配的分区硬盘/dev/sdb5

~ edit c1

<!-- 
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/sdb5'/>
<target dev='vdb' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</disk>
-->

再次创建克隆体c3


~ sudo virt-clone --connect qemu:///system -o c1 -n c3 -f /disk/sdb1/c3.img
Cloning c1.img                             1% [                               ]  47 MB/s | 426 MB     14:14 ETA
Cloning c1.img                                                                           |  40 GB     08:18

Clone 'c3' created successfully.

给克隆体c3挂载分区硬盘/dev/sdb7,并启动c3


virsh # edit c3

<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/sdb7'/>
<target dev='vdb' bus='virtio'/>
</disk>

virsh # start c3

virsh # console c3
Connected to domain c3
Escape character is ^]

Ubuntu 12.10 c1 ttyS0

c1 login: cos
Password:
Last login: Wed Jul 10 12:00:44 CST 2013 from 192.168.1.79 on pts/0
Welcome to Ubuntu 12.10 (GNU/Linux 3.5.0-32-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
New release '13.04' available.
Run 'do-release-upgrade' to upgrade to it.

cos@c1:~$

对克隆体c3,有几个地方需要修改:

  1. hostname,hosts
  2. 静态IP
  3. DNS
  4. /dev/vdb1自动挂载
  5. 配置hadoop存储目录权限
  6. ssh自动验证
  7. hadoop的master,slave

1. hostname,hosts


~ sudo hostname c3
~ sudo vi /etc/hostname
c3

~ sudo vi /etc/hosts
127.0.1.1       c3

2. 静态IP


~ sudo vi /etc/network/interfaces
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
#iface eth0 inet dhcp
iface eth0 inet static
address 192.168.1.32
netmask 255.255.255.0
gateway 192.168.1.1

改完IP后,我们重启一下,用ssh连接。


~ ssh cos@c3.wtmart.com

~ ifconfig
eth0      Link encap:Ethernet  HWaddr 52:54:00:06:b2:3a
          inet addr:192.168.1.32  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::5054:ff:fe06:b23a/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:42 errors:0 dropped:0 overruns:0 frame:0
          TX packets:33 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6084 (6.0 KB)  TX bytes:4641 (4.6 KB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:34 errors:0 dropped:0 overruns:0 frame:0
          TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2560 (2.5 KB)  TX bytes:2560 (2.5 KB)

3. DNS (内网的DNS)


~ sudo vi /etc/resolv.conf
nameserver 192.168.1.7

~ ping c1.wtmart.com
PING c1.wtmart.com (192.168.1.30) 56(84) bytes of data.
64 bytes from 192.168.1.30: icmp_req=1 ttl=64 time=0.485 ms
64 bytes from 192.168.1.30: icmp_req=2 ttl=64 time=0.552 ms

4. /dev/vdb1自动挂载


~ sudo mount /dev/vdb1 /home/cos/hadoop
~ sudo vi /etc/fstab
/dev/vdb1        /home/cos/hadoop      ext4    defaults 0       0

~ df -h
Filesystem              Size  Used Avail Use% Mounted on
/dev/mapper/u1210-root   36G  2.4G   32G   7% /
udev                    2.0G  4.0K  2.0G   1% /dev
tmpfs                   791M  224K  791M   1% /run
none                    5.0M     0  5.0M   0% /run/lock
none                    2.0G     0  2.0G   0% /run/shm
none                    100M     0  100M   0% /run/user
/dev/vda1               228M   29M  188M  14% /boot
/dev/vdb                148G  188M  140G   1% /home/cos/hadoop

5. 配置hadoop存储目录权限


~ sudo chown -R cos:cos hadoop/

~ mkdir /home/cos/hadoop/data
~ mkdir /home/cos/hadoop/tmp

~ sudo chmod 755 /home/cos/hadoop/data
~ sudo chmod 755 /home/cos/hadoop/tmp

6. ssh自动验证


~ rm -rf ~/.ssh/

~ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/cos/.ssh/id_rsa):
Created directory '/home/cos/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/cos/.ssh/id_rsa.
Your public key has been saved in /home/cos/.ssh/id_rsa.pub.
The key fingerprint is:
54:1f:51:5f:5c:1d:ac:8a:9d:8f:fd:da:65:7e:f9:8d cos@c3
The key's randomart image is:
+--[ RSA 2048]----+
|          . oooo*|
|         . . . o+|
|        .   . . .|
|       .     .   |
|        S o o    |
|         . +     |
|            +   +|
|           . o.=+|
|             .Eo*|
+-----------------+
~ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

~ cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCtIK3+hGeJQOigIN1ydDVpUzFeg/urnX2DaHhuv5ik8bGGDZWmSA+qPWAZQoZFBGqGMTshvMQeksjA2WiINbFpQCGuxSFx5a9Ad0XU8bGDFi+yYfKlp1ROZ+7Jz1+cO4tjrCX4Ncr3nZlddl6Uh/yMYU+iuRQ4Ga7GXgeYoSf7C5vQmzkYija0XNa8wFr0+aeD3hau6s8dWwsw7Dn/xVA3eMqK+4LDGH6dkn6tc6nVyNbzofXdtPOsCkwHpwozwuRBL37LYcmELe+oT/GifWf0Qp4rQD/9ObtHkhKrSW45bRH/WzvkyNxl04dKlIj26zIsh9zjMHF8o0ce+zjUl7aD cos@c3

7. hadoop的master,slave


~ cd /home/cos/toolkit/hadoop-1.0.3/conf
~  vi slaves
c1.wtmart.com
c3.wtmart.com

3. 完成2个节点的hadoop集群

重新启动c1节点,配置2个节点的hadoop集群


~ ssh cos@c1.wtmart.com
~ sudo mount /dev/vdb1 /home/cos/hadoop

配置slave


~ cd /home/cos/toolkit/hadoop-1.0.3/conf
~  vi slaves
c1.wtmart.com
c3.wtmart.com

DNS (内网的DNS)


~ sudo vi /etc/resolv.conf
nameserver 192.168.1.7

~  ping c3.wtmart.com
PING c3.wtmart.com (192.168.1.32) 56(84) bytes of data.
64 bytes from 192.168.1.32: icmp_req=1 ttl=64 time=0.673 ms
64 bytes from 192.168.1.32: icmp_req=2 ttl=64 time=0.429 ms

交换ssh公钥
复制c3的authorized_keys到c1的authorized_keys


~ scp cos@c3.wtmart.com:/home/cos/.ssh/authorized_keys .
~ cat authorized_keys >> /home/cos/.ssh/authorized_keys

~ cat /home/cos/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCuWCgfB8qSyEpKmUwnXvWtiCbzza1oD8oM3jm3EJzBRivEC/QV3k8v8zOHt2Cf4H6hYIvYHoSMilA2Wqbh1ny/70zcIYrwFiX5QvSjbEfsj7UmvxevxjB1/5F66TRLN/PyTiw3FmPYLkxvTP8CET02D2cgAN0n+XGTXQaaBqBElQGuiOJUAIvMUl0yg3mH7eP8TRlS4qYpllh04kirSbkOm6IYRDtGsrb90ew63l6F/MpPk/H6DVGC23PnD7ZcMr7VFyCkNPNqcEBwbL8qL1Hhnf7Lvlinp3M3zVU0Aet3TjkgwvcLPX8BWmrguVdBJJ3yqcocylh0YKN9WImAhVm7 cos@c1
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDU4sIHWxuyicIwFbRpDcoegDfSaUhmaBTSFSVTd/Uk8w0fTUcERNx9bZrtuiiPsOImvsBy9hN2Au/x36Jp4hO8DFxzqTNBcfD9xcEgOzHGqH7ZC5qMqx3Poy6DzN/rk9t/+wXyNCfR4kRhrPBNiF6gMDMTRQLpKK9VnY0nQ6FvOQ407gMAsyf0Fn31sHZJLtM4/kxsSOEsSNIk1V+gbIxJv2fJ/AON0/iW1zu7inUC/+YvEuTLClnUMzqFb/xKp25XxSV6a5YzThxs58KO5JCRq2Kk/SM0GSmCSkjKImUYDDjmi1P6wbrd4mh/4frQr1DqYyPeHE4UlHXD90agw767 cos@c3

把c1的authorized_keys,再分发给c3


~ scp /home/cos/.ssh/authorized_keys cos@c3.wtmart.com:/home/cos/.ssh/authorized_keys

从c1启动hadoop集群


~ start-all.sh

starting namenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-namenode-c1.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c1.out
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c3.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c3.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting secondarynamenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-secondarynamenode-c1.out
starting jobtracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-jobtracker-c1.out
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c3.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c3.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c1.out

查看c1的进程


~ jps
8861 JobTracker
8486 NameNode
8750 SecondaryNameNode
9123 Jps
9001 TaskTracker
8612 DataNode

查看c3的进程


~ jps
3180 TaskTracker
3074 DataNode
3275 Jps

这样就建立起了,2个节点的hadoop集群。
上面是我们手动操作的过程比较复杂,接下来我们的任务就是写一个自动化的脚本完成上面的操作。

转载请注明出处:

http://blog.fens.me/hadoop-clone-node/

打赏作者