• Archive by category "操作系统"
  • (Page 3)

Blog Archives

RHadoop培训 之 Linux基础课

RHadoop实践系列文章,包含了R语言与Hadoop结合进行海量数据分析。Hadoop主要用来存储海量数据,R语言完成MapReduce 算法,用来替代Java的MapReduce实现。有了RHadoop可以让广大的R语言爱好者,有更强大的工具处理大数据1G, 10G, 100G, TB, PB。 由于大数据所带来的单机性能问题,可能会一去不复返了。

RHadoop实践是一套系列文章,主要包括”Hadoop环境搭建”,”RHadoop安装与使用”,R实现MapReduce的协同过滤算法”,”HBase和rhbase的安装与使用”。对于单独的R语言爱好者,Java爱好者,或者Hadoop爱好者来说,同时具备三种语言知识并不容 易。此文虽为入门文章,但R,Java,Hadoop基础知识还是需要大家提前掌握。

关于作者

张丹(Conan), 程序员Java,R,PHP,Javascript
weibo:@Conan_Z
blog: http://blog.fens.me
email: bsspirit@gmail.com

转载请注明出处:
http://blog.fens.me/rhadoop-linux-basic/

linux-basic

前言

覆盖Linux基础知识,快速上手,搭建RHadoop环境的基础课。

目录

  1. 背景知识
  2. 文件系统
  3. 常用命令
  4. vi编辑器
  5. 用户管理
  6. 磁盘管理
  7. 网络管理
  8. 系统管理
  9. 软件包管理
  10. 常用软件

1. 背景知识

Linux起源
Linux是一个诞生于网络、成长于网络且成熟于网络的奇特的操作系统。1991年,芬兰大学生Linus萌发了开发一个自由的UNIX操作系统的想法。

Ubuntu Linux最早于2004年作为Debian的一个分支出现,其创始人是南非企业家Mark Shuttleworth。Ubuntu项目由Shuttleworth的公司Canonical和社区志愿开发者共同努力开发而成,目的是实 现一个现代 版的Linux版本,使其在桌面系统上真正具有竞争力,更适合主流非技术用户使用。

Ubuntu的重点在于提高易用性,并且坚持定时发布新版本,即每隔六个月发布一个新版本。这确保了用户不再使用过时的软件。

http://www.ubuntu.com/

Linux安装

  • 在物理上面安装Linux (略…)
  • 在虚拟机上面安装Linux

vbox-linux

 

客户端软件

Putty,http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

putty

 

2. 文件系统

a.目录结构


ls /
bin dev home lib lost+found mnt proc run selinux sys usr vmlinuz
boot etc initrd.img lib64 media opt root sbin srv tmp var

b.常用目录介绍


/bin 所有用户都能执行的命令
/dev 设备特殊文件
/etc 系统管理和配置文件
/home 用户主目录的基点
/sbin 系统管理员才能执行的命令
/tmp 公用的临时文件存储点
/root 系统管理员的主目录
/mnt 系统提供这个目录是让用户临时挂载其他的文件系统
/var 某些大文件的溢出区,比方说各种服务的日志文件

c.简单了解的目录介绍


/lib 标准程序设计库
/lib64 标准程序设计库
/lost+found 这个目录平时是空的,系统非正常关机而留下恢复的文件
/proc 正在运行的内核信息映射
/sys 硬件设备的驱动程序信息
/usr 最庞大的目录,要用到的应用程序和文件几乎都在这个目录
/usr/bin 众多的应用程序
/boot 该目录默认下存放的是Linux的启动文件和内核
/media 挂载多媒体设备
/opt 第三方软件在安装时默认会找这个目录,所以你没有安装此类软件时它是空的

d. 文件和目录


ls -l /etc/passwd
-rw-r--r-- 1 root root 1146 May 28 20:48 /etc/passwd
第一列:文件类型和权限
第二列:i节点,硬件连接数
第三列:用户
第四列:用户组
第五列:文件大小
第六列:最近一次修改时间mtime
第七列:文件或者目录名

上面第一列: -rw-r--r--
1列:文件类型 (-普通文件, d目录, l链接文件)
2-10列:用户权限,用户组权限,其他人权限, 读r(4),写w(2),执行x(1)
rw- : 4+2+0=6
r-- : 4+0+0=4
r-- : 4+0+0=4
chmod 644 /etc/passwd

3. 常用命令


man: 查询帮助文档
ls: 列出目前下的所有内容 (a, l)
cd: 变换工作目录(~, . , .., -)
pwd: 显示当前目录
touch: 创建一个空文件或者改变创建时间
mkdir: 创建一个目录(p)
rmdir: 删除一个空目录(p)
rm: 删除文件和目录 (rf)
cp: 复制文件和目录(rf)
mv: 移动文件和目录
cat: 显示一个文件内容
head: 显示文件前N行 (-n)
tail: 显示文件后N行(-n)

less : 分屏显示文件内容
wc: 统计计数 (l,w,m)
chmod:  给目录设置权限(R)
chown:  修改目录和文件用户或者用户组 (R)
chgrp: 修改目录和文件用户组(R)
which: 按路径查找命令
whereis: 按资料库查找命令
locate: 按文件包查询
find: 按条件查询
gzip: 压缩和解压文件(-n,d)
zcat: 查看gzip压缩文件内容

zip: 以zip压缩文件和目录 (r)
unzip: 解压zip文件
bzip2: 以bz2压缩和解压文件 (z,d) 
bzcat: 查看bz2压缩文件内容
tar: 打包并压缩文件(czvf, xzvf)
shutdown: 定时关机(-h 18:00, -r重启)
halt: 关机
init: 切换运行级别(0关机)
poweroff: 关机
reboot: 重启
history: 查看命令行历史
grep: 使用正则表达式搜索文本

echo:  打印环境变量
export: 查看环境变量,或者声明变量
clear: 清空屏幕
alias: 设置别名(alias cls='clear')
locale: 当前系统语言环境
sudo: 切换成root用户运行
> : 输入到文件 (locale > a.txt)
>> : 合并到文件末尾 (locale >> a.txt)
| : 管道过滤 (cat a.txt| grep en_US)
ln: 建立链接文件(s软链接)
sh: 执行一个shell脚本
date: 查看当前系统时间
time:统计程序消耗时间

4. vi编辑器

a. vi的基本概念
基本上vi可分为三种操作状态,分别是命令模式(Command mode)、插入模式(Insert mode)和底线命令模式(Last line mode),

  • Comand mode:控制屏幕光标的移动,字符或光标的删除,移动复制某区段及进入Insert mode下,或者到Last line mode。
  • Insert mode:唯有在Insert mode下,才可做文字数据输入,按Esc等可回到Comand mode。
  • Last line mode:将储存文件或离开编辑器,也可设置编辑环境,如寻找字符串、列出行号等。

b. 命令模式(Command mode)

移动光标类命令:

h :光标左移一个字符 
l :光标右移一个字符 
space:光标右移一个字符 
Backspace:光标左移一个字符 
k或Ctrl+p:光标上移一行 
j或Ctrl+n :光标下移一行 
Enter :光标下移一行 
w或W :光标右移一个字至字首 
b或B :光标左移一个字至字首 
e或E :光标右移一个字至字尾 
) :光标移至句尾 
( :光标移至句首 
}:光标移至段落开头 
{:光标移至段落结尾 
$:光标移至当前行尾 

屏幕翻滚类命令

Ctrl+u:向文件首翻半屏 
Ctrl+d:向文件尾翻半屏 
Ctrl+f:向文件尾翻一屏 
Ctrl+b;向文件首翻一屏  

插入文本类命令

i :在光标前 
I :在当前行首 
a:光标后 
A:在当前行尾 
o:在当前行之下新开一行 
O:在当前行之上新开一行 

删除命令

do:删至行首 
d$:删至行尾 
ndd:删除当前行及其后n-1行 

c. 插入模式(Insert mode)
编辑文本信息

d. 底线命令模式(Last line mode)

:n1,n2 co n3:将n1行到n2行之间的内容拷贝到第n3行下 
:n1,n2 m n3:将n1行到n2行之间的内容移至到第n3行下 
:n1,n2 d :将n1行到n2行之间的内容删除 
:w :保存当前文件 
:e filename:打开文件filename进行编辑 
:x:保存当前文件并退出 
:q:退出vi 
:q!:不保存文件并退出vi 
:!command:执行shell命令command 
:n1,n2 w!command:将文件中n1行至n2行的内容作为command的输入并执行之,若不指定n1,n2,则表示将整个文件内容作为command的输入 
:r!command:将命令command的输出结果放到当前行 

5. 用户管理


useradd:命令用来建立用户帐号和创建用户的起始目录。
passwd: 设定账号的密码
groupadd: 将新组加入系统
adduser: 增加用户到sudo

groupadd hadoop 
useradd hadoop -g hadoop;
passwd hadoop 
adduser hadoop sudo
mkdir /home/hadoop 
chown -R hadoop:hadoop /home/hadoop

sudo: 切换到root执行命令
  sudo –i: 切换到root账号

su: 切换其他用户      
who: 显示系统中有哪些用户登陆系统。
whoami: 显示当前操作的用户名

ls -l /etc/passwd
-rw-r--r-- 1 root root 1146 May 28 20:48 /etc/passwd

6. 磁盘管理

fdisk: 硬盘分区工具(l)
partprobe: 不重启让分区生效
mkfs: 格式化硬盘(-t ext4)

mount: 挂载分区
umount: 取消挂载分区

/etc/fstab:开机自动挂载
    /dev/vdb1        /home/cos/hadoop      ext4    defaults 0       0

df: 查看当前硬盘的使用情况(ah)
du: 查看目录下的文件夹大小(shd)

7. 网络管理

a. 网络配置

hostname: 显示或修改主机名
/etc/hostname
    u1

/etc/hosts:配置主机名与ip的映射
    127.0.0.1       localhost
    127.0.1.1       u1

    # The following lines are desirable for IPv6 capable hosts
    ::1     ip6-localhost ip6-loopback
    fe00::0 ip6-localnet
    ff00::0 ip6-mcastprefix
    ff02::1 ip6-allnodes
    ff02::2 ip6-allrouters

/etc/network/interfaces: 配置网卡信息
    auto lo
    iface lo inet loopback

    auto eth0
    #iface eth0 inet dhcp
    iface eth0 inet static
    address 192.168.1.200
    netmask 255.255.255.0
    gateway 192.168.1.1

/etc/init.d/networking restart: 重启网络配置

ifup eth0: 激活eth0网卡
ifdown eth0: 关闭eth0网卡

/etc/resolv.conf: DNS配置
    nameserver 192.168.1.1
    nameserver 8.8.8.8

b. 网络命令

ping: 检查主机网络接口状态
netstat: 检查整个Linux网络状态。
route: 检查本地路由信息
ifconfig: 检查网卡的信息
nslookup:机器的IP地址和其对应的域名
dig:用于询问DNS域名服务器的灵活的工具
traceroute:追踪网络数据包的路由途径

8. 系统管理

a. 系统命令

ps: 检查系统进程(aux)
kill: 指定进程ID杀进程(-9)
killall: 通过批量名字杀死进程(-9)
top: 查看系统运行状态,以CPU占用最高排序
free: 查看系统内存资源

b. 系统日志

dmesg:dmesg用来显示开机信息,kernel会将开机信息存储在ring buffer。
/var/log/syslog: 查看系统日志

9. 软件包管理

a. dpkg
dpkg是Debian系统的后台包管理器,类似RPM。也是Debian包管理系统的中流砥柱,负责安全卸载软件包,配置,以及维护已安装的软件包。由于ubuntu和Debian乃一脉相承,所以很多命令是不分彼此的。

dpkg -i package.deb 安装包
dpkg -r package 删除包
dpkg -P package 删除包(包括配置文件)
dpkg -L package 列出与该包关联的文件
dpkg -l package 显示该包的版本
dpkg –unpack package.deb 解开 deb 包的内容
dpkg -S keyword 搜索所属的包内容
dpkg -l 列出当前已安装的包
dpkg -c package.deb 列出 deb 包的内容
dpkg –configure package 配置包

b. apt-get
apt-get是一条linux命令,适用于deb包管理式的操作系统,主要用于自动从互联网的软件仓库中搜索、安装、升级、卸载软件或操作系统。是debian,ubuntu发行版的包管理工具,与红帽中的yum工具非常类似。

软件源:

/etc/apt/sources.list

deb http://mirrors.163.com/ubuntu/ precise main universe restricted multiverse
deb-src http://mirrors.163.com/ubuntu/ precise main universe restricted multiverse
deb http://mirrors.163.com/ubuntu/ precise-security universe main multiverse restricted
deb-src http://mirrors.163.com/ubuntu/ precise-security universe main multiverse restricted
deb http://mirrors.163.com/ubuntu/ precise-updates universe main multiverse restricted
deb http://mirrors.163.com/ubuntu/ precise-proposed universe main multiverse restricted
deb-src http://mirrors.163.com/ubuntu/ precise-proposed universe main multiverse restricted
deb http://mirrors.163.com/ubuntu/ precise-backports universe main multiverse restricted
deb-src http://mirrors.163.com/ubuntu/ precise-backports universe main multiverse restricted
deb-src http://mirrors.163.com/ubuntu/ precise-updates universe main multiverse restricted
deb http://mirror.bjtu.edu.cn/cran/bin/linux/ubuntu precise/

命令:

apt-cache search :  (package 搜索包)
apt-cache show :  (package 获取包的相关信息,如说明、大小、版本等)
sudo apt-get install : (package 安装包)
sudo apt-get install : (package - - reinstall 重新安装包)
sudo apt-get -f install : (强制安装?#"-f = --fix-missing"当是修复安装吧...)
sudo apt-get remove : (package 删除包)
sudo apt-get remove --purge : (package 删除包,包括删除配置文件等)
sudo apt-get autoremove --purge : (package 删除包及其依赖的软件包+配置文件等(只对6.10有效,强烈推荐))
sudo apt-get update : 更新源
sudo apt-get upgrade : 更新已安装的包
sudo apt-get dist-upgrade :升级系统
sudo apt-get dselect-upgrade :使用 dselect 升级
apt-cache depends : (package 了解使用依赖)
apt-cache rdepends : (package 了解某个具体的依赖?#当是查看该包被哪些包依赖吧...)
sudo apt-get build-dep : (package 安装相关的编译环境)
apt-get source : (package 下载该包的源代码)
sudo apt-get clean && sudo apt-get autoclean :清理下载文件的存档 && 只清理过时的包
sudo apt-get check :检查是否有损坏的依赖

10. 常用软件

a. 远程管理

telnet: 远程登录Linux主机
ssh: 远程安全登录Linux主机
scp: 基于ssh登陆进行安装的远程文件传输
ftp: 通过ftp协议远程文件传输
wget: 通过http协议远程文件下载

b. 程序编译

./configure:是用来检测你的安装平台的目标特征的。configure是一个脚本,用来确定所处系统的细节,比如使用何种编译器、何种库,以及编译器和库的保存位置,并把Makefile.in的相应部分进行替换,形成Makefile。
make: 是用来编译的,它从Makefile中读取指令,然后编译。
make install: 是用来安装的,它也从Makefile中读取指令,安装到指定的位置。
make clean: 删除临时文件

c. 产生密钥

ssh-keygen:产生SSH认证RSA密钥
ssh 是一个专为远程登录会话和其他网络服务提供安全性的协议。默认状态下ssh链接是需要密码认证的,可以通过添加系统认证(即公钥-私钥)的修改,修改后系统间切换可以避免密码输入和ssh认证.

~ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/conan/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/conan/.ssh/id_rsa.
Your public key has been saved in /home/conan/.ssh/id_rsa.pub.
The key fingerprint is:
40:9b:fe:ad:76:29:79:20:b5:7c:74:e4:c9:46:73:f1 conan@conan
The key's randomart image is:
+--[ RSA 2048]----+
|      .       .. |
|     . o    + .. |
|      +    = +  E|
|     . .. . *    |
|      .oSo o     |
|      ..+..      |
|       ..+..     |
|        +.+      |
|       ..+       |
+-----------------+

ls ~/.ssh
id_rsa  id_rsa.pub  known_hosts authorized_keys

id_rsa:私钥
id_rsa.pub: 公钥
known_hosts:登陆认证过的主机信息
authorized_keys:公钥授权文件

d. 时间管理
参考文章:http://blog.fens.me/linux-ntp/

ntp:服务器是用来使计算机时间同步化的一种协议,它可以使计算机对其服务器或时钟源做同步化,它可以提供高精准度的时间校正,在LAN上与标准间差小于1毫秒,WAN上几十毫秒,且通过加密确认的方式来防止恶毒的协议攻击。

#服务器配置:
sudo apt-get install ntp

ps -aux|grep ntp
ntp      23717  0.0  0.0  39752  2184 ?        Ss   19:40   0:00 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 126:135

/etc/init.d/ntp status
* NTP server is running

#客户端同步:
ntpdate 0.cn.pool.ntp.org  
22 Jul 16:45:05 ntpdate[6790]: step time server 218.75.4.130 offset 7618978.781348 sec

e. 定时任务

crontab:设置任务调度(el)
分钟m(0-59), 小时h(1-23), 日期dom(1-31), 月份mon(1-12), 星期down(0-6,0表示周日) 

crobtab -e
# m h    dom mon dow   command
# 每周一早点5:00给目录打包
# 0 5     *   *   1    tar -zcf /var/backups/home.tgz /home/

# 每天同步2次时间。(8:00, 16:00)
# 0 8,16  *   *   *    /usr/sbin/ntpdate 192.168.1.79

f. 网络文件系统

nfs:网络文件系统(NFS,Network File System)是一种将远程主机上的分区(目录)经网络挂载到本地系统的一种机制,通过对网络文件系统的支持,用户可以在本地系统上像操作本地分区一样来对远程主机的共享分区(目录)进行操作。

#服务器端安装
sudo apt-get install nfs-kernel-server

#配置访问权限
vi /etc/exports
/home/conan/.ssh *(rw,no_root_squash,sync)

#重启
sudo /etc/init.d/nfs-kernel-server restart

#查看配置列表
showmount -e
Export list for conan:
/home/conan/.ssh  *

#创建目录
sudo mkdir /mnt/ssh

#挂载目录
sudo mount 192.168.1.201:/home/conan/.ssh /mnt/ssh/

#查看目录
ls /mnt/ssh/
id_rsa  id_rsa.pub  known_hosts

g. 域名服务器
参考文章:http://blog.fens.me/vps-ip-dns/

bind9:一种开源的DNS(Domain Name System)协议的实现,包含对域名的查询和响应所需的所有软件。它是互联网上最广泛使用的一种DNS服务器。

#安装
sudo apt-get install bind9
cd /etc/bind

#配置
vi named.conf.options
forwarders {
      8.8.8.8;
      8.8.4.4;
};

# 设置wtmart.com域
~ vi named.conf.local
zone "wtmart.com" {
     type master;
     file "/etc/bind/db.wtmart.com";
};

# 配置域名指向IP
~ vi db.wtmart.com
$TTL    604800
@       IN      SOA     wtmart.com. root.wtmart.com. (
                              2         ; Serial
                         604800         ; Refresh
                          86400         ; Retry
                        2419200         ; Expire
                         604800 )       ; Negative Cache TTL
;
@       IN      NS      wtmart.com.
@       IN      A       173.255.193.179
@       IN      AAAA    ::1

ip      IN      A       173.255.193.179

lin    IN    A    116.24.133.86

#重启服务器
sudo /etc/init.d/bind9 restart

h. 数据库服务器

mysql: 是一个关系型数据库管理系统,由瑞典MySQL AB公司开发,目前属于Oracle公司。MySQL是一种关联数据库管理系统,关联数据库将数据保存在不同的表中,而不是将所有数据放在一个大仓库内,这样就增加了速度并提高了灵活性。MySQL的SQL语言是用于访问数据库的最常用标准化语言。MySQL软件采用了双授权政策(本词条“授权政策”),它分为社区版和商业版,由于其体积小、速度快、总体拥有成本低,尤其是开放源码这一特点,一般中小型网站的开发都选择MySQL作为网站数据库。

#安装
sudo apt-get install mysql-server

#配置允许其他计算访问
sudo sed -i 's/127.0.0.1/0.0.0.0/g' /etc/mysql/my.cnf

#设置字符集UTF-8
vi /etc/mysql/my.cnf

[client]
port            = 3306
socket          = /var/run/mysqld/mysqld.sock
default-character-set=utf8

[mysqld]
#
# * Basic Settings
#
default-storage-engine=INNODB
character-set-server=utf8
collation-server=utf8_general_ci

#查看字符集
mysql> show variables like '%char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

#重启
sudo service mysql restart

转载请注明出处:
http://blog.fens.me/rhadoop-linux-basic/

打赏作者

Nova安装攻略

自己搭建VPS系列文章,介绍了如何利用自己的计算机资源,通过虚拟化技术搭建VPS。

在互联网2.0时代,每个人都有自己的博客,还有很多专属于自己的互联网应用。这些应用大部分都是互联网公司提供的。对于一些有能力的开发人员(geek)来说,他们希望做一些自己的应用,可以用到最新最炫的技术,并且有自己的域名,有自己的服务器。这时就要去租一些互联网上的VPS主机。VPS主机就相当于是一台远程的计算机,可以部署自己的应用程序,然后申请一个域名,就可以正式发布在互联网上了。本站“@晒粉丝” 就使用的Linode主机VPS在美国达拉斯机房。

其实,VPS还可以自己搭建的。只要我们有一台高性能的服务器,一个IP地址,一个路由。可以把一台高性能的服务器,很快的变成5台,10台,20台的虚拟VPS。我们就可以在自己的VPS上面的,发布各种的应用,还可以把剩余的服务器资源租给其他的互联网使用者。 本系列文章将分为以下几个部分介绍:“虚拟化技术选型”,“动态IP解析”,“在Ubuntu上安装KVM并搭建虚拟环境”,“给KVM虚拟机增加硬盘”,“Nova安装攻略”,“VPS内网的网络架构设计”,“VPS租用云服务”。

关于作者:

  • 张丹(Conan), 程序员Java,R,PHP,Javascript
  • weibo:@Conan_Z
  • blog: http://blog.fens.me
  • email: bsspirit@gmail.com

转载请注明出处:
http://blog.fens.me/vps-nova-setup/

openstack

前言

Nova是Openstack一个重要的组成部分,最核心的功能就是对虚拟机进行管理(kvm, qemu, xen, vmware, virtual box)。

本次安装实验对Linux Ubuntu系统版本有严格的要求,必须是12.04 LTS。其他版本模拟实验均不成功,请大家严格执行。

本次实验的参考图书:
OpenStack Cloud Computing Cookbook
Chapter 1: Starting OpenStack Compute

目录

  1. nova安装方案
  2. VirtrulBox虚拟机环境
  3. 操作系统环境
  4. 软件包依赖
  5. nova配置
  6. 创建nova实例
  7. 登陆云实例
  8. 错误总结

1. nova安装方案

使用VirtrulBox虚拟机,嵌套qemu虚拟机。nova安装在VirtrulBox环境中,云实例则安装qemu环境中。通过nova管理qemu云实例。

2. VirtrulBox虚拟机环境

VirtrulBox虚拟机: 6G内存,4核CPU, Linux Ubuntu 12.04 LTS
CPU支持VT-x/AMD-V,嵌套分页,PAE/NX

nova1

3张虚拟网卡:

  • 网卡1:桥接网卡
  • 网卡2:Host-Only
  • 网卡3:Host-Only

vbox2

Host-Only网卡在虚拟机中的全局设置:
vbox1

3. 操作系统环境

再次强调:本次实验的Ubuntu版本,必须12.04 LTS


~ uname -a
Linux nova 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

~ cat /etc/issue
Ubuntu 12.04 LTS \n \l

~ ifconfig
eth0      Link encap:Ethernet  HWaddr 08:00:27:90:e8:19
          inet addr:192.168.1.200  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe90:e819/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:162 errors:0 dropped:0 overruns:0 frame:0
          TX packets:132 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:16399 (16.3 KB)  TX bytes:22792 (22.7 KB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

创建openstack用户
创建新用户openstack,密码openstack,增加到sudo组


~ sudo useradd openstack
~ sudo passwd openstack  
~ sudo adduser openstack sudo

~ sudo mkdir -p /home/openstack
~ sudo chown openstack:openstack /home/openstack

~ ls -l /home
drwxr-xr-x 8 conan     conan     4096 Jul 13 17:07 conan
drwxr-xr-x 2 openstack openstack 4096 Jul 13 17:21 openstack

用openstack账号重新登陆,测试sudo命令
以下所有操作请使用openstack用户


ssh openstack@192.168.1.200

openstack@u1:~$ whoami
openstack

openstack@u1:~$ sudo -i  
[sudo] password for openstack:

root@u1:~# whoami
root

root@u1:~# exit
logout

虚拟机网卡配置


~ sudo vi /etc/network/interfaces

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
address 192.168.1.200
netmask 255.255.255.0
network 192.168.1.0
broadcase 192.168.1.255
gateway 192.168.1.1

#public interface
auto eth1
iface eth1 inet static
address 172.16.0.1
netmask 255.255.0.0
network 172.16.0.0
broadcase 172.16.255.255

#private interface
auto eth2
iface eth2 inet manual
up ifconfig eth2 up

重新启动网卡


~ sudo /etc/init.d/networking restart

 * Running /etc/init.d/networking restart is deprecated because it may not enable again some interfaces
 * Reconfiguring network interfaces...                                                 ssh stop/waiting
ssh start/running, process 2040
ssh stop/waiting
ssh start/running, process 2082
ssh stop/waiting
ssh start/running, process 2121
                                                                            [ OK ]
~ ifconfig
eth0      Link encap:Ethernet  HWaddr 08:00:27:90:e8:19
          inet addr:192.168.1.200  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe90:e819/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3408 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2244 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3321759 (3.3 MB)  TX bytes:250703 (250.7 KB)

eth1      Link encap:Ethernet  HWaddr 08:00:27:4e:06:74
          inet addr:172.16.0.1  Bcast:172.16.255.255  Mask:255.255.0.0
          inet6 addr: fe80::a00:27ff:fe4e:674/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:18 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1656 (1.6 KB)  TX bytes:468 (468.0 B)

eth2      Link encap:Ethernet  HWaddr 08:00:27:5a:b1:1f
          inet6 addr: fe80::a00:27ff:fe5a:b11f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:33 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3156 (3.1 KB)  TX bytes:378 (378.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

DNS配置


~ vi /etc/resolv.conf 
nameserver 8.8.8.8

~ ping www.163.com
PING 163.xdwscache.glb0.lxdns.com (101.23.128.17) 56(84) bytes of data.
64 bytes from 101.23.128.17: icmp_req=1 ttl=53 time=20.3 ms
64 bytes from 101.23.128.17: icmp_req=2 ttl=53 time=18.5 ms

4. 软件包依赖

更新源


~ sudo vi /etc/apt/sources.list

deb http://mirrors.163.com/ubuntu/ precise main universe restricted multiverse 
deb-src http://mirrors.163.com/ubuntu/ precise main universe restricted multiverse 
deb http://mirrors.163.com/ubuntu/ precise-security universe main multiverse restricted 
deb-src http://mirrors.163.com/ubuntu/ precise-security universe main multiverse restricted 
deb http://mirrors.163.com/ubuntu/ precise-updates universe main multiverse restricted 
deb http://mirrors.163.com/ubuntu/ precise-proposed universe main multiverse restricted 
deb-src http://mirrors.163.com/ubuntu/ precise-proposed universe main multiverse restricted 
deb http://mirrors.163.com/ubuntu/ precise-backports universe main multiverse restricted 
deb-src http://mirrors.163.com/ubuntu/ precise-backports universe main multiverse restricted 
deb-src http://mirrors.163.com/ubuntu/ precise-updates universe main multiverse restricted

nova相关软件安装


~ sudo apt-get update
~ sudo apt-get -y install rabbitmq-server nova-api nova-objectstore nova-scheduler nova-network nova-compute nova-cert glance qemu unzip
~ sudo apt-get install pm-utils

注:如果没有安装pm-utils,libvirtd日志中会有错误


Cannot find 'pm-is-supported' in path: No such file or directory

查看系统进程


~ pstree
init─┬─acpid
     ├─atd
     ├─beam.smp─┬─cpu_sup
     │          ├─inet_gethost───inet_gethost
     │          └─38*[{beam.smp}]
     ├─cron
     ├─dbus-daemon
     ├─dhclient3
     ├─dnsmasq
     ├─epmd
     ├─5*[getty]
     ├─irqbalance
     ├─2*[iscsid]
     ├─libvirtd───10*[{libvirtd}]
     ├─login───bash
     ├─rsyslogd───3*[{rsyslogd}]
     ├─sshd───sshd───bash
     ├─sshd─┬─sshd───sshd───sh───bash───pstree
     │      └─sshd───sshd───sh───bash
     ├─su───glance-api
     ├─su───glance-registry
     ├─su───nova-api
     ├─su───nova-cert
     ├─su───nova-network
     ├─su───nova-objectstor
     ├─su───nova-scheduler
     ├─su───nova-compute
     ├─udevd───2*[udevd]
     ├─upstart-socket-
     ├─upstart-udev-br
     └─whoopsie───{whoopsie}

安装ntp时间同步服务


~ sudo apt-get -y install ntp

#修改配置文件
~ sudo vi /etc/ntp.conf

#Replace ntp.ubuntu.com with an NTP server on your network
server ntp.ubuntu.com
server 127.127.1.0
fudge 127.127.1.0 stratum 10

#重启ntp
~ sudo service ntp restart

~ ps -aux|grep ntp
ntp       6990  0.0  0.0  37696  2180 ?        Ss   19:50   0:00 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 113:120

#查看系统当前时间
~ date
Sat Jul 13 19:50:12 CST 2013

安装MySQL


~ sudo apt-get install mysql-server

#配置允许其他计算访问
~ sudo sed -i 's/127.0.0.1/0.0.0.0/g' /etc/mysql/my.cnf
~ sudo service mysql restart

#创建nova数据库及配置nova用户
~ MYSQL_PASS=mysql
~ mysql -uroot -p$MYSQL_PASS -e 'CREATE DATABASE nova;'
~ mysql -uroot -p$MYSQL_PASS -e "GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'%'"
~ mysql -uroot -p$MYSQL_PASS -e "SET PASSWORD FOR 'nova'@'%' = PASSWORD('openstack');"

5. nova配置

修改nova.conf配置文件
可以去掉–verbose,只是为了打印更多的日志信息。


~ sudo vi /etc/nova/nova.conf

--dhcpbridge_flagfile=/etc/nova/nova.conf
--dhcpbridge=/usr/bin/nova-dhcpbridge
--logdir=/var/log/nova
--state_path=/var/lib/nova
--lock_path=/var/lock/nova
--force_dhcp_release
--iscsi_helper=tgtadm
--libvirt_use_virtio_for_bridges
--connection_type=libvirt
--root_helper=sudo nova-rootwrap
--verbose
--ec2_private_dns_show_ip
--sql_connection=mysql://nova:openstack@172.16.0.1/nova
--use_deprecated_auth
--s3_host=172.16.0.1
--rabbit_host=172.16.0.1
--ec2_host=172.16.0.1
--ec2_dmz_host=172.16.0.1
--public_interface=eth1
--image_service=nova.image.glance.GlanceImageService
--glance_api_servers=172.16.0.1:9292
--auto_assign_floating_ip=true
--scheduler_default_filters=AllHostsFilter

修改VMM设置,nova-compute.conf
这里要使用qemu虚拟机,如果我们不是嵌套虚拟化的模式,建议使用kvm虚拟机。


~ sudo vi /etc/nova/nova-compute.conf

--libvirt_type=qemu

把nova源数据写入MySQL


~ sudo nova-manage db sync

2013-07-14 21:18:56 DEBUG nova.utils [-] backend  from (pid=8750) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
2013-07-14 21:19:06 WARNING nova.utils [-] /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py:639: SADeprecationWarning: The 'listeners' argument to Pool (and create_engine()) is deprecated.  Use event.listen().
  Pool.__init__(self, creator, **kw)

2013-07-14 21:19:06 WARNING nova.utils [-] /usr/lib/python2.7/dist-packages/sqlalchemy/pool.py:145: SADeprecationWarning: Pool.add_listener is deprecated.  Use event.listen()
  self.add_listener(l)

2013-07-14 21:19:06 AUDIT nova.db.sqlalchemy.fix_dns_domains [-] Applying database fix for Essex dns_domains table.

创建openstack私有网络


~ sudo nova-manage network create vmnet --fixed_range_v4=10.0.0.0/8 --network_size=64 --bridge_interface=eth2

2013-07-14 21:19:34 DEBUG nova.utils [req-152fee41-ddc9-4ac5-902d-fb93f7af67a8 None None] backend  from (pid=8807) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663

设置openstack浮动IP


~ sudo nova-manage floating create --ip_range=172.16.1.0/24

2013-07-14 21:19:48 DEBUG nova.utils [req-7171e8bc-6542-40d2-b24c-b4593505fd87 None None] backend  from (pid=8814) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663

查看mysql中nova数据库


~ mysql -uroot -p

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| ape_biz            |
| mysql              |
| nova               |
| performance_schema |
| test               |
+--------------------+
6 rows in set (0.00 sec)

mysql> use nova

mysql> show tables;
+-------------------------------------+
| Tables_in_nova                      |
+-------------------------------------+
| agent_builds                        |
| aggregate_hosts                     |
| aggregate_metadata                  |
| aggregates                          |
| auth_tokens                         |
| block_device_mapping                |
| bw_usage_cache                      |
| cells                               |
| certificates                        |
| compute_nodes                       |
| console_pools                       |
| consoles                            |
| dns_domains                         |
| fixed_ips                           |
| floating_ips                        |
| instance_actions                    |
| instance_faults                     |
| instance_info_caches                |
| instance_metadata                   |
| instance_type_extra_specs           |
| instance_types                      |
| instances                           |
| iscsi_targets                       |
| key_pairs                           |
| migrate_version                     |
| migrations                          |
| networks                            |
| projects                            |
| provider_fw_rules                   |
| quotas                              |
| s3_images                           |
| security_group_instance_association |
| security_group_rules                |
| security_groups                     |
| services                            |
| sm_backend_config                   |
| sm_flavors                          |
| sm_volume                           |
| snapshots                           |
| user_project_association            |
| user_project_role_association       |
| user_role_association               |
| users                               |
| virtual_interfaces                  |
| virtual_storage_arrays              |
| volume_metadata                     |
| volume_type_extra_specs             |
| volume_types                        |
| volumes                             |
+-------------------------------------+
49 rows in set (0.00 sec)

重启服务nova,libvirt,glance


#停止
~ sudo stop nova-compute
~ sudo stop nova-network
~ sudo stop nova-api
~ sudo stop nova-scheduler
~ sudo stop nova-objectstore
~ sudo stop nova-cert

~ sudo stop libvirt-bin
~ sudo stop glance-registry
~ sudo stop glance-api

#启动
~ sudo start nova-compute
~ sudo start nova-network
~ sudo start nova-api
~ sudo start nova-scheduler
~ sudo start nova-objectstore
~ sudo start nova-cert

~ sudo start libvirt-bin
~ sudo start glance-registry
~ sudo start glance-api

#查看系统进程树
~ pstree
init─┬─acpid
     ├─atd
     ├─beam.smp─┬─cpu_sup
     │          ├─inet_gethost───inet_gethost
     │          └─38*[{beam.smp}]
     ├─cron
     ├─dbus-daemon
     ├─dhclient3
     ├─dnsmasq
     ├─epmd
     ├─5*[getty]
     ├─irqbalance
     ├─2*[iscsid]
     ├─libvirtd───10*[{libvirtd}]
     ├─login───bash
     ├─mysqld───19*[{mysqld}]
     ├─ntpd
     ├─rsyslogd───3*[{rsyslogd}]
     ├─sshd───sshd───bash
     ├─sshd─┬─sshd───sshd───sh───bash───pstree
     │      └─sshd───sshd───sh───bash
     ├─su───glance-registry
     ├─su───glance-api
     ├─su───nova-network
     ├─su───nova-api
     ├─su───nova-scheduler
     ├─su───nova-objectstor
     ├─su───nova-cert
     ├─su───nova-compute
     ├─udevd───2*[udevd]
     ├─upstart-socket-
     ├─upstart-udev-br
     └─whoopsie───{whoopsie}

创建nova用户,角色,项目


#创建用户
~ sudo nova-manage user admin openstack
2013-07-14 21:22:00 DEBUG nova.utils [req-6a95dd03-04db-4f60-9198-d77a4d4936e8 None None] backend  from (pid=9254) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
2013-07-14 21:22:00 AUDIT nova.auth.manager [-] Created user openstack (admin: True)
export EC2_ACCESS_KEY=62ff82fa-74a9-4ffb-a420-ea190e893863
export EC2_SECRET_KEY=f1f32aed-85fe-406d-8f28-bbf02d7a7134

#创建角色
~ sudo nova-manage role add openstack cloudadmin
2013-07-14 21:22:15 AUDIT nova.auth.manager [-] Adding sitewide role cloudadmin to user openstack
2013-07-14 21:22:15 DEBUG nova.utils [req-a9d8cdfa-263c-4d6a-8c69-d6571aabee00 None None] backend  from (pid=9262) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663

#创建项目
~ sudo nova-manage project create cookbook openstack
2013-07-14 21:22:34 DEBUG nova.utils [req-3a340500-6674-439e-ac95-e28954637cf5 None None] backend  from (pid=9395) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
2013-07-14 21:22:34 AUDIT nova.auth.manager [-] Created project cookbook with manager openstack

~ sudo nova-manage project zipfile cookbook openstack
2013-07-14 21:22:49 DEBUG nova.utils [req-429e7839-6009-4862-98c5-af01ceac9cee None None] backend  from (pid=9402) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:663
2013-07-14 21:22:49 DEBUG nova.utils [-] Running cmd (subprocess): openssl genrsa -out /tmp/tmprZNpYT/temp.key 1024 from (pid=9402) execute /usr/lib/python2.7/dist-packages/nova/utils.py:224
2013-07-14 21:22:49 DEBUG nova.utils [-] Running cmd (subprocess): openssl req -new -key /tmp/tmprZNpYT/temp.key -out /tmp/tmprZNpYT/temp.csr -batch -subj /C=US/ST=California/O=OpenStack/OU=NovaDev/CN=cookbook-openstack-2013-07-14T13:22:49Z from (pid=9402) execute /usr/lib/python2.7/dist-packages/nova/utils.py:224
2013-07-14 21:22:49 DEBUG nova.crypto [-] Flags path: /var/lib/nova/CA from (pid=9402) _sign_csr /usr/lib/python2.7/dist-packages/nova/crypto.py:290
2013-07-14 21:22:49 DEBUG nova.utils [-] Running cmd (subprocess): openssl ca -batch -out /tmp/tmpJZvmrM/outbound.csr -config ./openssl.cnf -infiles /tmp/tmpJZvmrM/inbound.csr from (pid=9402) execute /usr/lib/python2.7/dist-packages/nova/utils.py:224
2013-07-14 21:22:49 DEBUG nova.utils [-] Running cmd (subprocess): openssl x509 -in /tmp/tmpJZvmrM/outbound.csr -serial -noout from (pid=9402) execute /usr/lib/python2.7/dist-packages/nova/utils.py:224
2013-07-14 21:22:49 WARNING nova.auth.manager [-] No vpn data for project cookbook

安装命令行工具及配置


~ sudo apt-get install euca2ools python-novaclient unzip

~ pwd
/home/openstack

~ ls -l
-rw-r--r-- 1 root root 5930 Jul 13 20:38 nova.zip

#解压工具包
~ unzip nova.zip
Archive:  nova.zip
extracting: novarc
extracting: pk.pem
extracting: cert.pem
extracting: cacert.pem

#增加环境变量
~ . novarc

#查看环境变量
~ env
LC_PAPER=en_US
LC_ADDRESS=en_US
LC_MONETARY=en_US
SHELL=/bin/sh
TERM=xterm
SSH_CLIENT=192.168.1.11 60377 22
LC_NUMERIC=en_US
EUCALYPTUS_CERT=/home/openstack/cacert.pem
OLDPWD=/var/log/libvirt
SSH_TTY=/dev/pts/0
LC_ALL=en_US.UTF-8
USER=openstack
LC_TELEPHONE=en_US
NOVA_CERT=/home/openstack/cacert.pem
EC2_SECRET_KEY=6f964a16-6036-44ef-bdf3-23dff94f5b94
NOVA_PROJECT_ID=cookbook
EC2_USER_ID=42
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home/conan/toolkit/jdk16/bin:/home/conan/toolkit/cassandra/bin
MAIL=/var/mail/openstack
NOVA_VERSION=1.1
LC_IDENTIFICATION=en_US
NOVA_USERNAME=openstack
PWD=/home/openstack
JAVA_HOME=/home/conan/toolkit/jdk16
LANG=en_US.UTF-8
CASSANDRA_HOME=/home/conan/toolkit/cassandra125
LC_MEASUREMENT=en_US
NOVA_API_KEY=openstack
NOVA_URL=http://172.16.0.1:8774/v1.1/
SHLVL=1
HOME=/home/openstack
LANGUAGE=en_US:en
EC2_URL=http://172.16.0.1:8773/services/Cloud
LOGNAME=openstack
SSH_CONNECTION=192.168.1.11 60377 192.168.1.200 22
EC2_ACCESS_KEY=openstack:cookbook
EC2_PRIVATE_KEY=/home/openstack/pk.pem
DISPLAY=localhost:10.0
S3_URL=http://172.16.0.1:3333
LC_TIME=en_US
EC2_CERT=/home/openstack/cert.pem
LC_NAME=en_US
_=/usr/bin/env

#创建密钥
~ euca-add-keypair openstack > openstack.pem
~ chmod 0600 *.pem

~ ls -l
-rw------- 1 openstack openstack 1029 Jul 13 20:38 cacert.pem
-rw------- 1 openstack openstack 2515 Jul 13 20:38 cert.pem
-rw------- 1 openstack openstack 1113 Jul 13 20:38 novarc
-rw-r--r-- 1 root      root      5930 Jul 13 20:38 nova.zip
-rw------- 1 openstack openstack  954 Jul 13 20:50 openstack.pem
-rw------- 1 openstack openstack  887 Jul 13 20:38 pk.pem

查看nova服务


~ euca-describe-availability-zones verbose
AVAILABILITYZONE        nova    available
AVAILABILITYZONE        |- nova
AVAILABILITYZONE        | |- nova-scheduler     enabled :-) 2013-07-14 13:23:53
AVAILABILITYZONE        | |- nova-compute       enabled :-) 2013-07-14 13:23:53
AVAILABILITYZONE        | |- nova-cert  enabled :-) 2013-07-14 13:23:53
AVAILABILITYZONE        | |- nova-network       enabled :-) 2013-07-14 13:23:53

6. 创建nova实例

上传云实例到虚拟主机
ubuntu-12.04-server-cloudimg-i386.tar.gz文件,请自己下载:http://uec-images.ubuntu.com/releases/precise/release/ubuntu-12.04-server-cloudimg-i386.tar.gz


~ scp ubuntu-12.04-server-cloudimg-i386.tar.gz openstack@192.168.1.200:/home/openstack
ubuntu-12.04-server-cloudimg-i386.tar.gz                                              100%  206MB -11880.-5KB/s   00:07

~ ls -l
-rw------- 1 openstack openstack      1029 Jul 14 21:22 cacert.pem
-rw------- 1 openstack openstack      2515 Jul 14 21:22 cert.pem
-rw------- 1 openstack openstack      1113 Jul 14 21:22 novarc
-rw-r--r-- 1 root      root           5930 Jul 14 21:22 nova.zip
-rw------- 1 openstack openstack       954 Jul 14 21:23 openstack.pem
-rw------- 1 openstack openstack       887 Jul 14 21:22 pk.pem
-rw-r--r-- 1 openstack openstack 215487341 Jul 14 21:25 ubuntu-12.04-server-cloudimg-i386.tar.gz

安装云实例


~ cloud-publish-tarball ubuntu-12.04-server-cloudimg-i386.tar.gz images i386

Sun Jul 14 21:25:34 CST 2013: ====== extracting image ======
Warning: no ramdisk found, assuming '--ramdisk none'
kernel : precise-server-cloudimg-i386-vmlinuz-virtual
ramdisk: none
image  : precise-server-cloudimg-i386.img
Sun Jul 14 21:25:40 CST 2013: ====== bundle/upload kernel ======
Sun Jul 14 21:26:06 CST 2013: ====== bundle/upload image ======
Sun Jul 14 21:27:04 CST 2013: ====== done ======
emi="ami-00000002"; eri="none"; eki="aki-00000001";

查看云实例
两种查看方式:euca客户端和nova客户端

注:注册image的操作要经过:decrypting,untarring,uploading,available这几个状态,需要等待几分钟


~ euca-describe-images

IMAGE   aki-00000001    images/precise-server-cloudimg-i386-vmlinuz-virtual.manifest.xmlavailable        private         i386    kernel                          instance-store
IMAGE   ami-00000002    images/precise-server-cloudimg-i386.img.manifest.xml            available        private         i386    machine aki-00000001                    instance-store

~ nova image-list
+--------------------------------------+-----------------------------------------------------+--------+--------+
|                  ID                  |                         Name                        | Status | Server |
+--------------------------------------+-----------------------------------------------------+--------+--------+
| 306eb471-bbc5-495e-b7a1-484e11f71502 | images/precise-server-cloudimg-i386-vmlinuz-virtual | ACTIVE |        |
| 9dbf632e-b0d8-4230-a0e5-ee3836040492 | images/precise-server-cloudimg-i386.img             | ACTIVE |        |
+--------------------------------------+-----------------------------------------------------+--------+--------+

设置云实例网络


~ euca-authorize default -P tcp -p 22 -s 0.0.0.0/0
GROUP   default
PERMISSION      default ALLOWS  tcp     22      22      FROM    CIDR    0.0.0.0/0

~ euca-authorize default -P icmp -t -1:-1
GROUP   default
PERMISSION      default ALLOWS  icmp    -1      -1      FROM    CIDR    0.0.0.0/0

查看系统空间


~ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        34G  3.7G   29G  12% /
udev            3.0G  4.0K  3.0G   1% /dev
tmpfs           1.2G  316K  1.2G   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            3.0G     0  3.0G   0% /run/shm
cgroup          3.0G     0  3.0G   0% /sys/fs/cgroup

~ nova flavor-list
+----+-----------+-----------+------+-----------+------+-------+-------------+
| ID |    Name   | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor |
+----+-----------+-----------+------+-----------+------+-------+-------------+
| 1  | m1.tiny   | 512       | 0    | 0         |      | 1     | 1.0         |
| 2  | m1.small  | 2048      | 10   | 20        |      | 1     | 1.0         |
| 3  | m1.medium | 4096      | 10   | 40        |      | 2     | 1.0         |
| 4  | m1.large  | 8192      | 10   | 80        |      | 4     | 1.0         |
| 5  | m1.xlarge | 16384     | 10   | 160       |      | 8     | 1.0         |
+----+-----------+-----------+------+-----------+------+-------+-------------+

还有29G硬盘空间,根据云实例创建favor-list,我可以选择 tiny或者small。

这里我选择用tiny模式


~ euca-run-instances ami-00000002 -t m1.tiny -k openstack
RESERVATION     r-f6y5tydu      cookbook        default
INSTANCE        i-00000001      ami-00000002                    pending openstack (cookbook, None)       0               m1.tiny 2013-07-14T13:30:02.000Z        unknown zone    aki-00000001                     monitoring-disabled     

查看云实例列表
这个操作要等几分钟:


~ euca-describe-instances
RESERVATION     r-f6y5tydu      cookbook        default
INSTANCE        i-00000001      ami-00000002    172.16.1.1      10.0.0.3        running openstack (cookbook, nova)       0               m1.tiny 2013-07-14T13:30:02.000Z        nova     aki-00000001                    monitoring-disabled     172.16.1.1      10.0.0.3instance-store

~ nova list
+--------------------------------------+----------+--------+----------------------------+
|                  ID                  |   Name   | Status |          Networks          |
+--------------------------------------+----------+--------+----------------------------+
| d6e5fe88-1950-48f4-853a-2fd57e6c72f4 | Server 1 | ACTIVE | vmnet=10.0.0.3, 172.16.1.1 |
+--------------------------------------+----------+--------+----------------------------+

~ top
11424 libvirt-  20   0 1451m 322m 7284 S  105  5.4   1:20.58 qemu-system-x86
 8962 nova      20   0  265m 104m 5936 S    3  1.8   0:17.57 nova-api
   35 root      25   5     0    0    0 S    1  0.0   0:00.66 ksmd
 5933 rabbitmq  20   0 2086m  28m 2468 S    1  0.5   0:03.33 beam.smp
 8969 nova      20   0  195m  48m 4748 S    1  0.8   0:02.77 nova-scheduler
 9190 nova      20   0 1642m  63m 6660 S    1  1.1   0:08.14 nova-compute
 8522 mysql     20   0 1118m  54m 8276 S    1  0.9   0:04.45 mysqld
 8951 nova      20   0  197m  50m 4756 S    1  0.9   0:03.09 nova-network

7. 登陆云实例

ping通过云实例


~ ping 172.16.1.1
PING 172.16.1.1 (172.16.1.1) 56(84) bytes of data.
64 bytes from 172.16.1.1: icmp_req=1 ttl=64 time=5.98 ms
64 bytes from 172.16.1.1: icmp_req=2 ttl=64 time=2.00 ms
64 bytes from 172.16.1.1: icmp_req=3 ttl=64 time=3.27 ms

通过证书登陆云实例


~ ssh -i openstack.pem ubuntu@172.16.1.1
The authenticity of host '172.16.1.1 (172.16.1.1)' can't be established.
ECDSA key fingerprint is b8:0b:a6:18:0d:30:06:ea:79:c7:17:e5:29:34:55:39.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '172.16.1.1' (ECDSA) to the list of known hosts.

在云实例简单操作


~ ubuntu@server-1:~$ who
ubuntu   pts/0        2013-07-14 13:35 (172.16.1.1)

~ ubuntu@server-1:~$ ifconfig
eth0      Link encap:Ethernet  HWaddr fa:16:3e:22:75:8f
          inet addr:10.0.0.3  Bcast:10.0.0.63  Mask:255.255.255.192
          inet6 addr: fe80::f816:3eff:fe22:758f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:263 errors:0 dropped:0 overruns:0 frame:0
          TX packets:250 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:29823 (29.8 KB)  TX bytes:28061 (28.0 KB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

nova实验安装完成!!

8. 错误总结

看上去上面过程,就是命令操作,但其实会遇到很多的问题。

1. 版本问题:如果用Ubunut 12.04.2时,和上面完全一样的操作,在创建云实例的时候会失败。云实例起不来,从RUNNING状态直接进入SHUTDOWN状态。ping不通,ssh也连不上。

2. 版本问题:如果用Ubunut 13.03时,命令操作,Nova数据库,Nova配置文件, Nova服务都已经和书中实例不一样了。按照书的操作无法进行。

3. 依赖软件问题:pm-utils这个软件竟然没有做libvirtd的依赖,需要手动安装。
如果没有安装会出现下面的错误


~ sudo cat /var/log/libvirt/libvirtd.log

2013-07-13 12:20:25.511+0000: 9292: info : libvirt version: 0.9.8
2013-07-13 12:20:25.511+0000: 9292: error : virExecWithHook:327 : Cannot find 'pm-is-supported' in path: No such file or directory
2013-07-13 12:20:25.511+0000: 9292: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2013-07-13 12:20:25.653+0000: 9292: error : virExecWithHook:327 : Cannot find 'pm-is-supported' in path: No such file or directory
2013-07-13 12:20:25.653+0000: 9292: warning : lxcCapsInit:77 : Failed to get host power management capabilities
2013-07-13 12:20:25.654+0000: 9292: error : virExecWithHook:327 : Cannot find 'pm-is-supported' in path: No such file or directory
2013-07-13 12:20:25.655+0000: 9292: warning : umlCapsInit:87 : Failed to get host power management capabilities

#解决pm-is-supported错误
~ sudo apt-get -y install pm-utils
~ sudo stop libvirt-bin
~ sudo start libvirt-bin

4. nova状态问题:注册image时间,会卡在untarring永久不动了。nova image-list状态是saving。
这是因为image由于各种原因没有注册成功,我第一次就由于硬盘满了,就一直卡在这个状态了,也看不到错误信息太郁闷了。

解决办法:删除原来卡住的 image,重新注册。


euca-deregister aki-00000001
euca-deregister ami-00000002

耐心很重要,坚持就是胜利。

转载请注明出处:
http://blog.fens.me/vps-nova-setup/

打赏作者

让Hadoop跑在云端系列文章 之 增加删除hadoop节点

让Hadoop跑在云端系列文章,介绍了如何整合虚拟化和Hadoop,让Hadoop集群跑在VPS虚拟主机上,通过云向用户提供存储和计算的服务。

现在硬件越来越便宜,一台非品牌服务器,2颗24核CPU,配48G内存,2T的硬盘,已经降到2万块人民币以下了。这种配置如果简单地放几个web应用,显然是奢侈的浪费。就算是用来实现单节点的hadoop,对计算资源浪费也是非常高的。对于这么高性能的计算机,如何有效利用计算资源,就成为成本控制的一项重要议题了。

通过虚拟化技术,我们可以将一台服务器,拆分成12台VPS,每台2核CPU,4G内存,40G硬盘,并且支持资源重新分配。多么伟大的技术啊!现在我们有了12个节点的hadoop集群, 让Hadoop跑在云端,让世界加速。

关于作者:

  • 张丹(Conan), 程序员Java,R,PHP,Javascript
  • weibo:@Conan_Z
  • blog: http://blog.fens.me
  • email: bsspirit@gmail.com

转载请注明出处:

http://blog.fens.me/hadoop-clone-add-delete/

clone-add-del

前言

让Hadoop跑在云端系列文章,经过前面几篇文章的介绍,我们已经可以创建并管理虚拟机,增加hadoop节点。本文只是把操作过程整理一下,做一个操作总结,让没有计算机背景的同学,也可以进行操作。

目录

  1. 增加克隆体hadoop节点c6
  2. 删除c6节点
  3. 实现脚本

 

1. 增加克隆体hadoop节点c6

1. 登陆host主机,查检c6.wtmat.com域名是否已经被正确解析。


~ ssh cos@host.wtmart.com

~ ping c6.wtmart.com
ping: unknown host c6.wtmart.com

2. 登陆dns.wtmart.com服务器,做域名绑定。


~ ssh cos@dns.wtmart.com

~  sudo vi /etc/bind/db.wtmart.com
#增加
c6      IN      A       192.168.1.35

#重启dns服务器
~ sudo /etc/init.d/bind9 restart
 * Stopping domain name service... bind9                                                              waiting for pid 1418 to die                                                                                              [ OK ]
 * Starting domain name service... bind9

~ exit

3. 返回host, 查检c6.wtmat.com域名是否已经被正确解析。


~ ping c6.wtmart.com -n
PING c6.wtmart.com (192.168.1.35) 56(84) bytes of data.
From 192.168.1.79 icmp_seq=1 Destination Host Unreachable
From 192.168.1.79 icmp_seq=2 Destination Host Unreachable
From 192.168.1.79 icmp_seq=3 Destination Host Unreachable

c6.wtmart.com已被解析到192.168.1.35,只是还没有主机,下面我们就给c6增加一台虚拟机。

4. 在host,克隆虚拟机


~ sudo virt-clone --connect qemu:///system -o hadoop-base -n c6 -f /disk/sdb1/c6.img
Cloning hadoop-base.img               1% [                          ]  42 MB/s | 531 MB     15:53 ETA
Cloning hadoop-base.img                                                        |  40 GB     07:54

5. 打开虚拟机管理控制软件virsh


~ sudo virsh

#查看主机状态
virsh # list --all
 Id    Name                           State
----------------------------------------------------
 5     server3                        running
 6     server4                        running
 7     d2                             running
 8     r1                             running
 9     server2                        running
 18    server5                        running
 48    c3                             running
 50    c1                             running
 52    c4                             running
 53    c2                             running
 55    c5                             running
 -     c6                             shut off
 -     d1                             shut off
 -     hadoop-base                    shut off
 -     u1210-base                     shut off


#编辑c6虚拟机,给虚拟机挂载分区硬盘/dev/sdb10
~ edit c6
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/sdb10'/>
<target dev='vdb' bus='virtio'/>
</disk>

#启动c6
~ start c6
Domain c6 started

#进入c6
~ console c6

6. 在c6中,执行快速配置脚本。


~ pwd
/home/cos

~ ls -l
drwxrwxr-x 2 cos cos 4096 Jul  9 23:50 hadoop
-rw-rw-r-- 1 cos cos 1404 Jul 11 16:50 quick.sh
drwxrwxr-x 7 cos cos 4096 Jul  9 23:31 toolkit

#修改虚拟机参数
~ vi quick.sh
export HOST=c6
export IP=192.168.1.35


#请用sudo身份执行脚本
~ sudo sh ./quick.sh

====================hostname host============================
====================ip address============================
Rather than invoking init scripts through /etc/init.d, use the service(8)
utility, e.g. service networking restart

Since the script you are attempting to invoke has been converted to an
Upstart job, you may also use the stop(8) and then start(8) utilities,
e.g. stop networking ; start networking. The restart(8) utility is also available.
networking stop/waiting
networking start/running
====================dns============================
====================fdisk mount============================
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x8f02312d.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): Partition type:
   p   primary (0 primary, 0 extended, 4 free)
   e   extended
Select (default p): Partition number (1-4, default 1): Using default value 1
First sector (2048-379580414, default 2048): Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-379580414, default 379580414): Using default value 379580414

Command (m for help): The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
mke2fs 1.42.5 (29-Jul-2012)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
11862016 inodes, 47447295 blocks
2372364 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
1448 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872

Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

==================== hadoop folder============================
====================ssh============================
Generating public/private rsa key pair.
Your identification has been saved in /home/cos/.ssh/id_rsa.
Your public key has been saved in /home/cos/.ssh/id_rsa.pub.
The key fingerprint is:
55:0d:3c:61:cc:53:e5:68:24:aa:33:18:3b:fc:08:75 cos@c6
The key's randomart image is:
+--[ RSA 2048]----+
|           +*=o..|
|           +*o.o |
|      o E o  oo .|
|     o = o   .   |
|    . = S        |
|     . + o       |
|      . .        |
|                 |
|                 |
+-----------------+


#退出虚拟机
~ exit

7. 登陆hadoop的主节点c1.wtmart.com


~ ssh cos@c1.wtmart.com

#查看当前hadoop集群状态, 5个hadoop节点正常运行
~ hadoop dfsadmin -report
Configured Capacity: 792662536192 (738.22 GB)
Present Capacity: 744482840576 (693.35 GB)
DFS Remaining: 744482676736 (693.35 GB)
DFS Used: 163840 (160 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 5 (5 total, 0 dead)

Name: 192.168.1.30:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 15181529088 (14.14 GB)
DFS Remaining: 143351558144(133.51 GB)
DFS Used%: 0%
DFS Remaining%: 90.42%
Last contact: Fri Jul 12 23:16:09 CST 2013


Name: 192.168.1.31:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:16:08 CST 2013


Name: 192.168.1.32:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249581568 (7.68 GB)
DFS Remaining: 150283526144(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:16:08 CST 2013


Name: 192.168.1.34:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:16:09 CST 2013


Name: 192.168.1.33:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:16:08 CST 2013

8. 增加c6到hadoop集群


~ pwd
/home/cos

~ ls
-rw-rw-r-- 1 cos cos 1078 Jul 12 23:30 clone-node-add.sh
-rw-rw-r-- 1 cos cos  918 Jul 12 23:44 clone-node-del.sh
drwxrwxr-x 2 cos cos 4096 Jul  9 21:42 download
drwxr-xr-x 6 cos cos 4096 Jul  9 23:31 hadoop
drwxrwxr-x 7 cos cos 4096 Jul  9 23:31 toolkit

#修改配置参数
~ vi clone-node-add.sh

#新增节点c6.wtmart.com
export NEW_NODE=c6.wtmart.com
#配置slaves节点
export SLAVES=c1.wtmart.com:c2.wtmart.com:c3.wtmart.com:c4.wtmart.com:c5.wtmart.com:c6.wtmart.com

#运行脚本,以当前用户运行
~ sh ./clone-node-add.sh
===============sync ssh=========================
Warning: Permanently added 'c6.wtmart.com,192.168.1.35' (ECDSA) to the list of known hosts.
scp c1.wtmart.com
scp c2.wtmart.com
scp c3.wtmart.com
scp c4.wtmart.com
scp c5.wtmart.com
scp c6.wtmart.com
===============sync hadoop slaves=========================
scp c1.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
scp c2.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
scp c3.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
scp c4.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
scp c5.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
scp c6.wtmart.com
slaves                                                               100%   84     0.1KB/s   00:00
===============restart hadoop cluster=========================
Warning: $HADOOP_HOME is deprecated.

stopping jobtracker
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c1.wtmart.com: stopping tasktracker
c5.wtmart.com: stopping tasktracker
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c4.wtmart.com: stopping tasktracker
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c3.wtmart.com: stopping tasktracker
c2.wtmart.com: stopping tasktracker
c6.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c6.wtmart.com:
c6.wtmart.com: no tasktracker to stop
stopping namenode
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: stopping datanode
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c6.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c6.wtmart.com:
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c6.wtmart.com: no datanode to stop
c5.wtmart.com: stopping datanode
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c2.wtmart.com: stopping datanode
c3.wtmart.com: stopping datanode
c4.wtmart.com: stopping datanode
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: stopping secondarynamenode
Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-namenode-c1.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c1.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c1.out
c6.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c6.wtmart.com:
c5.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c5.out
c2.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c2.out
c6.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c6.out
c3.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c3.out
c4.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c4.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting secondarynamenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-secondarynamenode-c1.out
starting jobtracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-jobtracker-c1.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c1.out
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c5.wtmart.com:
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c6.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c6.wtmart.com:
c3.wtmart.com:
c5.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c5.out
c2.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c2.out
c6.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c6.out
c4.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c4.out
c3.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c3.out

9. 查看hadoop节点已经增加,c6已经增加到hadoop集群中


#查看系统进程
~ jps
12019 TaskTracker
11763 SecondaryNameNode
12098 Jps
11878 JobTracker
11633 DataNode
11499 NameNode

#查看hadoop节点
~ hadoop dfsadmin -report
Configured Capacity: 983957319680 (916.38 GB)
Present Capacity: 925863960576 (862.28 GB)
DFS Remaining: 925863768064 (862.28 GB)
DFS Used: 192512 (188 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 6 (6 total, 0 dead)

Name: 192.168.1.30:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 15181541376 (14.14 GB)
DFS Remaining: 143351545856(133.51 GB)
DFS Used%: 0%
DFS Remaining%: 90.42%
Last contact: Fri Jul 12 23:27:01 CST 2013


Name: 192.168.1.31:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:27:00 CST 2013


Name: 192.168.1.32:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249581568 (7.68 GB)
DFS Remaining: 150283526144(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:27:00 CST 2013


Name: 192.168.1.34:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:27:00 CST 2013


Name: 192.168.1.33:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:27:00 CST 2013


Name: 192.168.1.35:50010
Decommission Status : Normal
Configured Capacity: 191294783488 (178.16 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 9913651200 (9.23 GB)
DFS Remaining: 181381103616(168.92 GB)
DFS Used%: 0%
DFS Remaining%: 94.82%
Last contact: Fri Jul 12 23:27:02 CST 2013

删除c6节点

1. 登陆hadoop主节点,c1.wtmart.com


~ ssh cos@c1.wtmart.com

~ pwd
/home/cos

~ ls -l
-rw-rw-r-- 1 cos cos 1078 Jul 12 23:30 clone-node-add.sh
-rw-rw-r-- 1 cos cos  918 Jul 12 23:44 clone-node-del.sh
drwxrwxr-x 2 cos cos 4096 Jul  9 21:42 download
drwxr-xr-x 6 cos cos 4096 Jul  9 23:31 hadoop
drwxrwxr-x 7 cos cos 4096 Jul  9 23:31 toolkit

2. 修改配置脚本


~ vi clone-node-del.sh
export DEL_NODE=c6
export SLAVES=c1.wtmart.com:c2.wtmart.com:c3.wtmart.com:c4.wtmart.com:c5.wtmart.com

#运行脚本,以当前用户运行
~  sh ./clone-node-del.sh
===============sync ssh=========================
scp c1.wtmart.com
scp c2.wtmart.com
scp c3.wtmart.com
scp c4.wtmart.com
scp c5.wtmart.com
===============sync hadoop slaves=========================
scp c1.wtmart.com
slaves                                                               100%   70     0.1KB/s   00:00
scp c2.wtmart.com
slaves                                                               100%   70     0.1KB/s   00:00
scp c3.wtmart.com
slaves                                                               100%   70     0.1KB/s   00:00
scp c4.wtmart.com
slaves                                                               100%   70     0.1KB/s   00:00
scp c5.wtmart.com
slaves                                                               100%   70     0.1KB/s   00:00
===============restart hadoop cluster=========================
Warning: $HADOOP_HOME is deprecated.

stopping jobtracker
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: stopping tasktracker
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c2.wtmart.com: stopping tasktracker
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c5.wtmart.com: stopping tasktracker
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c3.wtmart.com: stopping tasktracker
c4.wtmart.com: stopping tasktracker
stopping namenode
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: stopping datanode
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c4.wtmart.com: stopping datanode
c5.wtmart.com: stopping datanode
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c3.wtmart.com: stopping datanode
c2.wtmart.com: stopping datanode
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: stopping secondarynamenode
Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-namenode-c1.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c5.wtmart.com:
c3.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c3.out
c1.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c1.out
c5.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c5.out
c2.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c2.out
c4.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c4.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting secondarynamenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-secondarynamenode-c1.out
starting jobtracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-jobtracker-c1.out
c4.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c4.wtmart.com:
c5.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c5.wtmart.com:
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c5.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c5.out
c2.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c2.wtmart.com:
c4.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c4.out
c3.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c3.out
c2.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c2.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c1.out

3. 查看hadoop节点,c6已经被删除


~ hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.

Safe mode is ON
Configured Capacity: 792662536192 (738.22 GB)
Present Capacity: 744482836480 (693.35 GB)
DFS Remaining: 744482672640 (693.35 GB)
DFS Used: 163840 (160 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 5 (5 total, 0 dead)

Name: 192.168.1.30:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 15181533184 (14.14 GB)
DFS Remaining: 143351554048(133.51 GB)
DFS Used%: 0%
DFS Remaining%: 90.42%
Last contact: Fri Jul 12 23:45:29 CST 2013


Name: 192.168.1.31:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:45:29 CST 2013


Name: 192.168.1.32:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249581568 (7.68 GB)
DFS Remaining: 150283526144(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:45:29 CST 2013


Name: 192.168.1.34:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:45:32 CST 2013


Name: 192.168.1.33:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Fri Jul 12 23:45:32 CST 2013

4. 登陆host,删除c6的虚拟机


~ ssh cos@host.wtmart.com
~ sudo virsh

#暂停c6的虚拟机
virsh # destroy c6
Domain c6 destroyed

#销毁c6的虚拟机实例
virsh # undefine c6
Domain c6 has been undefined

#查看虚拟机列表,c6已经不存在
virsh # list --all
 Id    Name                           State
----------------------------------------------------
 5     server3                        running
 6     server4                        running
 7     d2                             running
 8     r1                             running
 9     server2                        running
 18    server5                        running
 48    c3                             running
 50    c1                             running
 52    c4                             running
 53    c2                             running
 55    c5                             running
 -     d1                             shut off
 -     hadoop-base                    shut off
 -     u1210-base                     shut off

5. 物理硬盘删除c6的镜像文件


~ cd /disk/sdb1
~ sudo rm c6.img

完成删除虚拟c6节点的操作。

3. 实现脚本

quick.sh


~ vi quick.sh

#!/bin/bash
export HOST=c5
export DOMAIN=$HOST.wtmat.com
export IP=192.168.1.34
export DNS=192.168.1.7

#1. hostname host
echo "====================hostname host============================"
hostname $HOST
echo $HOST >  /etc/hostname
sed -i -e "/127.0.0.1/d" /etc/hosts
sed -i -e 1"i\127.0.0.1 localhost ${HOST}" /etc/hosts

#2. ip
echo "====================ip address============================"
sed -i -e "/address/d;/^iface eth0 inet static/a\address ${IP}" /etc/network/interfaces
/etc/init.d/networking restart

#3. dns
echo "====================dns============================"
echo "nameserver ${DNS}" > /etc/resolv.conf

#4. fdisk mount
echo "====================fdisk mount============================"
(echo n; echo p; echo ; echo ; echo ; echo w) | fdisk /dev/vdb
mkfs -t ext4 /dev/vdb1
mount /dev/vdb1 /home/cos/hadoop
echo "/dev/vdb1 /home/cos/hadoop ext4 defaults 0 0 " >> /etc/fstab

#5. hadoop folder
echo "==================== hadoop folder============================"
mkdir /home/cos/hadoop/data
mkdir /home/cos/hadoop/tmp
chown -R cos:cos /home/cos/hadoop/
chmod 755 /home/cos/hadoop/data
chmod 755 /home/cos/hadoop/tmp

#6. ssh
echo "====================ssh============================"
rm -rf /home/cos/.ssh/*
sudo -u cos ssh-keygen -t rsa -N "" -f /home/cos/.ssh/id_rsa
sudo cat /home/cos/.ssh/id_rsa.pub >> /home/cos/.ssh/authorized_keys
chown -R cos:cos /home/cos/.ssh/
exit

clone-node-add.sh


~ vi clone-node-add.sh

#!/bin/bash
export NEW_NODE=c6.wtmart.com
export PASS=cos
export SLAVES=c1.wtmart.com:c2.wtmart.com:c3.wtmart.com:c4.wtmart.com:c5.wtmart.com:c6.wtmart.com
IFS=:

#sudo apt-get install sshpass
#1. sync ssh
echo "===============sync ssh========================="
sshpass -p ${PASS} scp -o StrictHostKeyChecking=no cos@${NEW_NODE}:/home/cos/.ssh/authorized_keys .
cat authorized_keys >> /home/cos/.ssh/authorized_keys
rm authorized_keys

for SLAVE in $SLAVES
do
        echo scp $SLAVE
        sshpass -p ${PASS} scp /home/cos/.ssh/authorized_keys cos@$SLAVE:/home/cos/.ssh/authorized_keys
done

#2. sync hadoop slaves
echo "===============sync hadoop slaves========================="
rm /home/cos/toolkit/hadoop-1.0.3/conf/slaves
for SLAVE in $SLAVES
do
   echo $SLAVE >> /home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

for SLAVE in $SLAVES
do
        echo scp $SLAVE
        scp /home/cos/toolkit/hadoop-1.0.3/conf/slaves cos@$SLAVE:/home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

#3. restart hadoop cluster
echo "===============restart hadoop cluster========================="
stop-all.sh
start-all.sh

clone-node-del.sh


~ vi clone-node-del.sh

#!/bin/bash
export DEL_NODE=c6
export PASS=cos
export SLAVES=c1.wtmart.com:c2.wtmart.com:c3.wtmart.com:c4.wtmart.com:c5.wtmart.com
IFS=:

#0 stop
stop-all.sh

#1. sync ssh
echo "===============sync ssh========================="
sed -i "/cos@${DEL_NODE}/d" /home/cos/.ssh/authorized_keys

for SLAVE in $SLAVES
do
        echo scp $SLAVE
        sshpass -p ${PASS} scp /home/cos/.ssh/authorized_keys cos@$SLAVE:/home/cos/.ssh/authorized_keys
done

#2. sync hadoop slaves
echo "===============sync hadoop slaves========================="
rm /home/cos/toolkit/hadoop-1.0.3/conf/slaves
for SLAVE in $SLAVES
do
   echo $SLAVE >> /home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

for SLAVE in $SLAVES
do
        echo scp $SLAVE
        scp /home/cos/toolkit/hadoop-1.0.3/conf/slaves cos@$SLAVE:/home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

#3. restart hadoop cluster
echo "===============restart hadoop cluster========================="
start-all.sh

方便维护人员快速增加删除hadoop节点。

转载请注明出处:

http://blog.fens.me/hadoop-clone-add-delete/

打赏作者

让Hadoop跑在云端系列文章 之 克隆虚拟机优化方案1-安装和配置

让Hadoop跑在云端系列文章,介绍了如何整合虚拟化和Hadoop,让Hadoop集群跑在VPS虚拟主机上,通过云向用户提供存储和计算的服务。

现在硬件越来越便宜,一台非品牌服务器,2颗24核CPU,配48G内存,2T的硬盘,已经降到2万块人民币以下了。这种配置如果简单地放几个web应用,显然是奢侈的浪费。就算是用来实现单节点的hadoop,对计算资源浪费也是非常高的。对于这么高性能的计算机,如何有效利用计算资源,就成为成本控制的一项重要议题了。

通过虚拟化技术,我们可以将一台服务器,拆分成12台VPS,每台2核CPU,4G内存,40G硬盘,并且支持资源重新分配。多么伟大的技术啊!现在我们有了12个节点的hadoop集群, 让Hadoop跑在云端,让世界加速。

关于作者:

  • 张丹(Conan), 程序员Java,R,PHP,Javascript
  • weibo:@Conan_Z
  • blog: http://blog.fens.me
  • email: bsspirit@gmail.com

转载请注明出处:

http://blog.fens.me/hadoop-clone-improve

clone-improve

前言

把虚拟化的hadoop环境创建好之后,我们就要考虑如何对系统进行优化了。从运维的角度,我找到了4个优化的出发点,安装,配置,监控,管理。
为了完成1个人管理1000节点的目标,点滴的优化,都是未来成功的基石。

我在努力着。。。

 

目录

  1. 对系统优化简单分析
  2. 优化问题1:c1作为母体每次克隆时要停机。
  3. 优化问题2:手动操作步骤太多。

 

1. 对系统优化简单分析(10个节点)

刚才我们从运维的角度,提出了4点优化的出发点:安装,配置,监控,管理。

现在系统成功运行了2个节点,一步一步地,如何能方便的做出10个节点呢?
注:如果上来就想着1000个节点,我们失去方向。请已经熟悉1000个节点方案的朋友忽略这篇文章!

安装
简单概括就是安装要简单,最好是一条命令或者一个脚本就可以完成!在我们的虚拟环境中,安装一个hadoop节点,其实就创建一台新的虚拟机,就一条命令!

可是现在的结构,c1作为母体每次克隆时要停机,就意味着hadoop环境要停机,这不是我们希望的。我们将讨论如何进行改进!

配置
克隆体的hadoop节点创建成功后,由于hostname, ip, dns, 挂载磁盘等,都从母体复制过来的。但这几项配置要求每个节点是不一样的,需要手动修改。

所以,我们应该做一个脚本,每次自动去修改这些配置项,减少手动的修改,减少复杂性。保证新增加的节点能顺利的加入原有的hadoop集群。

监控
我们现在用KVM虚拟机,可以直接通过host虚拟机管理控制台查看每个节点的情况,当然这些信息是不够。我们还需要安装其他的系统监控工具,及各种的hadoop监控工具。

关于监控,我们将在后继续介绍。

管理
如果我们想整套的hadoop环境更易用,可以通过openstack做管理,这会是一种更理想的方案。

当然,这篇文章不会涉及到这个问题,我们将在后继续介绍。

2. 优化问题1:c1作为母体每次克隆时要停机。

up1

c1作为hadoop集群的,namenode节点不应该停止,因些我们重新制作一个名为hadoop-base的母体。通过新母体制造克隆体。

先停止hadoop,通过 让Hadoop跑在云端系列文章 之 克隆虚拟机增加Hadoop节点 文章的方法,克隆一个新的母体hadoop-base。


~ sudo virt-clone --connect qemu:///system -o c1 -n hadoop-base -f /disk/sdb1/hadoop-base.img

查看虚拟机列表


virsh # list --all
 Id    Name                           State
----------------------------------------------------
 5     server3                        running
 6     server4                        running
 7     d2                             running
 8     r1                             running
 9     server2                        running
 18    server5                        running
 48    c3                             running
 50    c1                             running
 -     c2                             shut off
 -     d1                             shut off
 -     hadoop-base                    shut off
 -     u1210-base                     shut off

重新启动hadoop集群c1,c3两个节点。

在c1中查看hadoop节点


~ hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.

Safe mode is ON
Configured Capacity: 317066272768 (295.29 GB)
Present Capacity: 293635162112 (273.47 GB)
DFS Remaining: 293635084288 (273.47 GB)
DFS Used: 77824 (76 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 2 (2 total, 0 dead)

Name: 192.168.1.30:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 15181529088 (14.14 GB)
DFS Remaining: 143351558144(133.51 GB)
DFS Used%: 0%
DFS Remaining%: 90.42%
Last contact: Thu Jul 11 13:00:50 CST 2013

Name: 192.168.1.32:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249581568 (7.68 GB)
DFS Remaining: 150283526144(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Thu Jul 11 13:00:52 CST 2013

通过新的母体hadoop-base制造克隆体c4


~ sudo virt-clone --connect qemu:///system -o hadoop-base -n c4 -f /disk/sdb1/c4.img

#增加分区硬盘/dev/sdb7
~ sudo virsh
virsh # edit c4
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/sdb8'/>
<target dev='vdb' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</disk>

#启动c4
virsh # start c4

#进入c4
virsh # console c4

现在,母体已经从c1变成的hadoop-base。c1已经不需要再关机了。

3. 优化问题2:手动操作步骤太多。

我们手动操作分成2个脚本来处理

  • 新克隆体虚拟机的配置
  • 向集群增加克隆体配置

新克隆体虚拟机的配置
登陆c4后,通过脚本来代替手动的配置。
脚本对下面6个步骤进行配置修改操作:

  1. hostname host
  2. ip
  3. dns
  4. fdisk mount
  5. hadoop folder
  6. ssh

对以上6个步骤的解释,请参考:Hadoop跑在云端系列文章 之 克隆虚拟机增加Hadoop节点


#!/bin/bash
export HOST=c4
export DOMAIN=$HOST.wtmat.com
export IP=192.168.1.33
export DNS=192.168.1.7

#1. hostname host
echo "====================hostname host============================"
hostname $HOST
echo $HOST >  /etc/hostname
sed -i -e "/127.0.0.1/d" /etc/hosts
sed -i -e 1"i\127.0.0.1 localhost ${HOST}" /etc/hosts

#2. ip
echo "====================ip address============================"
sed -i -e "/address/d;/^iface eth0 inet static/a\address ${IP}" /etc/network/interfaces
/etc/init.d/networking restart

#3. dns
echo "====================dns============================"
echo "nameserver ${DNS}" > /etc/resolv.conf

#4. fdisk mount
echo "====================fdisk mount============================"
(echo n; echo p; echo ; echo ; echo ; echo w) | fdisk /dev/vdb
mkfs -t ext4 /dev/vdb1
mount /dev/vdb1 /home/cos/hadoop
echo "/dev/vdb1 /home/cos/hadoop ext4 defaults 0 0 " >> /etc/fstab

#5. hadoop folder
echo "==================== hadoop folder============================"
mkdir /home/cos/hadoop/data
mkdir /home/cos/hadoop/tmp
chown -R cos:cos /home/cos/hadoop/
chmod 755 /home/cos/hadoop/data
chmod 755 /home/cos/hadoop/tmp

#6. ssh
echo "====================ssh============================"
rm -rf /home/cos/.ssh/*
sudo -u cos ssh-keygen -t rsa -N "" -f /home/cos/.ssh/id_rsa 
sudo -u cos cat /home/cos/.ssh/id_rsa.pub >> /home/cos/.ssh/authorized_keys

exit

向集群增加克隆体配置
返回c1节点,用脚本完成加载c4的操作。

脚本对下面3步骤进行操作:

  1. 同步ssh公钥
  2. 同步hadoop的slaves文件
  3. 把c4加入到集群环境

下面脚本使用sshpaas软件,请提前安装


sudo apt-get install sshpass

脚本代码


#!/bin/bash
export NEW_NODE=c4.wtmart.com
export PASS=cos
export SLAVES=c1.wtmart.com:c3.wtmart.com:c4.wtmart.com
IFS=:

#1. sync ssh
sshpass -p ${PASS} scp -o StrictHostKeyChecking=no cos@${NEW_NODE}:/home/cos/.ssh/authorized_keys .
cat authorized_keys >> /home/cos/.ssh/authorized_keys

for SLAVE in $SLAVES
do
	echo scp $SLAVE
	sshpass -p ${PASS} scp /home/cos/.ssh/authorized_keys cos@$SLAVE:/home/cos/.ssh/authorized_keys
done

#2. sync hadoop slaves

rm /home/cos/toolkit/hadoop-1.0.3/conf/slaves
for SLAVE in $SLAVES
do
   echo $SLAVE >> /home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

for SLAVE in $SLAVES
do
	echo scp $SLAVE
	scp /home/cos/toolkit/hadoop-1.0.3/conf/slaves cos@$SLAVE:/home/cos/toolkit/hadoop-1.0.3/conf/slaves
done

#3. restart hadoop cluster
stop-all.sh
start-all.sh

重启hadoop集群,看到新的节点c4,已经加入到集群


~ hadoop dfsadmin -report

Safe mode is ON
Configured Capacity: 475598360576 (442.94 GB)
Present Capacity: 443917721600 (413.43 GB)
DFS Remaining: 443917615104 (413.43 GB)
DFS Used: 106496 (104 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)

Name: 192.168.1.30:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 15181529088 (14.14 GB)
DFS Remaining: 143351558144(133.51 GB)
DFS Used%: 0%
DFS Remaining%: 90.42%
Last contact: Thu Jul 11 15:55:57 CST 2013

Name: 192.168.1.32:50010
Decommission Status : Normal
Configured Capacity: 158533136384 (147.65 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249581568 (7.68 GB)
DFS Remaining: 150283526144(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Thu Jul 11 15:55:56 CST 2013

Name: 192.168.1.33:50010
Decommission Status : Normal
Configured Capacity: 158532087808 (147.64 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 8249528320 (7.68 GB)
DFS Remaining: 150282530816(139.96 GB)
DFS Used%: 0%
DFS Remaining%: 94.8%
Last contact: Thu Jul 11 15:55:58 CST 2013

有了这2个脚本,我们生成10个-20个节点,基本就是文件复制的时间了。

优化问题我们将继续深入进行。。。

转载请注明出处:

http://blog.fens.me/hadoop-clone-improve

打赏作者

让Hadoop跑在云端系列文章 之 克隆虚拟机增加Hadoop节点

让Hadoop跑在云端系列文章,介绍了如何整合虚拟化和Hadoop,让Hadoop集群跑在VPS虚拟主机上,通过云向用户提供存储和计算的服务。

现在硬件越来越便宜,一台非品牌服务器,2颗24核CPU,配48G内存,2T的硬盘,已经降到2万块人民币以下了。这种配置如果简单地放几个web应用,显然是奢侈的浪费。就算是用来实现单节点的hadoop,对计算资源浪费也是非常高的。对于这么高性能的计算机,如何有效利用计算资源,就成为成本控制的一项重要议题了。

通过虚拟化技术,我们可以将一台服务器,拆分成12台VPS,每台2核CPU,4G内存,40G硬盘,并且支持资源重新分配。多么伟大的技术啊!现在我们有了12个节点的hadoop集群, 让Hadoop跑在云端,让世界加速。

关于作者:

  • 张丹(Conan), 程序员Java,R,PHP,Javascript
  • weibo:@Conan_Z
  • blog: http://blog.fens.me
  • email: bsspirit@gmail.com

转载请注明出处:

http://blog.fens.me/hadoop-clone-node/

clone-guest

前言

通过虚拟化技术,我们可轻松的增加或删除一台虚拟机。像hadoop技术,安装,配置,运维,管理都很复杂,如果能通过虚拟化技术,降低运维成本,是多么开心的一件事啊!设想一下,如果一个人能够管理1000个hadoop节点,那么小型公司也可以随随便便构建像百度,阿里一样的强大的计算集群环境。世界也许会更奇妙!

当然,本文并不是讲一个人如何管理1000个hadoop节点。但我会介绍一种方式,通过克隆虚拟机来增加Hadoop节点。也许在大家的实践操作中,就能做出一个人运维1000个节点集群的方案。

目录 

  1. 系统环境介绍
  2. 克隆虚拟机
  3. 完成2个节点的hadoop集群

1. 系统环境介绍

我延续上篇文章的系统环境,让Hadoop跑在云端系列文章 之 创建Hadoop母体虚拟机

我们已成功地创建了Hadoop母体虚拟机c1。接下来,我们要用clone的方式创建c2,c3,c4,c5 ,4台克隆虚拟机。

2. 克隆虚拟机

在host中,打开虚拟机管理软件,查看c1的状态。


~ sudo virsh

virsh # list
 Id    Name                           State
----------------------------------------------------
 5     server3                        running
 6     server4                        running
 7     d2                             running
 8     r1                             running
 9     server2                        running
 18    server5                        running
 42    c1                             running

c1正在运行中,由于c2之前已经创建,我们已c3来举例说明。

创建克隆体c3


~ sudo virt-clone --connect qemu:///system -o c1 -n c3 -f /disk/sdb1/c3.img
ERROR    Domain with devices to clone must be paused or shutoff.

关闭c1,并重新克隆


virsh # destroy c1
Domain c1 destroyed

~ sudo virt-clone --connect qemu:///system -o c1 -n c3 -f /disk/sdb1/c3.img
ERROR    A disk path must be specified to clone '/dev/sdb5'

分区硬盘引入的错误。(无比强大的google,已经找不到对这个错误的解释了)

接下的操作:

  1. 重新启动c1,注释/etc/fstab自动挂载/dev/vdb1的操作(自行解决)
  2. 卸载给c1分配的分区硬盘/dev/sdb5

~ edit c1

<!-- 
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/sdb5'/>
<target dev='vdb' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</disk>
-->

再次创建克隆体c3


~ sudo virt-clone --connect qemu:///system -o c1 -n c3 -f /disk/sdb1/c3.img
Cloning c1.img                             1% [                               ]  47 MB/s | 426 MB     14:14 ETA
Cloning c1.img                                                                           |  40 GB     08:18

Clone 'c3' created successfully.

给克隆体c3挂载分区硬盘/dev/sdb7,并启动c3


virsh # edit c3

<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/sdb7'/>
<target dev='vdb' bus='virtio'/>
</disk>

virsh # start c3

virsh # console c3
Connected to domain c3
Escape character is ^]

Ubuntu 12.10 c1 ttyS0

c1 login: cos
Password:
Last login: Wed Jul 10 12:00:44 CST 2013 from 192.168.1.79 on pts/0
Welcome to Ubuntu 12.10 (GNU/Linux 3.5.0-32-generic x86_64)

 * Documentation:  https://help.ubuntu.com/
New release '13.04' available.
Run 'do-release-upgrade' to upgrade to it.

cos@c1:~$

对克隆体c3,有几个地方需要修改:

  1. hostname,hosts
  2. 静态IP
  3. DNS
  4. /dev/vdb1自动挂载
  5. 配置hadoop存储目录权限
  6. ssh自动验证
  7. hadoop的master,slave

1. hostname,hosts


~ sudo hostname c3
~ sudo vi /etc/hostname
c3

~ sudo vi /etc/hosts
127.0.1.1       c3

2. 静态IP


~ sudo vi /etc/network/interfaces
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
#iface eth0 inet dhcp
iface eth0 inet static
address 192.168.1.32
netmask 255.255.255.0
gateway 192.168.1.1

改完IP后,我们重启一下,用ssh连接。


~ ssh cos@c3.wtmart.com

~ ifconfig
eth0      Link encap:Ethernet  HWaddr 52:54:00:06:b2:3a
          inet addr:192.168.1.32  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::5054:ff:fe06:b23a/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:42 errors:0 dropped:0 overruns:0 frame:0
          TX packets:33 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6084 (6.0 KB)  TX bytes:4641 (4.6 KB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:34 errors:0 dropped:0 overruns:0 frame:0
          TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2560 (2.5 KB)  TX bytes:2560 (2.5 KB)

3. DNS (内网的DNS)


~ sudo vi /etc/resolv.conf
nameserver 192.168.1.7

~ ping c1.wtmart.com
PING c1.wtmart.com (192.168.1.30) 56(84) bytes of data.
64 bytes from 192.168.1.30: icmp_req=1 ttl=64 time=0.485 ms
64 bytes from 192.168.1.30: icmp_req=2 ttl=64 time=0.552 ms

4. /dev/vdb1自动挂载


~ sudo mount /dev/vdb1 /home/cos/hadoop
~ sudo vi /etc/fstab
/dev/vdb1        /home/cos/hadoop      ext4    defaults 0       0

~ df -h
Filesystem              Size  Used Avail Use% Mounted on
/dev/mapper/u1210-root   36G  2.4G   32G   7% /
udev                    2.0G  4.0K  2.0G   1% /dev
tmpfs                   791M  224K  791M   1% /run
none                    5.0M     0  5.0M   0% /run/lock
none                    2.0G     0  2.0G   0% /run/shm
none                    100M     0  100M   0% /run/user
/dev/vda1               228M   29M  188M  14% /boot
/dev/vdb                148G  188M  140G   1% /home/cos/hadoop

5. 配置hadoop存储目录权限


~ sudo chown -R cos:cos hadoop/

~ mkdir /home/cos/hadoop/data
~ mkdir /home/cos/hadoop/tmp

~ sudo chmod 755 /home/cos/hadoop/data
~ sudo chmod 755 /home/cos/hadoop/tmp

6. ssh自动验证


~ rm -rf ~/.ssh/

~ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/cos/.ssh/id_rsa):
Created directory '/home/cos/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/cos/.ssh/id_rsa.
Your public key has been saved in /home/cos/.ssh/id_rsa.pub.
The key fingerprint is:
54:1f:51:5f:5c:1d:ac:8a:9d:8f:fd:da:65:7e:f9:8d cos@c3
The key's randomart image is:
+--[ RSA 2048]----+
|          . oooo*|
|         . . . o+|
|        .   . . .|
|       .     .   |
|        S o o    |
|         . +     |
|            +   +|
|           . o.=+|
|             .Eo*|
+-----------------+
~ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

~ cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCtIK3+hGeJQOigIN1ydDVpUzFeg/urnX2DaHhuv5ik8bGGDZWmSA+qPWAZQoZFBGqGMTshvMQeksjA2WiINbFpQCGuxSFx5a9Ad0XU8bGDFi+yYfKlp1ROZ+7Jz1+cO4tjrCX4Ncr3nZlddl6Uh/yMYU+iuRQ4Ga7GXgeYoSf7C5vQmzkYija0XNa8wFr0+aeD3hau6s8dWwsw7Dn/xVA3eMqK+4LDGH6dkn6tc6nVyNbzofXdtPOsCkwHpwozwuRBL37LYcmELe+oT/GifWf0Qp4rQD/9ObtHkhKrSW45bRH/WzvkyNxl04dKlIj26zIsh9zjMHF8o0ce+zjUl7aD cos@c3

7. hadoop的master,slave


~ cd /home/cos/toolkit/hadoop-1.0.3/conf
~  vi slaves
c1.wtmart.com
c3.wtmart.com

3. 完成2个节点的hadoop集群

重新启动c1节点,配置2个节点的hadoop集群


~ ssh cos@c1.wtmart.com
~ sudo mount /dev/vdb1 /home/cos/hadoop

配置slave


~ cd /home/cos/toolkit/hadoop-1.0.3/conf
~  vi slaves
c1.wtmart.com
c3.wtmart.com

DNS (内网的DNS)


~ sudo vi /etc/resolv.conf
nameserver 192.168.1.7

~  ping c3.wtmart.com
PING c3.wtmart.com (192.168.1.32) 56(84) bytes of data.
64 bytes from 192.168.1.32: icmp_req=1 ttl=64 time=0.673 ms
64 bytes from 192.168.1.32: icmp_req=2 ttl=64 time=0.429 ms

交换ssh公钥
复制c3的authorized_keys到c1的authorized_keys


~ scp cos@c3.wtmart.com:/home/cos/.ssh/authorized_keys .
~ cat authorized_keys >> /home/cos/.ssh/authorized_keys

~ cat /home/cos/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCuWCgfB8qSyEpKmUwnXvWtiCbzza1oD8oM3jm3EJzBRivEC/QV3k8v8zOHt2Cf4H6hYIvYHoSMilA2Wqbh1ny/70zcIYrwFiX5QvSjbEfsj7UmvxevxjB1/5F66TRLN/PyTiw3FmPYLkxvTP8CET02D2cgAN0n+XGTXQaaBqBElQGuiOJUAIvMUl0yg3mH7eP8TRlS4qYpllh04kirSbkOm6IYRDtGsrb90ew63l6F/MpPk/H6DVGC23PnD7ZcMr7VFyCkNPNqcEBwbL8qL1Hhnf7Lvlinp3M3zVU0Aet3TjkgwvcLPX8BWmrguVdBJJ3yqcocylh0YKN9WImAhVm7 cos@c1
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDU4sIHWxuyicIwFbRpDcoegDfSaUhmaBTSFSVTd/Uk8w0fTUcERNx9bZrtuiiPsOImvsBy9hN2Au/x36Jp4hO8DFxzqTNBcfD9xcEgOzHGqH7ZC5qMqx3Poy6DzN/rk9t/+wXyNCfR4kRhrPBNiF6gMDMTRQLpKK9VnY0nQ6FvOQ407gMAsyf0Fn31sHZJLtM4/kxsSOEsSNIk1V+gbIxJv2fJ/AON0/iW1zu7inUC/+YvEuTLClnUMzqFb/xKp25XxSV6a5YzThxs58KO5JCRq2Kk/SM0GSmCSkjKImUYDDjmi1P6wbrd4mh/4frQr1DqYyPeHE4UlHXD90agw767 cos@c3

把c1的authorized_keys,再分发给c3


~ scp /home/cos/.ssh/authorized_keys cos@c3.wtmart.com:/home/cos/.ssh/authorized_keys

从c1启动hadoop集群


~ start-all.sh

starting namenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-namenode-c1.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c1.out
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c3.wtmart.com: starting datanode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-datanode-c3.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting secondarynamenode, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-secondarynamenode-c1.out
starting jobtracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-jobtracker-c1.out
c3.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c3.wtmart.com:
c3.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c3.out
c1.wtmart.com: Warning: $HADOOP_HOME is deprecated.
c1.wtmart.com:
c1.wtmart.com: starting tasktracker, logging to /home/cos/toolkit/hadoop-1.0.3/libexec/../logs/hadoop-cos-tasktracker-c1.out

查看c1的进程


~ jps
8861 JobTracker
8486 NameNode
8750 SecondaryNameNode
9123 Jps
9001 TaskTracker
8612 DataNode

查看c3的进程


~ jps
3180 TaskTracker
3074 DataNode
3275 Jps

这样就建立起了,2个节点的hadoop集群。
上面是我们手动操作的过程比较复杂,接下来我们的任务就是写一个自动化的脚本完成上面的操作。

转载请注明出处:

http://blog.fens.me/hadoop-clone-node/

打赏作者