OpenStack HA集群3-Pacemake Corosync

OpenStack HA集群3-Pacemake Corosync

大家好,又见面了,我是全栈君。

节点间主机名必须能解析

[root@controller1 ~]# cat /etc/hosts

192.168.17.149  controller1

192.168.17.141  controller2

192.168.17.166  controller3

192.168.17.111  demo.open-stack.cn

各节点间要互信,无密码能登录

[root@controller1 ~]# ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

20:79:d4:a4:9f:8b:75:cf:12:58:f4:47:a4:c1:29:f3 root@controller1

The key’s randomart image is:

+–[ RSA 2048]—-+

|      .o. …oo  |

|     o …o.o+   |

|    o +   .+o .  |

|     o o +  E.   |

|        S o      |

|       o o +     |

|      . . . o    |

|           .     |

|                 |

+—————–+

[root@controller1 ~]# ssh-copy-id controller2

[root@controller1 ~]# ssh-copy-id controller3

配置YUM源

# vim /etc/yum.repos.d/ha-clustering.repo

[network_ha-clustering_Stable]

name=Stable High Availability/Clustering packages (CentOS-7)

type=rpm-md

baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/

gpgcheck=0

gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/repodata/repomd.xml.key

enabled=1

这个YUM源可能会冲突,先enabled=0,如果剩下一个crmsh包,再enabled=1打开后安装

Corosync下载地址,目前最新版本2.4.2

http://build.clusterlabs.org/corosync/releases/

http://build.clusterlabs.org/corosync/releases/corosync-2.4.2.tar.gz

[root@controller1 ~]# ansible controller -m copy -a “src=/etc/yum.repos.d/ha-cluster.repo dest=/etc/yum.repos.d/”

安装软件包

# yum install pacemaker pcs resource-agents -y cifs-utils quota psmisc corosync fence-agents-all lvm2 resource-agents

#  yum install crmsh  -y

启动pcsd,并确认启动正常

# systemctl enable pcsd

# systemctl enable corosync

# systemctl start pcsd

# systemctl status pcsd

[root@controller2 ~]# pacemakerd -$

Pacemaker 1.1.15-11.el7_3.2

Written by Andrew Beekhof

[root@controller1 ~]# ansible controller -m command -a “pacemakerd -$”

修改hacluster密码

【all】# echo zoomtech | passwd –stdin hacluster

[root@controller1 ~]# ansible controller -m command -a “echo zoomtech | passwd –stdin hacluster”

# passwd hacluster

编辑corosync.conf

[root@controller3 ~]# vim /etc/corosync/corosync.conf

totem {

        version: 2

        secauth: off

        cluster_name: openstack-cluster

        transport: udpu

}

nodelist {

        node {

                ring0_addr: controller1

                nodeid: 1

        }

        node {

                ring0_addr: controller2

                nodeid: 2

        }

        node {

                ring0_addr: controller3

                nodeid: 3

        }

}

logging {

        to_logfile: yes

        logfile: /var/log/cluster/corosync.log

        to_syslog: yes

}

quorum {

        provider: corosync_votequorum

}

[root@controller1 ~]# scp /etc/corosync/corosync.conf controller2:/etc/corosync/

[root@controller1 ~]# scp /etc/corosync/corosync.conf controller3:/etc/corosync/

[root@controller1 corosync]# ansible controller -m copy -a “src=corosync.conf dest=/etc/corosync”

创建集群

使用pcs设置集群身份认证

[root@controller1 ~]# pcs cluster auth controller1 controller2 controller3 -u hacluster -p zoomtech –force

controller3: Authorized

controller2: Authorized

controller1: Authorized

现在我们创建一个集群并添加一些节点。注意,这个名字不能超过15个字符

[root@controller1 ~]# pcs cluster setup –force –name openstack-cluster controller1 controller2 controller3

Destroying cluster on nodes: controller1, controller2, controller3…

controller3: Stopping Cluster (pacemaker)…

controller2: Stopping Cluster (pacemaker)…

controller1: Stopping Cluster (pacemaker)…

controller2: Successfully destroyed cluster

controller1: Successfully destroyed cluster

controller3: Successfully destroyed cluster

Sending cluster config files to the nodes…

controller1: Succeeded

controller2: Succeeded

controller3: Succeeded

Synchronizing pcsd certificates on nodes controller1, controller2, controller3…

controller3: Success

controller2: Success

controller1: Success

Restarting pcsd on the nodes in order to reload the certificates…

controller3: Success

controller2: Success

controller1: Success

启动集群

[root@controller1 ~]# pcs cluster enable –all

controller1: Cluster Enabled

controller2: Cluster Enabled

controller3: Cluster Enabled

[root@controller1 ~]# pcs cluster start –all

controller2: Starting Cluster…

controller1: Starting Cluster…

controller3: Starting Cluster…

查看集群状态

[root@controller1 corosync]# ansible controller -m command -a “pcs cluster status”

[root@controller1 ~]# pcs cluster status

Cluster Status:

 Stack: corosync

 Current DC: controller3 (version 1.1.15-11.el7_3.2-e174ec8) – partition with quorum

 Last updated: Fri Feb 17 10:39:38 2017        Last change: Fri Feb 17 10:39:29 2017 by hacluster via crmd on controller3

 3 nodes and 0 resources configured

PCSD Status:

  controller2: Online

  controller3: Online

  controller1: Online

[root@controller1 corosync]# ansible controller -m command -a “pcs status”

[root@controller1 ~]# pcs status

Cluster name: openstack-cluster

Stack: corosync

Current DC: controller2 (version 1.1.15-11.el7_3.2-e174ec8) – partition with quorum

Last updated: Thu Mar  2 17:07:34 2017        Last change: Thu Mar  2 01:44:44 2017 by root via cibadmin on controller1

3 nodes and 1 resource configured

Online: [ controller1 controller2 controller3 ]

Full list of resources:

 vip    (ocf::heartbeat:IPaddr2):    Started controller2

Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

查看集群状态

[root@controller1 corosync]# ansible controller -m command -a “crm_mon -1”

[root@controller1 ~]# crm_mon -1

Stack: corosync

Current DC: controller2 (version 1.1.15-11.el7_3.2-e174ec8) – partition with quorum

Last updated: Wed Mar  1 17:54:04 2017          Last change: Wed Mar  1 17:44:38 2017 by root via cibadmin on controller1

3 nodes and 1 resource configured

Online: [ controller1 controller2 controller3 ]

Active resources:

vip     (ocf::heartbeat:IPaddr2):    Started controller1

查看pacemaker进程状态

[root@controller1 ~]# ps aux | grep pacemaker

root      75900  0.2  0.5 132632  9216 ?        Ss   10:39   0:00 /usr/sbin/pacemaked -f

haclust+  75901  0.3  0.8 135268 15376 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/cib

root      75902  0.1  0.4 135608  7920 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/stonithd

root      75903  0.0  0.2 105092  5020 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/lrmd

haclust+  75904  0.0  0.4 126924  7636 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/attrd

haclust+  75905  0.0  0.2 117040  4560 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/pengine

haclust+  75906  0.1  0.5 145328  8988 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/crmd

root      75997  0.0  0.0 112648   948 pts/0    R+   10:40   0:00 grep –color=auto pacemaker

查看集群状态

[root@controller1 ~]# corosync-cfgtool -s

Printing ring status.

Local node ID 1

RING ID 0

    id    = 192.168.17.132

    status    = ring 0 active with no faults

[root@controller2 corosync]# corosync-cfgtool -s

Printing ring status.

Local node ID 2

RING ID 0

    id    = 192.168.17.146

    status    = ring 0 active with no faults

[root@controller3 ~]# corosync-cfgtool -s

Printing ring status.

Local node ID 3

RING ID 0

    id    = 192.168.17.138

    status    = ring 0 active with no faults

[root@controller1 ~]# corosync-cmapctl | grep members

runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0

runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.17.132)

runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1

runtime.totem.pg.mrp.srp.members.1.status (str) = joined

runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0

runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.17.146)

runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1

runtime.totem.pg.mrp.srp.members.2.status (str) = joined

runtime.totem.pg.mrp.srp.members.3.config_version (u64) = 0

runtime.totem.pg.mrp.srp.members.3.ip (str) = r(0) ip(192.168.17.138)

runtime.totem.pg.mrp.srp.members.3.join_count (u32) = 1

runtime.totem.pg.mrp.srp.members.3.status (str) = joined

查看集群状态

[root@controller1 ~]# pcs status corosync

Membership information

———————-

    Nodeid      Votes Name

         1          1 controller1 (local)

         3          1 controller3

         2          1 controller2

[root@controller2 corosync]# pcs status corosync

Membership information

———————-

    Nodeid      Votes Name

         1          1 controller1

         3          1 controller3

         2          1 controller2 (local)

[root@controller3 ~]# pcs status corosync

Membership information

———————-

    Nodeid      Votes Name

         1          1 controller1

         3          1 controller3 (local)

         2          1 controller2

[root@controller1 ~]# crm_verify -L -V

   error: unpack_resources:    Resource start-up disabled since no STONITH resources have been defined

   error: unpack_resources:    Either configure some or disable STONITH with the stonith-enabled option

   error: unpack_resources:    NOTE: Clusters with shared data need STONITH to ensure data integrity

Errors found during check: config not valid

[root@controller1 ~]#

[root@controller1 ~]# pcs property set stonith-enabled=false

[root@controller1 ~]# pcs property set no-quorum-policy=ignore

[root@controller1 ~]# crm_verify -L -V

[root@controller1 corosync]# ansible controller -m command -a “pcs property set stonith-enabled=false

[root@controller1 corosync]# ansible controller -m command -a “pcs property set no-quorum-policy=ignore”

[root@controller1 corosync]# ansible controller -m command -a “crm_verify -L -V”

配置 VIP

[root@controller1 ~]# crm

crm(live)# configure

crm(live)configure# show

node 1: controller1

node 2: controller2

node 3: controller3

property cib-bootstrap-options: \

    have-watchdog=false \

    dc-version=1.1.15-11.el7_3.2-e174ec8 \

    cluster-infrastructure=corosync \

    cluster-name=openstack-cluster \

    stonith-enabled=false \

    no-quorum-policy=ignore

crm(live)configure# primitive vip ocf:heartbeat:IPaddr2 params ip=192.168.17.111 cidr_netmask=24 nic=ens37 op start interval=0s timeout=20s op stop interval=0s timeout=20s monitor interval=30s meta priority=100

crm(live)configure# show

node 1: controller1

node 2: controller2

node 3: controller3

primitive vip IPaddr2 \

    params ip=192.168.17.111 cidr_netmask=24 nic=ens37 \

    op start interval=0s timeout=20s \

    op stop interval=30s timeout=20s monitor \

    meta priority=100

property cib-bootstrap-options: \

    have-watchdog=false \

    dc-version=1.1.15-11.el7_3.2-e174ec8 \

    cluster-infrastructure=corosync \

    cluster-name=openstack-cluster \

    stonith-enabled=false \

    no-quorum-policy=ignore

crm(live)configure# commit

crm(live)configure# exit

查看VIP已绑定在ens37网卡上

[root@controller1 ~]# ip a

4: ens37: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

    link/ether 00:0c:29:ff:8b:4b brd ff:ff:ff:ff:ff:ff

    inet 192.168.17.141/24 brd 192.168.17.255 scope global dynamic ens37

       valid_lft 2388741sec preferred_lft 2388741sec

    inet 192.168.17.111/24 brd 192.168.17.255 scope global secondary ens37

       valid_lft forever preferred_lft forever

上面指定的网卡名称3个节点必须是同一个名称,否则飘移会出现问题,切换不过去

[root@controller1 ~]# crm status

Stack: corosync

Current DC: controller1 (version 1.1.15-11.el7_3.2-e174ec8) – partition with quorum

Last updated: Wed Feb 22 11:42:07 2017        Last change: Wed Feb 22 11:22:56 2017 by root via cibadmin on controller1


3 nodes and 1 resource configured


Online: [ controller1 controller2 controller3 ]


Full list of resources:


 vip    (ocf::heartbeat:IPaddr2):    Started controller1

查看corosync引擎是否正常启动

[root@controller1 ~]# grep -e “Corosync Cluster Engine” -e “configuration file” /var/log/cluster/corosync.log

[51405] controller1 corosyncnotice  [MAIN  ] Corosync Cluster Engine (‘2.4.0’): started and ready to provide service.

Mar 01 17:35:20 [51425] controller1        cib:     info: retrieveCib:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.xml (digest: /var/lib/pacemaker/cib/cib.xml.sig)

Mar 01 17:35:20 [51425] controller1        cib:  warning: cib_file_read_and_verify:    Could not verify cluster configuration file /var/lib/pacemaker/cib/cib.xml: No such file or directory (2)

Mar 01 17:35:20 [51425] controller1        cib:  warning: cib_file_read_and_verify:    Could not verify cluster configuration file /var/lib/pacemaker/cib/cib.xml: No such file or directory (2)

Mar 01 17:35:20 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.Apziws (digest: /var/lib/pacemaker/cib/cib.0ZxsVW)

Mar 01 17:35:21 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.ObYehI (digest: /var/lib/pacemaker/cib/cib.O8Rntg)

Mar 01 17:35:42 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.eqrhsF (digest: /var/lib/pacemaker/cib/cib.6BCfNj)

Mar 01 17:35:42 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.riot2E (digest: /var/lib/pacemaker/cib/cib.SAqtzj)

Mar 01 17:35:42 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.Q8H9BL (digest: /var/lib/pacemaker/cib/cib.MBljlq)

Mar 01 17:38:29 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.OTIiU4 (digest: /var/lib/pacemaker/cib/cib.JnHr1v)

Mar 01 17:38:36 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.2cK9Yk (digest: /var/lib/pacemaker/cib/cib.JSqEH8)

Mar 01 17:44:38 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.aPFtr3 (digest: /var/lib/pacemaker/cib/cib.E3Ve7X)

[root@controller1 ~]#

查看初始化成员节点通知是否正常发出 

[root@controller1 ~]# grep  TOTEM /var/log/cluster/corosync.log 

[51405] controller1 corosyncnotice  [TOTEM ] Initializing transport (UDP/IP Unicast).

[51405] controller1 corosyncnotice  [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none

[51405] controller1 corosyncnotice  [TOTEM ] The network interface [192.168.17.149] is now up.

[51405] controller1 corosyncnotice  [TOTEM ] adding new UDPU member {192.168.17.149}

[51405] controller1 corosyncnotice  [TOTEM ] adding new UDPU member {192.168.17.141}

[51405] controller1 corosyncnotice  [TOTEM ] adding new UDPU member {192.168.17.166}

[51405] controller1 corosyncnotice  [TOTEM ] A new membership (192.168.17.149:4) was formed. Members joined: 1

[51405] controller1 corosyncnotice  [TOTEM ] A new membership (192.168.17.141:12) was formed. Members joined: 2 3

检查启动过程中是否有错误产生

[root@controller1 ~]# grep ERROR: /var/log/cluster/corosync.log

本文转自 OpenStack2015 51CTO博客,原文链接:http://blog.51cto.com/andyliu/1917391,如需转载请自行联系原作者

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/107875.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • 数据库系统概论第五版 课后习题答案王珊

    数据库系统概论第五版 课后习题答案王珊第1章绪论1.试述数据、数据库、数据库系统、数据库管理系统的概念。答:(l)数据(Data):描述事物的符号记录称为数据。数据的种类有数字、文字、图形、图像、声音、正文等。数据与其语义是不可分的。解析在现代计算机系统中数据的概念是广义的。早期的计算机系统主要用于科学计算,处理的数据是整数、实数、浮点数等传统数学中的数据。现代计算机能存储和处理的对象十分广泛,表示这些对象的数据…

    2022年9月14日
    0
  • java中级考试 考点_java中级面试题的考点「建议收藏」

    java中级考试 考点_java中级面试题的考点「建议收藏」在我们对java有一定的基础学习后,能力再往上升一些就是中级。对于初级和中级来说,后者除了对于基础java内容把握能力强外,在一些知识点的比较分析和原理解剖上有所理解能力。本篇就java中级面试题进行了整理,挑出了一些典型的高频试题,都来看看具体内容吧。1.比较接口和抽象类的语法区别(1)抽象类可以有构造方法,接口中不能有构造方法。(2)抽象类中可以有普通成员变量,接口中没有普通成员变量!!!(注…

    2022年10月10日
    0
  • 电脑自启动软件管理

    电脑自启动软件管理

    2022年3月4日
    37
  • VMware安装CentOS7超详细版[通俗易懂]

    VMware安装CentOS7超详细版[通俗易懂]写在前面云计算与分布式这门课程的老师让我们使用vmware安装好centos7.6并配置好Java编译环境,刚好复习一波,下面是详细的安装过程。准备工作VMware,我用的是VMwareWorkstationPro15,下载与安装方法就不提了毕竟重点在后头。CentOS7镜像文件,由于7.6版本已经停更,这里我用的是7.7版本。下载地址http://isoredirect….

    2022年6月5日
    24
  • 详解GloVe词向量模型[通俗易懂]

    详解GloVe词向量模型[通俗易懂]  词向量的表示可以分成两个大类1:基于统计方法例如共现矩阵、奇异值分解SVD;2:基于语言模型例如神经网络语言模型(NNLM)、word2vector(CBOW、skip-gram)、GloVe、ELMo。  word2vector中的skip-gram模型是利用类似于自动编码的器网络以中心词的one-hot表示作为输入来预测这个中心词环境中某一个词的one-hot表示,即先将中心词one-h…

    2022年6月9日
    87
  • 内网ip和外网ip区别

    内网ip和外网ip区别文章一:原文:https://blog.csdn.net/Alexwym/article/details/81772446我们每天都会访问各种各样的网站,比如淘宝,百度等等。不免会思考,我们的设备是如何连接上这些网址的呢?要想搞清楚这个问题,首先就得先搞清楚内网ip和外网ip的联系。如图,假设我们的计算机现在就是设备一,我们想要访问百度。如果我们正使用着校园网,那么首先我们需要先通…

    2022年6月14日
    37

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号