BookKeeper全解(1)-BookKeeper简介和快速上手

BookKeeper全解(1)-BookKeeper简介和快速上手什么是 BookKeeperBo 是一个提供日志条目流存储持久化的服务框架 特别适合日志流存储 一个比较经典的应用是作为消息队列 Pulsar 的持久框架 那么 BookKeeper 是怎样产生的呢 这个灵感来源于 Hadoop 生态系统 我们知道 Haddop 生态系统的文件存储是 HDFS HDFS 包含一种节点叫做 NameNode 用于记录所有的操作 在宕机的时候可以通过这些记录进行恢复

什么是BookKeeper

BookKeeper是一个提供日志条目流存储持久化的服务框架。特别适合日志流存储,一个比较经典的应用是作为消息队列Pulsar的持久框架。

那么BookKeeper是怎样产生的呢?

  • 高效写
  • 基于复制的高容错(消息在ensembles之间复制,这个概念之后会讲)
  • 高吞吐量写

快速部署

git clone https://github.com/apache/bookkeeper.git 

之后,我们在conf目录下可以看到:

conf └─bk_cli_env.sh └─bk_server.conf └─bkenv.sh └─jaas_example.conf └─log4j.properties └─log4j.cli.properties └─log4j.shell.properties └─nettyenv.sh └─standalone.conf └─zookeeper.conf └─zookeeper.conf.dynamic 

Bookeeper提供了本地启动测试的类org.apache.bookkeeper.util.LocalBookKeeper,但是这个不太好调试,我们想搞清楚实际BookKeeper的工作模式还有部署,最好做一个相对完整的本地环境出来

编译代码,之后先在本地启用一个ZooKeeper,这个直接下载最新版的,直接默认配置启动即可(默认端口是2181)

 Server parameters bookiePort=3181 extraServerComponents= server settings httpServerEnabled=false httpServerPort=8080 httpServerClass=org.apache.bookkeeper.http.vertx.VertxHttpServer # Bookie Storage # Journal settings journalDirectories=d:/tmp1/bk-txn Ledger storage settings ledgerDirectories=d:/tmp1/bk-data indexDirectories=d:/tmp1/bk-data # Metadata Services # Metadata Service settings metadataServiceUri=zk+hierarchical://localhost:2181/ledgers ZooKeeper Metadata Service settings zkServers=localhost:2181 zkTimeout=10000 zkEnableSecurity=false # Settings below are used by stream/table service Grpc Server storageserver.grpc.port=4181 Dlog Settings for table service # Replication Settings dlog.bkcEnsembleSize=3 dlog.bkcWriteQuorumSize=2 dlog.bkcAckQuorumSize=2 Storage # local storage directories for storing table ranges data (e.g. rocksdb sst files) storage.range.store.dirs=d:/tmp1/bookkeeper/ranges # whether the storage server capable of serving readonly tables. default is false. storage.serve.readonly.tables=false # the cluster controller schedule interval, in milliseconds. default is 30 seconds. storage.cluster.controller.schedule.interval.ms=30000 

Bookie2:

 Server parameters bookiePort=3182 extraServerComponents= server settings httpServerEnabled=false httpServerPort=8081 httpServerClass=org.apache.bookkeeper.http.vertx.VertxHttpServer # Bookie Storage # Journal settings journalDirectories=d:/tmp2/bk-txn Ledger storage settings ledgerDirectories=d:/tmp2/bk-data indexDirectories=d:/tmp2/bk-data # Metadata Services # Metadata Service settings metadataServiceUri=zk+hierarchical://localhost:2181/ledgers ZooKeeper Metadata Service settings zkServers=localhost:2181 zkTimeout=10000 zkEnableSecurity=false # Settings below are used by stream/table service Grpc Server storageserver.grpc.port=4182 Dlog Settings for table service # Replication Settings dlog.bkcEnsembleSize=3 dlog.bkcWriteQuorumSize=2 dlog.bkcAckQuorumSize=2 Storage # local storage directories for storing table ranges data (e.g. rocksdb sst files) storage.range.store.dirs=d:/tmp2/bookkeeper/ranges # whether the storage server capable of serving readonly tables. default is false. storage.serve.readonly.tables=false # the cluster controller schedule interval, in milliseconds. default is 30 seconds. storage.cluster.controller.schedule.interval.ms=30000 

Bookie3:

 Server parameters bookiePort=3183 extraServerComponents= server settings httpServerEnabled=false httpServerPort=8082 httpServerClass=org.apache.bookkeeper.http.vertx.VertxHttpServer # Bookie Storage # Journal settings journalDirectories=d:/tmp3/bk-txn Ledger storage settings ledgerDirectories=d:/tmp3/bk-data indexDirectories=d:/tmp3/bk-data # Metadata Services # Metadata Service settings metadataServiceUri=zk+hierarchical://localhost:2181/ledgers ZooKeeper Metadata Service settings zkServers=localhost:2181 zkTimeout=10000 zkEnableSecurity=false # Settings below are used by stream/table service Grpc Server storageserver.grpc.port=4182 Dlog Settings for table service # Replication Settings dlog.bkcEnsembleSize=3 dlog.bkcWriteQuorumSize=2 dlog.bkcAckQuorumSize=2 Storage # local storage directories for storing table ranges data (e.g. rocksdb sst files) storage.range.store.dirs=d:/tmp3/bookkeeper/ranges # whether the storage server capable of serving readonly tables. default is false. storage.serve.readonly.tables=false # the cluster controller schedule interval, in milliseconds. default is 30 seconds. storage.cluster.controller.schedule.interval.ms=30000 

我用的IDE是IDEA,这里配置BookieShell运行初始化:

  • 主类选择org.apache.bookkeeper.bookie.BookieShell
  • VM options填写:-DentryFormatterClass=${ENTRY_FORMATTER_CLASS:-org.apache.bookkeeper.util.StringEntryFormatter}
  • Program aruguments填写:--conf ./conf/bk_server_1.conf metaformat
    image

之后启动,可以看到日志:

2018-10-16 15:43:14,766 - INFO - [main:MetadataDrivers@107] - BookKeeper metadata driver manager initialized 

然后,开始配置三个Bookie的启动:

  • 主类选择org.apache.bookkeeper.server.Main
  • Program aruguments填写:--conf ./conf/bk_server_1.conf, 另外两个分别填写--conf ./conf/bk_server_2.conf--conf ./conf/bk_server_3.conf

image

编写配置好后,启动三个Bookie(为何启动三个,默认配置情况下,一个ledger至少要写到两个Bokkie中,Bookies需要选举出最合适的,这个选举算法至少需要三个Bookie):

2018-10-16 16:00:01,923 - INFO - [main:Main@110] - Using configuration file ./conf/bk_server_1.conf 2018-10-16 16:00:01,935 - INFO - [main:Main@267] - Hello, I'm your bookie, listening on port 3181. Metadata service uri is zk+hierarchical://localhost:2181/ledgers. Journals are in [d:/tmp1/bk-txn]. Ledgers are stored in d:/tmp1/bk-data. 2018-10-16 16:00:02,045 - INFO - [main:Main@296] - Load lifecycle component : org.apache.bookkeeper.server.service.StatsProviderService 2018-10-16 16:00:02,304 - INFO - [main:BookieServer@94] - { "storage.cluster.controller.schedule.interval.ms" : "30000", "zkEnableSecurity" : "false", "dlog.bkcAckQuorumSize" : "2", "indexDirectories" : "d:/tmp1/bk-data", "zkServers" : "localhost:2181", "storage.range.store.dirs" : "d:/tmp1/bookkeeper/ranges", "httpServerPort" : "8080", "dlog.bkcWriteQuorumSize" : "2", "bookiePort" : "3181", "storage.serve.readonly.tables" : "false", "ledgerDirectories" : "d:/tmp1/bk-data", "zkTimeout" : "10000", "httpServerClass" : "org.apache.bookkeeper.http.vertx.VertxHttpServer", "httpServerEnabled" : "false", "metadataServiceUri" : "zk+hierarchical://localhost:2181/ledgers", "dlog.bkcEnsembleSize" : "3", "journalDirectories" : "d:/tmp1/bk-txn", "storageserver.grpc.port" : "4181", "extraServerComponents" : "" } 2018-10-16 16:00:03,032 - INFO - [main:MetadataDrivers@107] - BookKeeper metadata driver manager initialized 2018-10-16 16:00:03,033 - INFO - [main:MetadataDrivers@107] - BookKeeper metadata driver manager initialized 2018-10-16 16:00:03,033 - INFO - [main:MetadataDrivers@107] - BookKeeper metadata driver manager initialized 2018-10-16 16:00:03,167 - INFO - [main:Bookie@399] - Stamping new cookies on all dirs [d:\tmp1\bk-txn\current] [d:\tmp1\bk-data\current, d:\tmp1\bk-data\current] 2018-10-16 16:00:03,368 - WARN - [main:Bookie@302] - Dirs: [d:\tmp1\bk-data\current, d:\tmp1\bk-data\current] are in same DiskPartition/FileSystem: (d:) 2018-10-16 16:00:03,373 - INFO - [main:Bookie@644] - instantiate ledger manager org.apache.bookkeeper.meta.HierarchicalLedgerManagerFactory 2018-10-16 16:00:03,415 - ERROR - [main:Journal$LastLogMark@244] - Problems reading from d:\tmp1\bk-data\current\lastMark (this is okay if it is the first time starting this bookie 2018-10-16 16:00:03,420 - INFO - [main:Bookie@700] - Using ledger storage: org.apache.bookkeeper.bookie.SortedLedgerStorage 2018-10-16 16:00:03,500 - INFO - [main:IndexPersistenceMgr@107] - openFileLimit = 20000 2018-10-16 16:00:03,539 - INFO - [main:IndexInMemPageMgr@377] - maxDirectMemory = , pageSize = 8192, pageLimit = 10560 2018-10-16 16:00:03,594 - INFO - [main:ScanAndCompareGarbageCollector@107] - Over Replicated Ledger Deletion : enabled=true, interval= 2018-10-16 16:00:03,603 - INFO - [main:GarbageCollectorThread@253] - Minor Compaction : enabled=true, threshold=0.023224, interval= 2018-10-16 16:00:03,603 - INFO - [main:GarbageCollectorThread@255] - Major Compaction : enabled=true, threshold=0.0929, interval= 2018-10-16 16:00:03,734 - INFO - [main:Main@303] - Load lifecycle component : org.apache.bookkeeper.server.service.BookieService 2018-10-16 16:00:03,748 - INFO - [main:ComponentStarter@79] - Starting component bookie-server. 2018-10-16 16:00:03,753 - INFO - [main:Bookie@853] - Finished replaying journal in 3 ms. 2018-10-16 16:00:03,756 - INFO - [SyncThread-7-1:SyncThread@136] - Flush ledger storage at checkpoint CheckpointList{checkpoints=[LogMark: logFileId - 0 , logFileOffset - 0]}. 2018-10-16 16:00:03,786 - INFO - [main:Bookie@892] - Finished reading journal, starting bookie 2018-10-16 16:00:03,787 - INFO - [BookieJournal-3181:Journal@932] - Starting journal on d:\tmp1\bk-txn\current 2018-10-16 16:00:03,789 - INFO - [ForceWriteThread:Journal$ForceWriteThread@471] - ForceWrite Thread started 2018-10-16 16:00:03,828 - INFO - [BookieJournal-3181:JournalChannel@154] - Opening journal d:\tmp1\bk-txn\current\1667be392ce.txn 2018-10-16 16:00:04,078 - INFO - [main:ComponentStarter@81] - Started component bookie-server. 2018-10-16 16:00:04,309 - INFO - [BookieJournal-3181:NativeIO@48] - Unable to link C library. Native methods will be disabled. 

之后我们简单编写个测试程序,来看下日志写入读取效果:

public class TestClient { 
    public static void main(String[] args) throws InterruptedException, BKException, IOException { 
    BookKeeper bkc = new BookKeeper("localhost:2181"); // A password for the new ledger byte[] ledgerPassword = "test".getBytes(); // Create a new ledger and fetch its identifier LedgerHandle lh = bkc.createLedger(BookKeeper.DigestType.MAC, ledgerPassword); long ledgerId = lh.getId(); // Create a buffer for four-byte entries ByteBuffer entry = ByteBuffer.allocate(4); int numberOfEntries = 10; // Add entries to the ledger, then close it for (int i = 0; i < numberOfEntries; i++) { 
    entry.putInt(i); entry.position(0); lh.addEntry(entry.array()); } lh.close(); // Open the ledger for reading lh = bkc.openLedger(ledgerId, BookKeeper.DigestType.MAC, ledgerPassword); // Read all available entries Enumeration<LedgerEntry> entries = lh.readEntries(0, numberOfEntries - 1); while (entries.hasMoreElements()) { 
    ByteBuffer result = ByteBuffer.wrap(entries.nextElement().getEntry()); Integer retrEntry = result.getInt(); // Print the integer stored in each entry System.out.println(String.format("Result: %s", retrEntry)); } // Close the ledger and the client lh.close(); bkc.close(); } } 

运行,可以看到输出:

Result: 0 Result: 1 Result: 2 Result: 3 Result: 4 Result: 5 Result: 6 Result: 7 Result: 8 Result: 9 

BookKeeper基本概念

在BookKeeper中:

  • 每一块日志是一个entry
  • 日志流被定义为ledgers
  • 每个独立的保存ledges的服务器叫做bookies

Entries

Field 类型 描述
Ledger number long 属于的Ledger的ID标识
Entry number long 标识自身的全局唯一ID
Last confirmed (LC) long 上一个entry的全局唯一ID
Data byte[] 日志数据
Authentication code byte[] 校验码

Ledgers

Bookies 和 Ensemble

Bookie是一个独立的BookKeeper服务器,处理Ledgers片段(因为每个Bookie保存的是每个Ledgers的片段,不是完整的Ledgers)。

一个Ensemble是一个Bookies集合,他们共同保存着一个Ledger的所有entries。通常一个Ensemble是整个Bookies集群的子集。

BookKeeper API简介

BookKeeper的客户端主要负责创建删除ledgers,并且从ledgers中读取entries。

BookKeeper提供两种API,Ledger API还有DistributedLog API;顾名思义,Ledger API提供了直接与Ledger交互的接口,DistributedLog API不直接与ledger交互而是与Bookeeper集群交互。

BookKeeper 元数据存储

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/203065.html原文链接:https://javaforall.net

(0)
上一篇 2026年3月19日 下午10:39
下一篇 2026年3月19日 下午10:40


相关推荐

  • Telerik RadControls for ASP.NET AJAX 2010 Q2最新版下载+源码下载

    Telerik RadControls for ASP.NET AJAX 2010 Q2最新版下载+源码下载TelerikRadControlsforASP.NETAJAX2010Q2RadControlsforASP.NET是一套强大的用户界面控件套装,它可以帮助您创建拥有桌面应用程序华丽外表和高速性能的Web应用程序。18种可靠的UI及数据控件全面提供AJAX性能,使用户可以得到高级的体验。但RadControlsforASP.NET不仅仅只支持AJAX,尽管其包含的控件以一流的性能帮助开发者执行AJAX。RadControlsforASP.NET同样还可以跨浏览器支持,兼容XHTM

    2022年7月19日
    15
  • 服务器raid卡位置,初识服务器RAID 卡

    服务器raid卡位置,初识服务器RAID 卡IT168 资讯 在服务器上实施 RAID 冗余磁盘阵列 是保护数据不受硬件故障影响的必要手段 但是许多读者其实还并不熟悉 RAID 我们都知道 在服务器上实施 RAID 冗余磁盘阵列 是保护数据不受硬件故障影响的必要手段 但是许多读者其实还并不熟悉 RAID 因此我们就来一起认识认识组成 RAID 系统的关键设备 RAID 卡 RAID 是英文 RedundantArr

    2026年3月18日
    2
  • nano命令 – 字符终端文本编辑器

    nano命令 – 字符终端文本编辑器

    2026年3月17日
    3
  • Chrome插件推荐之Web Clipper

    Chrome插件推荐之Web ClipperChrome 插件推荐名称 WebClipper 作用 剪藏工具 可以使用它将网上的任何内容保存到任何地方 插件地址 https chrome google com webstore detail web clipper mhfbofiokmpp 官网 https clipper website 推荐理由 之前用有道云笔记的剪藏插件

    2026年3月17日
    2
  • DoJa平台手机游戏的开发与移植

    DoJa平台手机游戏的开发与移植作者:关文柏时间:2006年6月13日关键字:DoJaNTTDoCoMoi-modei-appli内容概况:·DoJa技术简介·DoJaAPI预览·appli程序的开发·DoJa游戏移植到J2ME平台的方法·相关资源链接一,DoJa技术简介简单的说,DoJa是日本最大的移动通讯公司NTTDoCoMo…

    2022年6月5日
    34
  • 以《简单易懂》的语言带你搞懂逻辑回归算法【附Python代码详解】机器学习系列之逻辑回归篇

    以《简单易懂》的语言带你搞懂逻辑回归算法【附Python代码详解】机器学习系列之逻辑回归篇目录必看前言逻辑回归算法1概述2基本原理3sklearn实现3.1导入数据(乳腺癌数据集)3.2建模3.3绘制学习曲线3.4网格搜索-确定最优参数结束语必看前言这一篇文章,我会详细从机器学习的角度介绍逻辑回归,以及如何利用Python来实现逻辑回归以及逻辑回归的实战模拟,另外我也会教大家如何利用网格搜索找到最优参数。干货满满!逻辑回归算法1概述分类技术是机器学习和数据挖掘应用中的重要组成部分。在数据科学中,绝大多数的问题属于分类问题。解决分类的算法也有很多种。如:KNN,使距

    2022年8月21日
    7

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号