欢迎关注大数据技术架构与案例微信公众号:过往记忆大数据
过往记忆博客公众号iteblog_hadoop
欢迎关注微信公众号:
过往记忆大数据

Tachyon 0.7.0伪分布式集群安装与测试

  我们先来看看官方文档是怎么对Tachyon进行描述的:

Tachyon is a memory-centric distributed storage system enabling reliable data sharing at memory-speed across cluster frameworks, such as Spark and MapReduce. It achieves high performance by leveraging lineage information and using memory aggressively. Tachyon caches working set files in memory, thereby avoiding going to disk to load datasets that are frequently read. This enables different jobs/queries and frameworks to access cached files at memory speed.

  中文的意思就是:Tachyon是一个高容错的分布式文件系统,允许文件以内存的速度在集群框架中进行可靠的共享,就像Spark和 MapReduce那样。通过利用信息继承,内存侵入,Tachyon获得了高性能。Tachyon工作集文件缓存在内存中,并且让不同的 Jobs/Queries以及框架都能内存的速度来访问缓存文件。因此,Tachyon可以减少那些需要经常使用的数据集通过访问磁盘来获得的次数。

  这篇文章将介绍如何搭建伪分布式的Tachyon集群。

  1、默认情况下Tachyon是基于Hadoop 1.0.4版本进行编译的,这个版本的Hadoop有点老,所以这里先从github库中将Tachyon下载下来,再进行编译。目前Tachyon的最新版本是0.7.1,这里就以0.7.0为例进行编译安装。

[iteblog@www.iteblog.com tachyon]$ wget https://github.com/amplab/tachyon/archive/v0.7.0.zip

  2、下载完之后,将会在tachyon文件夹下面产生一个名为v0.7.0的文件。其实他是一个zip格式的文件,我们先将他命名为tachyon-0.7.0.zip,然后解压它:

[iteblog@www.iteblog.com tachyon]$ mv v0.7.0 tachyon-0.7.0.zip
[iteblog@www.iteblog.com tachyon]$ unzip tachyon-0.7.0.zip

  3、解压后会在当前目录下产生名为tachyon-0.7.0的文件夹,我们进入tachyon-0.7.0文件夹,然后使用maven进行编译,编译过程根据网络环境大概需要2分钟:

[iteblog@www.iteblog.com tachyon]$  cd tachyon-0.7.0/
[iteblog@www.iteblog.com tachyon-0.7.0]$  mvn -Dhadoop.version=2.2.0 clean package -DskipTests=true
.......
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Tachyon Parent ..................................... SUCCESS [  6.419 s]
[INFO] Tachyon Common ..................................... SUCCESS [ 22.506 s]
[INFO] Tachyon Under File System .......................... SUCCESS [  0.472 s]
[INFO] Tachyon Under File System - Local FS ............... SUCCESS [  1.887 s]
[INFO] Tachyon Under File System - HDFS ................... SUCCESS [  2.472 s]
[INFO] Tachyon Under File System - Gluster FS ............. SUCCESS [  4.153 s]
[INFO] Tachyon Under File System - Swift .................. SUCCESS [  2.923 s]
[INFO] Tachyon Under File System - S3 ..................... SUCCESS [  1.956 s]
[INFO] Tachyon Clients .................................... SUCCESS [  0.384 s]
[INFO] Tachyon Clients - Implementation ................... SUCCESS [  3.559 s]
[INFO] Tachyon Clients - Distribution ..................... SUCCESS [  6.535 s]
[INFO] Tachyon Servers .................................... SUCCESS [  7.533 s]
[INFO] Mock Tachyon Cluster ............................... SUCCESS [  2.098 s]
[INFO] Tachyon Integration Tests .......................... SUCCESS [  1.256 s]
[INFO] Tachyon Shell ...................................... SUCCESS [  1.852 s]
[INFO] Tachyon Examples ................................... SUCCESS [  2.020 s]
[INFO] Tachyon Assemblies ................................. SUCCESS [  5.820 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:14 min
[INFO] Finished at: 2015-08-27T11:52:27+08:00
[INFO] Final Memory: 160M/1793M
[INFO] ------------------------------------------------------------------------

  4、配置tachyon,一次进行如下操作:

[iteblog@www.iteblog.com tachyon-0.7.0]$  sudo mkdir -p ../tachyon/journal
[iteblog@www.iteblog.com tachyon-0.7.0]$  sudo mkdir -p ../tachyon/ramdisk
[iteblog@www.iteblog.com tachyon-0.7.0]$  mv conf/tachyon-env.sh.template conf/tachyon-env.sh
[iteblog@www.iteblog.com tachyon-0.7.0]$ vim conf/tachyon-env.sh
在里面加入以下配置
HADOOP_HOME=/home/iteblog/hadoop/hadoop-2.2.0
export TACHYON_HOME=/home/iteblog/spark/tachyon-0.7.0
export TACHYON_UNDERFS_ADDRESS=$TACHYON_HOME/underfs
export TACHYON_RAM_FOLDER=/home/iteblog/spark/tachyon/ramdisk

并将conf/tachyon-env.sh文件中的-Dtachyon.master.journal.folder参数修改如下:

-Dtachyon.master.journal.folder=/home/iteblog/spark/tachyon/journal

  5、修改tachyon日志存放路径:

[iteblog@www.iteblog.com tachyon-0.7.0]$ mkdir logs
[iteblog@www.iteblog.com tachyon-0.7.0]$ vim conf/log4j.properties

tachyon.logs.dir=/home/iteblog/spark/tachyon-0.7.0/logs

  6、在conf文件夹下创建core-site.xml文件:

[iteblog@www.iteblog.com tachyon-0.7.0]$ vim conf/core-site.xml

<configuration>
  <property>
    <name>fs.tachyon.impl</name>
    <value>tachyon.hadoop.TFS</value>
  </property>
</configuration>

  7、格式化Tachyon

[iteblog@www.iteblog.com tachyon-0.7.0]$ bin/tachyon format
Connecting to localhost as iteblog...
root@localhost's password: 
Formatting Tachyon Worker @ localhost
Connection to localhost closed.
Formatting Tachyon Master @ localhost

[iteblog@www.iteblog.com tachyon-0.7.0]$ bin/tachyon-start.sh local
Killed 0 processes on localhost.localdomain
Killed 0 processes on localhost.localdomain
Connecting to localhost as iteblog...
root@localhost's password: 
Killed 0 processes on localhost.localdomain
Connection to localhost closed.
Formatting RamFS: /mnt/ramdisk (512mb)
Starting master @ localhost
Starting worker @ localhost

  tachyon-start.sh local命令将同时在本地启动Master和Worker进程。需要注意的是,运行tachyon-start.sh local命令一定要拥有切换root的密码,否者会无法启动。这是因为RamFS的格式化需要root权限。启动完成之后,我们可以在http://hostname:19999查看WEB UI页面了。如下图:

  8、运行测试用例

[iteblog@www.iteblog.com tachyon-0.7.0]$ bin/tachyon runTest Basic CACHE_THROUGH
/default_tests_files/BasicFile_CACHE_THROUGH has been removed
2015-08-28 16:23:13,645 INFO   (MasterClient.java:connect) - Tachyon client (version 0.7.1) is trying to connect with master @ localhost/127.0.0.1:19998
2015-08-28 16:23:13,687 INFO   (MasterClient.java:connect) - User registered with the master @ localhost/127.0.0.1:19998; got UserId 4
2015-08-28 16:23:13,740 INFO   (CommonUtils.java:printTimeTakenMs) - createFile with fileId 3 took 102 ms.
2015-08-28 16:23:13,793 INFO   (WorkerClient.java:connect) - Trying to get local worker host : hadoop
2015-08-28 16:23:13,818 INFO   (WorkerClient.java:connect) - Connecting local worker @ hadoop/192.168.56.101:29998
2015-08-28 16:23:13,983 INFO   (BlockOutStream.java:get) - Writing with local stream. tachyonFile: /default_tests_files/BasicFile_CACHE_THROUGH, blockIndex: 0, opType: CACHE_THROUGH
2015-08-28 16:23:14,108 INFO   (CommonUtils.java:createBlockPath) - Folder /mnt/ramdisk/tachyonworker/4 was created!
2015-08-28 16:23:14,116 INFO   (LocalBlockOutStream.java:<init>) - /mnt/ramdisk/tachyonworker/4/3221225472 was created! tachyonFile: /default_tests_files/BasicFile_CACHE_THROUGH, blockIndex: 0, blockId: 3221225472, blockCapacityByte: 536870912
2015-08-28 16:23:14,274 INFO   (CommonUtils.java:printTimeTakenMs) - writeFile to file /default_tests_files/BasicFile_CACHE_THROUGH took 533 ms.
2015-08-28 16:23:14,343 INFO   (CommonUtils.java:printTimeTakenMs) - readFile file /default_tests_files/BasicFile_CACHE_THROUGH took 65 ms.
Passed the test!


[iteblog@www.iteblog.com tachyon-0.7.0]$ bin/tachyon runTests

  9、停止Tachyon相关进程

[iteblog@www.iteblog.com tachyon-0.7.0]$ bin/tachyon-stop.sh
本博客文章除特别声明,全部都是原创!
原创文章版权归过往记忆大数据(过往记忆)所有,未经许可不得转载。
本文链接: 【Tachyon 0.7.0伪分布式集群安装与测试】(https://www.iteblog.com/archives/1490.html)
喜欢 (2)
分享 (0)
发表我的评论
取消评论

表情
本博客评论系统带有自动识别垃圾评论功能,请写一些有意义的评论,谢谢!
(4)个小伙伴在吐槽
  1. 楼主,你好,想请问下,如果我不是在本地,想通过ip地址访问webui,是不是 tachyon-start.sh local的local可以换成IP地址,然后在浏览器上通过IP:port来访问webui,我这样做是没成功的?可以请楼主牛牛给一个解决办法吗

    /kel℡丨猜不透?2015-11-12 09:43 回复
    • 那你那个IP要换成你启动tachyon机器的IP啊

      w3970907702015-11-12 11:00 回复
      • 恩 IP用的是我启动tachyon的那个IP,是没成功,是不是需不需要改动什么配置文件?

        /kel℡丨猜不透?2015-11-12 11:38 回复
        • 你去修改 tachyon_home/conf/tachyon-env.sh里面的
          export TACHYON_MASTER_ADDRESS=${TACHYON_MASTER_ADDRESS:-localhost}

          w3970907702015-11-12 11:53 回复