分布式文件系统-MooseFS
2013-04-09 13:55:27

Moose File System 是一个具备容错功能的网络分布式文件系统,它将数据分布在网络中的不同服务器上,MooseFS 通过 FUSE 使之看起来就是一个 Unix 的文件系统。它能在多种Unix(BSD)及Linux安装使用,采用C++开发并在GPLv2协议下授权使用。


MooseFS is a fault tolerant, network distributed file system. It spreads data over several physical servers which are visible to the user as one resource. For standard file operations MooseFS acts as other Unix-alike file systems:

 A hierarchical structure (directory tree)
 Stores POSIX file attributes (permissions, last access and modification times)
 Supports special files (block and character devices, pipes and sockets)
 Symbolic links (file names pointing to target files, not necessarily on MooseFS) and hard links (different names of files which refer to the same data on MooseFS)
 Access to the file system can be limited based on IP address and/or password


Distinctive features of MooseFS are:

 High reliability (several copies of the data can be stored across separate computers)
 Capacity is dynamically expandable by attaching new computers/disks
 Deleted files are retained for a configurable period of time (a file system level "trash bin")
 Coherent snapshots of files, even while the file is being written/accessed


ARCHITECTURE

MooseFS consists of four components:


Managing server (master server) – a single machine managing the whole filesystem, storing metadata for every file (information on size, attributes and file location(s), including all information about non-regular files, i.e. directories, sockets, pipes and devices).

Data servers (chunk servers)
- any number of commodity servers storing files data and synchronizing it among themselves (if a certain file is supposed to exist in more than one copy).

Metadata backup server(s) (metalogger server) - any number of servers, all of which store metadata changelogs and periodically downloading main metadata file; so as to promote these servers to the the role of the Managing server when primary master stops working.

Client computers that access (mount) the files in MooseFS - any number of machines using mfsmount process to communicate with the managing server (to receive and modify file metadata) and with chunkservers (to exchange actual file data).

另外官方有商业版本的MooseFS Pro提供,据说解决了Master的单点问题,非常值得一试。


官方介绍

MooseFS是大数据存储行业的一个突破性概念。它允许我们使用商用硬件将数据存储和数据处理结合在一起,从而提供极高的ROI。通过这种创新方法,我们为存储解决方案提供专业服务和专家咨询,并为您的所有操作提供实施和支持。MooseFS项目于大约15年前作为Gemius(一家在20多个国家测量互联网的欧洲领先公司)的子公司推出,现已成为全球公司最受欢迎的数据存储软件之一。它仍被用于存储Gemius核心业务的大量数据,7X24小时全天候工作,每秒收集和处理超过30万个事件。因此,我们向客户提供的任何解决方案都已在涉及大数据分析的现实工作环境中进行了测试。


相关站内参考:

派生版本LizardFS

MooseFS使用总结

LizardFS与MooseFS常见问题

使用分布式文件系统MooseFS实现存储共享(v1.6版)


最新版本:2.0
此版本支持在多个规则中运行多个 master 服务器。其中一个规则是 ”leader”,是给 chunkservers 和客户端使用,每个运行的系统不会超过一个 leader。分为社区版和专业版,更多更新内容请看发行说明

最新版本:3.0
2016年发布,使用情况可参考《MooseFS 3.0版介绍及试用》。


项目主页:http://www.moosefs.org/