MOOSEFS
Moosefs stands for Moose File system which is a distributed file system. A file system which is shared for multiple systems via network is known as Distributed File System. Let me explains more about MooseFS Storage sharing
Advantages of Distributed File System
Facilitates user to access all resources from a single point
As the files being distributed chance for the single point of failure is minimal
Server load balancing can be achieved by equal usage of storage resources
Here I am going to explain the procedure to setup a MooseFS server
For better understanding I’m explaining the setup in a practical point of view. We have LDAP centralized authentication implemented on our production floor. As the home directories of all our users are housed in our LDAP server we had totally around 9.76562 TB of free Hard disk space left on our production machines. And we needed to allocate considerable amount of space for our CCTV as well as the non official stuffs of our employes. After a week of Googling we came to know that the MooseFS can be used to fulfill our requirement and I’m gona explain the steps that I’ve followed to complete the Distributed File System.
MooseFS setup has four major componenets which are Master Server, Chunk Server & Meta Logger. And I have mentioned brief description about each three components as follows:
Master Server
The machine which manages the whole Moose file system is called Master Server. It stores Meta Data (File attribute, Location in distributed file system, Time of file creation, File standard, Size etc..). Master Server keeps the whole MetaData in RAM while processing a request. So it’s strongly recommended to consider the RAM size based on the number of files to be processed. It takes 300MB of RAM to process 1 Million files on the ChunkServers and 1GB of Hard disk space is required to hold the MetaData of 1 Million Files.
Chunk Server
ChunkServers are physical location where actually the chunks (Pieces) of files resides. The files are automatically synchronized between the ChunkServers. You can use any number of Chunk Servers as per your requirement & availability of free space. The recommended RAM size is 1GB.
MetaLogger
MetaLogger is to keep the copy of MetaData that the MasterServer has. So it’s recommended that the MetaLogger should have equal disk space as the MasterServer. If you have a plan to turn the MetaLogger as your MasterServer in the event of MasterServer failure.
MFSClient
All the machines that supports FUSE (File System in User Space : It allows non-previliaged users to create their own file system without altering the Kernal module).
Here’s the setup wizard of Master Server
First we need to add a system user say mfs to run the MFS service. And MooseFS can be downloaded from the link:
http://pro.hit.gemius.pl/hitredir/id=BxY7_eM43EskGBuSFtz3YNVqP1JNYK7dbkJhAZiPIXH.w7/url=moosefs.org/tl_files/mfscode/mfs-1.6.25.tar.gz Untarring the mfs-1.6.25.tar.gz file (Which has been downloaded) will produces a directory with the same file name without the compressed file extension (ie mfs-1.6.25), switch to the newly created directory and follow the steps given below:
./configure âprefix=/usr âsysconfdir=/etc âlocalstatedir=/var/lib âwith-default-user=mfs âwith-default-group=mfs âdisable-chunkserver âdisable-mfsmount
make
make install
cp /etc/mfsmaster.cfg.dist /etc/mfsmaster.cfg
cp /etc/mfsexports.cfg.dist /etc/mfsexports.cfg
vi /etc/mfsmaster.cfg
Find the following lines:
#WORKING_USER = mfs
#WORKING_GROUP = mfs
#REPLICATIONS_DELAY_INIT = 300
#REPLICATIONS_DELAY_DISCONNECT = 3600
#CHUNKS_LOOP_TIME = 300
#CHUNKS_DEL_LIMIT = 100
#CHUNKS_WRITE_REP_LIMIT = 1
#CHUNKS_READ_REP_LIMIT = 5
And modify the lines as given below:
WORKING_USER = mfs
WORKING_GROUP = mfs
REPLICATIONS_DELAY_INIT = 300
REPLICATIONS_DELAY_DISCONNECT = 3600
CHUNKS_LOOP_TIME = 300
CHUNKS_DEL_LIMIT = 100
CHUNKS_WRITE_REP_LIMIT = 1
CHUNKS_READ_REP_LIMIT = 5
The first two values WORKING_USER & WORKING_GROUP defines the system user name and group name responsile for running the daemon. REPLICATIONS_DELAY_INIT the initial wait time of MasterServer before start replicating chunks on the server which comes online. REPLICATIONS_DELAY_DISCONNECT is the value that MasterServer should wait before saving chunks on a Chunk Server if the Chunk Server goes online. CHUNKS_LOOP_TIME the time interval that the MasterServer repeats the operations mentioned below:
CHUNKS_DEL_LIMIT â Maximum number of Chunks that can be deleted at one loop (Depends on
CHUNKS_LOOP_TIME).CHUNKS_READ_REP_LIMIT & CHUNK_WRITE_REP_LIMIT â When replcation of particular CHUNK found to be below the Goal, ChunkServer will only perform limited number of operations.
Note: mfsexports.cfg contains MooseFS access list for mfsmount clients
vi /etc/mfsexports.cfg
192.168.1.0/24 / rw,alldirs,maproot=0
Here I’ve given access to all the machines which resides in Subnet 192.168.1.0/24. The syntax for the mfsexports.cfg is given below:
Address        Directory      Options
Address can be specified as single IP address, IP range, IP class specified by network address and bits number. Directory can be specified by exact or relative path. The options field has the following types:
ro, readonly â export tree in read only mode.
rw, readwrite â export tree in read & write mode.
Maproot – Maps root user access for the specified USER:GROUP.
Ignoregid â Disable checking of Group access at MFS master server.
Dynamicip â Allow reconnects from already authenticated clients IP address.
maproot=USER[:GROUP] â Gives root access to the user & group specified. If no group specified default group of the user mentioned will in effect.
mapall=USER[:GROUP] â Gives all non previliaged access for the user & Group specified.
minversion=VER â Access will be restricted for the clients older than specified
password=PASS â Expects password to authenticate from clients.
Alldirs â Mount access to all the subdirectories under the directory specified.
Finally we can add the MFSmaster server to the startup by following the steps given below:
mv /usr/src/mfs-1.6.25/debian/mfs-master.init /etc/init.d/mfs-master
chmod 755 /etc/init.d/ mfs-master.init
update-rc.d mfs-master.init defaults
/etc/init.d/mfs-master.init start
Note: Make sure that user mfs user has the ownership on /var/lib/mfs directory.
Chunk Server Configuration:
Follow the exact steps that we have gone through for Master Server with slight changes as given below:
./configure âprefix=/usr âsysconfdir=/etc âlocalstatedir=/var/lib âwith-default-user=mfs âwith-default-group=mfs âdisable-mfsmaster âdisable-mfsmount
make
make install
cp /etc/mfschunkserver.cfg.dist /etc/mfschunkserver.cfg
vi /etc/mfschunkserver.cfg
Find the following lines:
#WORKING_USER = mfs
#WORKING_GROUP = mfs
#MASTER_HOST = host name
#HDD_TEST_FREQ = 10
And modify the lines as like below:
WORKING_USER = mfs
WORKING_GROUP = mfs
MASTER_HOST = Master.SupportSages.com
HDD_TEST_FREQ = 10
Note: If we leave any variable as commented MFS will use the default value included in the Source code.
WORKING_USER & WORKING_GROUP â Stands for the user & group responsible for running the MFS ChunkServer.
MASTER_HOST â Master Server Host Name or IP address specified here.
Rest of the variables can be left commented. So that MooseFS setup will read the defaults values as I’ve mentioned earlier.
The mfshdd.cfg file contains the mount points used to store the Chunk Data.
vi /etc/ mfshdd.cfg
Add the mount point as given below:
/sage5-shared/mfs1
As we did with Master server we can add the ChunkServer to system startup by the following steps:
cp /usr/src/mfs-1.6.25/debian/mfs-chunkserver.init /etc/init.d/mfs-chunkserver.init
(/usr/src/ is the location where I’ve downloaded MFS)
chmod 755 /etc/init.d/mfs-chunkserver.init
update-rc.d mfs-chunkserver.init defaults
/etc/init.d/mfs-chunkserver.init start
MFS-Client Configuration:
./configure âprefix=/usr âsysconfdir=/etc âlocalstatedir=/var/lib âwith-default-user=mfs âwith-default-group=mfs âdisable-mfsmaster âdisable-chunkserver
make
make install
Create a mount point to mount the shared space provided by MooseFS and change the ownership to mfs in order to grant full access to mfs.
mkdir /var/mfs-shared
chown mfs:mfs /var/mfs-shared
Finally you just need to follow the below syntax for mounting the shared space
mfsmount mount point -H IP address/Host name
 Eg:
mfsmount /var/mfs-shared -H Master.SupportSages.com
Once after complting the mounting you can check the space available on shared space using the following command:
df -h | grep mfs
Also you can perform read/write operations on the mount point in order to make sure that the Moosefs system is working. At the end of each read/write operation the transfer rate will be displayed.
MetaLogger Configuration:
./configure âprefix=/usr âsysconfdir=/etc âlocalstatedir=/var/lib âwith-default-user=mfs âwith-default-group=mfs âdisable-chunkserver âdisable-mfsmount
 make
 make install
 cp /etc/mfsmetalogger.cfg.dist /etc/mfsmetalogger.cfg
 vi /etc/mfsmetalogger.cfg
Find the following lines:
#WORKING_USER = mfs
#WORKING_GROUP = mfs
#MASTER_HOST =
and replace it with the following:
WORKING_USER = mfs
WORKING_GROUP = mfs
MASTER_HOST = Master.SupportSages.com
cp /usr/src/mfs-1.6.25/debian/mfs-metalogger.init /etc/init.d/mfs-metalogger.init
chmod 755 /etc/init.d/mfs-metalogger.init
update-rc.d mfs-metalogger.init defaults
/etc/init.d/mfs-metalogger.init start
Setting up Goal:
The number of copies each chunks stored in Chunk Server is called Goal. We can set Goal for whole directory tree & individual files as well based on the priority of the particular file.
mfssetgoal â Command used to set Goal for a directory/file.
Mfsgetgoal â Command used to get the assigned Goal rate of a file/directory.
Syntax:
mfssetgoal GoalValue file
eg:
mfssetgoal 2 /var/mfs-shared/test
Using the above we have set Goal 2 for the file called test. This means there will be two copies for each chunks of the file âtestâ
mfssetgoal -r 2 /var/mfs-shared/directory
We can acheive Goal for whole directory tree usig the recusrsive switch ârâ as mentioned above. The actual number of copies of a file can be verified using the command:
mfscheckfile /var/mfs-shared/test
/mnt/mfs-shared/test:
2 copies: 1 chunks
mfsfileinfo /var/mfs-shared/test
/var/mfs-shared/test:
chunk 0: 00000000000520DF_00000001 / (id:356275 ver:1)
copy 1: 192.168.0.14:9622
copy 2: 192.168.0.15:9622
Disaster Recovery:
In the event of Master Server failure last meta data change log needs to be merged with main meta data file. Once you recover the Master server you can merge the change log by following the below steps:
mfsmetarestore -a
mfsmetarestore -a -d /directory path (If the backup file stored in a non standard location.
If you are setting up a new Master Server you need to have the master server configuration file same as the old master server. After that you need to copy last change log file, metadata.mfs.back from the Metalogger server to new mfsmaster configuration directory. Finally follow the below steps to recover the Metadata:
mfsmetarestore -m metadata.mfs.back -o metadata.mfs changelog.0.mfs
The recent change log has the last changes in the Metadata before the Master server crahses. Normal conditions it would be enough to use last change log to recover the Metadata.
Let me know if there is any correction ð