This post describes step by step procedure to configure (Quantcast) QFS file system in a Cloudera Hadoop cluster. The QFS wiki at GitHub contains lots of really valuable information which I used to confire QFS in my Hadoop environment, but for the individuals with little time the instructons below could come handy.
My Cloudera Hadoop deployment is on a VM environment running on CentOS 6.4 and I wanted to test the functionality of QFS’ HDFS interface. As the intent was to just have a functional QFS in place for testing it is likely that, for a production environment you may end up taking steps other than the ones specified in this post (like roating the log files, etc). Below is a diagram that describes the CDH setup before QFS is installed and configured.
There are two other posts that would come handy for a seamless install and I recommend that before the steps in this post are followed, at the bare minimum Oracle Java is properly installed:
- Configuring Oracle Java on CentOS
- Building QFS from source
This post would assume that you have java installed and configured properly with the environment variable
JAVA_HOME correctly pointing to the installed version of Oracle Java. And a tarball containing the binaries for the latest QFS file system.
Step 1: Install dependencies
First of all let’s ensure all the required packages are installed on each node. Not all of the packages listed below may be required but I have found this subset to be a good collection for meeting most of my needs.
$ yum groupinstall -y “Development tools” $ rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm $ yum install -y boost boost-devel git xfsprogs xfsprogs-devel libuuid-devel \ cmake openssl-devel wget redhat-lsb daemonize
Step 2: Prepare the datastore data & metadata
Just like HDFS, QFS also depends upon a local file system as the datastore for storing chunks / blocks for the distributed file system. I believe ext4 is the most common filesystem of choice for vanilla HDFS installations but with QFS the recommended file system is XFS as described in QFS Deployment Guide.
Before QFS could be configured and started these XFS volumes should be mounted on the nodes ready to serve QFS. Below are the steps that I followed for configuring the XFS volumes. The device /dev/sdd on each on my node is a block device dedicated for QFS. In a real world scenario there would be multiple of these block devices on each of the nodes (one per HDD).::
$ mkfs.xfs /dev/sdd $ echo “/dev/sdd /mnt/disk2 xfs defaults 1 1″ >> /etc/fstab
XFS is very versatile file system and offers plenty of tunables for optimization some at file system creation time like
log size and others that could be tuned at mount time like
logbufs. There is a good blog post that talks about some of these parameters and their effects.
Step 3: Install QFS Binaries
qfs-1.0.2.tar.gz tarball to /opt (on each node). I use a parallel shell to make the process quicker. Functionally that translates to the two commands described below executed on each node. The package
qfs-centos-6.4-1.0.2-x86_64.tar.gz is the binary and should be placed on each node under
$ tar -zxvf /tmp/qfs-centos-6.4-1.0.2-x86_64.tgz -C /opt/ $ ln -s /opt/qfs-centos-6.4-1.0.2-x86_64 /opt/qfs
Once the binaries are extracted, let’s create some essential directories that would be used by the QFS subsystem for configuration and logging.::
$ mkdir -p /etc/qfs $ mkdir -p /var/log/qfs/transaction_logs $ mkdir -p /var/lib/qfs/checkpoint
Also for each directory that is to be used as QFS datastore I prefer creating a root directory
qfs for storing the chunks for the file system and not place them in the root. So create an empty directory
qfs on each of the mounted XFS partition.
$ mkdir -p /mnt/disk1/qfs; mkdir -p /mnt/disk2/qfs/
Now we are ready to place the configuration files and init-scripts so that the QFS file system services could be started automatically. At the end of the post I have the complete commented files posted which could be modified for your environment. A zip file containing these scripts could also be downloaded from this link qfs_config_files.
There is a very good description on the architecture of QFS at the QFS Wiki. In my setup I will use the host hdp1 as the Metaserver and Webserver and the DataNodes of my Hadoop cluster hdp4, hdp5 and hpd6 would be used as Chunkservers.
Step 4: Configure Metaserver
Filtering out for comments and empty lines the block below describes my Metaserver.prp file which is placed in /etc/qfs/Metaserver.prp on host hdp1. Notice line number 1 where I have replaced the default clientPort from 20000 to 24000 as it conflicts with Impala. And line number 8 where the IP Addresses represents the hosts hdp4, hdp5 and hdp6 which will be used as Chunkservers.::
metaServer.clientPort = 24000 metaServer.chunkServerPort = 30000 metaServer.logDir = /var/log/qfs/transaction_logs metaServer.cpDir = /var/lib/qfs/checkpoint metaServer.createEmptyFs = 1 metaServer.recoveryInterval = 30 metaServer.clusterKey = qfs-cdh-1.0.2 metaServer.rackPrefixes = 192.168.79.84 1 192.168.79.85 2 192.168.79.86 3 metaServer.msgLogWriter.logLevel = INFO chunkServer.msgLogWriter.logLevel = NOTICE
Next, place the file qfs-webui.conf under directory
/etc/qfs/qfs-webui.conf on host
hdp1. Complete content of my
qfs-webui.conf is at the end of the post.
Step 5: Configure Chunkservers
For Chunkserver place the file
/etc/qfs/ on each of the host that are going to act as Chunkservers. In my setup the hosts are
hdp6. Filtering out for comments and empty line my
Chunkserver.prp is shown in the block below. For the entire file look at the end of the blog post which could be modified accordingly. Notice line 1 and 3 replace the IP address with the IP address of your Metaserver and the port with 24000 (default is 20000) in order to resolve the conflict with Impala.::
chunkServer.metaServer.hostname = 192.168.79.81 chunkServer.metaServer.port = 30000 chunkServer.clientPort = 24000 chunkServer.chunkDir = /mnt/disk2/qfs/ chunkServer.clusterKey = qfs-cdh-1.0.2 chunkServer.stdout = /dev/null chunkServer.stderr = /dev/null chunkServer.ioBufferPool.partitionBufferCount = 131072 chunkServer.msgLogWriter.logLevel = INFO
Step 6: Configure environment and install Init Scripts
In order to make QFS work seamlessly as a replacement for HDFS some environment variables needs to be configured. I find it easy to just define them in the global bash profile. Placing the following content under /etc/profile.d/qfs.sh on each of the node is one method that works for me.
#!/bin/bash export PATH=$PATH:/opt/qfs/bin:/opt/qfs/bin/tools/ export HADOOP_CLASSPATH=/opt/qfs/lib/* export JAVA_LIBRARY_PATH=/opt/qfs/lib/ export LD_LIBRARY_PATH=/opt/qfs/lib/ export QFSPARAM=”-Dfs.qfs.impl=com.quantcast.qfs.hadoop.QuantcastFileSystem -Dfs.defaultFS=qfs://hdp1.apnet.la:24000 -Dfs.qfs.metaServerHost=hdp1.apnet.la -Dfs.qfs.metaServerPort=24000″
The use of environment variable QFSPARAM would be explained later. But remember to replace the hostname hdp1.apnet.la with the hostname of your Metaserver.
Next we will install the init scripts such that the QFS services can start automatically. Place the files
hdp1 (or your Metaserver) and
/etc/init.d/ on each Chunkserver (in my case these are
hdp6). Complete files are available from the end of the post (Appendix).
On Metaserver (hdp1 in my case) execute the following commands to install the services::
$ chkconfig –add qfs-metaserver $ chkconfig –add qfs-webui $ chmod +x /etc/init.d/qfs-*
And on each node acting as Chunkserver execute the following command::
$ chkconfig –add qfs-chunkserver $ chmod +x /etc/init.d/qfs-*
Step 7: Start the services
Now we are ready to start the services. Start the QFS services by starting qfs-metaserver service on Metaserver (hdp1) by executing the command
service qfs-metaserver start and
service qfs-webui start and
service qfs-chunkserver start on each of the Chunkserver (
Do keep an eye on
/var/log/qfs/metaserver.err log files to see if you encounter any issues.
Step 8: Sample session
Below is an output of QFS shell session showing some commands to interact with QFS namespace.::
[root@hdp1 qfs]$ qfsshell -s hdp1.apnet.la -p 24000 QfsShell> ls dumpster root user QfsShell> cd user QfsShell> ls abisen QfsShell> ? Unknown cmd: ? Supported cmds are: append cd changeReplication chgrp chmod chown cinfo cp exit finfo help ls mkdir mv pwd quit rm rmdir stat Type <cmd name> --help for command specific help QfsShell>
For using QFS as a replacement to HDFS the only modification required for usual process is the inclusion of
$QFSPARAM environment variable that we defined in
[root@hdp1 qfs]$ hdfs dfs $QFSPARAM -ls / Found 3 items drwx------ - root root 0 2013-06-13 00:29 /dumpster drwxr-xr-x - root root 0 2013-06-13 00:59 /root drwxr-xr-x - root root 0 2013-06-13 09:40 /user
The session below shows a sample teragen run to ensure that we can successfully execute a map-reduce job.::
[root@hdp1 ~]$ hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar teragen $QFSPARAM 100000 /root/tgen-out 13/06/18 09:22:32 WARN conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id 13/06/18 09:22:32 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 13/06/18 09:22:32 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. Generating 100000 using 1 maps with step of 100000 13/06/18 09:22:32 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS 13/06/18 09:22:33 INFO mapred.JobClient: Running job: job_local107370576_0001 13/06/18 09:22:33 INFO mapred.LocalJobRunner: OutputCommitter set in config null 13/06/18 09:22:33 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter 13/06/18 09:22:33 INFO mapred.LocalJobRunner: Waiting for map tasks 13/06/18 09:22:33 INFO mapred.LocalJobRunner: Starting task: attempt_local107370576_0001_m_000000_0 13/06/18 09:22:33 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 13/06/18 09:22:33 INFO util.ProcessTree: setsid exited with exit code 0 13/06/18 09:22:33 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@5097d026 13/06/18 09:22:33 INFO mapred.MapTask: Processing split: org.apache.hadoop.examples.terasort.TeraGen$RangeInputFormat$RangeInputSplit@238b8914 13/06/18 09:22:33 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and BYTES_READ as counter name instead 13/06/18 09:22:33 INFO mapred.MapTask: numReduceTasks: 0 13/06/18 09:22:34 INFO mapred.JobClient: map 0% reduce 0% 13/06/18 09:22:34 INFO mapred.Task: Task:attempt_local107370576_0001_m_000000_0 is done. And is in the process of commiting 13/06/18 09:22:34 INFO mapred.LocalJobRunner: 13/06/18 09:22:34 INFO mapred.Task: Task attempt_local107370576_0001_m_000000_0 is allowed to commit now 13/06/18 09:22:34 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local107370576_0001_m_000000_0' to qfs://hdp1.apnet.la:24000/root/tgen-out 13/06/18 09:22:34 INFO mapred.LocalJobRunner: 13/06/18 09:22:34 INFO mapred.Task: Task 'attempt_local107370576_0001_m_000000_0' done. 13/06/18 09:22:34 INFO mapred.LocalJobRunner: Finishing task: attempt_local107370576_0001_m_000000_0 13/06/18 09:22:34 INFO mapred.LocalJobRunner: Map task executor complete. 13/06/18 09:22:35 INFO mapred.JobClient: map 100% reduce 0% 13/06/18 09:22:35 INFO mapred.JobClient: Job complete: job_local107370576_0001 13/06/18 09:22:35 INFO mapred.JobClient: Counters: 19 13/06/18 09:22:35 INFO mapred.JobClient: File System Counters 13/06/18 09:22:35 INFO mapred.JobClient: FILE: Number of bytes read=142955 13/06/18 09:22:35 INFO mapred.JobClient: FILE: Number of bytes written=238311 13/06/18 09:22:35 INFO mapred.JobClient: FILE: Number of read operations=0 13/06/18 09:22:35 INFO mapred.JobClient: FILE: Number of large read operations=0 13/06/18 09:22:35 INFO mapred.JobClient: FILE: Number of write operations=0 13/06/18 09:22:35 INFO mapred.JobClient: QFS: Number of bytes read=0 13/06/18 09:22:35 INFO mapred.JobClient: QFS: Number of bytes written=10000000 13/06/18 09:22:35 INFO mapred.JobClient: QFS: Number of read operations=0 13/06/18 09:22:35 INFO mapred.JobClient: QFS: Number of large read operations=0 13/06/18 09:22:35 INFO mapred.JobClient: QFS: Number of write operations=0 13/06/18 09:22:35 INFO mapred.JobClient: Map-Reduce Framework 13/06/18 09:22:35 INFO mapred.JobClient: Map input records=100000 13/06/18 09:22:35 INFO mapred.JobClient: Map output records=100000 13/06/18 09:22:35 INFO mapred.JobClient: Input split bytes=82 13/06/18 09:22:35 INFO mapred.JobClient: Spilled Records=0 13/06/18 09:22:35 INFO mapred.JobClient: CPU time spent (ms)=0 13/06/18 09:22:35 INFO mapred.JobClient: Physical memory (bytes) snapshot=0 13/06/18 09:22:35 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0 13/06/18 09:22:35 INFO mapred.JobClient: Total committed heap usage (bytes)=45744128 13/06/18 09:22:35 INFO mapred.JobClient: org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter 13/06/18 09:22:35 INFO mapred.JobClient: BYTES_READ=100000
For some reason the environment variables in
/etc/profile.d/qfs.sh are not coming into effect when a job is being submitted from a node other than Metaserver. And I was experiencing Exception ( Discussion
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.quantcast.qfs.hadoop.QuantcastFileSystem not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1587) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2279) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2292) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:87) …
The workaround I was able to make work was copying all the QFS JAR files to
/opt/cloudera/parcels/CDH/lib/hadoop and the shared libraries to
/opt/cloudera/parcels/CDH/lib/hadoop/lib/native on each node.
$ cp /opt/qfs/lib/*qfs*.so /opt/cloudera/parcels/CDH/lib/hadoop/lib/native/ $ cp /opt/qfs/lib/*qfs*.jar /opt/cloudera/parcels/CDH/lib/hadoop/
Appendix: Complete commented configuration files
I have pasted below the complete configuration files that I have used in my setup. These files in some form were pulled off the GitHub repository.comments powered by Disqus