1

I am using stock Apache Hadoop 1.1.1 and I can't get a datanode to start due to:

2015-04-23 09:12:48,138 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-04-23 09:12:48,152 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2015-04-23 09:12:48,154 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-04-23 09:12:48,154 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2015-04-23 09:12:48,254 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2015-04-23 09:12:48,608 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot access storage directory /hadoop/data/05
2015-04-23 09:12:48,608 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /hadoop/data/05 does not exist.
2015-04-23 09:12:48,731 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid value for volsFailed : 1 , Volumes tolerated : 0
    at org.apache.hadoop.hdfs.server.datanode.FSDataset.<init>(FSDataset.java:974)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:403)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:309)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1651)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1590)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1608)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1734)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1751)

2015-04-23 09:12:48,732 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at hadoop03
************************************************************/

Now, I know from here that I can set the tolerated failed volumes higher than zero, but how do I find out which volume is actually failing? I am assuming this is an actual disk failure since this is rather old hardware, but is there anything Hadoop-ish (even standard linux-ish) that I can do to debug which disk is failing?

0

You must log in to answer this question.

Browse other questions tagged .