java - How do I count the number of files in HDFS from an MR job? -
i'm new hadoop , java matter. i'm trying count number of files in folder on hdfs mapreduce driver i'm writing. i'd without calling hdfs shell want able pass in directory use when run mapreduce job. i've tried number of methods have had no success in implementation due inexperience java.
any appreciated.
thanks,
nomad.
you can use filesystem , iterate on files inside path. here example code
int count = 0; filesystem fs = filesystem.get(getconf()); boolean recursive = false; remoteiterator<locatedfilestatus> ri = fs.listfiles(new path("hdfs://my/path"), recursive); while (ri.hasnext()){ count++; ri.next(); }
Comments
Post a Comment