XML processing with hadoop -
this question has answer here:
i have xml file of following nature.
<tree> <subtree> some_text1 </subtree> <subtree> some_text2 </subtree> </tree> i have 10 nodes in cluster , want each mapper subtree each time 'map' method called i.e when map method called first time map running on 10 nodes able access first subtree element, when called second time 10 nodes able access second subtree element in xml file. there way in can ?
there no oob feature available this. have write custom inputformat that'll implement recordreader per requirement. might find link helpful : http://www.undercloud.org/?p=408
Comments
Post a Comment