hadoop - Convert HDFS SequenceFile from SnappyCodec to DefaultCodec -


given choice, i'd love can run hadoop shell vs mr job. have few files need converted.

some code untested should trick (obviously file names made - sequence files don't typically have extensions):

configuration conf = new configuration(); filesystem fs = filesystem.get(conf);  path inputpath = new path("part-r-00000.snappy"); path outputpath = new path("part-r-00000.deflate"); fsdataoutputstream dos = fs.create(outputpath);  sequencefile.reader reader = new sequencefile.reader(fs, inputpath,         conf); writable key = (writable) reflectionutils.newinstance(         reader.getkeyclass(), conf); writable value = (writable) reflectionutils.newinstance(         reader.getvalueclass(), conf);  compressioncodecfactory ccf = new compressioncodecfactory(conf); compressioncodec codec = ccf.getcodecbyclassname(defaultcodec.class         .getname()); sequencefile.writer writer = sequencefile.createwriter(conf, dos,         key.getclass(), value.getclass(), reader.getcompressiontype(),         codec);  while (reader.next(key, value)) {     writer.append(key, value); }  reader.close(); dos.close(); 

you should acquire configuration via toolrunner / tool pattern - here's similar question outlines incase new principal you:


Comments

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

javascript - Clean way to programmatically use CSS transitions from JS? -

android - send complex objects as post php java -