TechnicalPhilosophy: Tricky interview question on MapRedue

Q : What could be wrong if I get error :
"Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable"
Ans: This is general programmer error , there could be several reasons

might have a typo mistake in job.setMapperClass();
might have wrongly declared the mapper class Mapper<LongWritable,Text,Text,DoubleWritable>
might have used a mix of old model ( .mapred.*) and new model classes combinations ( .mapreduce.*)

Q : what are the hidden files created in output directory along with partfiles and success log ?
Ans : .crc ( cyclic redundancy check) hidden files will be generated for both partfiles and success log.

Q : what are the types you used to create a ORC file
Ans : NullWritable,OrcStruct
Example: Reducer<Text,DoubleWritable,NullWritable,OrcStruct>

Q: How do you access Hive from Mapreduce
Ans: HCatalog makes Hive metadata available to users of other Hadoop tools like Pig, MapReduce and Hive. It provides connectors for MapReduce and Pig so that users of those tools can read data from and write data to Hive’s warehouse.
Details are here and Example Code here

TechnicalPhilosophy

Labels

Tricky interview question on MapRedue

No comments:

Post a Comment