performance - Hadoop Hive slow queries -
i new hadoop hive , developing reporting solution. problem query performance slow (hive 0.10, hbase 0.94, hadoop 1.1.1). 1 of queries is:
select a.*, b.country, b.city p_country_town_hotel b inner join p_hotel_rev_agg_period on (a.key.hotel = b.hotel) b.hotel = 'adriapraha' , a.min_date < '20130701' order a.min_date desc limit 10; which takes quite long time (50s). know know, join on string field , not on integer data sets not big(cca 3300 , 100000 records). tried hints on sql didn't turn out faster. same query on ms sql server lasts 1s. simple count(*) table lasts 7-8s shocking (the table has 3300 records). don't know issue? ideas or did misinterpret hadoop?
yes..you have misinterpreted hadoop. hadoop, , hive well, not meant real time stuff. suitable offline, batch processing kinda stuff. not @ replacement rdbmss. though can fine tuning 'absolute real time' not possible. there lot of things happen under hood when run hive query, think not unaware of. first of hive query gets converted corresponding mr job followed few other things split creation, records generation, mapper generation etc. never suggest hadoop(or hive) if have real time needs.
you might wanna have @ impala real time needs.
Comments
Post a Comment