performance - Hadoop Hive slow queries -


i new hadoop hive , developing reporting solution. problem query performance slow (hive 0.10, hbase 0.94, hadoop 1.1.1). 1 of queries is:

select a.*, b.country, b.city p_country_town_hotel b      inner join p_hotel_rev_agg_period  on     (a.key.hotel = b.hotel) b.hotel = 'adriapraha' , a.min_date < '20130701'     order a.min_date desc       limit 10; 

which takes quite long time (50s). know know, join on string field , not on integer data sets not big(cca 3300 , 100000 records). tried hints on sql didn't turn out faster. same query on ms sql server lasts 1s. simple count(*) table lasts 7-8s shocking (the table has 3300 records). don't know issue? ideas or did misinterpret hadoop?

yes..you have misinterpreted hadoop. hadoop, , hive well, not meant real time stuff. suitable offline, batch processing kinda stuff. not @ replacement rdbmss. though can fine tuning 'absolute real time' not possible. there lot of things happen under hood when run hive query, think not unaware of. first of hive query gets converted corresponding mr job followed few other things split creation, records generation, mapper generation etc. never suggest hadoop(or hive) if have real time needs.

you might wanna have @ impala real time needs.


Comments

Popular posts from this blog

php - Why I am getting the Error "Commands out of sync; you can't run this command now" -

linux - Does gcc have any options to add version info in ELF binary file? -

java - Are there any classes that implement javax.persistence.Parameter<T>? -