java - Massive multiprogramming and read-only file access -

i trying create dictionary-based tagger running on hadoop cluster using pig. basically, does, each document (quite large text documents, few mbs) run each word in each sentence against dictionary read corresponding value.

there few hundred java programs (not threads) running in parallel, using dictionary file in read-only mode. idea load dictionary text , create map query against it.

question: should prepared for? remotely logic want read file in multiprogramming environment or should first copy (relatively small) file each instance of program? bufferedreader should use while reading file?

there little structured documentation on multiprogramming (compared multithreading) bit afraid of running against wall doing so.

note: allowed answer way of thinking totally wrong if provide me better way ;-)

i think approach fine. should load dictionary distributedcache memory, , checks memory-loaded dictionary (e.g., hashmap).

Search This Blog

Brande

java - Massive multiprogramming and read-only file access -

Comments

Post a Comment

Popular posts from this blog

php - Why I am getting the Error "Commands out of sync; you can't run this command now" -

linux - Does gcc have any options to add version info in ELF binary file? -

java - Are there any classes that implement javax.persistence.Parameter<T>? -