python - Bigrams with NLTK: problems with script -

i trying "calculate" bigrams in corpus nltk. however, there still bugs in script seems. can't figure out doing wrong, hope able give me @ least clue. please keep in mind, new this. thanks!

tekst.collocations()     bgm = nltk.collocations.bigramassocmeasures() finder = bigramcollocationfinder.from_words(mijn_corpus) # mijn_corpus should it's loc finder.apply_freq_filter(3) # filter out ones appear 1,2 times finder.nbest(bgm.pmi, 10)  scored_bgm = finder.score_ngrams( bgm.likelihood_ratio  ) prefix_keys = collections.defaultdict(list)  key, scores in scored: # sorting on first word of bigram     prefix_keys[key[0]].append((key[1], scores)) key in prefix_keys: #strongest association     prefix_keys[key].sort(key = lambda x: -x[1])

Search This Blog

Brande

python - Bigrams with NLTK: problems with script -

Comments

Post a Comment

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

android - send complex objects as post php java -

java - Are there any classes that implement javax.persistence.Parameter<T>? -