Algorithm to calculate how much of text A is in text B? -
i need calculate how of block of text (a) in block of text (b). simple algorithms soundex aren't providing great results me text b has additional text within isn't/shouldn't in text a, throws figures off. need ensure percentage of a within b, , ignore additions b.
my first thought simple algorithm might work in case split a sentences, note total number of sentences, search b instance of each sentence provide percentage. while should work feels quite hacky, , i'm sure more intelligent has devised algorithm provide better calculation on similar principle.
longest common subsequence looks best suited purposes.
Comments
Post a Comment