Algorithm to calculate how much of text A is in text B? -
i need calculate how of block of text (a
) in block of text (b
). simple algorithms soundex aren't providing great results me text b
has additional text within isn't/shouldn't in text a
, throws figures off. need ensure percentage of a
within b
, , ignore additions b
.
my first thought simple algorithm might work in case split a
sentences, note total number of sentences, search b
instance of each sentence provide percentage. while should work feels quite hacky, , i'm sure more intelligent has devised algorithm provide better calculation on similar principle.
longest common subsequence looks best suited purposes.
Comments
Post a Comment