algorithm - Finding all pairs of sequences that differ at exactly one position -


i need data structure representing set of sequences (all of same, known, length) following non-standard operation:

find 2 sequences in set differ @ 1 index. (or establish no such pair exists.)

if n length of sequences , m number of sequences, there obvious o(n*m*m) algorithm. wonder if there standard way of solving more efficiently. i'm willing apply pre-processing if needed.

bonus points if instead of returning pair, algorithm returns all sequences differ @ same point.

alternatively, interested in solution can check efficiently whether particular sequence differs @ 1 index sequence in set. if helps, can assume in set, no 2 sequences have property.

edit: can assume n reasonably small. this, mean improvements such o(log(n)*m*m) not useful use case.

for each sequence , each position in sequence, calculate hash of sequence without position , add hash table. if there entry in table, have found potential pair differs in 1 position. using rolling hashes both start , end , combining them, can calculate each hash in constant time. total running time expected o(n*m).


Comments

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

android - send complex objects as post php java -

charts - What graph/dashboard product is facebook using in Dashboard: PUE & WUE -