c - Graph Paths Abstraction Algorithm Needed -
i have data structure holding graph 1 in following picture:
in tree, node can have number of unique children levels below it. in tree in picture represents set of paths. every path should begin node level 1, , ends node of "*" mark. paths of tree in picture are:
a c g c g j d g d g j d k, , on...
actually original tree huge (around 2 million sequences) , maximum number of nodes per level 61 (of 11 levels). causes many memory consumption problems in application (a computer vision application samsung).
my target have iterative algorithm represents these paths in more compact string format. think problem divided 3 steps follows. have built tree data structure (step 2), but still can not derive iterative algorithm gets output string/sequence in step 3 tree.
1- input string:
(a c g) | (a c g j) | (a d g) | (a d g j ) | (a d k) | ....
,
where "|" represents alternatives.
2- building tree data structure of these paths.
3- required output string:
(a (c g [j]) | (d (g [j]) | k)) | (b ....)
.
where "|" represents alternatives , "[ ]" encloses options. target output string should optimized there not more common factors can taken more simplify it.
you can use modification of iterative dfs, utilizes stack keep track of unprocessed nodes. algorithm never stores more 6 characters on stack* 1 node, , there fewer n nodes on stack (where n number of nodes in graph). you've indicated n @ 61*11=671, there maximum of 4000 elements possible on stack.
in pseudocode below, "destination" node starred node in example above, e.g. g*.
initialization:
a dummy node φ introduced edge φ each of "root" nodes, e.g. nodes , b above. token φ assumed non-printing character, or can explicitly check before adding output string. node φ pushed onto stack before calling function.
outstring := "" while stack not empty pop token if token node outstring := outstring + node(token) // line 5 - explanation below if node(token) has children if node(token) destination outstring := outstring + "[" push "]" end if node(token) has multiple children each child of node(token), right left push ")" push child push "(" push "|" end pop // remove last "|" else push child end end else // token ()[]| outstring := outstring + token end end
the output of algorithm first part of graph (a , children) (with spaces added clarity; spaces can added code):
a (c g [j]) | (d (g [j]) | (k))
you'll notice deviation between result , mine: final node k enclosed in parentheses in solution. if undesirable (it result in ugliness a[(b)|(c)]
), can eliminate performing additional check when pop node token off of stack @ cost of additional overhead. replace line 5 above with:
if (node(token) has no children , last character of outstring "(" , next token on stack ")") remove trailing "(" outstring concatenate token outstring pop ")" stack , ignore else outstring := outstring + node(token) // above end
let me know if have questions or i've missed anything.
* happen in (probably highly unlikely) case of node being written |[(a)]
. nodes take 4 or fewer characters in stack.
Comments
Post a Comment