c# - Parsing a formatted string with RegEx or similar -
i have application sends tcp message server, , gets 1 back.
the message gets in format:
0,"120"1,"data field 1"2,"2401"3,"data field 3"1403-1,"multiple occurence 1"1403-2,"multiple occurence 2"99,""
so set of fields concatenated together.
each field has tag, comma, , value - in order.
tag number, value in quotes, comma seperates them.
0,"120"
0 tag, 120 value.
a complete message starts 0 field , ends 99,"" field.
to complicate things, tags have dashes because split more 1 value.
order of numbers not significant.
(for reference, "fedex tagged transaction" message).
so i'm looking decent way of validating have "complete" message (ie has 0 , 99 fields) - because it's tcp message guess have account not having received full message yet.
splitting values need.
the best have come parsing poor regex , cleaning-up afterwards. heart of this: (\d?\d?\d?\d?-?\d?\d,") split it
string s = @"(\d?\d?\d?\d?-?\d?\d,"")"; string[] strarray = regex.split(receiveddata, r); assert.areequal(14, strarray.length, "array length should 14", since have 7 fields.); dictionary<string, string> fields = new dictionary<string, string>(); //now put dictionary should easier work array (int = 0; <= strarray.length-2; i+=2) { fields.add(strarray[i].trim('"').trim(','), strarray[i + 1].trim('"')); } which doesn't work.
has lot of quotes , commas left over, , doesn't seem particularly well-formed...
i'm not regex can't put need do.
i don't know if best way.
any appreciated.
i suggest use regex.matches rather regex.split. way can iterate on matches, , use capture groups grab data want directly, while still maintaining structure. provided regex should work below in example:
matchcollection matchlist = regex.matches(receiveddata, @"(?<tag>\d+(?:-\d+)?),""(?<data>.*?)"""); foreach (match match in matchlist) { string tag = match.groups["tag"].value; string data = match.groups["data"].value; }
Comments
Post a Comment