Need regex to extract fields from string -

i need extract title, location, , price string this:

10' starcraft pop camper (newport) $5500

it should obvious which.

however, there cases this:

10' (approx.) starcraft pop camper (drigg's town, pa) $5500

when use simple regex, can match first string correctly, not second:

^(?<title>.+?) \((?<area>.+?)\) \$(?<price>[\d]+)$

i'm pretty sure lookaheads/backreferences can handle this, don't know how. can me out explanation? (and maybe references easy read article on subject.)

with 2 examples, best can suggest change lazy quantifier greedy quantifier title capturing group:

^(?<title>.+) \((?<area>.+?)\) \$(?<price>[\d]+)$            ^^           here

effectively, pattern in area capturing group capture text inside last brackets () (providing followed text can matched price capturing group).

the greedy quantifier in title consumes text possible, , force area capturing group take furthest possible match.

another way make sure sub-pattern in area capturing group not contain ():

^(?<title>.+) \((?<area>[^()]+)\) \$(?<price>[\d]+)$            ^^           ^^^^^^           here           here

i remove lazy quantifier, since redundant. there 1 way match bracket () characters, before , after text captured area capturing group.

the 2 solutions above assume area never contain bracket () characters. pattern going more complicated if want allow that.

Search This Blog

Brande

Need regex to extract fields from string -

Comments

Post a Comment

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

android - send complex objects as post php java -

java - Are there any classes that implement javax.persistence.Parameter<T>? -