regex - How to extract zipcode from very meshy string using regular expression in SAS? -


i trying extract 5 number zipcode address field. have included sample data (see below). data has 5 digit street fields in beginning, , 5 digit po box number in middle part , 5-9 digit zipcodes, in middle part , in end of string. objective extract 5 digit zipcode string not 5 digit street , po box number using regular expression in sas. please take @ sample data , me resolve issue. highly appreciate kind assistance.

13001 nw42 ave opa locka fl 33054 usa 13001 nw 42 avenue opa locka fl 33054 usa po box 98748 chicago il 60693 usa 601 w 80th street chicago il 60620 2502 12651 s dixie hwy, suite 321 miami,florida33156 12713 sw 125th ave miamifl 331865932 

this work specific example.

data have; length str $150; infile datalines truncover; input @1 str $150.; datalines; 13001 nw42 ave opa locka fl 33054 usa 13001 nw 42 avenue opa locka fl 33054 usa po box 98748 chicago il 60693 usa 601 w 80th street chicago il 60620 2502 12651 s dixie hwy, suite 321 miami,florida33156 12713 sw 125th ave miamifl 331865932 ;;;; run;  data want; set have; z_re = prxparse('`(\d{5}) ?(?:$|usa|\d{4})`o'); rc_z = prxmatch(z_re,trimn(str)); if rc_z zip = prxposn(z_re,1,str); put zip=; run; 

you can either adjust include other things, or reasonability checks possible places 5(+) digit string might appear zip code. example, might require within 10 characters of end-of-string, , @ least 10 characters beginning-of-string:

data want; set have; z_re = prxparse('`^.{10,}\d(\d{5}).{0,10}$`o'); rc_z = prxmatch(z_re,trimn(str)); if rc_z zip = prxposn(z_re,1,str); put zip=; run; 

i have include \d make sure matches 33186 instead of 65932 in last match. rule may better or may worse depending on various other possibilities; depending on data it's possible no match enough catch 100%. might consider doing both methods, , looking @ records disagree.


Comments

Popular posts from this blog

php - Why I am getting the Error "Commands out of sync; you can't run this command now" -

linux - Does gcc have any options to add version info in ELF binary file? -

java - Are there any classes that implement javax.persistence.Parameter<T>? -