string - Need to speed-up this Python script - i think the StringIO is very SLOW -


this working slow.

i have written custom .py convert .gpx .kml. working need sooooo slow: small .gpx of 477k, writing 207k .kml file takes 198 seconds complete! thats absurd , haven't got meaty .gpx size yet.

my hunch is stringio.stringio(x) that's slow. ideas how speed up?

thanks in anticipation.

here key snips only:

f = open(filename, "r") x = f.read() x = re.sub(r'\n', '', x, re.s) #remove newline returns name = re.search('<name>(.*)</name>', x, re.s) print "attachment name (as recorded gps device): " + name.group(1)  x = re.sub(r'<(.*)<trkseg>', '', x, re.s)  #strip header x = x.replace("</trkseg></trk></gpx>",""); #strip footer x = x.replace("<trkpt","\n<trkpt"); #make file in lines x = re.sub(r'<speed>(.*?)</speed>', '', x, re.s) #strip speed x = re.sub(r'<extensions>(.*?)</extensions>', '', x, re.s) # strip out extensions 

then

#.kml header goes here kmltrack = """<?xml version="1.0" encoding="utf-8"?><kml xmlns="http://www.ope......etc etc 

then

buf = stringio.stringio(x) line in buf:             if line not none:                     timm = re.search('time>(.*?)</time', line, re.s)                     if timm not none:                             kmltrack += ("          <when>"+ timm.group(1)+"</when>\n")                             checksuma =+ 1 buf = stringio.stringio(x) line in buf:             if line not none:                     lat = re.search('lat="(.*?)" lo', line, re.s)                     lon = re.search('lon="(.*?)"><ele>', line, re.s)                     ele = re.search('<ele>(.*?)</ele>', line, re.s)                     if lat not none:                             kmltrack += ("          <gx:coord>"+ lon.group(1) + " " + lat.group(1) + " " + ele.group(1) + "</gx:coord>\n")                             checksumb =+ 1 if checksuma == checksumb:             #put footer on             kmltrack += """     </gx:track></placemark></document></kml>""" else:             print ("checksum error")             return none  open("outfile.kml", "a") myfile:             myfile.write(kmltrack) return ("succsesful .kml file-write completed in :" + str(c.seconds) + " seconds.") 

once again, working very slow. if can see how speed up, please let me know! cheers


updated

thanks suggestions, all. i'm new python , appreciated hearing profiling. found out it. added script. , looks down 1 thing, 208 of total time of 209 seconds run, happen in 1 line. here snip:

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)  ....   4052    0.013    0.000    0.021    0.000 stringio.py:139(readline)  8104    0.004    0.000    0.004    0.000 stringio.py:38(_complain_ifclosed)     2    0.000    0.000    0.000    0.000 stringio.py:54(__init__)     2    0.000    0.000    0.000    0.000 stringio.py:65(__iter__)  4052    0.010    0.000    0.033    0.000 stringio.py:68(next)  8101    0.018    0.000    0.078    0.000 re.py:139(search)     4    0.000    0.000  208.656   52.164 re.py:144(sub)  8105    0.016    0.000    0.025    0.000 re.py:226(_compile)    35    0.000    0.000    0.000    0.000 rpc.py:149(debug)     5    0.000    0.000    0.010    0.002 rpc.py:208(remotecall)  ...... 

there 4 calls of 52 seconds per call. cprofile says happens on line number 144 script goes 94 lines. how move on this? much.

ok all. cprofile showed re.sub call, though wasn't sure 1 - though trial , error, didnt take long isolate it. solution fix re.sub being 'greedy' 'non-greedy' call.

so old header strip call x = re.sub(r'<(.*)<trkseg>', '', x, re.s) #strip header becomes x = re.sub(r'<?xml(.*?)<trkseg>', '', x, re.s) #strip header fast.

it finshes heavy .gxp conversions in 0 seconds. difference ? makes !


Comments

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

android - send complex objects as post php java -

charts - What graph/dashboard product is facebook using in Dashboard: PUE & WUE -