c# - Which solutions are faster when extract content from webcrawler -

i have made web crawler using asp.net. it's work well. problem when want extract content it. of content wrap between html tags. have of solutions extract content don't know 1 better. should performance , easy implement.

using regex many patterns extact content.
using linq xml extract content.
using xpath extract content.

somebody please me choose better solutions. think go xpath not sure performance better regex or linq2xml.

many ideas.

none of solutions particularly good.

html not regular language , such not fit regular expressions. see standard response parsing html regex.
html not valid xml

instead, should use html parsing library html agility pack.

Search This Blog

Brande

c# - Which solutions are faster when extract content from webcrawler -

Comments

Post a Comment

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

android - send complex objects as post php java -

java - Are there any classes that implement javax.persistence.Parameter<T>? -