open html parsers?


Warning: count(): Parameter must be an array or an object that implements Countable in /home/styllloz/public_html/qa-theme/donut-theme/qa-donut-layer.php on line 274
0 like 0 dislike
3 views
For rasprsivanja html using libxml2. In General satisfied, but I want something fast.
Watched some opensorce search engines (Xapian, Dataparksearch) — they have their own parsers. To deal with their sources and adapted to their needs — is not yet ripe, although close to it.
If anyone knows of other open parsers, lighter and more nimble than libxml2? I either Google or Yandex could not help. Maybe not as asked.
by | 3 views

7 Answers

0 like 0 dislike
Why not use regular expressions if you only need to pull out pieces of a page? Getting a header /(\\w+)<\\/title>/gi, collection of links — something like /]*href="([^>"]*)"[^>]*>(\\w+)<\\/a>/gi (though the regular season does not work if in the text links still have the tags). To sit, to smash the brain over them... and probably will work.
by
0 like 0 dislike
the collection of all links from the page to the form
\r

\r
isn't that one of the regular season is done?
by
0 like 0 dislike
Faster than if you write the parser sharpened for a specific purpose by yourself is unlikely to come.
You have some very specific and difficult task that you are using libxml? Of course it can, my hands curves, but I've tried to parse complex XML, every time understood that handles both faster and more reliable:)
by
0 like 0 dislike
maybe you are interested in simplehtmldom.sourceforge.net
by
0 like 0 dislike
phpquery has a great functionality but not quite the speed. Better to bring HTML into XML and processed using XSLT. Speed I think will fully satisfy.
by
0 like 0 dislike
Can look in the direction of Mechanize.
by
0 like 0 dislike
Also interested in parsers. These kind of can come - Grab, Scrapy or PHP HTML DOM parser
by

Related questions

0 like 0 dislike
1 answer
asked May 21, 2019 by penchekryak
0 like 0 dislike
2 answers
asked May 22, 2019 by Miracl
0 like 0 dislike
6 answers
0 like 0 dislike
1 answer
asked Apr 11, 2019 by fomenko_alexandr
110,608 questions
257,186 answers
0 comments
26,158 users