DOM versus Regex in web scraping

In web scraping field there are two methods for data filtration. and the question is what is best?

The correct answer is, depends.

First is to use a DOM (Document Object Model) parser and second is regex matching (regex is an acronym from regular expressions). Both of them has advantages and disadvantages.

DOM Parser

Advantages	Disadvantages
Simple to code	Use more memory
	Sensitive at bad HTML

Regex

Advantages	Disadvantages
Insensitive at bad HTML	Use more CPU
	more difficult to code

TheWebMiner Blog

cloud web scraping tool

DOM versus Regex in web scraping