In web scraping field there are two methods for data filtration. and the question is what is best?
The correct answer is, depends.
First is to use a DOM (Document Object Model) parser and second is regex matching (regex is an acronym from regular expressions). Both of them has advantages and disadvantages.
|Simple to code||Use more memory|
|Sensitive at bad HTML|
|Insensitive at bad HTML||Use more CPU|
|more difficult to code|