I need to scrap number 622104 from this html
How do I get numbers?
& lt; Div class = "numberBackground" & gt; & Lt; Div id = "ctl00_mainContent_playersOnlineNumberRepeater_ctl00_numberPanel" class = "number" & gt; & Lt; Div class = "number used" & gt; & Lt; Span & gt; 6 & lt; / Span & gt; & Lt; / Div & gt; & Lt; / Div & gt; & Lt; Div id = "ctl00_mainContent_playersOnlineNumberRepeater_ctl01_numberPanel" class = "number" & gt; & Lt; Div class = "number used" & gt; & Lt; Span & gt; 2 & lt; / Span & gt; & Lt; / Div & gt; & Lt; / Div & gt; & Lt; Div id = "ctl00_mainContent_playersOnlineNumberRepeater_ctl02_numberPanel" class = "number" & gt; & Lt; Div class = "number used" & gt; & Lt; Span & gt; 2 & lt; / Span & gt; & Lt; / Div & gt; & Lt; / Div & gt; & Lt; Div id = "ctl00_mainContent_playersOnlineNumberRepeater_ctl03_commaPanel" class = "comma" & gt; & Lt; / Div & gt; & Lt; Div id = "ctl00_mainContent_playersOnlineNumberRepeater_ctl04_numberPanel" class = "number" & gt; & Lt; Div class = "number used" & gt; & Lt; Span & gt; 1 & lt; / Span & gt; & Lt; / Div & gt; & Lt; / Div & gt; & Lt; Div id = "ctl00_mainContent_playersOnlineNumberRepeater_ctl05_numberPanel" class = "number" & gt; & Lt; Div class = "number used" & gt; & Lt; Span & gt; 0 & lt; / Span & gt; & Lt; / Div & gt; & Lt; / Div & gt; & Lt; Div id = "ctl00_mainContent_playersOnlineNumberRepeater_ctl06_numberPanel" class = "number" & gt; & Lt; Div class = "number used" & gt; & Lt; Span & gt; 4 & lt; / Span & gt; & Lt; / Div & gt; & Lt; / Div & gt; & Lt; / Div & gt;
Using the class to parse the HTML string, thanks to its method, You can use class = "numberWrapper" with the attribute & lt; Div & gt; An XPath query (using class) to search the tag. Then, iterate over them, by adding your content to a variable - which, at the end of the loop, your number will be included. For example, you might be of this type code:
$ str = & lt; & Lt; & Lt; HTML ... here's your HTML ... HTML; $ Number = ''; $ Dom = new DOMDocument (); If ($ dom- & gt; loads HTML) {$ xpath = new DOMXpath ($ dom); $ Result = $ xpath- & gt; Query ('// div [@ class = "numberWrapper"]'); Forex currency ($ $ as result of $ result) {$ number. = $ Div- & gt; NodeValue; }} Var_dump ($ number); And, as output, you get:
string '622104' (length = 6) You can also use the following XPath query, to make sure that you only & lt; Span & gt; Working with tag: $ results = $ xpath-> Query ('// div [@ class = "numberWrapper"] / span'); & lt; Div & gt; s only contains & lt; Span & gt; is included, the result will be similar - but it can change in other circumstances.
Definitely (only asked to make sure): Regular expression is not the right way to remove notifications from an HTML string. Edit after If the other & lt; Div & gt; s You do not want to take into account, you will need to get another XPath query - which you get to remove.
For example, you might do some such tricks:
$ result = $ xpath-> Query ('// div [@ class = "numbersBackground"] / div [@ class = "number wraprap"] / duration'); Of course, to find out if you are matching the structure of your HTML document.
If you want to download HTML, then you have two solutions:
- If enabled on your server, then you should give it URL as a parameter Can pass in.
- Otherwise, you will have to download the HTML for example, by using the content.
As a sidenote, if you get warnings before your HTML is not valid, then you would like to take a look at the function; -)
Comments
Post a Comment