I am trying to parse some data from some HTML using HTML expression. It shows data with multiple table rows:
& lt; Tr & gt; & Lt; Td> & Lt; A href = "showindex.cfm" & gt; & Lt; Span category = "style1" & gt; Companies & lt; / Span & gt; & Lt; / A & gt; & Lt; / Td> & Lt; TD & gt; & Lt; P & gt; 71 & lt; / P & gt; & Lt; / TD & gt; & Lt; / TR & gt; & Lt; TR & gt; & Lt; TD & gt; & Lt; Font & gt; & Lt; B & gt; & Lt; A href = "showindex.cfm" & gt; Political status & lt; / A & gt; & Lt; / B & gt; & Lt; / Font & gt; & Lt; / Td> & Lt; TD & gt; & Lt; P & gt; 76 & lt; / P & gt; & Lt; / TD & gt; & Lt; / TR & gt; & Lt; TR & gt; & Lt; Td> & Lt; P title = "This is political stability data; Score: 0.01; sene:" & gt; & Lt; A href = "showdatatable.cfm" & gt; Political stability denge & lt; / A & gt; & Lt; / P & gt; & Lt; / Td> & Lt; TD & gt; 7 & lt; / TD & gt; & Lt; / TR & gt; & Lt; TR & gt; & Lt; Td> & Lt; P title = "This index combines policies; Score: -0.34; sen:" & gt; & Lt; A href = "showdatatable.cfm" & gt; Local government support & lt; / A & gt; & Lt; / P & gt; & Lt; / Td> & Lt; TD & gt; 8 & lt; / TD & gt; & Lt; / TR & gt; & Lt; TR & gt; & Lt; Td> & Lt; P title = "This legal status combines data; Score: 3.59; Set:" & gt; & Lt; A href = "showdatatable.cfm" & gt; Legal status & lt; / A & gt; & Lt; / P & gt; & Lt; / Td> & Lt; Td> 9 & lt; / TD & gt; & Lt; / TR & gt; I have created a sequence of external "TD" tags.
I have an interest in this:
1 - If there is a "P" tag then its title feature is
"A" 3-internal text of
3- The internal text of the last "TD" tag and like them, like
("This is political stability data; Score: 0.01; sane : "," Companies "," 71 "); I'll tuple every two "TD" tags (my method is probably very sorry for this), then I'm interested in extracting that data, it's the form of my code In
tdSeq: seq's & lt; HtmlNode & gt; TdSeq | & Gt; Seq.pairwise | & Gt; Seq.mapi (Fun int item -> (Int, Item)). & Gt; Seq.filter (fun (not, _) - & gt; not% 2 = 0). Gt; List.ofSeq | & Gt; List.map (fun (no, item) -> items). & Gt; List.map (Interesting (A, B) -> Enter Data = A.Intext. ("Title") & lt;> Null then p.Attributes. ["Title"] [Title]] = "gtc : My value is only for every tuple (title, data, value)); The title of the "P" tag is back to any sign?
I'm not familiar with F # Here, but here is the C # equivalent:
HTML document document = loaded Maddy document (); Foreach (in HtmlNode tr doc.DocumentNode.SelectNodes (" Tr "); {string title = null; HtmlNode titleNode = tr.SelectSingleNode (" .// p "); if (titleNode! = Null) {title = titleNode.GetAttributeValue (" title ", null);} string anchor = Null; HtmlNode anchorNode = tr.SelectSingleNode (".// a"); if (anchornode! = Null) {anchor = anchorNode.InnerText;} string value = zero; HtmlNode valueNode = tr.SelectSingleNode ("td [last () ] "); If (valueNode! = Null) {value = valueNode.InnerText.Trim (); } Console. WrightLine ("title =" + title); Console Wrightite ("Anchor =" + Anchor); Console.light line ("value =" + value); } You have the main problem because you use the expression ".ppt" which starts with the root instead of the current node, "//p" expression. / P>
Comments
Post a Comment