Solved

How can I extract the html table from kml_description?


Badge

I am trying to get the attributes from a point layer in a KMZ file stored in a HTML table within kml_description...... 

I have tried using HTMLToXHTMLConverter, but the features always fail. Currently using FME 2013 SP3.  FME Data Inspector is able to read and list them under "Attributes". 

I followed the instructions from:

https://knowledge.safe.com/articles/19918/how-to-expose-feature-attributes-from-kml-tag.html

 Here is an example of the html code that comes between <description> and </description>

<html lang="en">
<head><meta charset="utf8"/><style> * { margin: 0; } html, body { height: 100%; margin: 0; padding: 0; } h2 { textalign: center; background: #e5eCf9; } #summaryData { fontfamily: 'Helvetica Neue', Arial, Helvetica, sansserif; fontsize: 12px; width: 350px; } #summaryData th{ paddingright: 20px; width: 40%; } .fire_title { background: #ffffff; marginbottom: 15px; } .box { textalign: left; padding: 1.5em; paddingtop: 1em; marginbottom: 1.5em; background: #e5eCf9; width: 100%; } .spacer { height: 15px; } </style></head>
<body>
<div id="summaryData"><h2>Friday, June 3, 2016</h2>
<table>
<tr><th>Type</th><td>WF</td></tr></table><h2>Totals</h2>
<table>
<tr><th>Area Burned</th><td>28.22 acres</td></tr><tr/>
<tr><th>CO2</th><td>17.91 tons</td></tr>
<tr><th>CO</th><td>2.44 tons</td></tr>
<tr><th>PM10</th><td>0.23 tons</td></tr>
<tr><th>VOC</th><td>0.57 tons</td></tr>
<tr><th>SO2</th><td>0.01 tons</td></tr>
<tr><th>NOX</th><td>0.01 tons</td></tr>
<tr><th>NH3</th><td>0.04 tons</td></tr>
<tr><th>CH4</th><td>0.12 tons</td></tr>
<tr><th>PM25</th><td>0.2 tons</td></tr>
</table></div></body></html>
icon

Best answer by takashi 2 June 2016, 05:04

View original

4 replies

Userlevel 3
Badge +17

Hi @colin_forsyth, I was able to convert your sample HTML doc to  an XHTML doc with the HTMLToXHTMLConvertor transformer. However, the schema is different from the example in the article that you linked, you therefore will have to define your own XQuery expression. e.g.

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in /html/body/div/table
for $y in $x/tr
return fme:set-attribute($y/th/text(), $y/td/text())

If the first table (Type: WF) is not necessary:

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in /html/body/div/table[2]/tr
return fme:set-attribute($x/th/text(), $x/td/text())
Badge

Thanks Takashi. It is working for me now with the XQuery.

Don't know if anyone else wants to take a crack at another difficult KML to parse: but I'm at a loss:

 

<?xml version="1.0" encoding="UTF-8"?><kml xmlns="http://www.opengis.net/kml/2.2"><Document><description>Area: Left: 885936.156160; Right: 926105.126601; Bottom: 15171486.649022; Top: 15190745.601094</description>

<Region><LatLonAltBox><north>41.7898664686</north>

<west>-89.7675684189</west>

<east>-89.6182244316</east>

<south>41.7405034708</south></LatLonAltBox></Region>

<name>spatialNET View Around Query Boundary: RF</name><Folder><description>Fiber Entities</description>

<visibility>1</visibility>

<open>0</open>

<name>Fiber Network</name><Folder><description>All Fiber Splice Cases in the target area.</description>

<visibility>1</visibility>

<Snippet></Snippet>

<open>0</open>

<name>Fiber Splice Cases</name><Placemark id="{SPLICE_CASE,10199669}"><description><![CDATA[

<img src="google_header.png"/>

<table>

<tr><td><h1>Splice Case: RFAVEA-F-DS08</h1></tr>

<tr><td><table>

<tr><td><h3>Attributes:</h3></td></tr>

<tr><td><table>

<tr><th>CLLI Code</th><td>None</td></tr><tr><th>Nodal Location:</th><td>920787.625332,15181617.4465</td></tr><tr><th>Entity status</th><td>Proposed<br/>Modified<br/>New<br/>Design Change</td></tr><tr><th>Account Code</th><td>None</td></tr><tr><th>Street Address</th><td>None</td></tr><tr><th>Billing Address</th><td>None</td></tr><tr><th>Number of Cables Spliced</th><td>0</td></tr><tr><th>Site Code</th><td>None</td></tr><tr><th>Designation</th><td>RFAVEA-F-DS08</td></tr><tr><th>Symbol Scale</th><td>None</td></tr><tr><th>Alternate Name</th><td>None</td></tr><tr><th>Construction Status</th><td>None</td></tr><tr><th>Location</th><td>None</td></tr><tr><th>Contact</th><td>None</td></tr><tr><th>Owner</th><td>None</td></tr><tr><th>Fiber Design Profile</th><td>None</td></tr><tr><th>Site Type</th><td>FOSC 450-B (24)</td></tr><tr><th>Type Description</th><td>TFD - Distribution Splice Case - B (24)</td></tr><tr><th>State</th><td>None</td></tr><tr><th>Town</th><td>None</td></tr><tr><th>ZIP Code</th><td>None</td></tr><tr><th>Nodal Rotation:</th><td>0.154072566076</td></tr><tr><th>Service Status Code</th><td>I</td></tr><tr><th>Service Status Date</th><td>None</td></tr><tr><th>Owning Drawing</th><td>None</td></tr><tr><th>ID codes for owner</th><td></td></tr><tr><th>Media Type</th><td>F</td></tr><tr><th>Incoming Cables</th><td><a href="#{FIBER_CABLE_UNCON,10200655};balloonFlyto">24-Armor SMode Loose: RFAVEA-F-DF08</a></td></tr><tr><th>Outgoing Cables</th><td></td></tr><tr><th>Passthrough Cables</th><td></td></tr><tr><th>Equipment Attribute 1</th><td>None</td></tr><tr><th>Equipment Attribute 2</th><td>None</td></tr><tr><th>Size of equipment</th><td>None</td></tr><tr><th>Equipment Type</th><td>None</td></tr><tr><th>Installation Date</th><td>None</td></tr><tr><th>Plant Owner</th><td>None</td></tr><tr><th>Calculated Latitude and Longitude</th><td> 41.767841/-89.638834</td></tr><tr><th>Noun (class descriptor)</th><td>Fiber Splice Case</td></tr><tr><th>Format (entity descriptor)</th><td>Splice Case: RFAVEA-F-DS08</td></tr><tr><th>Operational State</th><td>0</td></tr><tr><th>Operational State</th><td>In Service</td></tr><tr><th>Service Status</th><td>New</td></tr><tr><th>Workflow State</th><td>0</td></tr><tr><th>Workflow State</th><td>Real World</td></tr>

</table></td></tr>

 

<tr><td><h3>Documents:</h3></td></tr>

<tr><td><table>

 

</table></td></tr>

</table></td></tr>

</table>

<img src="google_footer.png" />

]]></description>

Badge +2

Don't know if anyone else wants to take a crack at another difficult KML to parse: but I'm at a loss:

 

<?xml version="1.0" encoding="UTF-8"?><kml xmlns="http://www.opengis.net/kml/2.2"><Document><description>Area: Left: 885936.156160; Right: 926105.126601; Bottom: 15171486.649022; Top: 15190745.601094</description>

<Region><LatLonAltBox><north>41.7898664686</north>

<west>-89.7675684189</west>

<east>-89.6182244316</east>

<south>41.7405034708</south></LatLonAltBox></Region>

<name>spatialNET View Around Query Boundary: RF</name><Folder><description>Fiber Entities</description>

<visibility>1</visibility>

<open>0</open>

<name>Fiber Network</name><Folder><description>All Fiber Splice Cases in the target area.</description>

<visibility>1</visibility>

<Snippet></Snippet>

<open>0</open>

<name>Fiber Splice Cases</name><Placemark id="{SPLICE_CASE,10199669}"><description><![CDATA[

<img src="google_header.png"/>

<table>

<tr><td><h1>Splice Case: RFAVEA-F-DS08</h1></tr>

<tr><td><table>

<tr><td><h3>Attributes:</h3></td></tr>

<tr><td><table>

<tr><th>CLLI Code</th><td>None</td></tr><tr><th>Nodal Location:</th><td>920787.625332,15181617.4465</td></tr><tr><th>Entity status</th><td>Proposed<br/>Modified<br/>New<br/>Design Change</td></tr><tr><th>Account Code</th><td>None</td></tr><tr><th>Street Address</th><td>None</td></tr><tr><th>Billing Address</th><td>None</td></tr><tr><th>Number of Cables Spliced</th><td>0</td></tr><tr><th>Site Code</th><td>None</td></tr><tr><th>Designation</th><td>RFAVEA-F-DS08</td></tr><tr><th>Symbol Scale</th><td>None</td></tr><tr><th>Alternate Name</th><td>None</td></tr><tr><th>Construction Status</th><td>None</td></tr><tr><th>Location</th><td>None</td></tr><tr><th>Contact</th><td>None</td></tr><tr><th>Owner</th><td>None</td></tr><tr><th>Fiber Design Profile</th><td>None</td></tr><tr><th>Site Type</th><td>FOSC 450-B (24)</td></tr><tr><th>Type Description</th><td>TFD - Distribution Splice Case - B (24)</td></tr><tr><th>State</th><td>None</td></tr><tr><th>Town</th><td>None</td></tr><tr><th>ZIP Code</th><td>None</td></tr><tr><th>Nodal Rotation:</th><td>0.154072566076</td></tr><tr><th>Service Status Code</th><td>I</td></tr><tr><th>Service Status Date</th><td>None</td></tr><tr><th>Owning Drawing</th><td>None</td></tr><tr><th>ID codes for owner</th><td></td></tr><tr><th>Media Type</th><td>F</td></tr><tr><th>Incoming Cables</th><td><a href="#{FIBER_CABLE_UNCON,10200655};balloonFlyto">24-Armor SMode Loose: RFAVEA-F-DF08</a></td></tr><tr><th>Outgoing Cables</th><td></td></tr><tr><th>Passthrough Cables</th><td></td></tr><tr><th>Equipment Attribute 1</th><td>None</td></tr><tr><th>Equipment Attribute 2</th><td>None</td></tr><tr><th>Size of equipment</th><td>None</td></tr><tr><th>Equipment Type</th><td>None</td></tr><tr><th>Installation Date</th><td>None</td></tr><tr><th>Plant Owner</th><td>None</td></tr><tr><th>Calculated Latitude and Longitude</th><td> 41.767841/-89.638834</td></tr><tr><th>Noun (class descriptor)</th><td>Fiber Splice Case</td></tr><tr><th>Format (entity descriptor)</th><td>Splice Case: RFAVEA-F-DS08</td></tr><tr><th>Operational State</th><td>0</td></tr><tr><th>Operational State</th><td>In Service</td></tr><tr><th>Service Status</th><td>New</td></tr><tr><th>Workflow State</th><td>0</td></tr><tr><th>Workflow State</th><td>Real World</td></tr>

</table></td></tr>

 

<tr><td><h3>Documents:</h3></td></tr>

<tr><td><table>

 

</table></td></tr>

</table></td></tr>

</table>

<img src="google_footer.png" />

]]></description>

Hi @mwilliamson, Welcome to the FME Forum. I'd highly encourage you to post this as a separate New Question. That way you can provide more details on what information you'd like to parse out of the KML, in your scenario, and a new question will likely get more attention.

Reply