Solved

bug in shape-export (tech preview)?

  • 15 November 2019
  • 6 replies
  • 4 views

Userlevel 3
Badge +18

I noticed some strange behaviour while exporting some simple data to shapefile: when using the "tech preview" (writer of featurewriter) my attribute name and values are trimmed to one character...

I've read this format is still under active development, but as i couldn't find it as a known issue: is this a bug or am i doing something wrong?

 

expected output (using the 'classic' shapefilewriters):

 

tech preview output:

 

using FME(R) 2019.1.3.0 (20191007 - Build 19642 - WIN64)

example workbench:

icon

Best answer by mark2atsafe 15 November 2019, 19:58

View original

6 replies

Userlevel 4
Badge +26

Thanks for the heads up. Just taking a look now.

Userlevel 4
Badge +26

Bizarrely, the new Shapefile format reads it back OK. It's the old shapefile format that can't read it back. But someone one of the formats has a definite issue. I'll keep looking.

Userlevel 4
Badge +26

So, like I mentioned, it's the old Shapefile reader that can't read back data from the new Shapefile writer. My preliminary finding is that the issue is one of encoding. A message in the log tells me:

Worker 844 > DBF Reader: No encoding specified, assuming System. If attribute names are incorrect or attributes cannot be read, check reader's Encoding parameter

It's a little odd that the new reader will read it back without problem, but I guess the new writer is writing data and being a little more strict on encoding. The old reader can't handle that strictness.

So in this case I think there are two options:

  1. Make sure you are using the new reader to read the data back
  2. When you are using the new writer, be sure to set the encoding parameter to a specific value

If you do either (or both) of those, then I don't think you'll have a problem.

All the same, I've got a query in with our developers to find out exactly what is going on, and why.

Userlevel 3
Badge +18

So, like I mentioned, it's the old Shapefile reader that can't read back data from the new Shapefile writer. My preliminary finding is that the issue is one of encoding. A message in the log tells me:

Worker 844 > DBF Reader: No encoding specified, assuming System. If attribute names are incorrect or attributes cannot be read, check reader's Encoding parameter

It's a little odd that the new reader will read it back without problem, but I guess the new writer is writing data and being a little more strict on encoding. The old reader can't handle that strictness.

So in this case I think there are two options:

  1. Make sure you are using the new reader to read the data back
  2. When you are using the new writer, be sure to set the encoding parameter to a specific value

If you do either (or both) of those, then I don't think you'll have a problem.

All the same, I've got a query in with our developers to find out exactly what is going on, and why.

thanks for investigating @mark2atsafe, i'll try your suggestions after the weekend as i don't have my FME laptop around anymore

this was just a small sample from my workbench, other featurewriters with the same settings delivered normal results, just some of them trimmed the attributes, i'll check next week

thanks again!

Userlevel 4
Badge +26

thanks for investigating @mark2atsafe, i'll try your suggestions after the weekend as i don't have my FME laptop around anymore

this was just a small sample from my workbench, other featurewriters with the same settings delivered normal results, just some of them trimmed the attributes, i'll check next week

thanks again!

You're welcome. I chatted with the developer and he says that the data is UTF16. But basically it's as I mentioned above. If you set the encoding explicitly on the writer, it will help a lot. You can choose UTF8 to be internationally compatible, or System if your data is only going to be used locally. And check the log file for warnings, which is where this sort of information would appear. Hope this helps.

Userlevel 3
Badge +18

You're welcome. I chatted with the developer and he says that the data is UTF16. But basically it's as I mentioned above. If you set the encoding explicitly on the writer, it will help a lot. You can choose UTF8 to be internationally compatible, or System if your data is only going to be used locally. And check the log file for warnings, which is where this sort of information would appear. Hope this helps.

Hi @mark2atsafe, this helps for sure. I never had to worry about encoding before, so I'm not fully aware of the limitations and consequences of ignoring this setting. And it still feels strange the 'old' shp-output is OK while using the 'tech preview' leads to the necessity to set the encoding manually.

But indeed, setting to UTF8 of 'fme-system' solves the issue, while setting the parameter to UTF-16LE (same as how the data is read) the values are trimmed

Reply