Reading from ZIP files - extract only file specified to use less disk space in T

Related products: FME Form

***Note from Migration:***



Original Title was: Reading from ZIP files - extract only file specified to use less disk space in TEMP folder




Reading files from inside ZIP files is a great feature of FME. But behind the scenes what happens is that FME extracts the files from the ZIP file to the FME TEMP folder and reads them from there. That's fine if it's a small file, but where the extracted files take up a lot of disk space, it can easily eat up your FME TEMP folder disk space.

The problem is made worse if you have concurrent Workspaces running, because each Workspace will extract the ENTIRE contents of the ZIP file, even if you only want to read ONE file out of it. I spent most of today trying to figure out why a process was failing and eventually discovered that running 3 Workspaces concurrently, each reading a single file from a ZIP file that takes up 10 Gb when fully extracted used up all my FME TEMP space.

I have a couple of suggestions:

1) The FME log is not very helpful in this situation. It simply says that it failed to open the file for reading and tells you to check the file exists and that you have privileges. But it refers to the original copy of the file in the ZIP file, not the extracted version in the FME TEMP folder. It would be more helpful if it could alert you to the lack of disk space and/or refer you to the location in the FME TEMP folder.

2) I'm using the Path Reader to get the list of files in the ZIP file, then sending one file name to a child Workspace to process. So it would be more efficient for the Workspace to only extract the file it needs from the ZIP file, rather than the whole lot.

Be the first to reply!