span8
span4
span8
span4
Feature Caching (or Run with Full Inspection in FME 2017 and earlier) allows you to store the intermediate results of your translation and inspect them. This helps with developing and debugging your workspaces. But it consumes disk space and resources when writing the cache. Make sure Feature Caching is off when running production workspaces. Feature Caching is automatically turned off when running a workspace on FME Server.
In FME 2015 and up you can watch the dynamic feature counts. This often gives an indication which transformers are blocking your data. You can also Run with Full Inspection which shows the feature counts and caches the data at each link on the workspace. This is useful for debugging workspaces but should not be used when testing for performance issues since there is an overhead to creating the cache files.
Before being able to tune a workspace it's vital to understand how to read an FME log file. Without this knowledge a user will often jump to incorrect conclusions about a translation, and start looking for performance issues in the wrong places.
When trying to speed up a translation it is always beneficial to check the log file to determine exactly where the time was being spent.
Tip #1: Check that Tools > FME Options > Translation > Log timestamp information is checked to ensure that the timings are turned on in your log. They are always on in the logfile saved to disc but this option displays them in the logfile window also.
The first thing to note in the log is that the time reported in seconds is FME Processing time - it may not be the same as the overall length of the process. For example here the elapsed time shows the process took over 6 minutes, but the Total field reports FME used only 25 seconds.
Elapsed Time | Total| Incremental CPU (secs)
2006-07-10 14:43:06| 8.5| 0.0|
2006-07-10 14:43:13| 8.8| 0.3|
2006-07-10 14:46:29| 18.0| 9.1|
2006-07-10 14:49:29| 25.8| 7.9|
Tip #2: See How FME logs processing time for more information on why logged FME processing time is not the whole story, and why it matters.
Because FME works by pushing features through the workspace on an individual basis (not a group - see How FME processes features and how it affects the translation process for more info) it is not possible to give exact timings for every individual transformer. Therefore a lot of time doesn't get logged separately but is lumped together under the next function that does support timing.
One of the first items of importance in the log file is the temporary directory. You'll see this reported as something like this (timings removed for clarity)...
INFORM|FME Configuration: Temporary directory is `C:\DOCUME~1\xxxx\LOCALS~1\Temp'
You'll also see a line commenting on the amount of disk space available in that directory...
INFORM|System Status: 37700MB of disk space available in the FME temporary directory
When FME runs a large translation it often requires a lot of temporary disk space. This is particularly true when using multiple writers or a fanout. So the amount of available disk space is important. But on a performance issue we're more concerned about the speed of all this disk activity.
Tip #3: Where possible set your temporary directory to point to the fastest disk you have available. See Setting the Temporary Directory for how to use FME_TEMP to set a different temporary directory
Tip#3a: if possible use an SSD (solid state drive) disk for your temporary directory. These are very fast and there use will have a big impact on FME file caching performance
Tip #3b: Where possible don't set your temporary directory on the same disk that the operating system uses; FME might be slowed down by the operating system writing to the same disk at the same time.
Tip #3c: Where possible set the temporary directory to a disk that has a large amount of free space - it won't improve the speed but it may prevent a large translation from failing due to a lack of disk space.
Tip #3d: As a last resort try that old standby - the RAM drive (also called a RAM disk)! Pointing the FME temporary directory to a RAM drive, or placing a copy of the source data on a RAM drive, would have the same benefits as using an SSD.
There can be a 100% improvement in workspace completion time (4 hours down to 2 in some cases) using a RAM drive, even on an SSD machine. In particular, it really helps when there are files that are read multiple times during a translation. In the case of Microstation Geographics, the Access file is read hundreds of times. This ends up becoming a real bottleneck. So by putting that input file on RAM drive it effectively caches that data input in RAM before the translation even starts. This can be useful for a wide range of formats that have high disk I/O, such as XML/GML, or file based databases (Esri file geodatabase) etc. RAM drives are available on both Windows and Linux.
The next part of the log file relates to reading data. Remember we said above that FME works by pushing features through the workspace on an individual basis? Well it starts processing each feature as soon as it is read from the source data. It doesn't read all features then start processing. Therefore it's similarly difficult to look at a log file and try to calculate the time spent reading the data because workspace processing time will be lumped in with it. See this example. 'Emptying factory pipeline' in this example marks the point at which we've finished reading data.
2006-02-03 11:37:47| 342.7| 0.5|INFORM|Emptying factory pipeline
Here it took 342.7 seconds (about 6 minutes) to read the source data. But, as you now know, this includes time spent processing the features within the workspace. When all the transformers were removed from this workspace we got...
2006-02-03 14:44:43| 66.5| 0.3|INFORM|Emptying factory pipeline
Wow! Only one minute instead of six. This tells us that 80% of the time was spent processing the data and only 20% reading it. In this case the user thought the reading was the bottleneck, but this shows it was the transformation. He should therefore check his workspace to make sure it is as efficient as possible and that there are no unnecessary transformers.
Tip #4: If you're worried about the reading performance of a workspace disconnect the readers from the transformers in Workbench and run the translation again. Then compare the log files. It may be that a lot of the time you assume was spent reading data is actually used by the workspace transformers and this will show where to concentrate your performance efforts.
Databases are an important component of many datasets and the log file will help us determine both how good our database performance is, plus how well FME is interacting with the database. The link on tip 2 above provides a good example relating to a prefetch query carried out on an Oracle database…
2004-05-14 17:18:52| 476.1| 0.0|INFORM|Started SQL cache prefetch
2004-05-14 17:25:10| 476.2| 0.1|INFORM|Finished SQL cache
Note the difference in actual time on the left; you can see that the time between when we issue the SQL prefetch and until it’s done is roughly seven minutes. However FME logs only 0.1 seconds of CPU time. From this we can say that the remaining time was spent by Oracle retrieving the data using the query it was given. To get that time down the user would need to look at how the Oracle database is structured and how the query is written. Perhaps the field being searched on isn’t indexed? Maybe the query supplied isn’t as efficient as it could be?
Tip #5a: Check the log carefully to find out how much database-related time is spent outside of FME and see if you need to improve your database efficiency. Speaking of indexes (indices?) - the matter of whether a table is indexed or not can have a great effect on the performance when writing data to it.
Writing to an unindexed table is quick because the database has no overhead work to do.
Writing data to a table that is indexed takes a lot longer because - for each row committed - the database has to index the data immediately.
As above, the reported CPU time doesn't change because it is the database server - and not FME - that is doing the indexing work.
Tip #5b: Where possible, drop indexes before doing a bulk load into a table, then recreate them after the load is complete. It is often quicker than leaving the index in place during the data load.
Related to this is the difference between truncating a table and dropping it. FME has settings to do either, but when you truncate a table the index remains and subsequent data loading is slower. When you drop a table first, a side effect is that all indexes are also dropped, hence data can be written faster because no indexing is taking place.
Tip #5c: Consider using the option to Drop a table, rather than Truncate, in order to get better performance from a bulk data load. In the example illustrating tip 5a, you can see a prefetch for a cache. A cache is used by the Joiner transformer. The Joiner matches records to graphic features. When FME reads a matched database record it will hold it in a cache. For subsequent features this cache is checked for a match before FME checks the database. The advantage there is that database records that are matched by multiple features do not cause FME to do a database read each time because the information is already held in memory (cached). This makes the join quicker and results in less network traffic. Here is a good example... @Relate: Database query statistics for table `JOINER:MY_TABLE': 7 queries made of which 0 were sequential duplicates and 1 hit the record cache of 3 records (14% overall cache hit)
Firstly this doesn't explicitly state how many records were matched, but we can make a good guess that it was four. Three features matched records, all with differing IDs, and FME added these records to the cache. The fourth matching feature didn't have to query the database because it hit one of the records held in cache. That's where the 14% comes in, by the way. There were seven queries and one of these matched a cached record (1/7 = 14%). So FME has automatically reduced network traffic on this query by 14%. The sequential duplicates part, by the way, indicates how many features had identical key IDs. For example... @Relate: Database query statistics for table `JOINER:MY_TABLE': 7 queries made of which 3 were sequential duplicates and 2 hit the record cache of 2 records (71% overall cache hit) Here there were two hits on the cache, but also three duplicate features. Duplicate features don't need a database query, provided they are sequential so 2 (cache hits) + 3 (duplicates) = 5 and 5/7 = 71%
So caching affects performance, but what can a user do to help? Well there are two settings that can be applied within the Joiner.
The first setting is cache size. Usually only a subset of records are cached. The cache size setting specifies how many records this subset will be. Once the cache is filled new records can only be added by dropping existing ones. Therefore the larger the number the more records will be held in memory and the less database reads will occur.
Obviously the size of the setting will depend on how many records you have, how often they will be matched by an individual record and how much memory your system contains. At a certain point it will be more efficient to read the database regardless, if the cache is holding so many records your system runs out of memory.
Tip #5d: With Joiner transformers set a cache size that is appropriate for the size of your dataset and the number matches that are likely to be made in that cache.
A second cache related setting is the prefetch. Instead of filling the cache with records as they are matched, the cache can be preloaded (ie filled with a specific set of data before matching takes place) by the user issuing a prefetch query. This prefetch query can select an entire table or a selected part of a table which is most likely to be matched by the feature attributes.
For example, a number of FME features of type 'roads' require a database match. The database table (myrecords) has a field (record_type) with a number of values; roads, highways, avenues, streets. The FME features will only ever be matched to where record_type=roads so the overall join process would be much more efficient if the following prefetch was issued...
select * from myrecords where record_type = 'roads'
Tip #5e: Where Joiner transformers will match only on a known subset of records within a table it will be more efficient to prefetch that subset of records before matching takes place.
Tip #5f: Increase transaction interval.
You can speed up translations involving all writers by lengthening the interval between committing transactions. Committing transactions is an expensive operation, and therefore it is recommended that you make the transaction interval as big as possible. In speed tests performed at Safe Software, changing the transaction interval from 500 to 1000 resulted in a specific translation being 2.5% faster. Changing the transaction interval to 5000 resulted in the same translation running 5.5% faster. Turning transactions OFF resulted in an improvement of either 12% or 19%. The performance advantages of changing the transaction interval or of turning transactions off will differ between various datasets.
NB: It doesn't matter if a required record is not in the prefetch - FME will just go directly to the database to get it. Also, the cache size is only used in conjunction with a prefetch when that prefetch is NOT exhaustive, ie has a where statement. So 'select * from mytable' as a prefetch will cause the cache size to be ignored because the entire set of records is already being held by FME. But 'select * from mytable where type=mytype' will make use of the cache because the prefetch query has not fetched the entire set of records.
You can run a profile of your FME Workspace to obtain a detailed log of how much time is spent in each underlying factory or function. Use Tools -> Edit Header to add the directive: FME_PROFILE_RESULT_CSV i.e. FME_PROFILE_RESULT_CSV C:\TEMP\profile.csv
Note: Don't forget to remove the profile directive for your production workspace since profiling also has an impact on performance!
It's sometimes worth using a Performance Monitoring tool (such as PerfMon) to log the CPU and memory usage of a process.
By default, Windows restricts the memory available to a single process to 2Gbytes. FME is a single-thread process and is so falls prey to this restriction. When it exceeds the available memory either the system will crash or FME will need to start caching features to disk, which has a very negative effect on performance.
2GB is not a large number given the size of current datasets. However, you can increase the amount of memory available to 3Gbytes by setting an operating switch. See Using the /3GB Switch for how to do this. Obviously, you need to have a computer with at least 3GB of RAM installed before this setting would make any difference.
You can also use a 32bit FME running on a 64 bit workstation which gives access to 4Gbytes RAM.
You might also want to consider using FME 64bit running on a 64bit processor. There are restrictions on some of the supported formats on FME 64bit. Also, to realize the true benefits of 64bit applications, it is recommended that you DOUBLE the amount of RAM you would normally have - i.e. 8Gbytes RAM minimum
Tip #6: Increasing the amount of available memory using the /3GB switch will make large-scale translations run faster, and permit some translations that would previously fail due to a lack of memory.
Remember that we said that FME pushes features through the workspace one at a time? Well that's not always the case. While some transformers in Workbench operate on one feature at a time (feature based) others need to work on groups of features (see About Group-Based Transformers). Group-based transformers are the ones that process multiple features simultaneously; for example intersecting many line features to produce a topological network.
Obviously, any transformer that works on a group of features must hold them all in memory at a single time to do so.
So one issue is that if you have multiple group-based transformers strung together in a workspace, particularly when they are in separate streams (parallel connections), then you are potentially storing multiple copies of the data at any one time. Therefore you're using up vital system resources and potentially slowing the translation because it ends up caching data to disk instead.
Tip #7a: Obviously if you need a certain arrangement of transformers then you must use that arrangement, but be aware that multiple group-based transformers can eat up memory very quickly, and try to avoid the situation if at all possible.
Tip #7b: Sit back, relax, and watch as FME handles memory in a way that will maximize performance!
Tip#7c: some grouping transformers have a features first option. For example, PointOnAreaOverlay has an Areas First option. FeatureMerger has a Suppliers first. Clipper has Clippers First. If you can order your features appropriately, then using these options will reduce the the amount of blocking and reduce memory use.
During the translation - as we've noted several times above - FME will either be holding your data in memory or caching it to a disk. Obviously, the smaller the dataset the less memory used and the better the performance, and this includes the number of attributes.
One particular problem would be carrying around spatial data as attributes. Spatial database formats - for example Oracle or GeoMedia - usually store geometry within a field in the database; for example GEOM. When FME reads the data it converts the GEOM field into FME-style geometry and drops the field from the data.
But, it's usually possible to store geometry inside a number of fields. Sometimes you wish to create a backup copy, and sometimes the original application creates copies for its own purposes. FME will only convert one field into geometry, leaving any others as attributes. Very large and complex attributes, that take up a great deal of system resources.
One user we assisted had just this problem. A compress function in his GIS, instead of simply compressing the original geometry field, created an entirely new field. When FME read the data it used the compressed field for the geometry, but also read the original uncompressed data as a plain attribute. This caused a major slowdown, but by simply applying an AttributeRemover transformer at the start of the translation, the excess geometry column could be removed before it started to get read by group-based transformers, and the translation performance vastly increased.
Another type of attribute to beware of is a List. A list can carry many, many sets of attributes, which is a big drain on resources. For example, use a Joiner to join a feature to 1000 records and you have a list with 1000 sets of records. This is bad enough, but if you explode the list and keep all of the original attributes, then you're getting 1000 features each with 1000 sets of attributes!
Tip #8: Only carry through the translation any geometry and attributes you intend to be available on the output. Remove any excess items as early in the translation process as possible.
Wherever possible let the database do the work. The FME Oracle and most other database readers support both full SQL Statements and SQL WHERE clauses. Use SQL Joins instead of using the FeatureMerger transformer, if possible. Create a database materialised view for even better performance and to simplify your workspace (although sometimes DBAs won't allow you to do this).
For ArcSDE, SQL Statements are only supported for non-spatial tables. For spatial tables use: sdetable -o create_view to create a view that contains a spatial column in the join.
The ArcSDE help has useful tips on how to do this, see the ArcGIS Help - A quick tour of views in the geodatabase: http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/A_quick_tour_of_views_in_the_geodatabase/002n000000t0000000/
You can create complex table joins using a combination of sdetable -o create_view followed by SQL ALTER VIEW.
Tip #9: FME can improve performance in some cases by handing off processing to a database.
When you have multiple writers in a workspace the data for the first writer just gets written straightaway, whereas subsequent writers get their data cached for later writing. This helps performance in itself, but also makes the first writer in the navigation page - the order of which you can control by right-click > move up - more efficient than any other. The Advanced Workspace Parameter: Order Writers By, allows you to set the order that features in your workspace are written to the writers.
Note: The Order Writers By parameter is available as of FME 2016.
Tip #10a: When you have multiple writers in a workspace, always ensure the one getting the larger amount of data is the first writer in the list. Need an example? This FAQ tells you all you need to know.
Tip #10b: For peak performance, tiles output from the RasterTiler or WebMapTiler should be written in the order they are output from these transformers. Do not use a Dataset Fanout here as it re-orders the output and negatively affects performance.
Sometimes a complex mathematical formula is most easily calculated by splitting the expression up into smaller parts and calculating these parts within individual ExpressionEvaluator transformers. However, chaining together ExpressionEvaluators in this way is not the most efficient way of processing data.
The reason for this is because FME uses attribute values in a TCL script not by reading them from commands within the script, but by recreating the script for each feature with its relevant values embedded (note that this is my very loose explanation of what is undoubtedly a more complex issue).
The point is that the TCL code gets recompiled for each feature in each ExpressionEvaluator, and chaining a series of these transformers together just compounds the problem.
Tip #11a: When you have multiple ExpressionEvaluator transformers in a workspace, consider condensing them into a single ExpressionEvaluator to cut down on TCL calls and compiling.
Another option is to replace all of the ExpressionEvaluator transformers with a single TCL script. This might sound daunting, but can be relatively simple compared to the previous idea of condensing a tricky algorithm into a single expression.
The TCLCaller transformer is a great way to do this, but remember performance is optimized by manipulating attributes in TCL through the FME_GetAttribute and FME_SetAttribute functions that are provided specifically for this purpose.
Tip #11b: When you have multiple ExpressionEvaluator transformers in a workspace, consider replacing these with a single TCLCaller transformer that contains all of the expressions within a single procedure.
Don't bite off more than you can chew. If you have a huge amount of data to process, you may want to consider dividing your processing by some kind of grouping such as region. This way you don't have to do joins across your entire dataset all at once.
For example, you could script a Where clause to select only the data from each of Canada's 10 provinces one at a time, so that only roughly 10 - 20% of the data is processed at any one time by the FME engine. Or you could do successive spatial extent queries. Still, this would ultimately allowing the entire country to be processed. The script to call the workspace would only need to be called 10 times, each time passing the name of the province to be processed to a runtime parameter that was in turn embedded in a SQL or Where clause statement within the workspace.
NOTE: The value of this parameter should NOT be adjusted unless you have exceptional circumstances or have been directed by Safe Software. Changing this value can cause unexpected behaviour and reduce performance. FME 2018 and up should no longer benefit from adjusting the memory redline parameter
FME_ENGINE_MEMORY_REDLINE <factor>
The Resource Manager automatically determines the optimal total memory the FME Engine process should use. It also dynamically allocates this total memory optimally to the algorithms within FME requesting it.
The FME_ENGINE_MEMORY_REDLINE directive is a hint to the FME Engine on how aggressive it should be in consuming memory. It takes a value between 0 and 1 (0.5 is the default value). For more aggressive memory usage, a value above 0.5 should be used. For reduced memory usage, a value below 0.5 should be used. The risk in being too aggressive is the process running out of memory or the machine thrashing. The risk in being too conservative is that the process may take longer to complete.
:: Value = 0.5 (Default)
The Resource Manager will stash "at a reasonable point", with the goal of running as fast as possible without risking stability.
:: Value = 0.0
"Optimizing Memory Usage" should occur immediately. This practice will incur longer processing time as it is more costly to write data to disk than using memory resources.
:: Value = 1.0
No memory limits on individual processes. Stashing will only occur if the entire system is dangerously low on memory.
Note: Values between those listed above can be chosen to further tweak memory usage.
Please note that when lowering the parameter value below 0.5 it is important that there exists a sufficient amount of temporary (physical) space in your /tmp or FME_TEMP directories.
Note: In FME 2017.1.0 and earlier, Linux was biased towards never stashing due to performance issues. Linux behaviour was harmonized with Windows in FME 2017.1.1
To set this value in FME Workbench use Tools - Edit Header and enter:
FME_ENGINE_MEMORY_REDLINE 0.5
as the very first line. For FME Server Engines see How to Control FME Server Engine Memory Usage.
Improving Performance when working with Esri Geodatabases
Feature Caching and Performance
Maximum concurrent FME processes error
In what order are features processed when there are parallel transformers
Dataset Fanout Negatively Impacts Performance when Writing Raster Tiles
Improve Performance when Editing an FME Workspace
Error reading large number of Raster files
How to Read and Translate all Feature Classes from Multiple ESRI Geodatabases
© 2019 Safe Software Inc | Legal