Solved

CSV Reader (in FME Server 2012 SP4)


Badge
Using the CSV reader. Encountering an extremely annoying problem ... the columns by default get named col0, col1 .. etc. up to col20 and are output in the same order which is fine EXCEPT that col9 is added at the end and not where it should be ... between col8 and col10.

 

 

Can't see how I can re-order.

 

 

Any ideas? Am I doing summat wrong.
icon

Best answer by ebygomm 30 May 2013, 16:41

View original

16 replies

Userlevel 5
Hi,

 

 

could you post the first two lines of your CSV here?

 

 

Guessing can be difficult without examples :-)

 

 

David
Badge
Here's some example lines from the file ...

 

 

There are 6 types of record (determined by the value of the first column) ... types 10, 11, 12, 13, 14, 15 and 99

 

 

the file begins with a single type 10 record then many thousands of records of types 11 ... 15 and closes with just one type 99 record

 

 

the maximum number of columns in any line is 20 but not every line has 20 columns and the data format of (say) column 4 in the type 10 line isn't the same as it for column 4 in a type 13 line

 

 

 

many thanks

 

 

10,"LEEDS DISTRICT",4720,2013-03-29,1,2013-03-29,60315,7.1,"F"

 

 

11,"I",1,23000005,1,4720,2,2000-12-13,,,1,2000-12-13,2004-01-13,2000-12-13,,425715.00,435212.00,425898.00,435219.00,10

 

 

11,"I",2,23000006,1,4720,2,2004-01-13,,,1,2004-01-13,2004-01-13,2004-01-13,,426286.00,435666.00,426260.00,435701.00,10

 

 

12,"I",13982,2,23000008,1,4257810436297,1

 

 

12,"I",13983,2,23000008,1,4258580436263,1

 

 

12,"I",13984,2,23000008,1,4258620436273,1

 

 

13,"I",65946,4185500444803,1,1998-05-14,,3,418522.00,444820.00,418581.00,444791.00,0,1998-05-14,1998-05-14,1

 

 

13,"I",65947,4185510442071,1,2003-06-26,,4,418567.00,442076.00,418536.00,442062.00,0,2003-06-26,2003-06-26,1

 

 

13,"I",65948,4185510442788,1,2004-01-13,,2,418549.00,442807.00,418552.00,442768.00,0,2004-01-13,2004-01-13,1

 

 

14,"I",227618,4455190430173,1,2,445519.00,430172.00

 

 

14,"I",227619,4455390429771,1,3,445550.00,429772.00

 

 

15,"I",241933,23080144,"MELROSE VILLAS","","HORSFORTH","LEEDS","ENG"

 

 

15,"I",241934,23092499,"ZZZ PRINCE HENRYS COURT","","OTLEY","LEEDS","ENG"

 

 

15,"I",241935,23092500,"A168","","","LEEDS","ENG"

 

 

99,0,241935,2013-03-29,60315

 

Userlevel 5
Hi,

 

 

If this was my project, I suspect my very first transformers after the Reader would be something like a TestFilter to split out the record types and then renaming the other fields to something that makes more sense than col1, col2, col3, etc.

 

 

I believe the column names are sorted alphabetically, thus placing col9 behind col10, as expected in that regard.

 

 

On the other hand, the order of the attributes inside the workspaces shouldn't influence the processing at all, so I'm not really sure if I understand the problem.

 

 

You can order the fields as you wish on the Writer, which is where it counts.

 

 

David
Badge
indeed - separating the record types is the first that I do as soon as I've read the file using a Tester against column0

 

 

and then I give the attributes sensible names (dependent upon record types)

 

 

but as soon as the data come out of the tester what would have been column 9 has got lost

 

 

i have already tried re-ordering the attributes on the way out of every transformer they pass through to no avail

 

 

perplexing
Userlevel 1
Badge +21
Do you need to keep the header record (type 10)?

 

 

If not, try changing the csv reader parameters to skip the first line. I know in FME 2012 desktop the csv reader only reads the first 10 columns (col0 - col8)  in a LPG/LSG type file as the first line is only 10 columns long. With it being col9 that's the issue, i.e. column number 11, I wonder if this is a similar issue?
Userlevel 1
Badge +21
Can you tell i get mixed up with column numbering!

 

 

Should read

 

 

I know in FME 2012 desktop the csv reader only reads the first 9 columns (col0 - col8)  in a LPG/LSG type file as the first line is only 9 columns long. With it being col9 that's the issue, i.e. column number 10, I wonder if this is a similar issue?
Userlevel 3
Badge +17
Hi,

 

 

Using your example data, I couldn't reproduce any problems about the order of attributes in FME 2013 SP1, so I'm not sure what is the actual problem. If you want to try arranging the order of attributes being displayed at the reader, enable 'Allow feature type editing' option of the Workbench, then you can do that in the Feature Type Properties dialog box of the reader. To set the option, click menu Tools > FME Options.

 

 

Takashi
Badge
looking into it further it seems as if its attributes are getting shuffled by the reader

 

 

i tweaked my input to make things a bit more obvious ... so that the data is the same as its column name (starting at 0)

 

 

----------------

 

zero,1,TWO,3,4,FIVE,6,7.7,EIGHT,9,col10,col11,12,this is thirteen,14.14,15/01/2015,six and ten,seventeen,18,19

 

 

0,1,TWO,3,4,FIVE,6,7.7,EIGHT,nine,col10,col11,12,this is thirteen,14.14,15/01/2015,six and ten,seventeen,18,19

 

 

none,1,TWO,3,4,55555,6,7.7,EIGHT,9.9,col10,col11,12,this is thirteen,14.14,15/01/2015,six and ten,seventeen,18,19

 

----------------

 

 

then i read the data in ... used an AttributeRemover to remove most of the columns ... i retained cols 6 thru 10 ... then added a writer to csv and copied the attributes from the transformer

 

 

here's what came out ...

 

 

col10,col6,col7,col8,col9

 

9,6,7.7,EIGHT,19

 

nine,6,7.7,EIGHT,19

 

9.9,6,7.7,EIGHT,19

 

 

clearly "col10" now contains the data that was in column 9 and "col9" now contains the data that was in column 19

 

 

 

 

Userlevel 5
Hi Ian,

 

 

I have used the CSV reader/writer quite a bit and I have never experienced anything like this. I just tested with your sample data using FME2013sp1 and it all came out as correctly.

 

 

I would double-check that it is not your workbench that does something unexpected before the features reaches the writer. Consider setting an inspection point just after the reader and following a feature all the way to the writer to see what happens along the way.

 

 

David
Badge
i just added a data inspector fed directly from the reader and the incorrect mapping of columns is just as wrong as before - so it's happening on the way in
Userlevel 5
Have you tried to delete and re-create the CSV reader? It might be necessary if the file contents have changed since you created it the first time.

 

 

The Update function doesn't always work on the CSV reader either, unfortunately.

 

 

David
Badge
more confusing still - if i go into the Navigation Pane and Inspect the source file there ... it's all OK ... all the data is paired with the correct attribute labels ... it's only wont once its gone into the reader and come out the other side

 

 

is there some setting somewhere to do with CSV readers where i might have accidentally tweaked something global thinking it was column headers are column order without realizing?
Userlevel 3
Badge +17
Hi,

 

 

the existence of empty lines and too small 'Maximum Lines to Scan' parameter value of the reader might cause that unexpected behavior.

 

Takashi
Badge
i've gone all the way back to a blank workspace and gone through it all again ... i think the problem is exactly what E Gomm said a while back ... the first line of my file contains just 9 columns but all the stuff that follows has more, which makes it go wonky

 

 

... so instead when i first add the reader and go to Parameters if i tell the reader to skip the first line it seems to come out ok after that

 

 

o'course the downside to that is that i can't process the first line - so i'd have to add another reader and deal with it separately

 

 

it'd be nice to think this was fixed in 2013?

 

 

many thanks to everyone for suggestions and advice

 

 

IanM
Userlevel 5
Hi,

 

 

good to hear you found a solution. You might also find the following alternative of interest, using a PythonCreator with the built-in csv module to parse the file into FME features.

 

 

This solution covers a few border cases where the FME csv reader falls short, such as super-wide csv files, text fields with linebreaks, etc.

 

 

David
Userlevel 1
Badge +21
You'll be pleased to know the csv reader in 2013 works correctly for these type of files without having to skip the header line.

Reply