< >CSV Helper - Read, manipulate and write CSV files from urls or ingested via email


Table of contents:


The CSV Helper has two actions, one to read CSV files and one to write CSV files:


  1. The read action can read a CSV string from a remote FTP location or http URL location, including authorization tokens in headers.

  2. The write action is used to create a CSV string from some previously generated data. The user input will contain a data-reference - a ref ID pointing to a list of values, and a list of column field names. These column field names will be used when iterating through the data given by the ref ID to extract values from the list element. If the columns are empty, then the CSV will contain just one column, namely with the string representation of the corresponding line.



Reading a CSV file


Read produces a list (one entry for each row) of lists (one entry for each column), so the result of read_csv might look like this:

[

    [1, "Mike", "Germany"],

    [3, "Akito", "Russia"],

    [123, "Ivan", "Japan"],

    [9867534, "Dennis", "China"]

]


Writing a CSV file


Configuration outcome matrix (TODO rewrite in plaintext?):


Print_headers     s  

selected_columns 

new_column_names

outcome

yes

Some valid columns

Some valid names

If selected and new names have different length -> error

Use new column names

yes

Some valid columns

nothing

Use selected column names

yes

nothing

Some valid names

Expect new_colum_names to have ONE entry

yes

nothing

nothing

Error

no

Some valid columns

Some valid names

Info(Unused new names)

no

Some valid columns

nothing

Regular csv without headers

no

nothing

Some valid names

Info(Unused new names)

no

nothing

nothing

CSV with one column and no headers


The write CSV action produces a CSV string which can be written to a file on an FTP server, or sent as a mail attachment in a file.


Filtering a CSV file for certain rows (records) only

In order to a CSV file, you have to use the Dict Helper that reads in the CSV file and connect them like the below (Names you can see here are edited to better describe the particular flow action).



Once the two are connected with the arrow, select the Action "Filter a list e.g. CSV". Next select the reference like in the example  csv1. In this case we use the reference to the previous connector without curly brackets {{ }}. Within the file condition you can use standard Python filtering language, while each column is referenced by the word item in combination with a number, starting at zero. So in essence item[0] references the first column, item[1] the 2nd one and so on.

The above example filters a CSV for all rows that have "Product A" or "Product B" in the 2nd column. 


Filter Operators

The below list of operators work:

<
<=
>
>=
==
!=
in
not in




Here are more examples:

item[1][:1]  ==  'A' or item[1][:1]  ==  'B'

This will return all rows that start with A or B in column 1. So you can use standard Python string references for beginning, end of or within a text string.


" Test 123" --> item[1][:3]  -->  Returns first 3 letters like: "Tes"
" Test 123" --> item[1][1:4]  -->  Returns letter 2 to 4 like: "est"
" Test 123" --> item[1][4:]   -->  Returns letter 5 till the end like: " 123"
" Test 123" --> item[1][-3:]  -->  Returns letter the last three letters till the end like: "123"
" Test 123" --> item[1][-4:-1]  -->  Returns 3 letters before the last letter like: " 12"


Alternatively, you could also use the below to filter for a string starting or ending with, which is just a different notation as in the block above:

item[1].startswith("Test”) 
item[1].endswith("123”)




For more examples, search for "Python string slicing"


As for the querying language with the boolean operators AND OR you can combine them as you wish as you can see below in a more complex example. For more details search for "Python boolean operators".


item[1][:5]  ==  'Prod1' or ( item[1][:5]  !=  'Prod3'  and  item[4][:3] == 'USA' )


Filters rows to all those that have Prod1 standing in column 2 or have Prod3 standing in column 2 AND column 5 starts with USA. So as you see,



Modifying a whole column in a CSV or dictionary

Often times, formats of e.g. dates are different in every system and need to be adjusted and modified to reflect the format of a different system. 


Imagine a CSV with the the column header datetime and the below three datetimestamps. That need to be changed to only dates, like: 2019-06-18

; datetime
; 2019-06-18 23:23:23.232736734
; 2019-04-12 23:23:23.232736798
; 2017-06-11 23:23:23.232736727


To do the above, you need to use a combination of:

  1. The connector or helper passing the object
  2. A looper that changes something in every object
  3. and a dict_helper that changes a particular field in a dict
  4. and a date helper that formats data in a particular way