Eats smaller websites for breakfast.

July 10, 2008

Using AWK to convert CSV to XML

Filed under: awk,programming — Dave @ 10:52 am

I needed to convert CSVs to XML, so it’s time to return to my text processing hero: AWK.

CSV to XML gets me 90% of the way there, except, it doesn’t actually convert from Comma Separated Values to XML, it uses space to separate values. Updated script:

BEGIN {RS = "\n"
FS = "," }
NR == 1 {for (i = 1; i <=NF; i++)
tag[i]=$i
print "<" node "XML>"}
NR != 1 {print " <" node ">"
for (i = 1; i <= NF; i++)
print " <" tag[i] ">" $i ""
print " "}
END {print ""}

Notes on cleaning your data:

  • our column headings need to be sane, ie: no spaces or weird characters

  • Numbers with commas, ie “12,231″ won’t work

No comments yet.

Leave a comment