CSV import considerations

Robert Helling helling at atdotde.de
Tue Sep 17 00:52:21 UTC 2013

On 17.09.2013, at 07:55, Miika Turkia <miika.turkia at gmail.com> wrote:

Hi everyone,

> Parsing the CSV in C is doable but of course scripting languages like perl or python are a lot more flexible when it comes to parsing strings and outputting XML. But then again these script languages are not universal, especially one cannot assume them to be available on Windows (even though they can be installed there).

hmm. How often would a typical user want to import such a file? My guess would be only once in the transition to subsurface. Being a lazy perlmonger myself my approach would be to write a perl script to do this and for people who don't want to install a perl environment set up a web service (basically running the same script with a cgi wrapper) for the translation. But maybe this is too lazy for you more ambitions people.

I don't know what type of CSV files we are talking about but the bad news is that CSV can mean a lot of things, in particular with respect to quoting characters and field separators. Not to mention localization nonsense (I just fell into that trap recently: For the course I was teaching we had the marks from homework and exam in an libre office spread sheet and I wanted to convert that data to TeX to print certificates. So I wrote some perl glue for the translation. Unfortunately, the spread sheet was set to German localization which had commas in grades like 2,3 (meaning 2.3). Perl string to numeric translation starts at the beginning of the string and reads characters as long as they make sense as a number and then discards the rest. Of course it wants a '.' for the decimal place and thus read "2,3" as 2 when it comes to a numerical value. Which ended in me handing out certificates with wrong grades). 

What I wanted to say: Persing CSV might be more difficult than it is at first sight.


Robert C. Helling     Elite Master Course Theoretical and Mathematical Physics
                      Scientific Coordinator
                      Ludwig Maximilians Universitaet Muenchen, Dept. Physik
                      Phone: +49 89 2180-4523  Theresienstr. 39, rm. B339

Enhance your privacy, use cryptography! My PGP keys have fingerprints
A9D1 A01D 13A5 31FA 6515  BB44 0820 367C 36BC 0C1D    and
DCED 37B6 251C 7861 270D  5613 95C7 9D32 9A8D 9B8F

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.hohndel.org/pipermail/subsurface/attachments/20130917/c7bdeba4/attachment-0001.sig>

More information about the subsurface mailing list