Comments - LOAD DATA INFILE

3 years, 11 months ago Jan Steinman

I would count that as pre-import massaging.

Unless you're importing a logical SQL dump from MariaDB itself (in which case, it's clearly a bug if the DATE format is wrong), you almost always have to massage your input data in some way — removing "$" and "," from currency columns, for example.

In most cases, it's no more than a few lines of Perl, Python, or Ruby to turn dates into the requisite "YYYY-MM-DD" format. (In fact, it's ONE LINE in Ruby!, Well, two, if you count "require 'date'.")

 
3 years, 11 months ago Olivier Bertrand

In that case, the date is contained in a CSV file exported from MongoDB and it has the standard format for all NOSQL data meaning: YY-MM-DDThh:mm:ssZ. I understand the LOAD INFILE command cannot take care of all existing date formats, this is why I think it should an additional parameter.

 
3 years, 11 months ago Jan Steinman

That may be the "standard" for all NOSQL databases, but I can assure you that the ISO (and before that, ANSI) standard for SQL databases is "YYYY-MM-DD hh:mm:ss".

So, you've got duelling standards here.

I could be snide and point out that ANSI standard SQL has been around long before Y2K, long before MongoDB made the questionable choice to not include the first two digits of the year (what if you have to represent a date in the last century?), long before anyone even thought of creating NOSQL, but instead, I'll just say the difference looks like just two lines of Perl. :-) Or one line of Ruby. Just do it.

You could easily put it in a pipeline between your export stream and your import stream.

PS: I think you should submit an enhancement request to MongoDB, asking to be able to export to ISO standard SQL. :-)

 
3 years, 11 months ago Olivier Bertrand

Things are strange. My table: {code} create table tld ( id int not null, name char(32) not null, birth datetime) engine=MyISAM; {code} The file to load: {no format} 1,Grace Hopper,1906-12-09T05:00:00Z 2,Kristen Nygaard,1926-08-27T04:00:00Z 3,Ole-Johan Dahl,1931-10-12T04:00:00Z {no format} When I try: {code} load data infile 'C:/Data/tld.csv' INTO TABLE tld FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'; {code} I get the error: 1292: Incorrect datetime value: '1906-12-09T05:00:00Z' for column `test`.`tld`.`birth` at row 1

Now when I edit the first record of the file to have: {no format} 1,Grace Hopper,1906-12-09 05:00:00 2,Kristen Nygaard,1926-08-27T04:00:00Z 3,Ole-Johan Dahl,1931-10-12T04:00:00Z {no format} I can load it with 2 warnings: 1265 Data truncated for column `birth` at row 2 1265 Data truncated for column `birth` at row 3

However, when I execute: {code} select * from tld; {code} It works fine and I get; {no format} +----+-----------------+-----------------------+

idnamebirth

+----+-----------------+-----------------------+

1Grace Hopper12/9/1906 5:00:00 AM
2Kristen Nygaard8/27/1926 4:00:00 AM
3Ole-Johan Dahl10/12/1931 4:00:00 AM

+----+-----------------+-----------------------+ {no format} So, even with a warning, row 2 and 3 have been accepted. Why not the first one?

 
3 years, 11 months ago Olivier Bertrand

Sorry for the typo: the format made by MongoDB is actually YYYY-MM-DDThh:mm:ssZ. It is also what REST queries return when putting a date in their JSON answer.

 
3 years, 11 months ago Jan Steinman
#!/usr/bin/sed -E
s/,([12][0-9][0-9][0-9]-[01][0-9]-[0-5][0-9])T([012][0-9]:[0-5][0-9]:[0-5][0-9])Z,/,/1 /2,/

It assumes comma-delimited. If your fields are quoted, but a double-quote after the first comma and before the last one.

I'm assuming "Z" has something to do with time zone? Or does it refer to "Zulu time," commonly called GMT? If you need to account for time zone differences, that is a further complication. If out and in have the same time zone, just ignore the zed.

I haven't actually run this. It probably has a syntax error or two. :-)

 
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.