Advanced Bulk Insert¶
In this example we will insert 2 rows in a more complex table. This will demonstrate using different kinds of data types, chanined methods and getting the summary information at the end of a transaction.
You will need the following table in the test database to execute this:
1 2 3 4 5 6 7 | CREATE TABLE `t2` (
`id` int(11) DEFAULT NULL,
`name` varchar(40) DEFAULT NULL,
`dob` date DEFAULT NULL,
`added` datetime DEFAULT NULL,
`salary` decimal(9,2) DEFAULT NULL
) ENGINE=Columnstore;
|
26 27 28 29 30 | import pymcsapi
try:
driver = pymcsapi.ColumnStoreDriver()
bulk = driver.createBulkInsert("test", "t2", 0, 0)
|
As with the basic example we create an instance of the driver and use it to create a bulk insert instance.
31 32 33 34 35 36 | bulk.setColumn(0, 1)
bulk.setColumn(1, "Andrew")
bulk.setColumn(2, "1936-12-24")
bulk.setColumn(3, "2017-07-07 15:14:12")
bulk.setColumn(4, "15239.45");
bulk.writeRow()
|
This demonstrates setting several different data types using strings of data. The second column (column 1
) is a VARCHAR(40)
and is set to “Andrew”. The third column is a DATE
column and the API will automatically convert this into a binary date format before transmitting it to ColumnStore. The fourth is a DATETIME
and the fifth a DECIMAL
column which again the API will convert from the strings into the binary format.
37 | bulk.setColumn(0, 2)[0].setColumn(1, "David")[0].setColumn(2, pymcsapi.ColumnStoreDateTime(1972, 5, 23))[0].setColumn(3, pymcsapi.ColumnStoreDateTime(2017, 7, 7, 15, 20, 18))[0].setColumn(4, pymcsapi.ColumnStoreDecimal(2347623, 2))[0].writeRow()
|
Many of the ColumnStoreBulkInsert
methods return a pointer to the class and a return status which means multiple calls can be chained together in a similar way to ORM APIs. Here we use additional datatypes ColumnStoreDateTime
and ColumnStoreDecimal
.
ColumnStoreDateTime
is used to create an entry for a DATE
or DATETIME
column. It can be used to define custom formats for dates and times using the strptime format.
A decimal is created using the ColumnStoreDecimal
class. It can be set using a string, double
or a pair of integers. The first integer is the precision and the second integer is the scale. So this number becomes 23476.23
.
38 39 40 41 42 43 44 | bulk.commit()
summary = bulk.getSummary()
print("Execution time: %s" % (summary.getExecutionTime(),))
print("Rows inserted: %s" % (summary.getRowsInsertedCount(),))
print("Truncation count: %s" %(summary.getTruncationCount(),))
print("Saturated count: %s" %(summary.getSaturatedCount(),))
print("Invalid count: %s" %(summary.getInvalidCount(),))
|
After a commit or rollback we can obtain summary information from the bulk insert class. This is done using the ColumnStoreBulkInsert.getSummary()
method which will return a reference ColumnStoreSummary
class. In this example we get the number of rows inserted (or would be inserted if there was a rollback) and the execution time from the moment the bulk insert class is created until the commit or rollback is complete.
45 46 | except RuntimeError as err:
print("Error caught: %s" % (err,))
|
At the end we clean up in the same was as the basic bulk insert example.