Basic Bulk Insert

In this example we will insert 1000 rows of two integer values into table test.t1. The full code for this can be found in the example/basic_bulk_insert.py file in the mcsapi codebase.

You will need the following table in the test database to execute this:

example/basic_bulk_insert.sql
1
2
3
4
CREATE TABLE `t1` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL
) ENGINE=Columnstore;
example/basic_bulk_insert.py
26
import pymcsapi

We need to import pymcsapi which is the main module to use mcsapi.

example/basic_bulk_insert.py
28
29
try:
    driver = pymcsapi.ColumnStoreDriver()

A new instance of ColumnStoreDriver is created which will attempt to find the Columnstore.xml configuration file by first searching for the environment variable COLUMNSTORE_INSTALL_DIR and then the default path of /usr/local/mariadb/columnstore/etc/Columnstore.xml. Alternatively we could provide a path as a parameter to ColumnStoreDriver.

example/basic_bulk_insert.py
30
    bulk = driver.createBulkInsert("test", "t1", 0, 0)

Once we have the ColumnStore installation’s configuration in the driver we use this to initiate a bulk insert using ColumnStoreDriver.createBulkInsert(). We are using the test database and the t1 table. The remaining two parameters are unused for now and set to 0.

example/basic_bulk_insert.py
31
32
33
34
    for i in range(0,1000):
        bulk.setColumn(0, i)
        bulk.setColumn(1, 1000-i)
        bulk.writeRow()

A “for” loop is used to loop over 1000 arbitrary inserts in this example. We use ColumnStoreBulkInsert.setColumn() to specify that column 0 (column a) should be set to the integer from the “for” loop and column 1 (column b) is set to 1000 minus the integer from the “for” loop.

When we have added something to every column ColumnStoreBulkInsert.writeRow() is used to indicate we are finished with the row. The library won’t necessarily write the row at this stage, it buffers up to 100,000 rows by default.

example/basic_bulk_insert.py
35
    bulk.commit()

At the end of the loop we execute ColumnStoreBulkInsert.commit() which will send any final rows and initiate the commit of the data. If we do not do this the transaction will be implicitly rolled back instead.

example/basic_bulk_insert.py
36
37
except RuntimeError as err:
    print("Error caught: %s" % (err,))

If anything fails then we should catch RuntimeError to handle it.