Basic Bulk Insert¶
In this example we will insert 1000 rows of two integer values into table test.t1
. The full code for this can be found in the example/basic_bulk_insert.py
file in the mcsapi codebase.
You will need the following table in the test database to execute this:
1 2 3 4 | CREATE TABLE `t1` (
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL
) ENGINE=Columnstore;
|
26 | import pymcsapi
|
We need to import pymcsapi
which is the main module to use mcsapi.
28 29 | try:
driver = pymcsapi.ColumnStoreDriver()
|
A new instance of ColumnStoreDriver
is created which will attempt to find the Columnstore.xml
configuration file by first searching for the environment variable COLUMNSTORE_INSTALL_DIR
and then the default path of /usr/local/mariadb/columnstore/etc/Columnstore.xml
. Alternatively we could provide a path as a parameter to ColumnStoreDriver
.
30 | bulk = driver.createBulkInsert("test", "t1", 0, 0)
|
Once we have the ColumnStore installation’s configuration in the driver we use this to initiate a bulk insert using ColumnStoreDriver.createBulkInsert()
. We are using the test
database and the t1
table. The remaining two parameters are unused for now and set to 0
.
31 32 33 34 | for i in range(0,1000):
bulk.setColumn(0, i)
bulk.setColumn(1, 1000-i)
bulk.writeRow()
|
A “for” loop is used to loop over 1000 arbitrary inserts in this example. We use ColumnStoreBulkInsert.setColumn()
to specify that column 0
(column a
) should be set to the integer from the “for” loop and column 1
(column b
) is set to 1000
minus the integer from the “for” loop.
When we have added something to every column ColumnStoreBulkInsert.writeRow()
is used to indicate we are finished with the row. The library won’t necessarily write the row at this stage, it buffers up to 100,000 rows by default.
35 | bulk.commit()
|
At the end of the loop we execute ColumnStoreBulkInsert.commit()
which will send any final rows and initiate the commit of the data. If we do not do this the transaction will be implicitly rolled back instead.
36 37 | except RuntimeError as err:
print("Error caught: %s" % (err,))
|
If anything fails then we should catch RuntimeError
to handle it.