ColumnStoreExporter Object¶
Methods¶
generateTableStatement¶
-
public String
generateTableStatement
(DataFrame dataFrame)¶ Returns a DML CREATE TABLE statement without database prefix based on the schema of the submitted DataFrame. The table name is set to “spark_export”.
Parameters: - dataFrame – The DataFrame from whom the structure for the generated table statement will be inferred.
-
public String
generateTableStatement
(DataFrame dataFrame, String database)¶ Returns a DML CREATE TABLE statement with database prefix based on the schema of the submitted DataFrame. The table name is set to “spark_export”.
Parameters: - dataFrame – The DataFrame from whom the structure for the generated table statement will be inferred.
- database – The database name used in the generated table statement.
Note
The submitted database name will automatically be parsed into the ColumnStore naming convention, if not already compatible.
-
public String
generateTableStatement
(DataFrame dataFrame, String database, String table)¶ Returns a DML CREATE TABLE statement for database.table based on the schema of the submitted DataFrame.
Parameters: - dataFrame – The DataFrame from whom the structure for the generated table statement will be inferred.
- database – The database name used in the generated table statement.
- table – The table name used in the generated table statement.
Note
The submitted database and table names will automatically be parsed into the ColumnStore naming convention, if not already compatible.
-
public String
generateTableStatement
(DataFrame dataFrame, String database, String table, bool determineTypeLength)¶ Returns a DML CREATE TABLE statement for database.table based on the schema (and content) of the submitted DataFrame.
Parameters: - dataFrame – The DataFrame from whom the structure for the generated table statement will be inferred.
- database – The database name used in the generated table statement.
- table – The table name used in the generated table statement.
- determineTypeLength – If set to true the content of the DataFrame will be analysed to determine the best SQL datatype for each column. Otherwise reasonable default types will be used.
Note
The submitted database and table names will automatically be parsed into the ColumnStore naming convention, if not already compatible.
export¶
-
public void
export
(String database, String table, DataFrame df)¶ Exports the given DataFrame into an existing ColumnStore database.table using the default Columnstore.xml configuration.
Parameters: - database – The target database the DataFrame is exported into.
- table – The target table the DataFrame is exported into.
- df – The DataFrame to export.
Note
To guarantee that the DataFrame import into ColumnStore is a single transaction, that is rollbacked in case of error, the DataFrame is first collected at the Spark master and from there written to the ColumnStore system. Therefore, it needs to fit into the memory of the Spark master.
Note
The schema of the DataFrame to export and the ColumnStore table to import have to match. Otherwise, the import will fail.
-
public void
export
(String database, String table, DataFrame df, String configuration)¶ Exports the given DataFrame into an existing ColumnStore database.table using a specific Columnstore.xml configuration.
Parameters: - database – The target database the DataFrame is exported into.
- table – The target table the DataFrame is exported into.
- df – The DataFrame to export.
- configuration – Path to the Columnstore.xml configuration to use for the export.
Note
To guarantee that the DataFrame import into ColumnStore is a single transaction, that is rollbacked in case of error, the DataFrame is first collected at the Spark master and from there written to the ColumnStore system. Therefore, it needs to fit into the memory of the Spark master.
Note
The schema of the DataFrame to export and the ColumnStore table to import have to match. Otherwise, the import will fail.
exportFromWorkers¶
-
public void
exportFromWorkers
(String database, String table, RDD rdd)¶ Exports the given RDD into an existing ColumnStore database.table from the worker nodes using the default Columnstore.xml configuration.
Parameters: - database – The target database the RDD is exported into.
- table – The target table the RDD is exported into.
- rdd – The RDD to export.
Note
Each partition of the RDD is imported as single transaction into ColumnStore. In case of an error only partitions in which the error occurred are rolled back. Already committed partitions will remain in the database.
Note
The schema of the RDD to export and the ColumnStore table to import have to match. Otherwise, the import will fail.
-
public void
exportFromWorkers
(String database, String table, RDD rdd, List<Int> partitions)¶ Exports the given partitions of the RDD into an existing ColumnStore database.table from the worker nodes using the default Columnstore.xml configuration.
Parameters: - database – The target database the RDD is exported into.
- table – The target table the RDD is exported into.
- rdd – The RDD to export.
- partitions – List of partitions identified by their integer to be exported. If an empty List is submitted all partitions are exported.
Note
Each partition of the RDD is imported as single transaction into ColumnStore. In case of an error only partitions in which the error occurred are rolled back. Already committed partitions will remain in the database.
Note
The schema of the RDD to export and the ColumnStore table to import have to match. Otherwise, the import will fail.
-
public void
exportFromWorkers
(String database, String table, RDD rdd, List<Int> partitions, String configuration)¶ Exports the given partitions of the RDD into an existing ColumnStore database.table from the worker nodes using a specific Columnstore.xml configuration.
Parameters: - database – The target database the RDD is exported into.
- table – The target table the RDD is exported into.
- rdd – The RDD to export.
- partitions – List of partitions identified by their integer to be exported. If an empty List is submitted all partitions are exported.
- configuration – Path to the Columnstore.xml configuration to use for the export.
Note
Each partition of the RDD is imported as single transaction into ColumnStore. In case of an error only partitions in which the error occurred are rolled back. Already committed partitions will remain in the database.
Note
The schema of the RDD to export and the ColumnStore table to import have to match. Otherwise, the import will fail.
parseTableColumnNameToCSConvention¶
-
public String
parseTableColumnNameToCSConvention
(String input)¶ Parses the input String according to the ColumnStore naming convention and returns it.
Parameters: - input – The String that is going to be parsed.