System Troubleshooting MariaDB ColumnStore
Contents
- MariaDB ColumnStore alias commands
- MariaDB ColumnStore support tool
- MariaDB ColumnStore logging
- Enable/Disable Core Files
- MariaDB ColumnStore database files
- MariaDB ColumnStore troubleshooting utilities
- MariaDB ColumnStore Administration multi-node commands
- MariaDB ColumnStore administration misc. commands
- Multi-node install problems and how to diagnose
- Add Module install problems and how to diagnose
MariaDB ColumnStore alias commands
During the installation, these alias commands are defined and placed in the .bashrc file of the install user. This example is from a non-root install:
alias mcsmysql='/home/mariadb-user/mariadb/columnstore/mysql/bin/mysql --defaults-file=/home/mariadb-user/mariadb/columnstore/mysql/my.cnf -u root' alias ma=/home/mariadb-user/mariadb/columnstore/bin/mcsadmin alias mcsadmin=/home/mariadb-user/mariadb/columnstore/bin/mcsadmin alias cpimport=/home/mariadb-user/mariadb/columnstore/bin/cpimport alias home='cd /home/mariadb-user/mariadb/columnstore' alias log='cd /var/log/mariadb/columnstore/' alias dbrm='cd /home/mariadb-user/mariadb/columnstore/data1/systemFiles/dbrm' alias module='cat /home/mariadb-user/mariadb/columnstore/local/module'
mcsmysql - Access the MariaDB ColumnStore MySQL Console ma and mcsadmin - Access the MariaDB ColumnStore Admin Console cpimport - short-cut to run the Bulk Load Process, cpimport home - cd to MariaDB ColumnStore home directory log - cd to MariaDB ColumnStore log directory dbrm - cd to MariaDB ColumnStore DBRM file directory module - outputs the MariaDB ColumnStore local mode name, like 'pm1'
MariaDB ColumnStore support tool
This tool can be executed by users, called "columnstoreSupport", that will generated a report that contains the log files and other system data that is used by MariaDB personnel to help diagnose system related issues and errors within the MariaDB ColumnStore Product.
Here is how to run it:
On a single server:
/usr/local/mariadb/columnstore/bin/columnstoreSupport -a
On a multi-node combo server, run on pm1
/usr/local/mariadb/columnstore/bin/columnstoreSupport -a -p 'user-password'
On a multi-node separate server, run on um1
/usr/local/mariadb/columnstore/bin/columnstoreSupport -a -p 'user-password'
Here is an example of it getting run the the report that is generated:
/usr/local/mariadb/columnstore/bin/columnstoreSupport -a Get software report data for pm1 Get config report data for pm1 Get log report data for pm1 Get log config data for pm1 Get hardware report data for pm1 Get resource report data for pm1 Get dbms report data for pm1 Columnstore Support Script Successfully completed, files located in columnstoreSupportReport.tar.gz
columnstoreSupportReport.tar.gz is what you would provide to MariaDB personnel or attach to a JIRA.
This is what the report consist of:
1. Compressed log files from each module : pm1_logReport.tar.gz a. This is the directory /var/log/mariadb/columnstore from pm1 which will contain:
- system logs for ColumnStore, debug, info, err, warning, and critical
- the alarm logs, alarm.log and activealarmLog
- UI command log, uiCommands.log, which are commands entered into mcsadmin
2. Config report from each module: pm1_configReport.txt. NOTE: on a single server system, pm1 report will contain more configuration data. On a multi-node seperate system um1 report will contain more data
- /etc/tstab
- Server processes - ps command info and top
- System network information
- System configuration information including storage
- System status information at the time the report was run
- System configuration file, Columnstore.xml
3. Hardware report for each module: pm1_hardwareReport.txt
- OS version
- CPU information
- Memory information
- Storage mount information
- Ip address information, ifconfig
4. Resource report for each module: pm1_resourceReport.txt
- Shared memory
- Disk usage
- DBRM files
- Active table locks
- BRM extent map
5. Software report for each module: pm1_softwareReport.txt
- MariaDB ColumnStore software version
6. DBMR report, from the front-end modules: um1_dbmsReport.txt
- MariaDB version
- System catalog and tables
- MariaDB Columnstore usernames
- MariaDB ColumnStore variables
- MariaDB ColumnStore configuration file, my.cnf
- List of active queries at the time the report was run
7. MariaDB ColumnStore MySQL log file: um1_mysqllogReport.tar.gz
MariaDB ColumnStore logging
MariaDB ColumnStore utilizes the install system logging tool, whether its syslog, rsyslog, or ng-syslog. The logs are located in /var/log/mariadb/columnstore. There are these 5 logs:
- crit.log
- err.log
- warning.log
- info.log
- debug.log We also utilize the log rotate tool and by default we are configured to keep 7 days of log files. They are stored in /var/log/mariadb/columnstore/archive.
Also in the /var/log/mariadb/columnstore directory, there are a few other logs that are kept:
- activeAlarms – List of active alarms currently set on the system
- alarm.log – list of all the alarms and associated clear alarms
- uiCommands.log – list of the mcsadmin commands entered
The MariaDB ColumnStore process corefiles would be stored in /var/log/mariadb/columnstore/corefiles, that is if core file dumping is enabled on the system.
MariaDB ColumnStore log files and what goes in them
- Crit, err, and warning used to log problems by a MariaDB ColumnStore Process.
- Info will have logs showing high level actions that are going on in the system. During a stop/startsystem, it will show the high level commands of the process/modules being stopped and started. The bulk-load (cpimport) tool logs is high level actions there also.
- Debug will have the lower level actions from the MariaDB ColumnStore Processes, which will include queries. So all the mysql query commands will be logged here.
MariaDB ColumnStore-MySQL logs are stored here:
/usr/local/mariadb/columnstore/mysql/db/’server-name’.err
NOTE: Other log files will be written to the log directory as well as the /tmp directory from MariaDB ColumnStore process during certain operations. So you will see a few other logs show up in the log directory besides these.
Enable/Disable Core Files
Since core files are very large and can take up a lot of disk space, Core File Generating for MariaDB ColumnStore platform processes are disabled.
- Enable from pm1
# ma shutdownsystem // edit Columnstore.xml set CoreFileFlag to y # ma startsystem
- Disable from pm1
# ma shutdownsystem // edit Columnstore.xml set CoreFileFlag to n # ma startsystem
MariaDB ColumnStore database files
how to enable corefiles: (I will add to the troubleshooting guide also)
- ma shutdownsystem edit Columnstore.xml set CoreFileFlag to y
- ma startsystem
The MariaDB ColumnStore has 3 sets of database files. These files are also always backed up and restored together as part of the backup and restore process.
- MariaDB ColumnStore-MySQL schemes - /usr/local/mariadb/columnstore/mysql/db/*
- MariaDB ColumnStore Database - /usr/local/mariadb/columnstore/dataX/000.dir
- MariaDB ColumnStore DBRM files - /usr/local/mariadb/columnstore/data1/systemFiles/dbrm/*
MariaDB ColumnStore Database: The X represents the DBroot ID #. DBroot is the file directory containing the meta-data. We generally assign 1 DBroot per Performance Module. This DBroot can be pointing to local disk storage area or mounted to external disk. So for a single-server setup, we would have ‘data1’ as the dbroot. If we had a system will 2 Performance modules, then we would have ‘data1’ on PM1 and ‘data2’ on PM2.
MariaDB ColumnStore DBRM files: This is where the Extent Map and Versioning files are located. These make up the DBRM files. There are 3 copies of the files that are keep, one is the current active set and the other 2 are the backup. The current active copy file name is located in this file:
/usr/local/mariadb/columnstore/data1/systemFiles/dbrm/BRM_saves_current
The Extent Map is loaded into shared memory on each of the nodes during the start-system process time. The version in shared memory on the PM1 node is the main copy. Changes are applied to that version in memory. Then changes are made to the disk version that only exist on PM1 disk storage and a copy of those changes are sent out to the other nodes and their memory copies are updated.
NOTE: The following utility can be used to dump the internal memory copy of the Extent Map /usr/local/mariadb/columnstore/bin/editem
: there are a few options with this command, -i dumps a raw copy and -d dumps a formatted copy
MariaDB ColumnStore troubleshooting utilities
Here are a few of the common utilities that are used to view and troubleshoot issues. All of these commands are located in /usr/local/mariadb/columnstore/bin/
- editem – used to view the Extent Map in internal memory, discussed in previous section
- clearShm – used to clear the shared internal memory, used at times after a system-shutdown command is do just to make sure the memory is cleared
- configxml.sh – used to get and set parameters from the system config file, ColumnStore.xml
- ./configxml.sh getconfig ExeMgr1 Port
- ./configxml.sh setconfig ExeMgr1 Port 8601
- dbrmctl – used to display are change the DBRM status
- Example, the dbrm might be in a readonly state, this would unlock
- ./dbrmctl resume
- Example, the dbrm might be in a readonly state, this would unlock
- viewtablelock – will display what tables are locked. There might be times when a DML command might fail and leave a table in a locked state. You can run this command to find which tables are locked, and then use the ‘cleartablelock’ command
- cleartablelock – use to clear a table lock. As explained above, could be used to clear a lock on a table that was left set on a failed command
MariaDB ColumnStore Administration multi-node commands
This quick start guide shows how to bring up a single-node system. MariaDB ColumnStore also supports multi-node systems. From reading the other documents, you will have seen where we can support different configurations.
Some systems will have the User Module functionality on a separate server than the Performance Module functionality. These types of system will have both User Module and Performance Module servers. User Modules is where the MySQL front-end will reside and where the sql commands are entered. The Performance Modules will have access to the meta-data both stored on disk and cached in its internal memory.
Some systems will have the User Module functionality on the same server as the Performance Module functionality. The single-server node is like this and you can also have multiple-nodes using this configuration. Here are some common mcsadmin commands that are used on multi-node systems:
- Adding modules
- Example of adding another Performance Module to the system. This is assuming the server is setup for a MariaDB ColumnStore install, like dependent packages are installed and it will allow a root user login. This command will add 1 performance module to the system and will auto assign its ID. So you might end up with say ‘pm2’ add if you started with only a 1 pm mulit-node-system
- addmodule pm 1 ‘host-name’
- Example of adding another Performance Module to the system. This is assuming the server is setup for a MariaDB ColumnStore install, like dependent packages are installed and it will allow a root user login. This command will add 1 performance module to the system and will auto assign its ID. So you might end up with say ‘pm2’ add if you started with only a 1 pm mulit-node-system
- Adding DBroots
- When a new Performance Module is added, you will need to add a new DBroot to the
system, which will then be assigned to that newly added module. This command would add
1 DBroot and will auto assign the ID. So if the system had just 1 DBroot configured, DBRoot #2 would be added
- adddbroot 1
- When a new Performance Module is added, you will need to add a new DBroot to the
system, which will then be assigned to that newly added module. This command would add
1 DBroot and will auto assign the ID. So if the system had just 1 DBroot configured, DBRoot #2 would be added
- Assigning DBroot
- If you have added a new Performance Module and a new DBroot, then you will need to
assign that Dbroot to the module. You can also assign a new dbroot to an existing module.
- assigndbrootpmconfig 2 pm2
- If you have added a new Performance Module and a new DBroot, then you will need to
assign that Dbroot to the module. You can also assign a new dbroot to an existing module.
- Moving DBroot
- This command might be used if you decided to remove or disable a Performance Module on a system, but you wanted to keep the data in the DBroot that was assigned to the module. You can use this command to move the assignment from one pm to another. This example is moving DBRoot 2 from pm2 to pm1
- movepmdbrootconfig pm2 2 pm1
- NOTE: DBroot #1 is always assigned to Performance Module #1, so this DBroot can’t be manually moved to another Performance Module.
- This command might be used if you decided to remove or disable a Performance Module on a system, but you wanted to keep the data in the DBroot that was assigned to the module. You can use this command to move the assignment from one pm to another. This example is moving DBRoot 2 from pm2 to pm1
- Enabling Module
- One a module has been added, it will be in a disabled state. This means its now configured, but not in-use.
You would run this command to enable a newly added module. If it's a User Module, then that module will be
enabled and started, making it part of the system. If it's a Performance Module, it will only be enabled and set to a Manually Offline state. At this point, you would use the assigndbrootpmconfig or movepmdbrootconfig command to make a DBroot assigned to that module. Then you need to run startSystem to bring online.
- alterSystem-enableModule um2
- One a module has been added, it will be in a disabled state. This means its now configured, but not in-use.
You would run this command to enable a newly added module. If it's a User Module, then that module will be
enabled and started, making it part of the system. If it's a Performance Module, it will only be enabled and set to a Manually Offline state. At this point, you would use the assigndbrootpmconfig or movepmdbrootconfig command to make a DBroot assigned to that module. Then you need to run startSystem to bring online.
- Disabling Module
- You can disable a module. This will put a module in a disabled state, meaning the system will not be utilizing it. But it's still configured and could be enabled back at some point
- alterSystem-disableModule um2
- You can disable a module. This will put a module in a disabled state, meaning the system will not be utilizing it. But it's still configured and could be enabled back at some point
- Removing Module
- You can remove a module. This will de-configure a module in the system. A stopSystem is required to be run first before a remove can be removed.
- removeModule um2
- You can remove a module. This will de-configure a module in the system. A stopSystem is required to be run first before a remove can be removed.
Command sequence to add an User Module using the 'mcsadmin' tool
mcadmin > addModule um2 'host-name' mcadmin > alterSystem-enableModule um2
Command sequence to add a Performance Module using the 'mcsadmin' tool
mcadmin > addModule pm2 'host-name' mcadmin > alterSystem-enableModule pm2 mcadmin > adddbroot 1 mcadmin > assigndbrootpmconfig 2 pm2 mcadmin > startSystem
MariaDB ColumnStore administration misc. commands
- Get List of active SQL Statements
- This command will show the active SQL statements in the system
- getActiveSQLStatements
- This command will show the active SQL statements in the system
- Get Module Configuration
- This command shows the Module Configuration of the system
- getModuleConfig
- This command shows the Module Configuration of the system
- Get MariaDB Columnstore software version
- This command shows the MariaDB Columnstore software version installed
- getSoftwareInfo
- This command shows the MariaDB Columnstore software version installed
- Get Storage Configuration
- This command shows the Storage Configuration of the system
- getStorageConfig
- This command shows the Storage Configuration of the system
- Get System Information
- This command shows the System, Module, and Process Status's of the system
- getSystemInfo
- This command shows the System, Module, and Process Status's of the system
- Get System Network Configuration
- This command shows the Module Network Configuration of the system
- getSystemNetworkConfig
- This command shows the Module Network Configuration of the system
- Get System Network Configuration
- This command shows the Module Resource usage of the system, which includes system and process CPU usage, Memory and Swap usage, and Disk Usage
- getSystemResourceUsage
- This command shows the Module Resource usage of the system, which includes system and process CPU usage, Memory and Swap usage, and Disk Usage
Multi-node install problems and how to diagnose
Once you install the packages on the initial server, pm1, run post-install and postConfigure.
If its failing in the remote server install section, go via the install logs in /tmp to see why the failure occurred. It could be related to these issues:
1. user password or ssh-keys not setup, failing to log in 2. Dependent package isn't installed on a remote server 3. Incompatible OS's between nodes, all have to be the same
If you get to the point where it says Starting system processes, but it seems to hang or not return. Here are some things to check:
1. Check the locale setting on all servers and make sure they are all the same
2. on pm1, create the alias if you haven't already
- . /usr/local/mariadb/columnstore/bin/columnstoreAlias then run following command and check the process status:
- mcsadmin getsysteminfo check if ProcMon is ACTIVE on all configured servers, if not, check the log files on the asscouiated server to see what error ProcMon is reporting. Also make sure the ProcMgr is ACTIVE on pm1.
logs are located in:
/var/log/mariadb/columnstore
generally when ProcMon/ProcMgr isn't active, its because one of these issues: 1. if external storage, an pm /etc/fstab isnt setup 2. message issue between the servers that is causing ProcMon's and ProcMgr to fail to communicate. Make sure all server firewalls are disable along with SElinux.
Add Module install problems and how to diagnose
There are a number of reasons why an addModule command might fail, missing dependent packages, password or ssh key is not setup, etc. Here are some things to investigate when this 'mcsadmin' command does fail.
1. Check the log files on the local node where the command is running. Generally an log in the error, warning, or critical log will be reported when a failure occurs. 2. Depending on how far the addModule command got, another log will be generated in /tmp. Look for a log file with binary_installer, user/performance_installer. This will echo back the installer script the commands it runs, so it would flag an issue in there. 3. Also make sure you can log into the new server/instance from the local one. Not being able to log in would cause a failure. 4. Also depending on how far the command got, it might have added an entry in the system configuration file for the new module. So you will need to check that and if you were adding pm2 and it failed, then it shows up in the system configuration via the 'getsystemn'. You would then need to remove that module before trying the addModule command.