System Troubleshooting MariaDB ColumnStore

You are viewing an old version of this article. View the current version here.

MariaDB ColumnStore alias commands

During the installation, these alias commands are defined and placed in the .bashrc file of the install user. This example is from a non-root install:

alias mcsmysql='/home/mariadb-user/mariadb/columnstore/mysql/bin/mysql --defaults-file=/home/mariadb-user/mariadb/columnstore/mysql/my.cnf -u root'
alias ma=/home/mariadb-user/mariadb/columnstore/bin/mcsadmin
alias mcsadmin=/home/mariadb-user/mariadb/columnstore/bin/mcsadmin
alias cpimport=/home/mariadb-user/mariadb/columnstore/bin/cpimport
alias home='cd /home/mariadb-user/mariadb/columnstore'
alias log='cd /var/log/mariadb/columnstore/'
alias dbrm='cd /home/mariadb-user/mariadb/columnstore/data1/systemFiles/dbrm'
alias module='cat /home/mariadb-user/mariadb/columnstore/local/module'

mcsmysql - Access the MariaDB ColumnStore MySQL Console
ma and mcsadmin - Access the MariaDB ColumnStore Admin Console
cpimport - short-cut to run the Bulk Load Process, cpimport
home - cd to MariaDB ColumnStore home directory
log - cd to MariaDB ColumnStore log directory
dbrm - cd to MariaDB ColumnStore DBRM file directory
module - outputs the MariaDB ColumnStore local mode name, like 'pm1'

MariaDB ColumnStore support tool

This tool can be executed by users, called "columnstoreSupport", that will generated a report that contains the log files and other system data that is used by MariaDB personnel to help diagnose system related issues and errors within the MariaDB ColumnStore Product.

Here is how to run it:

On a single server:

/usr/local/mariadb/columnstore/bin/columnstoreSupport -a

On a multi-node combo server, run on pm1

/usr/local/mariadb/columnstore/bin/columnstoreSupport -a -p 'user-password'

On a multi-node separate server, run on um1

/usr/local/mariadb/columnstore/bin/columnstoreSupport -a -p 'user-password'

Here is an example of it getting run the the report that is generated:

/usr/local/mariadb/columnstore/bin/columnstoreSupport -a
Get software report data for pm1
Get config report data for pm1
Get log report data for pm1
Get log config data for pm1
Get hardware report data for pm1
Get resource report data for pm1
Get dbms report data for pm1

Columnstore Support Script Successfully completed, files located in columnstoreSupportReport.tar.gz

columnstoreSupportReport.tar.gz is what you would provide to MariaDB personnel or attach to a JIRA.

This is what the report consist of:

1. Compressed log files from each module : pm1_logReport.tar.gz a. This is the directory /var/log/mariadb/columnstore from pm1 which will contain:

  • system logs for ColumnStore, debug, info, err, warning, and critical
  • the alarm logs, alarm.log and activealarmLog
  • UI command log, uiCommands.log, which are commands entered into mcsadmin

2. Config report from each module: pm1_configReport.txt. NOTE: on a single server system, pm1 report will contain more configuration data. On a multi-node seperate system um1 report will contain more data

  • /etc/tstab
  • Server processes - ps command info and top
  • System network information
  • System configuration information including storage
  • System status information at the time the report was run
  • System configuration file, Columnstore.xml

3. Hardware report for each module: pm1_hardwareReport.txt

  • OS version
  • CPU information
  • Memory information
  • Storage mount information
  • Ip address information, ifconfig

4. Resource report for each module: pm1_resourceReport.txt

  • Shared memory
  • Disk usage
  • DBRM files
  • Active table locks
  • BRM extent map

5. Software report for each module: pm1_softwareReport.txt

  • MariaDB ColumnStore software version

6. DBMR report, from the front-end modules: um1_dbmsReport.txt

  • MariaDB version
  • System catalog and tables
  • MariaDB Columnstore usernames
  • MariaDB ColumnStore variables
  • MariaDB ColumnStore configuration file, my.cnf
  • List of active queries at the time the report was run

7. MariaDB ColumnStore MySQL log file: um1_mysqllogReport.tar.gz

MariaDB ColumnStore logging

MariaDB ColumnStore utilizes the install system logging tool, whether its syslog, rsyslog, or ng-syslog. The logs are located in /var/log/mariadb/columnstore. There are these 5 logs:

  • crit.log
  • err.log
  • warning.log
  • info.log
  • debug.log We also utilize the log rotate tool and by default we are configured to keep 7 days of log files. They are stored in /var/log/mariadb/columnstore/archive.

Also in the /var/log/mariadb/columnstore directory, there are a few other logs that are kept:

  • activeAlarms – List of active alarms currently set on the system
  • alarm.log – list of all the alarms and associated clear alarms
  • uiCommands.log – list of the mcsadmin commands entered

The MariaDB ColumnStore process corefiles would be stored in /var/log/mariadb/columnstore/corefiles, that is if core file dumping is enabled on the system.

MariaDB ColumnStore log files and what goes in them

  • Crit, err, and warning used to log problems by a MariaDB ColumnStore Process.
  • Info will have logs showing high level actions that are going on in the system. During a stop/startsystem, it will show the high level commands of the process/modules being stopped and started. The bulk-load (cpimport) tool logs is high level actions there also.
  • Debug will have the lower level actions from the MariaDB ColumnStore Processes, which will include queries. So all the mysql query commands will be logged here.

MariaDB ColumnStore-MySQL logs are stored here:

/usr/local/mariadb/columnstore/mysql/db/’server-name’.err

NOTE: Other log files will be written to the log directory as well as the /tmp directory from MariaDB ColumnStore process during certain operations. So you will see a few other logs show up in the log directory besides these.

Enable/Disable Core Files

Since core files are very large and can take up a lot of disk space, Core File Generating for MariaDB ColumnStore platform processes are disabled.

The location of the MariaDB ColumnStore Process corefiles get placed here:

/var/log/mariadb/columnstore/corefiles/

  • Enable from pm1
# ma shutdownsystem
# /usr/local/mariadb/columnstore/bin/setConfig Installation CoreFileFlag y
# ma startsystem
  • Disable from pm1
# ma shutdownsystem
# /usr/local/mariadb/columnstore/bin/setConfig Installation CoreFileFlag n
# ma startsystem

MariaDB ColumnStore database files

The MariaDB ColumnStore has 3 sets of database files. These files are also always backed up and restored together as part of the backup and restore process.

  • MariaDB ColumnStore-MySQL schemes - /usr/local/mariadb/columnstore/mysql/db/*
  • MariaDB ColumnStore Database - /usr/local/mariadb/columnstore/dataX/000.dir
  • MariaDB ColumnStore DBRM files - /usr/local/mariadb/columnstore/data1/systemFiles/dbrm/*

MariaDB ColumnStore Database: The X represents the DBroot ID #. DBroot is the file directory containing the meta-data. We generally assign 1 DBroot per Performance Module. This DBroot can be pointing to local disk storage area or mounted to external disk. So for a single-server setup, we would have ‘data1’ as the dbroot. If we had a system will 2 Performance modules, then we would have ‘data1’ on PM1 and ‘data2’ on PM2.

MariaDB ColumnStore DBRM files: This is where the Extent Map and Versioning files are located. These make up the DBRM files. There are 3 copies of the files that are keep, one is the current active set and the other 2 are the backup. The current active copy file name is located in this file:

/usr/local/mariadb/columnstore/data1/systemFiles/dbrm/BRM_saves_current

The Extent Map is loaded into shared memory on each of the nodes during the start-system process time. The version in shared memory on the PM1 node is the main copy. Changes are applied to that version in memory. Then changes are made to the disk version that only exist on PM1 disk storage and a copy of those changes are sent out to the other nodes and their memory copies are updated.

NOTE: The following utility can be used to dump the internal memory copy of the Extent Map /usr/local/mariadb/columnstore/bin/editem : there are a few options with this command, -i dumps a raw copy and -d dumps a formatted copy

MariaDB ColumnStore troubleshooting utilities

Here are a few of the common utilities that are used to view and troubleshoot issues. All of these commands are located in /usr/local/mariadb/columnstore/bin/

  • editem – used to view the Extent Map in internal memory, discussed in previous section
  • clearShm – used to clear the shared internal memory, used at times after a system-shutdown command is do just to make sure the memory is cleared
  • configxml.sh – used to get and set parameters from the system config file, ColumnStore.xml
    • ./configxml.sh getconfig ExeMgr1 Port
    • ./configxml.sh setconfig ExeMgr1 Port 8601
  • dbrmctl – used to display are change the DBRM status
    • Example, the dbrm might be in a readonly state, this would unlock
      • ./dbrmctl resume
  • viewtablelock – will display what tables are locked. There might be times when a DML command might fail and leave a table in a locked state. You can run this command to find which tables are locked, and then use the ‘cleartablelock’ command
  • cleartablelock – use to clear a table lock. As explained above, could be used to clear a lock on a table that was left set on a failed command

MariaDB ColumnStore Administration multi-node commands

This quick start guide shows how to bring up a single-node system. MariaDB ColumnStore also supports multi-node systems. From reading the other documents, you will have seen where we can support different configurations.

Some systems will have the User Module functionality on a separate server than the Performance Module functionality. These types of system will have both User Module and Performance Module servers. User Modules is where the MySQL front-end will reside and where the sql commands are entered. The Performance Modules will have access to the meta-data both stored on disk and cached in its internal memory.

Some systems will have the User Module functionality on the same server as the Performance Module functionality. The single-server node is like this and you can also have multiple-nodes using this configuration. Here are some common mcsadmin commands that are used on multi-node systems:

  1. Adding modules
    • Example of adding another Performance Module to the system. This is assuming the server is setup for a MariaDB ColumnStore install, like dependent packages are installed and it will allow a root user login. This command will add 1 performance module to the system and will auto assign its ID. So you might end up with say ‘pm2’ add if you started with only a 1 pm mulit-node-system
      • addmodule pm 1 ‘host-name’
  2. Adding DBroots
    • When a new Performance Module is added, you will need to add a new DBroot to the system, which will then be assigned to that newly added module. This command would add 1 DBroot and will auto assign the ID. So if the system had just 1 DBroot configured, DBRoot #2 would be added
      • adddbroot 1
  3. Assigning DBroot
    • If you have added a new Performance Module and a new DBroot, then you will need to assign that Dbroot to the module. You can also assign a new dbroot to an existing module.
      • assigndbrootpmconfig 2 pm2
  4. Moving DBroot
    • This command might be used if you decided to remove or disable a Performance Module on a system, but you wanted to keep the data in the DBroot that was assigned to the module. You can use this command to move the assignment from one pm to another. This example is moving DBRoot 2 from pm2 to pm1
      • movepmdbrootconfig pm2 2 pm1
    • NOTE: DBroot #1 is always assigned to Performance Module #1, so this DBroot can’t be manually moved to another Performance Module.
  5. Enabling Module
    • One a module has been added, it will be in a disabled state. This means its now configured, but not in-use. You would run this command to enable a newly added module. If it's a User Module, then that module will be enabled and started, making it part of the system. If it's a Performance Module, it will only be enabled and set to a Manually Offline state. At this point, you would use the assigndbrootpmconfig or movepmdbrootconfig command to make a DBroot assigned to that module. Then you need to run startSystem to bring online.
      • alterSystem-enableModule um2
  6. Disabling Module
    • You can disable a module. This will put a module in a disabled state, meaning the system will not be utilizing it. But it's still configured and could be enabled back at some point
      • alterSystem-disableModule um2
  7. Removing Module
    • You can remove a module. This will de-configure a module in the system. A stopSystem is required to be run first before a remove can be removed.
      • removeModule um2

Command sequence to add an User Module using the 'mcsadmin' tool

mcadmin > addModule um2 'host-name'
mcadmin > alterSystem-enableModule um2

Command sequence to add a Performance Module using the 'mcsadmin' tool

mcadmin > addModule pm2 'host-name'
mcadmin > alterSystem-enableModule pm2
mcadmin > adddbroot 1
mcadmin > assigndbrootpmconfig 2 pm2
mcadmin > startSystem

MariaDB ColumnStore administration misc. commands

  1. Get List of active SQL Statements
    • This command will show the active SQL statements in the system
      • getActiveSQLStatements
  2. Get Module Configuration
    • This command shows the Module Configuration of the system
      • getModuleConfig
  3. Get MariaDB Columnstore software version
    • This command shows the MariaDB Columnstore software version installed
      • getSoftwareInfo
  4. Get Storage Configuration
    • This command shows the Storage Configuration of the system
      • getStorageConfig
  5. Get System Information
    • This command shows the System, Module, and Process Status's of the system
      • getSystemInfo
  6. Get System Network Configuration
    • This command shows the Module Network Configuration of the system
      • getSystemNetworkConfig
  7. Get System Network Configuration
    • This command shows the Module Resource usage of the system, which includes system and process CPU usage, Memory and Swap usage, and Disk Usage
      • getSystemResourceUsage

Multi-node install problems and how to diagnose

Once you install the packages on the initial server, pm1, run post-install and postConfigure.

If its failing in the remote server install section, go via the install logs in /tmp to see why the failure occurred. It could be related to these issues:

1. user password or ssh-keys not setup, failing to log in

2. Dependent package isn't installed on a remote server

3. Incompatible OS's between nodes, all have to be the same

If you get to the point where it says Starting system processes, but it seems to hang or not return. Here are some things to check:

1. Check the locale setting on all servers and make sure they are all the same

2. on pm1, create the alias if you haven't already

  1. . /usr/local/mariadb/columnstore/bin/columnstoreAlias then run following command and check the process status:
  2. mcsadmin getsysteminfo check if ProcMon is ACTIVE on all configured servers, if not, check the log files on the asscouiated server to see what error ProcMon is reporting. Also make sure the ProcMgr is ACTIVE on pm1.

logs are located in:

/var/log/mariadb/columnstore

generally when ProcMon/ProcMgr isn't active, its because one of these issues:

1. if external storage, an pm /etc/fstab isnt setup

2. message issue between the servers that is causing ProcMon's and ProcMgr to fail to communicate. Make sure all server firewalls are disable along with SElinux.

Add Module install problems and how to diagnose

There are a number of reasons why an addModule command might fail, missing dependent packages, password or ssh key is not setup, etc. Here are some things to investigate when this 'mcsadmin' command does fail.

1. Check the log files on the local node where the command is running. Generally an log in the error, warning, or critical log will be reported when a failure occurs.

2. Depending on how far the addModule command got, another log will be generated in /tmp. Look for a log file with binary_installer, user/performance_installer. This will echo back the installer script the commands it runs, so it would flag an issue in there.

3. Also make sure you can log into the new server/instance from the local one. Not being able to log in would cause a failure.

4. Also depending on how far the command got, it might have added an entry in the system configuration file for the new module. So you will need to check that and if you were adding pm2 and it failed, then it shows up in the system configuration via the 'getsystemn'. You would then need to remove that module before trying the addModule command.

postConfigure install problems and how to diagnose

The installation script, postConfigure, is run at install and upgrade times. The first part of the script takes information from the user and setup the system configuration, which is updating the Columnstore.xml and the ProcessConfig.xml configuration files.

The second part of the script performs a remote install of all of the other servers in the system, which is for a multi-node install configuration. The installation of the remote nodes are done simultaneously and the remote install logs are plaed in /tmp on 'pm1', i.e. "pm1_installer.log". The actualy log file name will be different based on if you are doing an rpm, debian, or binary install. So if postConfigure reports that a failure occurred during the remote server install phase, you can look at these logs in /tmp. The main reasons why this might fail:

1. ssh access to the remote node from pm1 failed, password or ssh setup issues

2. A missing dependency package on the remote node

The third part of postConfigure is the starting up of the system, which consist of starting up of the 'columnstore' service script on each node. If a failure happens during this time frame, do the following to help determine the failure:

1. first, you might need to run this script to get the 'mcsadmin' alias command defined

# . ./usr/local/mariadb/columnstore/bin/aliasColumnstore

2. get the system status's

# mcsadmin getsystemi

3. Here is some things that can point to you why the system didn't come up a. Make sure that all ProcMon processes are active on all nodes. If they aren't, then hear are some of the reasons why they might not be: i. Firewall is enabled on the 'pm1' node are the installing node, check that ii. ProcMon might have run into another issues at startup, like failing to mount to an external disk. So check the log files on the remote server where ProcMon failed to go ACTIVE b. If it reports a module status of FAILED, then check the log files from that module.

4. Also check the log files from the local 'pm1' module.

startSystem problems and how to diagnose

So this is assuming that the system has made it successfully though a postConfigure install or upgrade. At some point, you might need to do a stop or shutdownsystem for some maintenance or some other reason. And then do the startsystem. If any failures occur with the startSystem command, you can check the following:

1. get the system status's

# mcsadmin getsystemi

2. Here is some things that can point to you why the system didn't come up a. Make sure that all ProcMon processes are active on all nodes. If they aren't, then hear are some of the reasons why they might not be: i. Firewall is enabled on the 'pm1' node are the installing node, check that ii. ProcMon might have run into another issues at startup, like failing to mount to an external disk. So check the log files on the remote server where ProcMon failed to go ACTIVE b. If any of the DBRM Processes, Controller or Worker nodes are in a FAILED state, the most likely reason for this the there is an issue with the DBRM files. These fails are loaded from disk into shared-memory. This is load fails, it will mark the DBRM Process as FAILED. If this is the case, then please contact MariaDB Customer Support. The DBRM fails contain the Extent Map and other Metadata related to the MariaDB Columnstore Database Files. c. If it reports a module status of FAILED, then check the log files from that module.

3. Also check the log files from the local 'pm1' module.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.