Comments - mariadb is crashing

12 years, 4 months ago Elena Stepanova

Hi,

No, so far I was using a different Linux flavor.

The point is, your server is not just crashing every few minutes, it is crashing according to the perfect schedule, like this:

16:00                      
16:05                      18:05
16:10                      18:10
16:15                      18:15
16:20                      18:20
16:25                      18:25
16:30                      18:30
16:35                      18:35
16:40                      18:40
16:45                      18:45
16:50                      18:50
16:55                      18:55
17:00                      19:00
17:10  <-- ATTENTION! -->  19:10
18:00  <-- ATTENTION! -->  20:00

and so on, without fault, on the same 2-hour cycle.

There are also connectivity errors in the log, which happen with 1 second presision which also suggests that something is scheduled.

I cannot completely rule out the idea that it is OS-specific, but it seems to me more likely that something is happening on the machine(s) or on the DB server(s) where you observe these regular crashes. It might be related to the database, like the scheduled connections, or something completely independent -- disk maintenance, or be whatever. The former is easier to rule out: when we see the general log, we will hopefully know more.

One more thing to check is whether you have any events configured in the database itself -- there are none in the SQL file you sent, but maybe they were created separately.

Regards, ---

 
12 years, 4 months ago Scott Feldstein

Yes, this is what I pointed out in my initial comments, "This occurs just about every 5 minutes on the dot, but sometimes it crashes out of band. mysqld_safe continuously restarts it." and that the segfault seems to be from the checkpoint thread.

Here are some things to consider as well:

1) I have run percona mysql on these same servers, using the same disks and it is completely stable for several months now.

2) this is occurring on two different servers with the same OS and not the OS with rhel 5.2

3) the stack trace says it is a segfault

From these I can say with lots of confidence that nothing is killing the daemon.

I'll get you the required data when my box is accessible again.

 
12 years, 4 months ago Elena Stepanova

There is no doubt it's a crash inside Aria checkpoint execution, we just need to find a way to reproduce it locally. The more we know about what is going on on your servers, the faster we will get there.

The fact that it is not observed on the RHEL 5.2 server does not yet mean the bad memory access is not happening there, only that it does not cause the crash. We might see more on debug binaries when we have the right DML flow to make Aria checkpoints work same way as they do on your servers. I tried simple INSERTs on your schema, but did not get any valgrind complaints so far, so it's more efficient to use the real pattern than keep trying random ones.

 
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.