Page 1 of 1

Vicibox stopping every night???

PostPosted: Tue May 11, 2010 6:39 am
by callingcard
It seem to work wonderfully every day ( couldn't make predictive dial to work, unfortunately, but will continue working on this ) .

It is a Vicidial installation with 2.2.0 and asterisk 1.2.30 , over ubuntu.

But it seems that every night the server is stopping ?? Where should I check?

PostPosted: Tue May 11, 2010 8:01 am
by mflorell
Does it stop at the same time exactly every night?

Is there anything in the crontab at that time?

PostPosted: Tue May 11, 2010 10:10 am
by williamconley
when you post, please post your entire configuration including (but not limited to) your installation method, vicidial version and build, asterisk version, telephony hardware (model number is helpful here), cluster information if you have one, and whether any other software is installed in the box.

this IS a requirement for posting along with reading the stickies (at the top of each forum) and the manager's manual (available on EFLO.net, both free and paid versions)

installation method is fairly important in this question.

a couple guesses: does it hang during the 6:45 am crontab scheduled reboot? is there more than one reboot scheduled?

does it crash when attempting the backup?

is the hard drive over 90% full?

PostPosted: Fri May 14, 2010 12:49 am
by callingcard
Thank you both for getting back to me.

It is a Vicidial installation with 2.2.0 and asterisk 1.2.30 , over ubuntu.

From scratch, upgraded using the instructions from vicibox to 2.2.0

Admin : VERSION: 2.2.0-236
BUILD: 100413-2328

Client : VERSION: 2.2.0-258 BUILD: 100413-1347

No telephony hardware.

I installed webmin over ubuntu, as additional software.

It seems to stop every night at the same time, and it seems to be related to the 6.45 am reboot.

this is the crontab from the installation which stops every night :

Code: Select all

### reset several temporary-info tables in the database
2 1 * * * /usr/share/astguiclient/AST_reset_mysql_vars.pl

### optimize the database tables within the asterisk database
3 1 * * * /usr/share/astguiclient/AST_DB_optimize.pl

### VICIDIAL agent time log weekly summary report generation
2 0 * * 0 /usr/share/astguiclient/AST_agent_week.pl

### remove old recordings more than 7 days old
#24 0 * * * /usr/bin/find /var/spool/asterisk/monitorDONE -maxdepth 2 -type f -mtime +2 -print | xargs rm -f

### remove old vicidial logs and asterisk logs more than 2 days old
28 0 * * * /usr/bin/find /var/log/astguiclient -maxdepth 1 -type f -mtime +2 -print | xargs rm -f
29 0 * * * /usr/bin/find /var/log/asterisk -maxdepth 3 -type f -mtime +2 -print | xargs rm -f

### Reboot for good measure
45 6 * * * /sbin/reboot
18 1 * * * /etc/webmin/cron/tempdelete.pl





And this is the crontab from one out-of-the-vicibox installation, not updated. :

The only difference I was able to see was the last line :



Code: Select all
18 1 * * * /etc/webmin/cron/tempdelete.pl



Does the problem resides here?



root@moon:~# crontab -e
GNU nano 2.0.7 File: /tmp/crontab.DPJJuB/crontab

### reset several temporary-info tables in the database
2 1 * * * /usr/share/astguiclient/AST_reset_mysql_vars.pl

### optimize the database tables within the asterisk database
3 1 * * * /usr/share/astguiclient/AST_DB_optimize.pl

### VICIDIAL agent time log weekly summary report generation
2 0 * * 0 /usr/share/astguiclient/AST_agent_week.pl

### remove old recordings more than 7 days old
#24 0 * * * /usr/bin/find /var/spool/asterisk/monitorDONE -maxdepth 2 -type f -mtime +2 -print | xargs rm -f

### remove old vicidial logs and asterisk logs more than 2 days old
28 0 * * * /usr/bin/find /var/log/astguiclient -maxdepth 1 -type f -mtime +2 -print | xargs rm -f
29 0 * * * /usr/bin/find /var/log/asterisk -maxdepth 3 -type f -mtime +2 -print | xargs rm -f

### Reboot for good measure
45 6 * * * /sbin/reboot
[/code]

PostPosted: Fri May 14, 2010 1:48 am
by okli
When the server has stoped, in what state do you find it? Powered off, hung?

Check /var/log/messages, /var/log/dmesg and /var/log/asterisk/messages. Look for clues.

PostPosted: Fri May 14, 2010 8:24 am
by williamconley
my experience with reboot crashes to date has been two possibilities:

1) multiple reboot commands (ie, vicidial crontab entry attempts a reboot and another process attempts a reboot and the original reboot has not yet completed all the way to "up" and when the system attempts to "down" in the middle of ... whoa ... and it dies

2) dependency failure (DHCP, cluster with missing master server ...) which only happens at 6:45 am because all servers are rebooting at the same time (in which case one simply reboots all "master" servers 5 minutes early so they are available before the others)

Try changing the time (drop it to 4AM) to see if there is a "coincidence" occurring.

Try changing the time (10 minutes from now!) to see if you can spot the moment of death error (and/or comb the logs for the issue)

PostPosted: Tue May 18, 2010 1:23 am
by callingcard
Tried rebooting at 4 am and did not worked.

Tried combing the logs but also nothing.

I am sorry but I am clueless here. It seems that everything is happening arround the power button press.

Everything stops at "Unmounting local filesystems".....than nothing.

On a similar server ( not upgraded to 2.2.0 ) the continuation is : " Will now restart "

Please, what am I missing?


Code: Select all
May 18 09:09:48 soleil kernel: [   21.390583] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
May 18 09:09:48 soleil kernel: [   21.411814] Linux agpgart interface v0.102
May 18 09:09:48 soleil kernel: [   21.523854] input: Power Button (FF) as /devices/virtual/input/input3
May 18 09:09:48 soleil kernel: [   21.600071] ACPI: Power Button (FF) [PWRF]
May 18 09:09:48 soleil kernel: [   21.600176] input: Power Button (CM) as /devices/virtual/input/input4
May 18 09:09:48 soleil kernel: [   21.600397] agpgart: Detected SiS chipset - id:1633
May 18 09:09:48 soleil kernel: [   21.605611] agpgart: AGP aperture is 64M @ 0xe8000000
May 18 09:09:48 soleil kernel: [   21.669942] ACPI: Power Button (CM) [PWRB]
May 18 09:09:48 soleil kernel: [   21.735512] input: ImPS/2 Generic Wheel Mouse as /devices/platform/i8042/serio1/input/input5
[/code]

PostPosted: Tue May 18, 2010 1:26 am
by callingcard
I find it hung.

okli wrote:When the server has stoped, in what state do you find it? Powered off, hung?

Check /var/log/messages, /var/log/dmesg and /var/log/asterisk/messages. Look for clues.

PostPosted: Tue May 18, 2010 8:37 am
by williamconley
your issue will "likely" be acpi

either drivers are wrong (which happens) or some function refuses to quit which cannot be severed (also happens) or there is a setting in your BIOS that is wrong for this configuration

consider making SURE that everything running on this box has a shutdown procedure registered and is OFF before the umount

silly question: why not a standard .iso installation? (that would give you optimized ubuntu for vicidial, and i've never had reboot problems with Vicibox with the sole exception of one client who accidentally had TWO reboots in his cron, LOL, and his system would attempt 2nd reboot before fully starting, then crash as a result)