mod_weppp Communitcation interrupted error message


    From time to time my WEB applications became unavailable. When the problem occurs my
WEB browser just keeps trying to access the servlets and JSPs but it always gives that spooky
time-out error message or says that the page cannot be displayed.
In regular time intervals I had to restart tomcat and apache daemons to have my applications available
again.

    I found these messages:

[Tue Oct 21 15:45:57 2003] [error] Communitcation interrupted
[Tue Oct 21 16:39:07 2003] [error] Communitcation interrupted
[Tue Oct 21 16:41:35 2003] [error] Communitcation interrupted
[Tue Oct 21 16:41:42 2003] [error] Communitcation interrupted
[Tue Oct 21 16:41:48 2003] [error] Communitcation interrupted
[Tue Oct 21 16:41:48 2003] [error] Communitcation interrupted
[Tue Oct 21 16:41:53 2003] [error] Communitcation interrupted
[Tue Oct 21 16:42:19 2003] [error] Communitcation interrupted

in apache error log files. I'm using Tomcat 4.0.4, Apache 1.3.23 and mod_webapp module. Both
running in Slackware 7.1 servers (By the way, I love Linux).
I spent a lot of time searching the cause of this problem in the WEB and I found lots of pages, but
all of them just report the problem but don't give any solution.

    I had used these components (Apache, Tomcat and mod_webapp) in some applications before, but
I didn't have this problem. In these successful cases, Apache and Tomcat run in the same machine. This
gave me a hint to the cause of the problem.
    When the problem occured, Apache and Tomcat were running in different servers and between these
servers there was a firewall that shuts down idle connections.
    It seems that mod_webapp opens several TCP connections with Tomcat that are never shut down. And
this may be the cause of the problem because if you have any network element between the apache and
tomcat servers that closes idle connections there will be a time when this network element will close all connections
between the servers, so the communication between apache and tomcat will be interrupted.


The Solution

    To solve this problem I had to modify the source code of mod_webapp module and two Linux kernel parameters. I
activated the TCP KEEPALIVE option in all sockets created in the source file webapp-module-1.0-tc40/apr/network_io/unix/sockets.c
(feel free to request the modified sockets.c source file by e-mail, but I'm not responsible for its use and for any injury, damage or loss of any
kind it may cause to you, your company, kids, pets, whatever... Also, I'm still testing this modified code and I don't know if the changes I made
endeed solved the problem, but so far it's working pretty fine)

For example:
verdadeiro = 1;
if(setsockopt((*new)->socketdes, SOL_SOCKET, SO_KEEPALIVE, (char *)&verdadeiro,
sizeof(verdadeiro)) < 0) {
    fprintf(stderr, "Falha ao ativar SO_KEEPALIVE no socket.\n");
}

Also I had to change the kernel parameter values in the special files /proc/sys/net/ipv4/tcp_keepalive_time and
/proc/sys/net/ipv4/ tcp_keepalive_probes in my web server, so the KEEPALIVE packets are sent in a time interval
shorter than the period configured in the firewall between my servers.
(This solution WILL NOT work if the KEEPALIVE packets are sent in a time interval greater than the interval the firewall is configured
to shut down idle connections)

The following table shows the modified values from the kernel parameters:

Config File
Old (default) Value
New Value
tcp_keepalive_time
7200
10
tcp_max_ka_probes
5
1024

The first parameter in the table defines the time interval between KEEPALIVE packets. The second
one defines how many sockets you may have with the KEEPALIVE option.

If you try this, remember to adjust the values after each reboot of your machine.

Well, I hope this document be useful to someone...


Useful references

Wagner Santana - October 22th of 2003
Systems Engineer/Software Developer

If you think I can help you, I can be found on....

wagner@nlink.com.br
wagner.santana@timnordeste.com.br

wagnersantana (Yahoo Messenger Id)
wagnersantana1974@hotmail.com (MSN Messenger Id only - please, don't send e-mail to this address)
5603781 (ICQ Id)