Bug 38788 - problems with ip masquerading flooding destination addresses with packets
problems with ip masquerading flooding destination addresses with packets
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.1
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: David Miller
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-05-02 12:14 EDT by Need Real Name
Modified: 2007-04-18 12:32 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-06-06 08:08:38 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
tcpdump -i eth0 on atlas after applying new ipchains rules (8.33 KB, text/plain)
2001-05-03 09:38 EDT, Need Real Name
no flags Details
tcpdump -i eth0 on blizzard after applying new ipchains rules (7.53 KB, text/plain)
2001-05-03 09:39 EDT, Need Real Name
no flags Details

  None (edit)
Description Need Real Name 2001-05-02 12:14:26 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.77 [en] (X11; U; Linux 2.2.16-22 i686)


LAN architecture for relevant PCs:

192.168.0.1 - blizzard (RH 7.1 gateway machine allowing IP masquerading
over a dial-up ppp connection to the internet)
192.168.0.5 - typhoon (RH 7.1 work station)
192.168.0.7 - atlas (RH 7.0 work station)

2 Days ago I cleanly installed RedHat 7.1 on a gateway machine (blizzard)
for our LAN which was previously running RH7.0.

I am having problems with NAT/IP masquerading. When browsing web-sites from
either of the workstations I notice that every now and again the modem and
hub go wild (the modem send and receive lights keep flickering and the hub
collision lights for the gateway machine and the workstation which is
browsing the web flicker.) The available bandwidth for browsing the
internet drops substantially, slowing access.

Doing:
[root@atlas /root]# tcpdump -i eth0

on a workstation when this is happening gives:

15:56:12.594199 < www.linuxformat.com.www > atlas.oaklea.intranet.1656: .
0:0(0) ack 1 win 8760 <nop,nop,timestamp 5142741 1336738> (DF)
15:56:12.594309 > atlas.oaklea.intranet.1656 > www.linuxformat.com.www: .
14257264:14257264(0) ack 4294796990 win 31856 <nop,nop,timestamp 1456158
5140397> (DF)
15:56:12.594537 < www.linuxformat.com.www > atlas.oaklea.intranet.1656: .
0:0(0) ack 1 win 8760 <nop,nop,timestamp 5142741 1336738> (DF)
15:56:12.594597 > atlas.oaklea.intranet.1656 > www.linuxformat.com.www: .
14257264:14257264(0) ack 4294796990 win 31856 <nop,nop,timestamp 1456158
5140397> (DF)
15:56:12.604243 < www.linuxformat.com.www > atlas.oaklea.intranet.1656: .
0:0(0) ack 1 win 8760 <nop,nop,timestamp 5142742 1336738> (DF)
15:56:12.604494 > atlas.oaklea.intranet.1656 > www.linuxformat.com.www: .
14257264:14257264(0) ack 4294796990 win 31856 <nop,nop,timestamp 1456159
5140397> (DF)
15:56:12.604450 < www.linuxformat.com.www > atlas.oaklea.intranet.1656: .
0:0(0) ack 1 win 8760 <nop,nop,timestamp 5142742 1336738> (DF)
15:56:12.604623 > atlas.oaklea.intranet.1656 > www.linuxformat.com.www: .
14257264:14257264(0) ack 4294796990 win 31856 <nop,nop,timestamp 1456159
5140397> (DF)
15:56:12.614177 < www.linuxformat.com.www > atlas.oaklea.intranet.1656: .
0:0(0) ack 1 win 8760 <nop,nop,timestamp 5142742 1336738> (DF)
15:56:12.614415 > atlas.oaklea.intranet.1656 > www.linuxformat.com.www: .
14257264:14257264(0) ack 4294796990 win 31856 <nop,nop,timestamp 1456160
5140397> (DF)
15:56:12.614367 < www.linuxformat.com.www > atlas.oaklea.intranet.1656: .
0:0(0) ack 1 win 8760 <nop,nop,timestamp 5142742 1336738> (DF)

ps -ef lists nothign out of the ordinary.

The only way to stop it is to do:

service ipchains restart

Unplugging the network cable for the relevant work station stops the
packets, however, upon plugging the cable back in again it immediately
starts up again.

It appears to happen at random! You can be browsing sites for 5 minutes and
then all of sudden it starts up. It happens very often though.

At first I though that the box had been cracked, however, after checking it
appears that this does not appear to be the case, anyway the box has only
been running the new version of RH7.1 for a couple of days and is only
connects when someone sends email or wants to browse the web.

I am using IPChains as the firewall on blizzard. This is the same script
that was used when this machine was running RH7.0 and is installed in
/etc/sysconfig/ipchains. I do get an error in /var/log/messages about the
masquerading timeouts already being set by the kernel, but otherwise I have
not been able to spot any other errors.

The current IPChains script:

:input ACCEPT
:forward ACCEPT
:output ACCEPT
-M -S 7200 10 160
-P forward DENY
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 23:23 -i ppp0 -p 6 -j DENY
-l
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 21:21 -i ppp0 -p 6 -j DENY
-l
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 110:110 -i ppp0 -p 6 -j DENY
-l
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 98:98 -i ppp0 -p 6 -j DENY
-l
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 98:98 -i ppp0 -p 17 -j DENY
-l
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 30:30 -i ppp0 -p 1 -j DENY
-l
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 79:79 -i ppp0 -p 6 -j DENY
-l
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 113:113 -i ppp0 -p 6 -j DENY
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 111:111 -i ppp0 -p 6 -j DENY
-l
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 111:111 -i ppp0 -p 17 -j
DENY -l
-A output -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 111:111 -i ppp0 -p 6 -j
DENY -l
-A output -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 111:111 -i ppp0 -p 17 -j
DENY -l
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 137:139 -i ppp0 -p 17 -j
DENY -l
-A input -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 137:139 -i ppp0 -p 6 -j DENY
-l
-A output -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 137:139 -i ppp0 -p 17 -j
DENY -l
-A output -s 0.0.0.0/0.0.0.0 -d 0.0.0.0/0.0.0.0 137:139 -i ppp0 -p 6 -j
DENY -l
-A forward -s 192.168.0.0/255.255.255.0 -d 0.0.0.0/0.0.0.0 -j MASQ

Reproducible: Sometimes
Steps to Reproduce:
1.Enable masquerading and ppp on machine.
2.Set this machine as the gateway for all other workstations
	

Actual Results:  As per description. This is quite unusual and I have been
unable to find anyone else reporting similar problems :(

Expected Results:  Normal masquerading. I have been running a RH gateway in
this fashion with these machines for over a year now with previous versions
of rehat and have never seen anything like this.

I changed the cableing to ensure it is not that at fault and it has made no
difference. It still occurs with either work station.
Comment 1 Need Real Name 2001-05-02 13:32:30 EDT
I should add that I am using 3Com NICs in the gateway machine. I have tried a
3C905 TX-M using the 3c59x module and a spare 3c515 both have the same problem.
Comment 2 David Miller 2001-05-02 19:15:55 EDT
Is blizzard an SMP machine?
Comment 3 Need Real Name 2001-05-02 20:03:26 EDT
No, it is has just a single, humble Intel Pentium 75 processor.

I have been unable to reproduce these problems when the client machines are both
running a versions of MS Windows (98 and 2000 have been tried). This is just so
strange!

The IP addresses of the client machines have not been changed and the subnet
masks are correct, they also do not clash with any other IPs.

The only thing that has changed recently is the upgrade of blizzard to RH7.1 and
the standard RH 2.4.2-2. (This kernel has not been recompiled).

Tomorrow, I'll try a RedHat 7.0 kernel 2.2.19-7 on blizzard to see if the
problem goes away.
Comment 4 David Miller 2001-05-03 02:38:57 EDT
Can you give this a try?  Change your:

-A forward -s 192.168.0.0/24 -j MASQ

rule into to be these two rules:

-A forward -s 192.168.0.0/24 ! -p tcp -j MASQ
-A forward -s 192.168.0.0/24 -p tcp -y -j MASQ

Let me know if this makes any difference in behavior.
Comment 5 Need Real Name 2001-05-03 09:35:12 EDT
Applying those rules in place of

-A forward -s 192.168.0.0/24 -j MASQ

does change the behaviour, however the problem is not resolved.

What appears to happen now is that packets are sent out from the network from
atlas, however, incoming packets destined for masqed machines do not appear to
get beyond blizzard.

I set tcpdump to listen on eth0 on blizzard and atlas at the same time. And have
attached the 2 files to this form. (I've just noticed that the time on atlas is
a bit out).

One thing that I haven't mentioned which may or may not be of help is that there
is an additional IP aliased to eth0 on blizzard. eth0:0 is 192.168.0.10 and is
used for IP based virtual hosting in apache for an intranet CGI script.

Also, I have tried 2.2.19 again and have not been able to replicate this packet
flooding problem. I'll be using that for the time being, however I am more than
happy to apply any other changes in order to resolve this problem with 2.4

Many thanks.
Comment 6 Need Real Name 2001-05-03 09:38:33 EDT
Created attachment 17226 [details]
tcpdump -i eth0 on atlas after applying new ipchains rules
Comment 7 Need Real Name 2001-05-03 09:39:57 EDT
Created attachment 17227 [details]
tcpdump -i eth0 on blizzard after applying new ipchains rules
Comment 8 Ian Koenig 2001-05-28 23:13:40 EDT
I have noticed similiar conditions on my box.  

I upgraded from 7.0 to 7.1 (Upgrade not clean install) and have run into "slow
downs" on the gateway.  

The configuration:
Cable Modem --> NIC_A on server/gateway --> NIC_B on server/gateway --> Internal
NAT LAN

On the server I am running RH7.1 with ipchains presently.   I have not changed
anything in the ipchains configuration from the RH7.0 setup.

Symptoms:
Since upgrading (yesterday morning) I have run into the situtaion where we have
inconsistent connections to the outside internet, http page download stalls,
inability to reach or connect to some hosts (which I know are up from talking
with other people)

What can I look at or send or attach to be able to assist in this issue?
Comment 9 Ian Koenig 2001-05-30 17:49:16 EDT
In the major pain of moving from ipchains to iptables I noticed some symptoms 
that told me that ipchains does not work the same between 2.2 & 2.4.  

The primary example was the gateway box that runs the firewall is named dump.  
It has two interfaces (eth0 --> Internet & eth1 --> Internal LAN) which connect 
the internal to the external.  

With no changes to the ipchains script... 
eth1 was no longer accepting / allowing DHCP packets to come into it.  
Some of the ports that were closed on eth0 (137:139) were now open on v2.4.
About every 5 mins client machines in the internal LAN would either loose the 
connection that they had open or the connection would simply stop accepting 
data (ie.  Stalled downloads in Netscape or IE)

I have since moved to iptables and have all the old scripts still available if 
you want to see them or use them for testing.  I will be happy to supply more 
information if you want or need.
Comment 10 Ian Koenig 2001-06-13 10:11:01 EDT
In much of the headbashing I've been doing over the last couple of weeks over
the ipchains/iptables problem has been resolved by moving up to iptables 1.2.2
from iptables 1.2.1a.  There are rather important bug fixes that have resolved
many of the connection tracking problems (which is what the slowdowns were
attributed to).

I simply changed the iptables.spec file to build 1.2.2 instead of 1.2.1a

It is my understanding that this also fixes some problems related to ipchains.

Note You need to log in before you can comment on or make changes to this bug.