The File Transfer Protocol (FTP) and Your Firewall / Network Address Translation (NAT) Router / Load-Balancing Router

The File Transfer Protocol has held up remarkably well over the years. The protocol was first standardized in the early 1970's ¬ decades before most networks were protected by strict firewalls that drop incoming packets first, ask questions later.

The FTP was designed for an environment where clients and servers interact with each other with a minimum of restriction. Additionally, the FTP was designed to operate over communications channels where packets travel directly to their destination, and not in today¬s environment where there may be a transparent intermediary that is responsible for sending the packets to and from a host on a private network.

Contents  

The Problems [Contents]

The primary problems that the FTP poses to firewalls, NAT devices, and load-balancing devices (all of which will simply be referred to as ¬routing devices¬ and not "routers" since gateway machines generally aren't problematic) are:

  1. Additional TCP/IP connections are used for data transfers;

  2. Data connections may be sent to random port numbers;

  3. Data connections may originate from the server to the client, as well as originating from the client to the server;

  4. Data connections¬ destination addresses are negotiated on the fly between the client and server over the channel used for the control connection;

  5. The control connection is idle while the data transfer takes place on the data connection.

The ramifications for problem (1) are that routing devices must maintain state information for the control connection where the FTP conversation between client and server takes place, and subsequent data connections. For load balancing devices especially, this means that it is imperative to send the data connections to the same internal server that the control connection associated with it is being sent. 

For problem (2), this means that it is impossible to for FTP to work with a configuration where only a handful of well-known ports are allowed in and all other ports are denied. Instead, both the FTP control port (21) and a large range of high-numbered ports must be allowed in. But, as a consequence of problem (1), the range of ports can be locked down for everything except use by FTP with a little work by the routing device. 

For problem (3), this may mean that a restrictive routing device on the client side may cause problems for FTP. 

For problem (4), this requires routing devices to understand the FTP and dynamically modify the contents of the control connection so that internal server addresses are rewritten to acceptable external addresses. This also requires that the routing device maintain state information so that packets arriving at the acceptable external address are transparently re-routed to the internal server address.

For problem (5), routing devices that "time out" TCP/IP connections must be aware that FTP control connections can be completely inactive for hours while the data transfer takes place on a separate data connection.  The classic example of a problem with this case is the common occurrence where a lengthy download finishes and the client wishes to start another download, but the routing device has timed-out the control connection since no activity took place for 15 minutes.  The client program then locks up waiting for the server to reply to a message it never received because the routing device did not route it to the server.

The Two Types of Data Transfers - Active (PORT) and Passive (PASV) [Contents]

The FTP specifies a mechanism for a default data connection, where the server can connect back to the client from port 20 to the same IP address and port number that the client is originating from on the control connection.  However, it really isn't feasible because the preferred transfer mode is "stream mode" and would require that the default data connection be reopened with each data transfer (and TCP won't let you do that until TIME_WAIT expires on the previous default data connection that has the same connection endpoints).

Therefore, all modern FTP clients negotiate with the server on where the data is sent and who initiates the connection.  The client program can specify active mode by sending the "PORT" command to instruct that the server should connect back to a specified IP address and port number and then send the data.  Or, a client program can choose passive mode by using the "PASV" command to ask that the server tell the client an IP address and port number that the client can connect to and receive the data.

In a nutshell, PORT is used to have the server connect to the client, and PASV is used to have the client connect to the server.  Since the client connects to the server to establish the control connection, it would seem logical that the client should connect to the server to establish the data connection, which would imply that PASV would be preferred (and at the same time eliminate the single biggest problem with FTP and firewalls).  Mysteriously, the implementers chose to specify in the FTP specification that PORT should be preferred and PASV need not be implemented at all by FTP client programs.

Example Sessions Using Active and Passive Data Transfers [Contents]

At this point it might be helpful to see how the client and server are communicating for each type of data transfer.  The first example is an Active session that logs in anonymously and does a single active data transfer, a directory listing.  Note that a directory listings are treated as data transfers just like uploading and downloading of files!

Client: USER anonymous
Server: 331 Guest login ok, send your e-mail address as password.
Client: PASS NcFTP@
Server: 230 Logged in anonymously.
Client: PORT 192,168,1,2,7,138 The client wants the server to send to port number 1930 on IP address 192.168.1.2.
Server: 200 PORT command successful.
Client: LIST
Server: 150 Opening ASCII mode data connection for /bin/ls. The server now connects out from port 21 to port 1930 on 192.168.1.2.
Server: 226 Listing completed. That succeeded, so the data is now sent over the established data connection.
Client: QUIT
Server: 221 Goodbye.

Next is the Passive example.

Client: USER anonymous
Server: 331 Guest login ok, send your e-mail address as password.
Client: PASS NcFTP@
Server: 230 Logged in anonymously.
Client: PASV The client is asking where he should connect.
Server: 227 Entering Passive Mode (172,16,3,4,204,173) The server replies with port 52397 on IP address 172.16.3.4.
Client: LIST
Server: 150 Data connection accepted from 172.16.3.4:52397; transfer starting. The client has now connected to the server at port 52397 on IP address 172.16.3.4.
Server: 226 Listing completed. That succeeded, so the data is now sent over the established data connection.
Client: QUIT
Server: 221 Goodbye.

Why PORT Poses Problems for Routing Devices [Contents]

The biggest problem caused by FTP client programs choosing to use "PORT" to negotiate FTP data connections is the fact that the server must be the connecting out back to the client's IP address.  For restrictive firewalls, it is desirable to forbid all incoming connections, so using PORT would cause the connection incoming from the server to fail.

Another big problem is that when a client program is using network address translation to hide behind a routing device on an internal network, when using PORT the client tells a server on the external network to connect to an address on the client's internal network.  I.e., from the example above:

Client: PORT 192,168,1,2,7,138

That almost always results in the routing device denying the connection, or the connection to fail completely if the IP address is a RFC 1918 compliant reserved address (i.e. 192.168.x.x, 172.16.x.x, 10.x.x.x).  In either case, the client user will typically experience a discarded connection that is very frustrating since the client program will just lock up until the connection is considered permanently timed-out.

Solution 1:  The client user should configure their FTP client program to use PASV rather than PORT.  Using passive mode may not solve the problem if there is a similar restrictive firewall on the server side.

Solution 2:  A better solution is for the network administrator of the client network to use high-quality network address translation software.  Devices can keep track of FTP data connections, and when a client on a private network uses "PORT" with an internal network address, the device should dynamically rewrite the packet containing the PORT and IP address and change the address so that it refers to the external IP address of the routing device.  The device would then have to route the connection incoming from the remote FTP server back to the internal network address of the client.  I.e., from the example above we had:

Client: PORT 192,168,1,2,7,138

When the packet containing this PORT reaches the routing device, it should be rewritten like this, assuming the external address is 17.254.0.26:

Client: PORT 17,254,0,26,7,138

The remote server would then attempt to connect to 17.254.0.26:1930.  The routing device in this example would then forward all traffic for this connection to and from the client address at 192.168.1.2:1930.

Why PASV Poses Problems for Firewalls [Contents]

When an FTP server is behind a firewall, there can be problems when FTP clients try to use passive mode to connect to an ephemeral port number (temporary random port number) on the FTP server machine.  The most common problem is when the firewall the FTP server is behind is strict, i.e. the firewall allows only a few well-known port numbers in and denies access to all other ports.

Solution 1:  The network administrator of the server network can configure the firewall to allow in the entire ephemeral port range.  The range of ephemeral ports that need to be opened up is dependent on the configuration of the server machine that is running the FTP server software -- not the ephemeral ports on the firewall!

So, find out how the FTP server machine has configured the ephemeral port range (whose default range varies with the operating system) and then open those ports on the firewall.  Ideally, the firewall should be configured so that only that range of ports is accessible to the FTP server machine.  Also double check to be sure that there aren't any other TCP services with port numbers in the ephemeral port range listening on the FTP server machine.

Solution 2:  The network administrator of the server network can consult the firewall vendor's documentation to see if FTP connections can be dynamically monitored and ports dynamically opened when a passive FTP connection is detected.  This is similar to what intelligent network address translation software can do on the client side for PORT -- the FTP control connections are monitored, and when a packet containing "PASV" from an FTP session is detected, the firewall can automatically open the port.

Using our PASV example above, when the FTP server replies to the PASV request:

Server:  227 Entering Passive Mode (172,16,3,4,204,173)

The firewall would then parse the request and find that the client will be instructed to connect to port 52397 on the address 172.16.3.4.  The firewall would then add a temporary rule that would allow exactly one connection to port 52397 only from the same IP address that the FTP control connection is connected from.

Why PASV Poses Problems for FTP Servers on Internal Networks [Contents]

The other server-side problem that can occur is when a client is trying to access an FTP server on an internal network protected by a routing device.  Because a server response from PASV includes an IP address and port number, if this IP address corresponds to a private network then the client will not be able to connect to that private address.  From our PASV example above, we have:

Server:  227 Entering Passive Mode (172,16,3,4,204,173)

If left unaltered, the client would try to connect to port 52397 on the IP address 172.16.3.4.  If the client is not on the private internal network, the client would time-out trying to connect to that address, when in reality it should be connecting to the external IP address of the routing device.

Solution 1:  The network administrator of the server network can consult the routing device vendor's documentation to see if FTP connections can be dynamically monitored and dynamically replace the IP address specification for packets containing the PASV response.

Using our PASV example above, when the FTP server replies to the PASV request:

Server:  227 Entering Passive Mode (172,16,3,4,204,173)

The routing device should rewrite the packet like this, assuming the external address is 17.254.0.91:

Server:  227 Entering Passive Mode (17,254,0,91,204,173)

The remote client would then attempt to connect to the routing device at 17.254.0.91:52397. The routing device in this example would then forward all traffic for this connection between the remote client and the internal FTP server at IP address 172.16.3.4.

Why PASV Poses Problems for FTP Servers behind Load-Balancing Routers [Contents]

Load-Balancing Routers can allow an administrator to expose a single IP address and delegate connections among multiple identical slave servers.  This is similar to Redundant Arrays of Inexpensive Disks (RAID), only instead of disks the array is of TCP/IP servers.

Load Balancing provides two challenges for FTP.  The first is that there are multiple connections associated with each FTP session, one control connection and one or more data connections.  For PASV data connections to work, the load balancer must be able to send the connection from the client to the same slave server that is handling the control connection.

The second problem, which is related to the first, is that when a slave server replies with the PASV response, the PASV response's IP address must be accessible to the remote client.

Solution 1:  The network administrator of the server network can give each slave server a valid externally accessible IP address.  The external IP address of the load balancer could be used as the preferred address, but having each slave server have its own external IP address would allow PASV data connections to connect directly to the slave server without requiring traffic from slaves to pass through the load balancer.  It also means that the load balancer does not need to do any special automatic handling of FTP.

Solution 2:  The network administrator of the server network can consult the load balancing router vendor's documentation to see if FTP connections can be handled automatically so that the PASV reply is dynamically rewritten to contain the external IP address of the load balancer.

Solution 3:  If the routing device isn't intelligent enough to take special care of FTP sessions, but has the ability to always forward traffic from the same remote client IP address to the same internal server IP address, then the network administrator of the server network may be able to configure the FTP server software to spoof the address it uses for PASV replies.

For example, NcFTPd Server has an option to let you specify an IP address to use for PASV replies rather than the real IP address of the machine.  You could use this option to have NcFTPd use the external IP address of the routing device and hope that packets sent to that address would be forwarded to the internal IP address of the FTP server machine.

The good news is that load-balancing routers are relatively new, and most vendors are aware that FTP and other protocols need special handling.  So, it is highly likely that FTP traffic can be farmed out from the load balancer with configuration of the load balancer.

Deadlock - When there are Restrictive Firewalls on Both Sides [Contents]

Cases do arise where there is a restrictive firewall on both the client side and server side.  Again, a restrictive firewall is one whose policy is to deny everything except for traffic between a few well-known ports.  When this happens, a client user cannot use PORT because the FTP server cannot connect back to the client program listening ephemeral port number, and a client user cannot use PASV because the client program cannot connect to the FTP server software listening on an ephemeral port number.

Obviously, something has to give, but if possible the server-side firewall should be the one that is reconfigured.  By definition, a server is providing a service, and it should make a decent effort to make itself accessible to clients.  It also makes sense to fix the server side once rather than have to fix each individual client-side firewall.

Someday, all devices doing network address translation will have built-in special handling of FTP sessions so PORT can be used, and someday all firewalls will have built-in special handling of FTP sessions so PASV can be used too.

Problems when the FTP Server is Listening on a Non-Standard Port Number [Contents]

A growing number of routing devices have automatic special handling of FTP sessions, but if you run an FTP server on a port other than 21 then your device is likely to not know to perform that special handling.  Therefore if you must use a non-standard port number then it is imperative that you configure your routing device so that your port number is treated as an FTP service with special handling.

Even if the server-side network's routing device is configured for that special handling, problems could still arise on the client side!  Some firewalls require that FTP data connections from the server originate from port 20, which is the standard port number for FTP data connections.  If your FTP server is running on non-standard port N, it is required by the FTP specification that its data connections originate from port N - 1.  Client-side firewalls may deny these connections, so beware.

Problems caused by the firewall prematurely timing out a valid FTP session [Contents]

Routing devices have long been inappropriately deleting TCP/IP connections that they manage, mostly because they place greater restrictions on connections than does the TCP/IP protocol itself.  For example, a TCP/IP connection with no outstanding acknowledgments pending is allowed to be idle indefinitely, unless one or both ends of the connection agree to use TCP/IP "Keep Alive" probes.  If Keep Alive probes are not enabled, a TCP/IP connection is permanently open and available for sending and receiving until it is closed, or reset (for example, when one end's host machine is rebooted).

Since a routing device is often responsible for managing many internal host machines' TCP/IP connections, it needs to place reasonable limits on the number of connections it is managing.  Therefore, it will try to reclaim connections when it can, and a common way to do this is to put an activity timer on the connection and delete connections when the timer shows that the connection has been idle for a "long" period of time.  Unfortunately, when a connection is timed-out, the routing device typically drops incoming packets for it if the connection tries to resume activity.  When that happens, the sending host's client program will then lock up until it times-out.  If the routing device were kind enough to send back a reply to the sending host with an error message, rather than dropping the packets and ignoring the sending host, the client program could immediately err-out rather than time-out after a considerable delay.

Since the FTP protocol uses two connections, a control connection for communicating with the client, and another connection to transfer data, there is twice the probability of getting timed-out by an impatient firewall.  The most common instance of this problem occurs comes into play during a long file transfer.  When a transfer is initiated (on the control connection), the control connection is idle until the transfer (on the data connection) finishes.  If the routing device does not special case for the FTP protocol and the data connection takes longer than the routing device's idle timeout, then the control connection will be timed out.  This is a significant problem since the client program may wish to continue using the FTP session, such as downloading additional files.

Even if the client program is planning on ending the session, the FTP requires that the client program send a message ("QUIT") to the server indicating that the connection should be closed, and the server is then required to reply with another message indicating that the session is officially closed. The ramifications are that the client program could then lock up waiting for a reply to a "QUIT" message that the server will not receive since the firewall timed-out the session, unbeknownst to both client and server.  The solution for this specific case, which some, but not all, FTP client programs do, is to either place a very short time-out on the reply to the "QUIT" message, or to simply close its end of the FTP session (which violates the FTP protocol, but is de facto behavior and is generally accepted).

The general solution for this problem is that the routing device needs to special-case the FTP protocol, and when there is activity on a FTP session's data connection, it must mark the FTP session's control connection as active, in addition to marking the data connection as active.  Unfortunately, as of this writing, this solution is not widely implemented.

Another solution is to enable the TCP/IP "Keep Alive" feature on the control connection.  When this is enabled by an application program such as an FTP client or server program, if the connection has been idle for a preset time, the TCP/IP stack will automatically send a heartbeat message to the other end's TCP/IP stack, and if no reply is received, the connection will be properly timed-out at the TCP stack on the host machine, rather than at the firewall.  The problem with this is that due to legacy behavior of the TCP/IP protocol (which is decades old, remember!) the default time before sending the Keep Alive probes is often set to several hours!  So, under default conditions, the connection would have to be idle for several hours before the TCP stack would send out its heartbeat message, which is not realistic considering the default idle timeout for routing devices is often under 15 minutes.

For the Keep Alive feature to work under realistic conditions then, it must be configured to start sending the probes before the routing device's idle time out kicks in.  For example, if a firewall is configured to discard idle connections after 15 minutes, you would want your Keep Alive probes to be sent after 10 minutes of inactivity.  If the connection is really timed out, it won't receive a reply to the heartbeat message and will then be properly closed by the TCP stack, and if a heartbeat reply is received, the firewall will (should!) mark the connection as no longer idle.

As long as one side of the FTP session enables Keep Alive and has the heartbeat timer configured to a sensible value, this problem should be solved.  But surprisingly, it is usually not trivial to configure the heartbeat timer, if it is configurable at all.  Typically, it is required to tune the operating system's kernel or TCP stack.  Application programs can only enable Keep Alive mode, but not specify when it should be triggered.  Therefore, unless the operating system provides a mechanism to configure the timer and the system administrator of the machine running the application program has bothered to configure the timer, a program enabling "Keep Alive" mode is unlikely to solve the problem.

Final Words [Contents]

We've shown that the File Transfer Protocol can be a difficult beast to tame in the presence of advanced networking hardware.  We will leave you with a brief list of guidelines, but realize that for FTP to work smoothly you will most likely need to expend considerable effort on configuration or considerable cash on hardware that is FTP aware.


© Copyright 2001, 2002, 2005 by Mike Gleason, NcFTP Software.
All Rights Reserved.

This is revision 1.1.1 (April 18, 2005) of this document.