Theodore is a simple ftp server based on rfc959. It can:
In the RFC you can read the following snippet:
5. DECLARATIVE SPECIFICATIONS 5.1. MINIMUM IMPLEMENTATION In order to make FTP workable without needless error messages, the following minimum implementation is required for all servers: TYPE - ASCII Non-print MODE - Stream STRUCTURE - File, Record COMMAND - USER, QUIT, PORT, TYPE, MODE, STRU, for the default values RETR, STOR, NOOP. The default values for transfer parameters are: TYPE - ASCII Non-print MODE - Stream STRU - File All hosts must accept the above as the standard defaults.
Now Far ftp client is able to login and list current dir on my server! Opera also can. Mozilla and IE asks too much!
#ifndef DC_H #define DC_H #include "theodore.h" class TDC { public: enum Mode { Stream, Block, Compressed } ; enum Stru { File, Record, Page } ; enum Type { Ascii, Ebcdic, Image, Local } ; TDC(); ~TDC(); bool setPassive( bool p=true ); bool isPassive( ) const { return m_isPassive; } void setMode( const TDC::Mode m ) { m_mode = m; } void setStru( const TDC::Stru s ) { m_stru = s; } bool setAddr( t::addr adr ); string getNextPasvAddr( ); bool open(); void close(); bool isOpened(); bool snd( const char *buf ); private: Mode m_mode; Stru m_stru; Type m_type; bool m_isPassive; t::addr m_remoteAddr; t::addr m_localAddr; bool m_opened; SOCKET m_sockListen; SOCKET m_sock; bool accpt(); bool cnnct(); } ; #endif // DC_H
Seems like directories have _finddata_t::attrib == 16 and files - 32. So, now let's make LIST formatted output.
struct addr { int m_ipOne, m_ipTwo, m_ipThree, m_ipFour; int m_pOne, m_pTwo, m_port; string m_ip; addr(): m_ipOne( 0 ), m_ipTwo( 0 ), m_ipThree( 0 ), m_ipFour( 0 ), m_pOne( 0 ), m_pTwo( 0 ), m_port( 0 ), m_ip( "0.0.0.0" ) { } bool setAddr( ); bool setPort( ); static bool parseRawAddr( string s, t::addr *adr ); } ;
Related stuff for LIST command:
errno_t _access_s( const char *path, int mode ); intptr_t _findfirst( const char *filespec, struct _finddata_t *fileinfo ); int remove( const char *path ); int _stat( const char *path, struct _stat *buffer ); int _fstat( int fd, struct _stat *buffer ); long _filelength( int fd ); The _stat structure, defined in SYS\STAT.H, includes the following fields. st_gid Numeric identifier of group that owns the file (UNIX-specific) This field will always be zero on Windows systems. A redirected file is classified as a Windows file. st_atime Time of last access of file. Valid on NTFS but not on FAT formatted disk drives. st_ctime Time of creation of file. Valid on NTFS but not on FAT formatted disk drives. st_dev Drive number of the disk containing the file (same as st_rdev). st_ino Number of the information node (the inode) for the file (UNIX-specific). On UNIX file systems, the inode describes the file date and time stamps, permissions, and content. When files are hard-linked to one another, they share the same inode. The inode, and therefore st_ino, has no meaning in the FAT, HPFS, or NTFS file systems. st_mode Bit mask for file-mode information. The _S_IFDIR bit is set if path specifies a directory; the _S_IFREG bit is set if path specifies an ordinary file or a device. User read/write bits are set according to the file's permission mode; user execute bits are set according to the filename extension. st_mtime Time of last modification of file. st_nlink Always 1 on non-NTFS file systems. st_rdev Drive number of the disk containing the file (same as st_dev). st_size Size of the file in bytes; a 64-bit integer for variations with the i64 suffix. st_uid Numeric identifier of user who owns file (UNIX-specific). This field will always be zero on Windows systems. A redirected file is classified as a Windows file. If path refers to a device, the st_size, various time fields, st_dev, and st_rdev fields in the _stat structure are meaningless. Because STAT.H uses the _dev_t type that is defined in TYPES.H, you must include TYPES.H before STAT.H in your code. #include <io.h> _CRTIMP int __cdecl _findfirst (const char*, struct _finddata_t*); _CRTIMP int __cdecl _findnext (int, struct _finddata_t*); _CRTIMP int __cdecl _findclose (int); _CRTIMP int __cdecl _chdir (const char*); _CRTIMP char* __cdecl _getcwd (char*, int); _CRTIMP int __cdecl _mkdir (const char*); _CRTIMP char* __cdecl _mktemp (char*); _CRTIMP int __cdecl _rmdir (const char*); _CRTIMP int __cdecl _chmod (const char*, int); struct _finddata_t { unsigned attrib; /* Attributes, see constants above. */ time_t time_create; time_t time_access; /* always midnight local time */ time_t time_write; _fsize_t size; char name[FILENAME_MAX]; /* may include spaces. */ }; struct tm { int tm_sec; /* Seconds: 0-59 (K&R says 0-61?) */ int tm_min; /* Minutes: 0-59 */ int tm_hour; /* Hours since midnight: 0-23 */ int tm_mday; /* Day of the month: 1-31 */ int tm_mon; /* Months *since* january: 0-11 */ int tm_year; /* Years since 1900 */ int tm_wday; /* Days since Sunday (0-6) */ int tm_yday; /* Days since Jan. 1: 0-365 */ int tm_isdst; /* +1 Daylight Savings Time, 0 No DST, * -1 don't know */ }; Requirements Routine Required header Optional headers Compatibility _stat, _stat32, _stat64, _stati64, _stat32i64, _stat64i32followed by Windows 95, Windows 98, Windows 98 Second Edition, Windows Millennium Edition, Windows NT 4.0, Windows 2000, Windows XP Home Edition, Windows XP Professional, Windows Server 2003 Requirements Function Required header Compatibility _filelength Windows 95, Windows 98, Windows 98 Second Edition, Windows Millennium Edition, Windows NT 4.0, Windows 2000, Windows XP Home Edition, Windows XP Professional, Windows Server 2003
And know what actually ftp-servers (FileZilla) outputs on LIST or LIST -la:
drwxr-xr-x 1 ftp ftp 0 Mmm hh:mm file name with spaces.ext<CRLF> -rw-r--r-- 1 ftp ftp 0 Mmm hh:mm directory<CRLF>
I.e. file has rights -rw-r--r-- 1 ftp ftp and dir has drwxr-xr-x 1 ftp ftp.
Then goes gap with minimum length of 8 <SP> and file size aligned right ( dirs have 0 bytes size ).
Then creation time: Mmm dd hh:mm
Then filename or dir.
The question is: how to distinguish files from directories
What does each command mean? They're all configure data connection parameters. I.e. which sequence of bytes ( and by the way what's the byte? ) represents a new line. How is file separated to pieces (record, page).
TYPE's for: ASCII(DEFAULT), EBCDIC, IMAGE, LOCAL.
NVT assumes <CRLF> as end of a line. EBCDIC consider <NL> as end of the line. IMAGE can include zero-padding at the end (of file or of each record).LOCAL must specify by second parameter size (in bits) of logical bytes.
STRU's for: file-structure[(DEFAULT) actually no redundant symbols here to make structure :)], record-structure[is for text files like ASCII and EBCDIC], page-structure[the most interesting structure]. Each page has page-header in 4 logical bytes (sized by TYPE command): Header length(>=4), Page index, Data Length(>=0), Page type(0==Last page, 1==simple page, 2==Descriptor page, 3==Access Controlled Page. A page is a contiguous set of 512 words of 36 bits each.
MODE's for Stream(DEFAULT), Block and Compressed. When stream is used, end of data connection indicates the end of file to be transmitted. For record type at Stream mode separator is:
bit index: 15 87 0 bit value: 1111111100000010 - EOF bit index: 15 87 0 bit value: 1111111100000001 - EOR
If byte of all ones (11111111) was intended to be transmitted as data (not part of EOF or EOR) it should be repeated!
In BLOCK mode file is transmitted in series of blocks which are represented in following manner:
Block Header +----------------+----------------+----------------+ | Descriptor | Byte Count | | 8 bits | 16 bits | +----------------+----------------+----------------+ Code Meaning 128 End of data block is EOR 64 End of data block is EOF 32 Suspected errors in data block 16 Data block is a restart marker With this encoding, more than one descriptor coded condition may exist for a particular block. As many bits as necessary may be flagged.
If a marker was found in a block - all is fucked up. Send 110 MARK yyyy = mmmm, where yyyy is User-process data stream marker, and mmmm server's equivalent marker (note the spaces between markers and "="). But in first approach just response (try later... 4xx).
Today:
#ifndef FS_H #define FS_H #include <direct.h> #include <errno.h> #include <string> #include <list> using namespace std; class TFS { public: enum Type { File, Dir } ; struct Path { bool m_dValid; bool m_fValid; string m_path; string m_fName; list<string> m_dirs; Path(): m_dValid( true ), m_fValid( false ) { } ~Path() { } void setDValid( bool v ) { m_dValid = v; } void setFValid( bool v ) { m_fValid = v; } void setFullPath( const string fp ) { m_path = fp; } void setFileName( const string fn ) { m_fName = fn; } void setDirList( const list<string> dl ) { m_dirs = dl; } list<string> getDirList( ) const { return m_dirs; } string getFileName( ) const { return m_fName; } string getFullPath() const { return m_path; } bool isDValid() const { return m_dValid; } bool isFValid() const { return m_fValid; } } ; static Path parseRawPath( string str ); static int level( const Path path ); static void printPath( const Path path ); private: static string cutMilk( const string str ); static bool syntaxOk( const string path, TFS::Type type ); static string getFileName( const string fullPath ); static list<string> getDirList( const string mulDir ); } ; #endif // FS_H
Today:
Today:
int TConnection::work() { string s; <...> if( readCRLF( s ) ) { std::string sOut; printf( "\n%d bytes received: %s\n", s.length(), s.c_str() ); sOut = m_interpreter.performRawCmd( s ); send( m_socket, sOut.c_str(), sOut.length(), 0 ); } Sleep(1); return 0; }
This is presented implementation at the moment of the TConnection::work() and first thing to discuss is how to manage with ABOR operation. If TInterpreter includes all operation-handlers and can perform one command at the same time? Ok we can save all retrieved commands to list. Every time we are inside performRawCmd() we add new command to list.
So, all commands received by TConnection are in list. How do we manage with them? It's necessary to have ability to abort currently performing operation if ABOR received. And in general some operations are supposed to be controlled by another ones. So it's absolutely necessary to make possible currently performing commands to hear new arrived ones.
However, if we receive operation that is not connected with currently peforming it should be rejected. We are simple - right? Wanna download two files simultaneously? Open new connection, sir.
Each operation given to TInterpreter is atomic. This means TConnection isn't to perceive any operation except ABORT or QUIT. TConnection is to take new operation only if TInterpreter::Completed is set. If TInterpreter::Pending is set only ABORT and QUIT could be applied. When TInterpreter::SemiCompleted, TConnection is supposed to make response from TInterpreter::code, TInterpreter::ParString.
Now make this:
What really build today:
Here is typical ftp scenario accoringly to the RFC:
ftp (host) multics<CR> Connect to host S, port L, establishing control connections. <---- 220 Service ready <CRLF>. username Doe <CR> USER Doe<CRLF>----> <---- 331 User name ok, need password<CRLF>. password mumble <CR> PASS mumble<CRLF>----> <---- 230 User logged in<CRLF>. retrieve (local type) ASCII<CR> (local pathname) test 1 <CR> User-FTP opens local file in ASCII. (for. pathname) test.pl1<CR> RETR test.pl1<CRLF> ----> <---- 150 File status okay; about to open data connection<CRLF>. Server makes data connection to port U. <---- 226 Closing data connection, file transfer successful<CRLF>. type Image<CR> TYPE I<CRLF> ----> <---- 200 Command OK<CRLF> store (local type) image<CR> (local pathname) file dump<CR> User-FTP opens local file in Image. (for.pathname) >udd>cn>fd<CR> STOR >udd>cn>fd<CRLF> ----> <---- 550 Access denied<CRLF> terminate QUIT <CRLF> ----> Server closes all connections.
Login state diagram
The most complicated diagram is for the Login sequence: 1 +---+ USER +---+------------->+---+ | B |---------->| W | 2 ---->| E | +---+ +---+------ | -->+---+ | | | | | 3 | | 4,5 | | | -------------- ----- | | | | | | | | | | | | | | --------- | | 1| | | | V | | | | +---+ PASS +---+ 2 | ------>+---+ | |---------->| W |------------->| S | +---+ +---+ ---------->+---+ | | | | | 3 | |4,5| | | -------------- -------- | | | | | | | | | | | | ----------- | 1,3| | | | V | 2| | | +---+ ACCT +---+-- | ----->+---+ | |---------->| W | 4,5 -------->| F | +---+ +---+------------->+---+
Command-reply interchange
Finally, we present a generalized diagram that could be used to model the command and reply interchange: ------------------------------------ | | Begin | | | V | | +---+ cmd +---+ 2 +---+ | -->| |------->| |---------->| | | | | | W | | S |-----| -->| | -->| |----- | | | | +---+ | +---+ 4,5 | +---+ | | | | | | | | | | | 1| |3 | +---+ | | | | | | | | | | | | ---- | ---->| F |----- | | | | | | | | +---+ ------------------- | | V End
Data connection There are basic points:
DTP - is Data Transfer Process. PI - is Protocol Interpreter.
------------- |/---------\| || User || -------- ||Interface|<--->| User | |\----^----/| -------- ---------- | | | |/------\| FTP Commands |/----V----\| ||Server|<---------------->| User || || PI || FTP Replies || PI || |\--^---/| |\----^----/| | | | | | | -------- |/--V---\| Data |/----V----\| -------- | File |<--->|Server|<---------------->| User |<--->| File | |System| || DTP || Connection || DTP || |System| -------- |\------/| |\---------/| -------- ---------- ------------- Server-FTP USER-FTP NOTES: 1. The data connection may be used in either direction. 2. The data connection need not exist all of the time. Figure 1 Model for FTP Use
Ftp-session could be separated to 3 phases:
USER <SP> <username> <CRLF> (230, 530, 500, 501, 421, 331, 332) The argument field is a Telnet string identifying the user. The user identification is that which is required by the server for access to its file system. This command will normally be the first command transmitted by the user after the control connections are made (some servers may require this). Additional identification information in the form of a password and/or an account command may also be required by some servers. Servers may allow a new USER command to be entered at any point in order to change the access control and/or accounting information. This has the effect of flushing any user, password, and account information already supplied and beginning the login sequence again. All transfer parameters are unchanged and any file transfer in progress is completed under the old access control parameters. PASS <SP> <password> <CRLF> (230 , 202, 530, 500, 501, 503, 421, 332) The argument field is a Telnet string specifying the user's password. This command must be immediately preceded by the user name command, and, for some sites, completes the user's identification for access control. Since password information is quite sensitive, it is desirable in general to "mask" it or suppress typeout. It appears that the server has no foolproof way to achieve this. It is therefore the responsibility of the user-FTP process to hide the sensitive password information. SYST <CRLF> (215, 500, 501, 502, 421) This command is used to find out the type of operating system at the server. The reply shall have as its first word one of the system names listed in the current version of the Assigned Numbers document [4].
CWD <SP> <pathname> <CRLF> (250, 500, 501, 502, 421, 530, 550) This command allows the user to work with a different directory or dataset for file storage or retrieval without altering his login or accounting information. Transfer parameters are similarly unchanged. The argument is a pathname specifying a directory or other system dependent file group designator. CDUP <CRLF> (200, 500, 501, 502, 421, 530, 550) This command is a special case of CWD, and is included to simplify the implementation of programs for transferring directory trees between operating systems having different PORT <SP> <host-port> <CRLF> (200, 500, 501, 421, 530) The argument is a HOST-PORT specification for the data port to be used in data connection. There are defaults for both the user and server data ports, and under normal circumstances this command and its reply are not needed. If this command is used, the argument is the concatenation of a 32-bit internet host address and a 16-bit TCP port address. This address information is broken into 8-bit fields and the value of each field is transmitted as a decimal number (in character string representation). The fields are separated by commas. A port command would be: PORT h1,h2,h3,h4,p1,p2 where h1 is the high order 8 bits of the internet host address. PASV <CRLF> (227, 500, 501, 502, 421, 530) This command requests the server-DTP to "listen" on a data port (which is not its default data port) and to wait for a connection rather than initiate one upon receipt of a transfer command. The response to this command includes the host and port address this server is listening on. TYPE <SP> <type-code> <CRLF> (200, 500, 501, 504, 421, 530) The argument specifies the representation type as described in the Section on Data Representation and Storage. Several types take a second parameter. The first parameter is denoted by a single Telnet character, as is the second Format parameter for ASCII and EBCDIC; the second parameter for local byte is a decimal integer to indicate Bytesize. The parameters are separated by a(Space, ASCII code 32). The following codes are assigned for type: \ / A - ASCII | | N - Non-print |-><-| T - Telnet format effectors E - EBCDIC| | C - Carriage Control (ASA) / \ I - Image L - Local byte Byte size The default representation type is ASCII Non-print. If the Format parameter is changed, and later just the first argument is changed, Format then returns to the Non-print default. STRU <SP> <structure-code> <CRLF> (200, 500, 501, 504, 421, 530) The argument is a single Telnet character code specifying file structure described in the Section on Data Representation and Storage. The following codes are assigned for structure: F - File (no record structure) R - Record structure P - Page structure The default structure is File. MODE <SP> <mode-code> <CRLF> (200, 500, 501, 504, 421, 530) The argument is a single Telnet character code specifying the data transfer modes described in the Section on Transmission Modes. The following codes are assigned for transfer modes: S - Stream B - Block C - Compressed The default transfer mode is Stream. RETR <SP> <pathname> <CRLF> (125, 150, (110), 226, 250, 425, 426, 451, 450, 550, 500, 501, 421, 530) This command causes the server-DTP to transfer a copy of the file, specified in the pathname, to the server- or user-DTP at the other end of the data connection. The status and contents of the file at the server site shall be unaffected. STOR <SP> <pathname> <CRLF> (125, 150, (110), 226, 250, 425, 426, 451, 551, 552, 532, 450, 452, 553, 500, 501, 421, 530) This command causes the server-DTP to accept the data transferred via the data connection and to store the data as a file at the server site. If the file specified in the pathname exists at the server site, then its contents shall be replaced by the data being transferred. A new file is created at the server site if the file specified in the pathname does not already exist. APPE <SP> <pathname> <CRLF> (125, 150, (110), 226, 250, 425, 426, 451, 551, 552, 532, 450, 550, 452, 553, 500, 501, 502, 421, 530) This command causes the server-DTP to accept the data transferred via the data connection and to store the data in a file at the server site. If the file specified in the pathname exists at the server site, then the data shall be appended to that file; otherwise the file specified in the pathname shall be created at the server site. ABOR <CRLF> (225, 226, 500, 501, 502, 421) This command tells the server to abort the previous FTP service command and any associated transfer of data. The abort command may require "special action", as discussed in the Section on FTP Commands, to force recognition by the server. No action is to be taken if the previous command has been completed (including data transfer). The control connection is not to be closed by the server, but the data connection must be closed. There are two cases for the server upon receipt of this command: (1) the FTP service command was already completed, or (2) the FTP service command is still in progress. In the first case, the server closes the data connection (if it is open) and responds with a 226 reply, indicating that the abort command was successfully processed. In the second case, the server aborts the FTP service in progress and closes the data connection, returning a 426 reply to indicate that the service request terminated abnormally. The server then sends a 226 reply, indicating that the abort command was successfully processed. MKD <SP> <pathname> <CRLF> (257, 500, 501, 502, 421, 530, 550) This command causes the directory specified in the pathname to be created as a directory (if the pathname is absolute) or as a subdirectory of the current working directory (if the pathname is relative). See Appendix II. PWD <CRLF> (257, 500, 501, 502, 421, 550) This command causes the name of the current working directory to be returned in the reply. See Appendix II. LIST [<SP> <pathname>] <CRLF> (125, 150, 226, 250, 425, 426, 451, 450, 500, 501, 502, 421, 530) This command causes a list to be sent from the server to the passive DTP. If the pathname specifies a directory or other group of files, the server should transfer a list of files in the specified directory. If the pathname specifies a file then the server should send current information on the file. A null argument implies the user's current working or default directory. The data transfer is over the data connection in type ASCII or type EBCDIC. (The user must ensure that the TYPE is appropriately ASCII or EBCDIC). Since the information on a file may vary widely from system to system, this information may be hard to use automatically in a program, but may be quite useful to a human user. NOOP <CRLF> (200, 500, 421) This command does not affect any parameters or previously entered commands. It specifies no action other than that the server send an OK reply.
QUIT <CRLF> (221, 500) This command terminates a USER and if file transfer is not in progress, the server closes the control connection. If file transfer is in progress, the connection will remain open for result response and the server will then close it. If the user-process is transferring files for several USERs but does not wish to close and then reopen connections for each, then the REIN command should be used instead of QUIT. An unexpected close on the control connection will cause the server to take the effective action of an abort (ABOR) and a logout (QUIT).
REPLY-CODES
Positive Preliminary Reply: The requested action is being initiated; expect another reply before proceeding with a new command. (The user-process sending another command before the completion reply would be in violation of protocol; but server-FTP processes should queue any commands that arrive while a preceding command is in progress.) This type of reply can be used to indicate that the command was accepted and the user-process may now pay attention to the data connections, for implementations where simultaneous monitoring is difficult. The server-FTP process may send at most, one 1yz reply per command
110 Restart marker reply. In this case, the text is exact and not left to the particular implementation; it must read: MARK yyyy = mmmm Where yyyy is User-process data stream marker, and mmmm server's equivalent marker (note the spaces between markers and "="). 120 Service ready in nnn minutes. 125 Data connection already open; transfer starting. 150 File status okay; about to open data connection.
Positive Completion Reply: The requested action has been successfully completed. A new request may be initiated.
200 Command okay. 202 Command not implemented, superfluous at this site. 211 System status, or system help reply. 212 Directory status. 213 File status. 214 Help message. On how to use the server or the meaning of a particular non-standard command. This reply is useful only to the human user. 215 NAME system type. Where NAME is an official system name from the list in the Assigned Numbers document. 220 Service ready for new user. 221 Service closing control connection. Logged out if appropriate. 225 Data connection open; no transfer in progress. 226 Closing data connection. Requested file action successful (for example, file transfer or file abort). 227 Entering Passive Mode (h1,h2,h3,h4,p1,p2). 230 User logged in, proceed. 250 Requested file action okay, completed. 257 "PATHNAME" created.
Positive Intermediate reply: The command has been accepted, but the requested action is being held in abeyance, pending receipt of further information. The user should send another command specifying this information. This reply is used in command sequence groups.
331 User name okay, need password. 332 Need account for login. 350 Requested file action pending further information.
Transient Negative Completion reply: The command was not accepted and the requested action did not take place, but the error condition is temporary and the action may be requested again. The user should return to the beginning of the command sequence, if any. It is difficult to assign a meaning to "transient", particularly when two distinct sites (Server- and User-processes) have to agree on the interpretation. Each reply in the 4yz category might have a slightly different time value, but the intent is that the user-process is encouraged to try again. A rule of thumb in determining if a reply fits into the 4yz or the 5yz (Permanent Negative) category is that replies are 4yz if the commands can be repeated without any change in command form or in properties of the User or Server (e.g., the command is spelled the same with the same arguments used; the user does not change his file access or user name; the server does not put up a new implementation.)
421 Service not available, closing control connection. This may be a reply to any command if the service knows it must shut down. 425 Can't open data connection. 426 Connection closed; transfer aborted. 450 Requested file action not taken. File unavailable (e.g., file busy). 451 Requested action aborted: local error in processing. 452 Requested action not taken. Insufficient storage space in system.
Permanent Negative Completion reply: The command was not accepted and the requested action did not take place. The User-process is discouraged from repeating the exact request (in the same sequence). Even some "permanent" error conditions can be corrected, so the human user may want to direct his User-process to reinitiate the command sequence by direct action at some point in the future (e.g., after the spelling has been changed, or the user has altered his directory status.)
500 Syntax error, command unrecognized. This may include errors such as command line too long. 501 Syntax error in parameters or arguments. 502 Command not implemented. 503 Bad sequence of commands. 504 Command not implemented for that parameter. 530 Not logged in. 532 Need account for storing files. 550 Requested action not taken. File unavailable (e.g., file not found, no access). 551 Requested action aborted: page type unknown. 552 Requested file action aborted. Exceeded storage allocation (for current directory or dataset). 553 Requested action not taken. File name not allowed.
Ivan Yurlagin,