How it works
Here you can make overview about how shfs works.
The source tree
Shfs sources are split across few directories. The shfs/Linux-2.x directory contains kernel module code for the specified kernel version, while shfsmount/ directory is where user-space utilities live (shfsmount, the perl/shell server core is here too).
Where the code came from
Since writing new filesystem module from the scratch is not very funny thing, shfs was partially based on Florin Malita's ftpfs. with some bugs fixed (locking, memory leaks, handling of date). As the time flows, the code from ftpfs began to vanish and now there is hardly any there. I have assimilated some portions of smbfs(ncpfs) code (mainly directory caching code through page cache) from the main Linux kernel tree.
Caches
Sending shell command to the remote host on every request from the kernel VFS layer is not a wise idea, because of high load it generates on both sides of ssh channel. Much better way is to use the caches for some operations, such as reading directories, reading and writing files, etc.
Read-write cache
(fcache.c)- on file open, n pages are allocated as simple read-write buffer
- file-offset and size are associated with the buffer
- entire buffer is either clean (for read only) or dirty (data not yet written)
- on read request, an attempt to read full buffer is performed (dirty data are flushed)
- subsequent requests read data from this buffer (hit)
- on write request, if there is enough space in the buffer, data are written to the buffer
- if buffer is full or file is closed, the entire buffer is is send to the remote peer
This makes great performance improvement, since calling dd (= storing data on the remote side) for each page generate quite high system load. Using read-write cache, dd is only called every on nth request. You can switch this cache off using "nocache" option while mounting the filesystem.
Directory cache
(dcache.c)- this cache is taken almost intact from smbfs/ncpfs, it uses plain dentry cache and page cache to prevent rm -rf from complaining
- the time-to-live of the dentry is 30 seconds by default, could be changed in the mount-time (ttl option).
Readline cache
(proc.c, function sock_readln)Lines are read all at once instead of char-by-char. This speeds up directory lookup.
How it all works together
Figure illustrates: When user calls shfsmount (or mount -t shfs) command in order to mount remote share, basic checks are done, process is forked and user command (ssh in most cases) is executed. This command has stdin/stdout redirected so it can be used for the connection by shfsmount. After connection is established, shfsmount initializes remote side and transfers "server" code to the remote side.
Although shfs is shell filesystem, there are two different server code available: shell and perl. Both have the same basic functionality although perl code is faster and more robust. Every code type has its test phase which does all necessary checks (for perl this would be perl version, available modules, etc.).
Then, shfsmount calls mount syscall and passes the file descriptor (of ssh stdin/stdout) to the kernel module. Shfsmount could exit or wait for ssh to die and restart the connection again (while in persistent mode).