James Lentini | a3fa73b | 2008-02-25 12:20:13 -0500 | [diff] [blame] | 1 | ################################################################################ |
| 2 | # # |
| 3 | # NFS/RDMA README # |
| 4 | # # |
| 5 | ################################################################################ |
| 6 | |
| 7 | Author: NetApp and Open Grid Computing |
James Lentini | c272cca | 2008-04-24 15:57:43 -0400 | [diff] [blame] | 8 | Date: April 15, 2008 |
James Lentini | a3fa73b | 2008-02-25 12:20:13 -0500 | [diff] [blame] | 9 | |
| 10 | Table of Contents |
| 11 | ~~~~~~~~~~~~~~~~~ |
| 12 | - Overview |
| 13 | - Getting Help |
| 14 | - Installation |
| 15 | - Check RDMA and NFS Setup |
| 16 | - NFS/RDMA Setup |
| 17 | |
| 18 | Overview |
| 19 | ~~~~~~~~ |
| 20 | |
| 21 | This document describes how to install and setup the Linux NFS/RDMA client |
| 22 | and server software. |
| 23 | |
| 24 | The NFS/RDMA client was first included in Linux 2.6.24. The NFS/RDMA server |
| 25 | was first included in the following release, Linux 2.6.25. |
| 26 | |
| 27 | In our testing, we have obtained excellent performance results (full 10Gbit |
| 28 | wire bandwidth at minimal client CPU) under many workloads. The code passes |
| 29 | the full Connectathon test suite and operates over both Infiniband and iWARP |
| 30 | RDMA adapters. |
| 31 | |
| 32 | Getting Help |
| 33 | ~~~~~~~~~~~~ |
| 34 | |
| 35 | If you get stuck, you can ask questions on the |
| 36 | |
| 37 | nfs-rdma-devel@lists.sourceforge.net |
| 38 | |
| 39 | mailing list. |
| 40 | |
| 41 | Installation |
| 42 | ~~~~~~~~~~~~ |
| 43 | |
| 44 | These instructions are a step by step guide to building a machine for |
| 45 | use with NFS/RDMA. |
| 46 | |
| 47 | - Install an RDMA device |
| 48 | |
| 49 | Any device supported by the drivers in drivers/infiniband/hw is acceptable. |
| 50 | |
| 51 | Testing has been performed using several Mellanox-based IB cards, the |
| 52 | Ammasso AMS1100 iWARP adapter, and the Chelsio cxgb3 iWARP adapter. |
| 53 | |
| 54 | - Install a Linux distribution and tools |
| 55 | |
| 56 | The first kernel release to contain both the NFS/RDMA client and server was |
| 57 | Linux 2.6.25 Therefore, a distribution compatible with this and subsequent |
| 58 | Linux kernel release should be installed. |
| 59 | |
| 60 | The procedures described in this document have been tested with |
| 61 | distributions from Red Hat's Fedora Project (http://fedora.redhat.com/). |
| 62 | |
| 63 | - Install nfs-utils-1.1.1 or greater on the client |
| 64 | |
| 65 | An NFS/RDMA mount point can only be obtained by using the mount.nfs |
| 66 | command in nfs-utils-1.1.1 or greater. To see which version of mount.nfs |
| 67 | you are using, type: |
| 68 | |
| 69 | > /sbin/mount.nfs -V |
| 70 | |
| 71 | If the version is less than 1.1.1 or the command does not exist, |
| 72 | then you will need to install the latest version of nfs-utils. |
| 73 | |
| 74 | Download the latest package from: |
| 75 | |
| 76 | http://www.kernel.org/pub/linux/utils/nfs |
| 77 | |
| 78 | Uncompress the package and follow the installation instructions. |
| 79 | |
| 80 | If you will not be using GSS and NFSv4, the installation process |
| 81 | can be simplified by disabling these features when running configure: |
| 82 | |
| 83 | > ./configure --disable-gss --disable-nfsv4 |
| 84 | |
| 85 | For more information on this see the package's README and INSTALL files. |
| 86 | |
| 87 | After building the nfs-utils package, there will be a mount.nfs binary in |
| 88 | the utils/mount directory. This binary can be used to initiate NFS v2, v3, |
| 89 | or v4 mounts. To initiate a v4 mount, the binary must be called mount.nfs4. |
| 90 | The standard technique is to create a symlink called mount.nfs4 to mount.nfs. |
| 91 | |
| 92 | NOTE: mount.nfs and therefore nfs-utils-1.1.1 or greater is only needed |
| 93 | on the NFS client machine. You do not need this specific version of |
| 94 | nfs-utils on the server. Furthermore, only the mount.nfs command from |
| 95 | nfs-utils-1.1.1 is needed on the client. |
| 96 | |
| 97 | - Install a Linux kernel with NFS/RDMA |
| 98 | |
| 99 | The NFS/RDMA client and server are both included in the mainline Linux |
| 100 | kernel version 2.6.25 and later. This and other versions of the 2.6 Linux |
| 101 | kernel can be found at: |
| 102 | |
| 103 | ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ |
| 104 | |
| 105 | Download the sources and place them in an appropriate location. |
| 106 | |
| 107 | - Configure the RDMA stack |
| 108 | |
| 109 | Make sure your kernel configuration has RDMA support enabled. Under |
| 110 | Device Drivers -> InfiniBand support, update the kernel configuration |
| 111 | to enable InfiniBand support [NOTE: the option name is misleading. Enabling |
| 112 | InfiniBand support is required for all RDMA devices (IB, iWARP, etc.)]. |
| 113 | |
| 114 | Enable the appropriate IB HCA support (mlx4, mthca, ehca, ipath, etc.) or |
| 115 | iWARP adapter support (amso, cxgb3, etc.). |
| 116 | |
| 117 | If you are using InfiniBand, be sure to enable IP-over-InfiniBand support. |
| 118 | |
| 119 | - Configure the NFS client and server |
| 120 | |
| 121 | Your kernel configuration must also have NFS file system support and/or |
| 122 | NFS server support enabled. These and other NFS related configuration |
| 123 | options can be found under File Systems -> Network File Systems. |
| 124 | |
| 125 | - Build, install, reboot |
| 126 | |
| 127 | The NFS/RDMA code will be enabled automatically if NFS and RDMA |
| 128 | are turned on. The NFS/RDMA client and server are configured via the hidden |
| 129 | SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The |
| 130 | value of SUNRPC_XPRT_RDMA will be: |
| 131 | |
| 132 | - N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA client |
| 133 | and server will not be built |
| 134 | - M if both SUNRPC and INFINIBAND are on (M or Y) and at least one is M, |
| 135 | in this case the NFS/RDMA client and server will be built as modules |
| 136 | - Y if both SUNRPC and INFINIBAND are Y, in this case the NFS/RDMA client |
| 137 | and server will be built into the kernel |
| 138 | |
| 139 | Therefore, if you have followed the steps above and turned no NFS and RDMA, |
| 140 | the NFS/RDMA client and server will be built. |
| 141 | |
| 142 | Build a new kernel, install it, boot it. |
| 143 | |
| 144 | Check RDMA and NFS Setup |
| 145 | ~~~~~~~~~~~~~~~~~~~~~~~~ |
| 146 | |
| 147 | Before configuring the NFS/RDMA software, it is a good idea to test |
| 148 | your new kernel to ensure that the kernel is working correctly. |
| 149 | In particular, it is a good idea to verify that the RDMA stack |
| 150 | is functioning as expected and standard NFS over TCP/IP and/or UDP/IP |
| 151 | is working properly. |
| 152 | |
| 153 | - Check RDMA Setup |
| 154 | |
| 155 | If you built the RDMA components as modules, load them at |
| 156 | this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel |
| 157 | card: |
| 158 | |
| 159 | > modprobe ib_mthca |
| 160 | > modprobe ib_ipoib |
| 161 | |
| 162 | If you are using InfiniBand, make sure there is a Subnet Manager (SM) |
| 163 | running on the network. If your IB switch has an embedded SM, you can |
| 164 | use it. Otherwise, you will need to run an SM, such as OpenSM, on one |
| 165 | of your end nodes. |
| 166 | |
| 167 | If an SM is running on your network, you should see the following: |
| 168 | |
| 169 | > cat /sys/class/infiniband/driverX/ports/1/state |
| 170 | 4: ACTIVE |
| 171 | |
| 172 | where driverX is mthca0, ipath5, ehca3, etc. |
| 173 | |
| 174 | To further test the InfiniBand software stack, use IPoIB (this |
| 175 | assumes you have two IB hosts named host1 and host2): |
| 176 | |
| 177 | host1> ifconfig ib0 a.b.c.x |
| 178 | host2> ifconfig ib0 a.b.c.y |
| 179 | host1> ping a.b.c.y |
| 180 | host2> ping a.b.c.x |
| 181 | |
| 182 | For other device types, follow the appropriate procedures. |
| 183 | |
| 184 | - Check NFS Setup |
| 185 | |
| 186 | For the NFS components enabled above (client and/or server), |
| 187 | test their functionality over standard Ethernet using TCP/IP or UDP/IP. |
| 188 | |
| 189 | NFS/RDMA Setup |
| 190 | ~~~~~~~~~~~~~~ |
| 191 | |
| 192 | We recommend that you use two machines, one to act as the client and |
| 193 | one to act as the server. |
| 194 | |
| 195 | One time configuration: |
| 196 | |
| 197 | - On the server system, configure the /etc/exports file and |
| 198 | start the NFS/RDMA server. |
| 199 | |
James Lentini | c272cca | 2008-04-24 15:57:43 -0400 | [diff] [blame] | 200 | Exports entries with the following formats have been tested: |
James Lentini | a3fa73b | 2008-02-25 12:20:13 -0500 | [diff] [blame] | 201 | |
James Lentini | c272cca | 2008-04-24 15:57:43 -0400 | [diff] [blame] | 202 | /vol0 192.168.0.47(fsid=0,rw,async,insecure,no_root_squash) |
| 203 | /vol0 192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash) |
James Lentini | a3fa73b | 2008-02-25 12:20:13 -0500 | [diff] [blame] | 204 | |
James Lentini | c272cca | 2008-04-24 15:57:43 -0400 | [diff] [blame] | 205 | The IP address(es) is(are) the client's IPoIB address for an InfiniBand HCA or the |
| 206 | cleint's iWARP address(es) for an RNIC. |
| 207 | |
| 208 | NOTE: The "insecure" option must be used because the NFS/RDMA client does not |
| 209 | use a reserved port. |
James Lentini | a3fa73b | 2008-02-25 12:20:13 -0500 | [diff] [blame] | 210 | |
| 211 | Each time a machine boots: |
| 212 | |
| 213 | - Load and configure the RDMA drivers |
| 214 | |
| 215 | For InfiniBand using a Mellanox adapter: |
| 216 | |
| 217 | > modprobe ib_mthca |
| 218 | > modprobe ib_ipoib |
| 219 | > ifconfig ib0 a.b.c.d |
| 220 | |
| 221 | NOTE: use unique addresses for the client and server |
| 222 | |
| 223 | - Start the NFS server |
| 224 | |
| 225 | If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config), |
| 226 | load the RDMA transport module: |
| 227 | |
| 228 | > modprobe svcrdma |
| 229 | |
| 230 | Regardless of how the server was built (module or built-in), start the server: |
| 231 | |
| 232 | > /etc/init.d/nfs start |
| 233 | |
| 234 | or |
| 235 | |
| 236 | > service nfs start |
| 237 | |
| 238 | Instruct the server to listen on the RDMA transport: |
| 239 | |
| 240 | > echo rdma 2050 > /proc/fs/nfsd/portlist |
| 241 | |
| 242 | - On the client system |
| 243 | |
| 244 | If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config), |
| 245 | load the RDMA client module: |
| 246 | |
| 247 | > modprobe xprtrdma.ko |
| 248 | |
| 249 | Regardless of how the client was built (module or built-in), issue the mount.nfs command: |
| 250 | |
| 251 | > /path/to/your/mount.nfs <IPoIB-server-name-or-address>:/<export> /mnt -i -o rdma,port=2050 |
| 252 | |
| 253 | To verify that the mount is using RDMA, run "cat /proc/mounts" and check the |
| 254 | "proto" field for the given mount. |
| 255 | |
| 256 | Congratulations! You're using NFS/RDMA! |