nfsv4 Provider
The nfsv4 provider provides probes for tracing NFS v4 server activity.
This is a place holder. Contribute to the DTrace community by writing this chapter.
Arguments
All NFS operation probes have the first argument in common:
args[0] conninfo_t * socket connection information
The conninfo_t structure is already used by the iSCSI target provider (iscsi), and is intended for use by all provider which are providing some higher level protocol (e.g. iscsi, nfs, http, ftp).
typedef struct conninfo { string ci_local; /* local host address */ string ci_remote; /* remote host address */ string ci_protocol; /* protocol (ipv4, ipv6, etc) */ } conninfo_t;
Operation probes have their second argument in common:
args[1] nfsv4opinfo_t * NFS v4 operation properties typedef struct nfsv4opinfo { string noi_curpath; /* current file handle path (if any) */ cred_t *noi_cred; /* credentials */ uint64_t noi_xid; /* transaction ID */ } nfsv4opinfo_t;
Callback operation probes have their second argument in common:
args[1] nfsv4cbinfo_t * NFS v4 callback properties typedef struct nfsv4cbinfo { string nci_curpath; /* file handle path (if any) */ } nfsv4cbinfo_t;
Probes
Below is a list of the top level operation probes along with the specific argument for each whose type is defined by the NFS v4 specification:
probe name | args[2] |
---|---|
nfsv4:::compound-op-start | COMPOUND4args * |
nfsv4:::compound-op-done | COMPOUND4res * |
Below is a list of operation probes along with the specific argument for each whose type is defined by the NFS v4 specification:
probe name | args[2] |
---|---|
nfsv4:::op-access-start | ACCESS4args * |
nfsv4:::op-access-done | ACCESS4res * |
nfsv4:::op-close-start | CLOSE4args * |
nfsv4:::op-close-done | CLOSE4res * |
nfsv4:::op-commit-start | COMMIT4args * |
nfsv4:::op-commit-done | COMMIT4res * |
nfsv4:::op-create-start | CREATE4args * |
nfsv4:::op-create-done | CREATE4res * |
nfsv4:::op-delegpurge-start | DELEGPURGE4args * |
nfsv4:::op-delegpurge-done | DELEGPURGE4res * |
nfsv4:::op-delegreturn-start | DELEGRETURN4args * |
nfsv4:::op-delegreturn-done | DELEGRETURN4res * |
nfsv4:::op-getattr-start | GETATTR4args * |
nfsv4:::op-getattr-done | GETATTR4res * |
nfsv4:::op-getfh-start | GETFH4args * |
nfsv4:::op-getfh-done | GETFH4res * |
nfsv4:::op-link-start | LINK4args * |
nfsv4:::op-link-done | LINK4res * |
nfsv4:::op-lock-start | LOCK4args * |
nfsv4:::op-lock-done | LOCK4res * |
nfsv4:::op-lockt-start | LOCKT4args * |
nfsv4:::op-lockt-done | LOCKT4res * |
nfsv4:::op-locku-start | LOCKU4args * |
nfsv4:::op-locku-done | LOCKU4res * |
nfsv4:::op-lookup-start | LOOKUP4args * |
nfsv4:::op-lookup-done | LOOKUP4res * |
nfsv4:::op-lookupp-start | LOOKUPP4args * |
nfsv4:::op-lookupp-done | LOOKUPP4res * |
nfsv4:::op-nverify-start | NVERIFY4args * |
nfsv4:::op-nverify-done | NVERIFY4res * |
nfsv4:::op-open-start | OPEN4args * |
nfsv4:::op-open-done | OPEN4res * |
nfsv4:::op-open-confirm-start | OPEN_CONFIRM4args * |
nfsv4:::op-open-confirm-done | OPEN_CONFIRM4res * |
nfsv4:::op-open-downgrade-start | OPEN_DOWNGRADE4args * |
nfsv4:::op-open-downgrade-done | OPEN_DOWNGRADE4res * |
nfsv4:::op-openattr-start | OPENATTR4args * |
nfsv4:::op-openattr-done | OPENATTR4res * |
nfsv4:::op-putfh-start | PUTFH4args * |
nfsv4:::op-putfh-done | PUTFH4res * |
nfsv4:::op-putpubfh-start | PUTPUBFH4args * |
nfsv4:::op-putpubfh-done | PUTPUBFH4res * |
nfsv4:::op-putrootfh-start | PUTROOTFH4args * |
nfsv4:::op-putrootfh-done | PUTROOTFH4res * |
nfsv4:::op-read-start | READ4args * |
nfsv4:::op-read-done | READ4res * |
nfsv4:::op-readdir-start | READDIR4args * |
nfsv4:::op-readdir-done | READDIR4res * |
nfsv4:::op-readlink-start | READLINK4args * |
nfsv4:::op-readlink-done | READLINK4res * |
nfsv4:::op-release-lockowner-start | RELEASE_LOCKOWNER4args * |
nfsv4:::op-release-lockowner-done | RELEASE_LOCKOWNER4res * |
nfsv4:::op-remove-start | REMOVE4args * |
nfsv4:::op-remove-done | REMOVE4res * |
nfsv4:::op-rename-start | RENAME4args * |
nfsv4:::op-rename-done | RENAME4res * |
nfsv4:::op-renew-start | RENEW4args * |
nfsv4:::op-renew-done | RENEW4res * |
nfsv4:::op-restorefh-start | <none> |
nfsv4:::op-restorefh-done | <none> |
nfsv4:::op-savefh-start | SAVEFH4args * |
nfsv4:::op-savefh-done | SAVEFH4res * |
nfsv4:::op-secinfo-start | SECINFO4args * |
nfsv4:::op-secinfo-done | SECINFO4res * |
nfsv4:::op-setattr-start | SETATTR4args * |
nfsv4:::op-setattr-done | SETATTR4res * |
nfsv4:::op-setclientid-start | SETCLIENTID4args * |
nfsv4:::op-setclientid-done | SETCLIENTID4res * |
nfsv4:::op-setclientid-confirm-start | SETCLIENTID_CONFIRM4args * |
nfsv4:::op-setclientid-confirm-done | SETCLIENTID_CONFIRM4res * |
nfsv4:::op-verify-start | VERIFY4args * |
nfsv4:::op-verify-done | VERIFY4res * |
nfsv4:::op-write-start | WRITE4args * |
nfsv4:::op-write-done | WRITE4res * |
Callback compound probes have an undefined second argument; this slot isreserved for future use.
Below is a list of the top level callback probes along with the specific argument for each whose type is defined by the NFS v4 specification:
probe name | args[2] |
---|---|
nfsv4:::compound-cb-start | CB_COMPOUND4args * |
nfsv4:::compound-cb-done | CB_COMPOUND4res * |
Below is a list of callback probes along with the specific argument for each whose type is defined by the NFS v4 specification:
probe name | args[2] |
---|---|
nfsv4:::cb-getattr-start | CB_GETATTR4args * |
nfsv4:::cb-getattr-done | CB_GETATTR4res * |
nfsv4:::cb-recall-start | CB_RECALL4args * |
nfsv4:::cb-recall-done | CB_RECALL4res * |
(Note that since the Solaris NFS v4 implementation does not yet use the 'getattr' callback, the probe will not be implemented; it is noted here in anticipation of a future implementation).
Examples
Some simple examples of nfsv4 provider usage follow.
nfsv4rwsnoop.d
This DTrace script traces NFSv4 reads and writes:
#!/usr/sbin/dtrace -s #pragma D option quiet #pragma D option switchrate=10hz dtrace:::BEGIN { printf("%-16s %-18s %2s %-8s %6s %s\n", "TIME(us)", "CLIENT", "OP", "OFFSET", "BYTES", "PATHNAME"); } nfsv4:::op-read-start { printf("%-16d %-18s %2s %-8d %6d %s\n", timestamp / 1000, args[0]->ci_remote, "R", args[2]->offset / 1024, args[2]->count, args[1]->noi_curpath); } nfsv4:::op-write-start { printf("%-16d %-18s %2s %-8d %6d %s\n", timestamp / 1000, args[0]->ci_remote, "W", args[2]->offset / 1024, args[2]->data_len, args[1]->noi_curpath); }
This output shows a few files were read, and one was written:
# ./nfsv4rwsnoop.d TIME(us) CLIENT OP OFFSET BYTES PATHNAME 156889725960 192.168.1.109 R 0 4096 /export/share/bin/nawk 156889735515 192.168.1.109 R 4 28672 /export/share/bin/nawk 156889736298 192.168.1.109 R 32 32768 /export/share/bin/nawk 156889736544 192.168.1.109 R 96 32768 /export/share/bin/nawk 156889736902 192.168.1.109 R 64 32768 /export/share/bin/nawk 156916061653 192.168.1.109 R 0 4096 /export/share/bin/ssh 156916069375 192.168.1.109 R 4 28672 /export/share/bin/ssh 156916070098 192.168.1.109 R 32 32768 /export/share/bin/ssh 156916070435 192.168.1.109 R 96 32768 /export/share/bin/ssh 156916070758 192.168.1.109 R 64 32768 /export/share/bin/ssh 156916071036 192.168.1.109 R 128 32768 /export/share/bin/ssh 156916071352 192.168.1.109 R 160 32768 /export/share/bin/ssh 156916071582 192.168.1.109 R 192 32768 /export/share/bin/ssh 156916071696 192.168.1.109 R 72 4096 /export/share/bin/ssh 156916080508 192.168.1.109 R 224 4096 /export/share/bin/ssh 156916080844 192.168.1.109 R 228 28672 /export/share/bin/ssh 156916081566 192.168.1.109 R 256 32768 /export/share/bin/ssh 156916081833 192.168.1.109 R 288 32768 /export/share/bin/ssh 156916082237 192.168.1.109 R 320 20480 /export/share/bin/ssh 156933373074 192.168.1.109 W 0 32768 /export/share/words 156933373351 192.168.1.109 W 32 32768 /export/share/words 156933373855 192.168.1.109 W 64 32768 /export/share/words 156933374185 192.168.1.109 W 96 32768 /export/share/words 156933375442 192.168.1.109 W 128 32768 /export/share/words 156933375864 192.168.1.109 W 160 32768 /export/share/words 156933376105 192.168.1.109 W 192 10055 /export/share/words
The fields printed are:
field | description |
---|---|
TIME(us) | Time of event in microseconds |
CLIENT | Remote client IP address |
OP | R == read, W == write |
OFFSET | File offset of I/O, in Kbytes |
BYTES | Bytes of I/O |
PATHNAME | Pathname of file, if known |
nfsv4ops.d
This DTrace script counts NFSv4 operations by client, printing a summary every 5 seconds:
#!/usr/sbin/dtrace -s #pragma D option quiet dtrace:::BEGIN { trace("Tracing... Interval 5 secs.\n"); } nfsv4:::op-* { @ops[args[0]->ci_remote, probename] = count(); } profile:::tick-5sec, dtrace:::END { printf("\n %-32s %-28s %8s\n", "Client", "Operation", "Count"); printa(" %-32s %-28s %@8d\n", @ops); trunc(@ops); }
The following output shows which client is sending which NFSv4 operations:
# ./nfsv4ops.d Tracing... Interval 5 secs. Client Operation Count 192.168.1.109 op-getattr-done 1 192.168.1.109 op-getattr-start 1 192.168.1.109 op-putfh-done 1 192.168.1.109 op-putfh-start 1 Client Operation Count 192.168.1.109 op-access-done 1 192.168.1.109 op-access-start 1 192.168.1.109 op-close-done 1 192.168.1.109 op-close-start 1 192.168.1.109 op-getfh-done 1 192.168.1.109 op-getfh-start 1 192.168.1.109 op-open-done 1 192.168.1.109 op-open-start 1 192.168.1.109 op-getattr-done 3 192.168.1.109 op-getattr-start 3 192.168.1.109 op-read-done 9 192.168.1.109 op-read-start 9 192.168.1.109 op-putfh-done 12 192.168.1.109 op-putfh-start 12 ^C Client Operation Count
The fields printed are:
field | description |
---|---|
Client | Remote client IP address |
Operation | NFSv4 operation, described using the nfsv4 provider probename |
Count | Operations during this interval |
nfsv4fileio.d
This DTrace script prints a summary of file read and write bytes:
#!/usr/sbin/dtrace -s #pragma D option quiet dtrace:::BEGIN { trace("Tracing... Hit Ctrl-C to end.\n"); } nfsv4:::op-read-done { @readbytes[args[1]->noi_curpath] = sum(args[2]->data_len); } nfsv4:::op-write-done { @writebytes[args[1]->noi_curpath] = sum(args[2]->count); } dtrace:::END { printf("\n%12s %12s %s\n", "Rbytes", "Wbytes", "Pathname"); printa("%@12d %@12d %s\n", @readbytes, @writebytes); }
This output shows a few files were read, and one was written:
# ./nfsv4fileio.d Tracing... Hit Ctrl-C to end. ^C Rbytes Wbytes Pathname 0 206663 /export/share/words1 24528 0 /export/share/bin/xargs 44864 0 /export/share/bin/ed 232476 0 /export/share/bin/vi
The fields printed are:
field | description |
---|---|
Rbytes | Bytes read for this pathname |
Wbytes | Bytes written to this pathname |
Pathname | Pathname of NFS file |
nfsv4rwtime.d
This DTrace script prints a summary NFSv4 read and write elapsed times (latencies), along with other details:
#!/usr/sbin/dtrace -s #pragma D option quiet inline int TOP_FILES = 10; dtrace:::BEGIN { printf("Tracing... Hit Ctrl-C to end.\n"); } nfsv4:::op-read-start, nfsv4:::op-write-start { start[args[1]->noi_xid] = timestamp; } nfsv4:::op-read-done, nfsv4:::op-write-done { this->elapsed = timestamp - start[args[1]->noi_xid]; @rw[probename == "op-read-done" ? "read" : "write"] = quantize(this->elapsed / 1000); @host[args[0]->ci_remote] = sum(this->elapsed); @file[args[1]->noi_curpath] = sum(this->elapsed); start[args[1]->noi_xid] = 0; } dtrace:::END { printf("NFSv4 read/write distributions (us):\n"); printa(@rw); printf("\nNFSv4 read/write by host (total us):\n"); normalize(@host, 1000); printa(@host); printf("\nNFSv4 read/write top %d files (total us):\n", TOP_FILES); normalize(@file, 1000); trunc(@file, TOP_FILES); printa(@file); }
This output below shows a peak in the read time distribution plot in the 64 to 127 microsecond range, and a second peak between 2 and 8 milliseconds:
# ./nfsv4rwtime.d Tracing... Hit Ctrl-C to end. ^C NFSv4 read/write distributions (us): write value ------------- Distribution ------------- count 32 | 0 64 |@@@@@@ 1 128 |@@@@@@@@@@@ 2 256 |@@@@@@@@@@@@@@@@@ 3 512 |@@@@@@ 1 1024 | 0 read value ------------- Distribution ------------- count 16 | 0 32 |@@@@ 6 64 |@@@@@@@@@@@@ 17 128 |@ 1 256 |@@ 3 512 |@ 1 1024 |@@ 3 2048 |@@@@@@@@ 12 4096 |@@@@@@@@@@ 15 8192 |@ 1 16384 | 0 NFSv4 read/write by host (total us): 192.168.1.109 148215 NFSv4 read/write top 10 files (total us): /export/share/bin/man 5020 /export/share/bin/makeuuid 5132 /export/share/bin/mc68030 5836 /export/share/bin/m4 6446 /export/share/bin/msgfmt 6669 /export/share/bin/mkmsgs 6674 /export/share/bin/mailstats 6935 /export/share/bin/mkdir 7009 /export/share/bin/mac 7693 /export/share/bin/make 27903
Other details are printed, such as total read/write latency by host, and the top 10 files by latency.
The first peak in the read distribution is likely to be NFS operations hitting the memory cache, and the second those that missed and read from disk. The writes were all fast as they are likely to written to the memory cache and returned asynchronously. Further use of DTrace can confirm these theories.
The fields from the distribution plot are:
field | description |
---|---|
value | Minimum elapsed time for this event in microseconds |
count | Number of events in this time range |
nfsv4io.d
This is a simple DTrace script to provide basic I/O details by host every 5 seconds:
#!/usr/sbin/dtrace -s #pragma D option quiet dtrace:::BEGIN { interval = 5; printf("Tracing... Interval %d secs.\n", interval); tick = interval; } nfsv4:::op-* { @ops[args[0]->ci_remote] = count(); } nfsv4:::op-read-done { @reads[args[0]->ci_remote] = count(); @readbytes[args[0]->ci_remote] = sum(args[2]->data_len); } nfsv4:::op-write-done { @writes[args[0]->ci_remote] = count(); @writebytes[args[0]->ci_remote] = sum(args[2]->count); } profile:::tick-1sec /tick-- == 0/ { normalize(@ops, interval); normalize(@reads, interval); normalize(@writes, interval); normalize(@writebytes, 1024 * interval); normalize(@readbytes, 1024 * interval); printf("\n %-32s %6s %6s %6s %6s %8s\n", "Client", "r/s", "w/s", "kr/s", "kw/s", "ops/s"); printa(" %-32s %@6d %@6d %@6d %@6d %@8d\n", @reads, @writes, @readbytes, @writebytes, @ops); trunc(@ops); trunc(@reads); trunc(@writes); trunc(@readbytes); trunc(@writebytes); tick = interval; }
This output shows 192.168.1.109 calling NFSv4 reads and writes:
# ./nfsv4io.d Tracing... Interval 5 secs. Client r/s w/s kr/s kw/s ops/s 192.168.1.109 17 1 331 40 290 Client r/s w/s kr/s kw/s ops/s 192.168.1.109 9 0 197 0 152 Client r/s w/s kr/s kw/s ops/s 192.168.1.109 16 0 269 0 363 Client r/s w/s kr/s kw/s ops/s ^C
Other details can be calculated from the output, such as average read and write size (eg, 331(kr/s) / 17(r/s) = 19.5 average kr). These could also be added to the script to be printed as columns.
The fields printed are:
field | description |
---|---|
Client | Remote client IP address |
r/s | reads per second |
w/s | writes per second |
kr/s | kilobytes read per second |
kw/s | kilobytes written per second |
ops/s | total NFSv4 operations per second (including the reads and writes) |