Actions and Subroutines
You can use D function calls such as trace and printf to invoke two different kinds of services provided by DTrace: actions that trace data or modify state external to DTrace, and subroutines that affect only internal DTrace state. This chapter defines the actions and subroutines and describes their syntax and semantics.
Actions
Actions enable your DTrace programs to interact with the system outside of DTrace. The most common actions record data to a DTrace buffer. Other actions are available, such as stopping the current process, raising a specific signal on the current process, or ceasing tracing altogether. Some of these actions are destructive in that they change the system, albeit in a well-defined way. These actions may only be used if destructive actions have been explicitly enabled. By default, data recording actions record data to the principal buffer. For more details on the principal buffer and buffer policies, see Chapter 11, Buffers and Buffering.
Default Action
A clause can contain any number of actions and variable manipulations. If a clause is left empty, the default action is taken. The default action is to trace the enabled probe identifier (EPID) to the principal buffer. The EPID identifies a particular enabling of a particular probe with a particular predicate and actions. From the EPID, DTrace consumers can determine the probe that induced the action. Indeed, whenever any data is traced, it must be accompanied by the EPID to enable the consumer to make sense of the data. Therefore, the default action is to trace the EPID and nothing else.
Using the default action allows for simple use of dtrace(1M). For example, the following example command enables all probes in the TS timeshare scheduling module with the default action:
# dtrace -m TS
The preceding command might produce output similar to the following example:
# dtrace -m TS dtrace: description 'TS' matched 80 probes CPU ID FUNCTION:NAME 0 12077 ts_trapret:entry 0 12078 ts_trapret:return 0 12069 ts_sleep:entry 0 12070 ts_sleep:return 0 12033 ts_setrun:entry 0 12034 ts_setrun:return 0 12081 ts_wakeup:entry 0 12082 ts_wakeup:return 0 12069 ts_sleep:entry 0 12070 ts_sleep:return 0 12033 ts_setrun:entry 0 12034 ts_setrun:return 0 12069 ts_sleep:entry 0 12070 ts_sleep:return 0 12033 ts_setrun:entry 0 12034 ts_setrun:return 0 12069 ts_sleep:entry 0 12070 ts_sleep:return 0 12023 ts_update:entry 0 12079 ts_update_list:entry 0 12080 ts_update_list:return 0 12079 ts_update_list:entry ...
Data Recording Actions
The data recording actions comprise the core DTrace actions. Each of these actions records data to the principal buffer by default, but each action may also be used to record data to speculative buffers. See Chapter 11, Buffers and Buffering for more details on the principal buffer. See Chapter 13, Speculative Tracing for more details on speculative buffers. The descriptions in this section refer only to the directed buffer, indicating that data is recorded either to the principal buffer or to a speculative buffer if the action follows a speculate.
trace
void trace(expression)
The most basic action is the trace action, which takes a D expression as its argument and traces the result to the directed buffer. The following statements are examples of trace actions:
trace(execname); trace(curlwpsinfo->pr_pri); trace(timestamp / 1000); trace(`lbolt); trace("somehow managed to get here");
tracemem
void tracemem(address, size_t nbytes)
The tracemem action takes a D expression as its first argument, address, and a constant as its second argument, nbytes. tracemem copies the memory from the address specified by addr into the directed buffer for the length specified by nbytes.
The output format depends on the data printed. When dtrace decides that the data looks like ascii string, it prints them as text, and output is terminated by first '\0'. When dtrace decides that the data is binary, it prints them in hex form
0 342 write:entry 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: c0 de 09 c2 4a e8 27 54 dc f8 9f f1 9a 20 4b d1 ....J.'T..... K. 10: 9c 7a 7a 85 1b 03 0a fb 3a 81 8a 1b 25 35 b3 9a .zz.....:...%5.. 20: f1 7d e6 2b 66 6d 1c 11 f8 eb 40 7f 65 9a 25 f8 .}.+fm....@.e.%. 30: c8 68 87 b2 6f 48 a2 a5 f3 a2 1f 46 ab 3d f9 d2 .h..oH.....F.=.. 40: 3d b8 4c c0 41 3c f7 3c cd 18 ad 0d 0d d3 1a 90 =.L.A<.<........
You can force tracemem to use always binary format by using rawbytes option.
printf
void printf(string format, ...)
Like trace, the printf action traces D expressions. However, printf allows for elaborate printf(3C)-style formatting. Like printf(3C), the parameters consists of a format string followed by a variable number of arguments. By default, the arguments are traced to the directed buffer. The arguments are later formatted for output by dtrace(1M) according to the specified format string. For example, the first two examples of trace from trace() could be combined in a single printf:
printf("execname is %s; priority is %d", execname, curlwpsinfo->pr_pri);
For more information on printf, see Chapter 12, Output Formatting.
printa
void printa(aggregation) void printa(string format, aggregation)
The printa action enables you to display and format aggregations. See Chapter 9, Aggregations for more detail on aggregations. If a format is not provided, printa only traces a directive to the DTrace consumer that the specified aggregation should be processed and displayed using the default format. If a format is provided, the aggregation will be formatted as specified. See Chapter 12, Output Formatting for a more detailed description of the printa format string.
printa only traces a directive that the aggregation should be processed by the DTrace consumer. It does not process the aggregation in the kernel. Therefore, the time between the tracing of the printa directive and the actual processing of the directive depends on the factors that affect buffer processing. These factors include the aggregation rate, the buffering policy and, if the buffering policy is switching, the rate at which buffers are switched. See Chapter 9, Aggregations and Chapter 11, Buffers and Buffering for detailed descriptions of these factors.
stack
void stack(int nframes) void stack(void)
The stack action records a kernel stack trace to the directed buffer. The kernel stack will be nframes in depth. If nframes is not provided, the number of stack frames recorded is the number specified by the stackframes option. For example:
# dtrace -n uiomove:entry'{stack()}' CPU ID FUNCTION:NAME 0 9153 uiomove:entry genunix`fop_write+0x1b namefs`nm_write+0x1d genunix`fop_write+0x1b genunix`write+0x1f7 0 9153 uiomove:entry genunix`fop_read+0x1b genunix`read+0x1d4 0 9153 uiomove:entry genunix`strread+0x394 specfs`spec_read+0x65 genunix`fop_read+0x1b genunix`read+0x1d4 ...
The stack action is a little different from other actions in that it may also be used as the key to an aggregation:
# dtrace -n kmem_alloc:entry'{@[stack()] = count()}' dtrace: description 'kmem_alloc:entry' matched 1 probe ^C rpcmod`endpnt_get+0x47c rpcmod`clnt_clts_kcallit_addr+0x26f rpcmod`clnt_clts_kcallit+0x22 nfs`rfscall+0x350 nfs`rfs2call+0x60 nfs`nfs_getattr_otw+0x9e nfs`nfsgetattr+0x26 nfs`nfs_getattr+0xb8 genunix`fop_getattr+0x18 genunix`cstat64+0x30 genunix`cstatat64+0x4a genunix`lstat64+0x1c 1 genunix`vfs_rlock_wait+0xc genunix`lookuppnvp+0x19d genunix`lookuppnat+0xe7 genunix`lookupnameat+0x87 genunix`lookupname+0x19 genunix`chdir+0x18 1 rpcmod`endpnt_get+0x6b1 rpcmod`clnt_clts_kcallit_addr+0x26f rpcmod`clnt_clts_kcallit+0x22 nfs`rfscall+0x350 nfs`rfs2call+0x60 nfs`nfs_getattr_otw+0x9e nfs`nfsgetattr+0x26 nfs`nfs_getattr+0xb8 genunix`fop_getattr+0x18 genunix`cstat64+0x30 genunix`cstatat64+0x4a genunix`lstat64+0x1c 1 ...
ustack
void ustack(int nframes, int strsize) void ustack(int nframes) void ustack(void)
The ustack action records a user stack trace to the directed buffer. The user stack will be nframes in depth. If nframes is not provided, the number of stack frames recorded is the number specified by the ustackframes option. While ustack is able to determine the address of the calling frames when the probe fires, the stack frames will not be translated into symbols until the ustack action is processed at user-level by the DTrace consumer. If strsize is specified and non-zero, ustack will allocate the specified amount of string space, and use it to perform address-to-symbol translation directly from the kernel. This direct user symbol translation is currently available only for Java virtual machines, version 1.5 and higher. Java address-to-symbol translation annotates user stacks that contain Java frames with the Java class and method name. If such frames cannot be translated, the frames will appear only as hexadecimal addresses.
The following example traces a stack with no string space, and therefore no Java address-to-symbol translation:
# dtrace -n syscall::write:entry'/pid == $target/{ustack(50, 0); exit(0)}' -c "java -version" dtrace: description 'syscall::write:entry' matched 1 probe java version "1.5.0-beta3" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-beta3-b58) Java HotSpot(TM) Client VM (build 1.5.0-beta3-b58, mixed mode) dtrace: pid 5312 has exited CPU ID FUNCTION:NAME 0 35 write:entry libc.so.1`_write+0x15 libjvm.so`__1cDhpiFwrite6FipkvI_I_+0xa8 libjvm.so`JVM_Write+0x2f d0c5c946 libjava.so`Java_java_io_FileOutputStream_writeBytes+0x2c cb007fcd cb002a7b cb002a7b cb002a7b cb002a7b cb002a7b cb002a7b cb002a7b cb002a7b cb002a7b cb002a7b cb002a7b cb002a7b cb002a7b cb000152 libjvm.so`__1cJJavaCallsLcall_helper6FpnJJavaValue_ pnMmethodHandle_pnRJavaCallArguments_ pnGThread__v_+0x187 libjvm.so`__1cCosUos_exception_wrapper6FpFpnJJavaValue_ pnMmethodHandle_pnRJavaCallArguments_ pnGThread__v2468_v_+0x14 libjvm.so`__1cJJavaCallsEcall6FpnJJavaValue_nMmethodHandle_ pnRJavaCallArguments_pnGThread __v_+0x28 libjvm.so`__1cRjni_invoke_static6FpnHJNIEnv__pnJJavaValue_ pnI_jobject_nLJNICallType_pnK_jmethodID_pnSJNI_ ArgumentPusher_pnGThread__v_+0x180 libjvm.so`jni_CallStaticVoidMethod+0x10f java`main+0x53d
Notice that the C and C++ stack frames from the Java virtual machine are presented symbolically using C++ “mangled” symbol names, and the Java stack frames are presented only as hexadecimal addresses. The following example shows a call to ustack with a non-zero string space:
# dtrace -n syscall::write:entry'/pid == $target/{ustack(50, 500); exit(0)}' -c "java -version" dtrace: description 'syscall::write:entry' matched 1 probe java version "1.5.0-beta3" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-beta3-b58) Java HotSpot(TM) Client VM (build 1.5.0-beta3-b58, mixed mode) dtrace: pid 5308 has exited CPU ID FUNCTION:NAME 0 35 write:entry libc.so.1`_write+0x15 libjvm.so`__1cDhpiFwrite6FipkvI_I_+0xa8 libjvm.so`JVM_Write+0x2f d0c5c946 libjava.so`Java_java_io_FileOutputStream_writeBytes+0x2c java/io/FileOutputStream.writeBytes java/io/FileOutputStream.write java/io/BufferedOutputStream.flushBuffer java/io/BufferedOutputStream.flush java/io/PrintStream.write sun/nio/cs/StreamEncoder$CharsetSE.writeBytes sun/nio/cs/StreamEncoder$CharsetSE.implFlushBuffer sun/nio/cs/StreamEncoder.flushBuffer java/io/OutputStreamWriter.flushBuffer java/io/PrintStream.write java/io/PrintStream.print java/io/PrintStream.println sun/misc/Version.print sun/misc/Version.print StubRoutines (1) libjvm.so`__1cJJavaCallsLcall_helper6FpnJJavaValue_ pnMmethodHandle_pnRJavaCallArguments_pnGThread __v_+0x187 libjvm.so`__1cCosUos_exception_wrapper6FpFpnJJavaValue_ pnMmethodHandle_pnRJavaCallArguments_pnGThread __v2468_v_+0x14 libjvm.so`__1cJJavaCallsEcall6FpnJJavaValue_nMmethodHandle _pnRJavaCallArguments_pnGThread__v_+0x28 libjvm.so`__1cRjni_invoke_static6FpnHJNIEnv__pnJJavaValue_pnI _jobject_nLJNICallType_pnK_jmethodID_pnSJNI _ArgumentPusher_pnGThread__v_+0x180 libjvm.so`jni_CallStaticVoidMethod+0x10f java`main+0x53d 8051b9a
The above example output demonstrates symbolic stack frame information for Java stack frames. There are still some hexadecimal frames in this output because some functions are static and do not have entries in the application symbol table. Translation is not possible for these frames.
The ustack symbol translation for non-Java frames occurs after the stack data is recorded. Therefore, the corresponding user process might exit before symbol translation can be performed, making stack frame translation impossible. If the user process exits before symbol translation is performed, dtrace will emit a warning message, followed by the hexadecimal stack frames, as shown in the following example:
dtrace: failed to grab process 100941: no such process c7b834d4 c7bca85d c7bca1a4 c7bd4374 c7bc2628 8047efc
Techniques for mitigating this problem are described in Chapter 33, User Process Tracing.
Finally, because the postmortem DTrace debugger commands cannot perform the frame translation, using ustack with a ring buffer policy always results in raw ustack data.
The following D program shows an example of ustack that leaves strsize unspecified:
syscall::brk:entry /execname == $$1/ { @[ustack(40)] = count(); }
To run this example for the Netscape web browser, .netscape.bin in default Solaris installations, use the following command:
# dtrace -s brk.d .netscape.bin dtrace: description 'syscall::brk:entry' matched 1 probe ^C libc.so.1`_brk_unlocked+0xc 88143f6 88146cd .netscape.bin`unlocked_malloc+0x3e .netscape.bin`unlocked_calloc+0x22 .netscape.bin`calloc+0x26 .netscape.bin`_IMGCB_NewPixmap+0x149 .netscape.bin`il_size+0x2f7 .netscape.bin`il_jpeg_write+0xde 8440c19 .netscape.bin`il_first_write+0x16b 8394670 83928e5 .netscape.bin`NET_ProcessHTTP+0xa6 .netscape.bin`NET_ProcessNet+0x49a 827b323 libXt.so.4`XtAppProcessEvent+0x38f .netscape.bin`fe_EventLoop+0x190 .netscape.bin`main+0x1875 1 libc.so.1`_brk_unlocked+0xc libc.so.1`sbrk+0x29 88143df 88146cd .netscape.bin`unlocked_malloc+0x3e .netscape.bin`unlocked_calloc+0x22 .netscape.bin`calloc+0x26 .netscape.bin`_IMGCB_NewPixmap+0x149 .netscape.bin`il_size+0x2f7 .netscape.bin`il_jpeg_write+0xde 8440c19 .netscape.bin`il_first_write+0x16b 8394670 83928e5 .netscape.bin`NET_ProcessHTTP+0xa6 .netscape.bin`NET_ProcessNet+0x49a 827b323 libXt.so.4`XtAppProcessEvent+0x38f .netscape.bin`fe_EventLoop+0x190 .netscape.bin`main+0x1875 1 ...
jstack
void jstack(int nframes, int strsize) void jstack(int nframes) void jstack(void)
jstack is an alias for ustack that uses the jstackframes option for the number of stack frames the value specified by , and for the string space size the value specified by the jstackstrsize option. By default, jstacksize defaults to a non-zero value. As a result, use of jstack will result in a stack with in situ Java frame translation.
uaddr
_usymaddr uaddr(uintptr_t address)
uaddr will prints the symbol for a specified address, including hexadecimal offset. This allows for the same symbol resolution that ustack provides.
# dtrace -c date -n 'pid$target::main:entry{ uaddr(0x8062578); }' dtrace: description 'pid$target::main:entry' matched 1 probe Sun Feb 3 20:58:03 PST 2008 dtrace: pid 105537 has exited CPU ID FUNCTION:NAME 0 59934 main:entry date`clock_val
In the above example, a call to uaddr(0x8062578) causes date`clock_val to be printed.
The example below shows the hexadecimal offsets being printed.
demo$ sudo dtrace -n "pid\$target::main:{uaddr(uregs[R_PC])}" -c nmap dtrace: description 'pid$target::main:' matched 946 probes [outout cut] dtrace: pid 2229 has exited CPU ID FUNCTION:NAME 1 20165 main:entry nmap`main 1 20166 main:0 nmap`main 1 20167 main:1 nmap`main+0x1 1 20168 main:3 nmap`main+0x3 1 20169 main:4 nmap`main+0x4 1 20170 main:5 nmap`main+0x5 1 20171 main:6 nmap`main+0x6 1 20172 main:b nmap`main+0xb 1 20173 main:c nmap`main+0xc 1 20174 main:12 nmap`main+0x12 1 20175 main:15 nmap`main+0x15 1 20176 main:1c nmap`main+0x1c 1 20177 main:23 nmap`main+0x23 1 20178 main:2b nmap`main+0x2b 1 20179 main:2e nmap`main+0x2e 1 20180 main:33 nmap`main+0x33 ... ... ...
usym
_usymaddr usym(uintptr_t address)
usym will print the symbol for a specified address. This is analogous to how uaddr works, but without the hexadecimal offsets.
uaddr: date`clock_val+0x1 usym: date`clock_val
Destructive Actions
Some DTrace actions are destructive in that they change the state of the system in some well-defined way. Destructive actions may not be used unless they have been explicitly enabled. When using dtrace(1M), you can enable destructive actions using the -w option. If an attempt is made to enable destructive actions in dtrace(1M) without explicitly enabling them, dtrace will fail with a message similar to the following example:
dtrace: failed to enable 'syscall': destructive actions not allowed
An administrator may choose to disable destructive actions system-wide by setting the kernel tunable dtrace_destructive_disallow to 1. This may be done in a number of ways including rebooting after adding the following line to /etc/system:
set dtrace:dtrace_destructive_disallow = 1
It may be set temporarily on a running system using mdb(1):
# echo "dtrace_destructive_disallow/W 1" | mdb -kw dtrace_destructive_disallow: 0x0 = 0x1
Process Destructive Actions
Some destructive actions are destructive only to a particular process. These actions are available to users with the dtrace_proc or dtrace_user privileges. See Chapter 35, Security for details on DTrace security privileges.
stop
void stop(void)
The stop action forces the process that fires the enabled probe to stop when it next leaves the kernel, as if stopped by a proc(4) action. The prun(1) utility may be used to resume a process that has been stopped by the stop action. The stop action can be used to stop a process at any DTrace probe point. This action can be used to capture a program in a particular state that would be difficult to achieve with a simple breakpoint, and then attach a traditional debugger like mdb(1) to the process. You can also use the gcore(1) utility to save the state of a stopped process in a core file for later analysis.
raise
void raise(int signal)
The raise action sends the specified signal to the currently running process. This action is similar to using the kill(1) command to send a process a signal. The raise action can be used to send a signal at a precise point in a process's execution.
copyout
void copyout(void *buf, uintptr_t addr, size_t nbytes)
The copyout action copies nbytes from the buffer specified by buf to the address specified by addr in the address space of the process associated with the current thread. If the user-space address does not correspond to a valid, faulted-in page in the current address space, an error will be generated.
copyoutstr
void copyoutstr(string str, uintptr_t addr, size_t maxlen)
The copyoutstr action copies the string specified by str to the address specified by addr in the address space of the process associated with the current thread. If the user-space address does not correspond to a valid, faulted-in page in the current address space, an error will be generated. The string length is limited to the value set by the strsize option. See Chapter 16, Options and Tunables for details.
system
void system(string program, ...)
The system action causes the program specified by program to be executed as if it were given to the shell as input. The program string may contain any of the printf/printa format conversions. Arguments must be specified that match the format conversions. Refer to Chapter 12, Output Formatting for details on valid format conversions.
The following example runs the date(1) command once per second:
# dtrace -wqn tick-1sec'{system("date")}' Tue Jul 20 11:56:26 CDT 2004 Tue Jul 20 11:56:27 CDT 2004 Tue Jul 20 11:56:28 CDT 2004 Tue Jul 20 11:56:29 CDT 2004 Tue Jul 20 11:56:30 CDT 2004
The following example shows a more elaborate use of the action, using printf conversions in the program string along with traditional filtering tools like pipes:
#pragma D option destructive #pragma D option quiet proc:::signal-send /args[2] == SIGINT/ { printf("SIGINT sent to %s by ", args[1]->pr_fname); system("getent passwd %d | cut -d: -f5", uid); }
Running the above script results in output similar to the following example:
# ./whosend.d SIGINT sent to MozillaFirebird- by Bryan Cantrill SIGINT sent to run-mozilla.sh by Bryan Cantrill ^C SIGINT sent to dtrace by Bryan Cantrill
The execution of the specified command does not occur in the context of the firing probe – it occurs when the buffer containing the details of the system action are processed at user-level. How and when this processing occurs depends on the buffering policy, described in Chapter 11, Buffers and Buffering. With the default buffering policy, the buffer processing rate is specified by the switchrate option. You can see the delay inherent in system if you explicitly tune the switchrate higher than its one-second default, as shown in the following example:
#pragma D option quiet #pragma D option destructive #pragma D option switchrate=5sec tick-1sec /n++ < 5/ { printf("walltime : %Y\n", walltimestamp); printf("date : "); system("date"); printf("\n"); } tick-1sec /n == 5/ { exit(0); }
Running the above script results in output similar to the following example:
# dtrace -s ./time.d walltime : 2004 Jul 20 13:26:30 date : Tue Jul 20 13:26:35 CDT 2004 walltime : 2004 Jul 20 13:26:31 date : Tue Jul 20 13:26:35 CDT 2004 walltime : 2004 Jul 20 13:26:32 date : Tue Jul 20 13:26:35 CDT 2004 walltime : 2004 Jul 20 13:26:33 date : Tue Jul 20 13:26:35 CDT 2004 walltime : 2004 Jul 20 13:26:34 date : Tue Jul 20 13:26:35 CDT 2004
Notice that the walltime values differ, but the date values are identical. This result reflects the fact that the execution of the date(1) command occured only when the buffer was processed, not when the system action was recorded.
Kernel Destructive Actions
Some destructive actions are destructive to the entire system. These actions must obviously be used extremely carefully, as they will affect every process on the system and any other system implicitly or explicitly depending upon the affected system's network services.
breakpoint
void breakpoint(void)
The breakpoint action induces a kernel breakpoint, causing the system to stop and transfer control to the kernel debugger. The kernel debugger will emit a string denoting the DTrace probe that triggered the action. For example, if one were to do the following:
# dtrace -w -n clock:entry'{breakpoint()}' dtrace: allowing destructive actions dtrace: description 'clock:entry' matched 1 probe
On Solaris running on SPARC, the following message might appear on the console:
dtrace: breakpoint action at probe fbt:genunix:clock:entry (ecb 30002765700) Type 'go' to resume ok
On Solaris running on x86, the following message might appear on the console:
dtrace: breakpoint action at probe fbt:genunix:clock:entry (ecb d2b97060) stopped at int20+0xb: ret kmdb[0]:
The address following the probe description is the address of the enabling control block (ECB) within DTrace. You can use this address to determine more details about the probe enabling that induced the breakpoint action.
A mistake with the breakpoint action may cause it to be called far more often than intended. This behavior might in turn prevent you from even terminating the DTrace consumer that is triggering the breakpoint actions. In this situation, set the kernel tunable dtrace_destructive_disallow to 1. This setting will disallow all destructive actions on the machine.
The exact method for setting dtrace_destructive_disallow will depend on the kernel debugger that you are using. If using the OpenBoot PROM on a SPARC system, use w!:
ok 1 dtrace_destructive_disallow w! ok
Confirm that the variable has been set using w?:
ok dtrace_destructive_disallow w? 1 ok
Continue by typing go:
ok go
If using kmdb(1) on x86 or SPARC systems, use the 4–byte write modifier (W) with the / formatting dcmd:
kmdb[0]: dtrace_destructive_disallow/W 1 dtrace_destructive_disallow: 0x0 = 0x1 kmdb[0]:
Continue using :c:
kadb[0]: :c
To re-enable destructive actions after continuing, you will need to explicitly reset dtrace_destructive_disallow back to 0 using mdb(1):
# echo "dtrace_destructive_disallow/W 0" | mdb -kw dtrace_destructive_disallow: 0x1 = 0x0 #
panic
void panic(void)
The panic action causes a kernel panic when triggered. This action should be used to force a system crash dump at a time of interest. You can use this action together with ring buffering and postmortem analysis to understand a problem. For more information, see Chapter 11, Buffers and Buffering and Chapter 37, Postmortem Tracing respectively. When the panic action is used, a panic message appears that denotes the probe causing the panic. For example:
panic[cpu0]/thread=30001830b80: dtrace: panic action at probe syscall::mmap:entry (ecb 300000acfc8) 000002a10050b840 dtrace:dtrace_probe+518 (fffe, 0, 1830f88, 1830f88, 30002fb8040, 300000acfc8) %l0-3: 0000000000000000 00000300030e4d80 0000030003418000 00000300018c0800 %l4-7: 000002a10050b980 0000000000000500 0000000000000000 0000000000000502 000002a10050ba30 genunix:dtrace_systrace_syscall32+44 (0, 2000, 5, 80000002, 3, 1898400) %l0-3: 00000300030de730 0000000002200008 00000000000000e0 000000000184d928 %l4-7: 00000300030de000 0000000000000730 0000000000000073 0000000000000010 syncing file systems... 2 done dumping to /dev/dsk/c0t0d0s1, offset 214827008, content: kernel 100% done: 11837 pages dumped, compression ratio 4.66, dump succeeded rebooting...
syslogd(1M) will also emit a message upon reboot:
Jun 10 16:56:31 machine1 savecore: [ID 570001 auth.error] reboot after panic: dtrace: panic action at probe syscall::mmap:entry (ecb 300000acfc8)
The message buffer of the crash dump also contains the probe and ECB responsible for the panic action.
chill
void chill(int nanoseconds)
The chill action causes DTrace to spin for the specified number of nanoseconds. chill is primarily useful for exploring problems that might be timing related. For example, you can use this action to open race condition windows, or to bring periodic events into or out of phase with one another. Because interrupts are disabled while in DTrace probe context, any use of chill will induce interrupt latency, scheduling latency, dispatch latency. Therefore, chill can cause unexpected systemic effects and it should not used indiscriminately. Because system activity relies on periodic interrupt handling, DTrace will refuse to execute the chill action for more than 500 milliseconds out of each one-second interval on any given CPU. If the maximum chill interval is exceeded, DTrace will report an illegal operation error, as shown in the following example:
# dtrace -w -n syscall::openat:entry'{chill(500000001)}' dtrace: allowing destructive actions dtrace: description 'syscall::openat:entry' matched 1 probe dtrace: 57 errors CPU ID FUNCTION:NAME dtrace: error on enabled probe ID 1 (ID 14: syscall::openat:entry): \ illegal operation in action #1
This limit is enforced even if the time is spread across multiple calls to chill, or multiple DTrace consumers of a single probe. For example, the same error would be generated by the following command:
# dtrace -w -n syscall::openat:entry'{chill(250000000); chill(250000001);}'
Special Actions
This section describes actions that are neither data recording actions nor destructive actions.
Speculative Actions
The actions associated with speculative tracing are speculate, commit, and discard. These actions are discussed in Chapter 13, Speculative Tracing.
exit
void exit(int status)
The exit action is used to immediately stop tracing, and to inform the DTrace consumer that it should cease tracing, perform any final processing, and call exit(3C) with the status specified. Because exit returns a status to user-level, it is a data recording action, However, unlike other data storing actions, exit cannot be speculatively traced. exit will cause the DTrace consumer to exit regardless of buffer policy. Because exit is a data recording action, it can be dropped.
When exit is called, only DTrace actions already in progress on other CPUs will be completed. No new actions will occur on any CPU. The only exception to this rule is the processing of the END probe, which will be called after the DTrace consumer has processed the exit action and indicated that tracing should stop.
Subroutines
Subroutines differ from actions because they generally only affect internal DTrace state. Therefore, there are no destructive subroutines, and subroutines never trace data into buffers. Many subroutines have analogs in the Section 9F or Section 3C interfaces. See Intro(9F) and Intro(3) for more information on the corresponding subroutines.
alloca
void *alloca(size_t size)
alloca allocates size bytes out of scratch space, and returns a pointer to the allocated memory. The returned pointer is guaranteed to have 8–byte alignment. Scratch space is only valid for the duration of a clause. Memory allocated with alloca will be deallocated when the clause completes. If insufficient scratch space is available, no memory is allocated and an error is generated.
basename
string basename(char *str)
basename is a D analogue for basename(1). This subroutine creates a string that consists of a copy of the specified string, but without any prefix that ends in /. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, basename does not execute and an error is generated.
bcopy
void bcopy(void *src, void *dest, size_t size)
bcopy copies size bytes from the memory pointed to by src to the memory pointed to by dest. All of the source memory must lie outside of scratch memory and all of the destination memory must lie within it. If these conditions are not met, no copying takes place and an error is generated.
cleanpath
string cleanpath(char *str)
cleanpath creates a string that consists of a copy of the path indicated by str, but with certain redundant elements eliminated. In particular “/./” elements in the path are removed, and “/../” elements are collapsed. The collapsing of /../ elements in the path occurs without regard to symbolic links. Therefore, it is possible that cleanpath could take a valid path and return a shorter, invalid one.
For example, if str were “/foo/../bar” and /foo were a symbolic link to /net/foo/export, cleanpath would return the string “/bar” even though bar might only be in /net/foo not /. This limitation is due to the fact that cleanpath is called in the context of a firing probe, where full symbolic link resolution or arbitrary names is not possible. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, cleanpath does not execute and an error is generated.
copyin
void *copyin(uintptr_t addr, size_t size)
copyin copies the specified size in bytes from the specified user address into a DTrace scratch buffer, and returns the address of this buffer. The user address is interpreted as an address in the space of the process associated with the current thread. The resulting buffer pointer is guaranteed to have 8-byte alignment. The address in question must correspond to a faulted-in page in the current process. If the address does not correspond to a faulted-in page, or if insufficient scratch space is available, NULL is returned, and an error is generated. See Chapter 33, User Process Tracing for techniques to reduce the likelihood of copyin errors.
copyinstr
string copyinstr(uintptr_t addr) string copyinstr(uintptr_t addr, size_t maxlength)
copyinstr copies a null-terminated C string from the specified user address into a DTrace scratch buffer, and returns the address of this buffer. The user address is interpreted as an address in the space of the process associated with the current thread. The maxlength parameter, if specified, sets a limit on the number of bytes past addr which will be examined (the resulting string will always be null-terminated). The resulting string's length is limited to the value set by the strsize option; see Chapter 16, Options and Tunables for details. As with copyin, the specified address must correspond to a faulted-in page in the current process. If the address does not correspond to a faulted-in page, or if insufficient scratch space is available, NULL is returned, and an error is generated. See Chapter 33, User Process Tracing for techniques to reduce the likelihood of copyinstr errors.
copyinto
void copyinto(uintptr_t addr, size_t size, void *dest)
copyinto copies the specified size in bytes from the specified user address into the DTrace scratch buffer specified by dest. The user address is interpreted as an address in the space of the process associated with the current thread. The address in question must correspond to a faulted-in page in the current process. If the address does not correspond to a faulted-in page, or if any of the destination memory lies outside scratch space, no copying takes place, and an error is generated. See Chapter 33, User Process Tracing for techniques to reduce the likelihood of copyinto errors.
dirname
string dirname(char *str)
dirname is a D analogue for dirname(1). This subroutine creates a string that consists of all but the last level of the pathname specified by str. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, dirname does not execute and an error is generated.
inet_ntoa
string inet_ntoa(ipaddr_t *addr)
inet_ntoa takes a pointer to an IPv4 address and returns it as a dotted quad decimal string. This is similar to inet_ntoa() from libnsl as described in inet(3SOCKET), however this D version takes a pointer to the IPv4 address rather than the address itself. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, inet_ntoa does not execute and an error is generated.
inet_ntoa6
string inet_ntoa6(in6_addr_t *addr)
inet_ntoa6 takes a pointer to an IPv6 address and returns it as an RFC 1884 convention 2 string, with lower case hexadecimal digits. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, inet_ntoa6 does not execute and an error is generated.
inet_ntop
string inet_ntop(int af, void *addr)
inet_ntop takes a pointer to an IP address and returns a string version depending on the provided address family. This is similar to inet_ntop() from libnsl as described in inet(3SOCKET). Supported address families are AF_INET and AF_INET6, both of which have been defined for use in D programs. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, inet_ntop does not execute and an error is generated.
msgdsize
size_t msgdsize(mblk_t *mp)
msgdsize returns the number of bytes in the data message pointed to by mp. See msgdsize(9F) for details. msgdsize only includes data blocks of type M_DATA in the count.
msgsize
size_t msgsize(mblk_t *mp)
msgsize returns the number of bytes in the message pointed to by mp. Unlike msgdsize, which returns only the number of data bytes, msgsize returns the total number of bytes in the message.
mutex_owned
int mutex_owned(kmutex_t *mutex)
mutex_owned is an implementation of mutex_owned(9F). mutex_owned returns non-zero if the calling thread currently holds the specified kernel mutex, or zero if the specified adaptive mutex is currently unowned.
mutex_owner
kthread_t *mutex_owner(kmutex_t *mutex)
mutex_owner returns the thread pointer of the current owner of the specified adaptive kernel mutex. mutex_owner returns NULL if the specified adaptive mutex is currently unowned, or if the specified mutex is a spin mutex. See mutex_owned(9F).
mutex_type_adaptive
int mutex_type_adaptive(kmutex_t *mutex)
mutex_type_adaptive returns non-zero if the specified kernel mutex is of type MUTEX_ADAPTIVE, or zero if it is not. Mutexes are adaptive if they meet one or more of the following conditions:
- The mutex is declared statically
- The mutex is created with an interrupt block cookie of NULL
- The mutex is created with an interrupt block cookie that does not correspond to a high-level interrupt
See mutex_init(9F) for more details on mutexes. The majority of mutexes in the Solaris kernel are adaptive.
progenyof
int progenyof(pid_t pid)
progenyof returns non-zero if the calling process (the process associated with the thread that is currently triggering the matched probe) is among the progeny of the specified process ID.
rand
int rand(void)
rand returns a pseudo-random integer. The number returned is a weak pseudo-random number, and should not be used for any cryptographic application.
rw_iswriter
int rw_iswriter(krwlock_t *rwlock)
rw_iswriter returns non-zero if the specified reader-writer lock is either held or desired by a writer. If the lock is held only by readers and no writer is blocked, or if the lock is not held at all, rw_iswriter returns zero. See rw_init(9F).
rw_write_held
int rw_write_held(krwlock_t *rwlock)
rw_write_held returns non-zero if the specified reader-writer lock is currently held by a writer. If the lock is held only by readers or not held at all, rw_write_held returns zero. See rw_init(9F).
speculation
int speculation(void)
speculation reserves a speculative trace buffer for use with speculate and returns an identifier for this buffer. See Chapter 13, Speculative Tracing for details.
strjoin
string strjoin(char *str1, char *str2)
strjoin creates a string that consists of str1 concatenated with str2. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, strjoin does not execute and an error is generated.
strlen
size_t strlen(string str)
strlen returns the length of the specified string in bytes, excluding the terminating null byte.