Actions and Subroutines

Skip to end of metadata

Page restrictions apply
Added by jon_haslam, last edited by kumann123 on Mar 15, 2011 (view change)

Go to start of metadata

Actions and Subroutines

You can use D function calls such as trace and printf to invoke two different kinds of services provided by DTrace: actions that trace data or modify state external to DTrace, and subroutines that affect only internal DTrace state. This chapter defines the actions and subroutines and describes their syntax and semantics.

Destructive Actions

Process Destructive Actions

Kernel Destructive Actions

Special Actions

Subroutines

Actions

Actions enable your DTrace programs to interact with the system outside of DTrace. The most common actions record data to a DTrace buffer. Other actions are available, such as stopping the current process, raising a specific signal on the current process, or ceasing tracing altogether. Some of these actions are destructive in that they change the system, albeit in a well-defined way. These actions may only be used if destructive actions have been explicitly enabled. By default, data recording actions record data to the principal buffer. For more details on the principal buffer and buffer policies, see Chapter 11, Buffers and Buffering.

Default Action

A clause can contain any number of actions and variable manipulations. If a clause is left empty, the default action is taken. The default action is to trace the enabled probe identifier (EPID) to the principal buffer. The EPID identifies a particular enabling of a particular probe with a particular predicate and actions. From the EPID, DTrace consumers can determine the probe that induced the action. Indeed, whenever any data is traced, it must be accompanied by the EPID to enable the consumer to make sense of the data. Therefore, the default action is to trace the EPID and nothing else.

Using the default action allows for simple use of dtrace(1M). For example, the following example command enables all probes in the TS timeshare scheduling module with the default action:

# dtrace -m TS

The preceding command might produce output similar to the following example:

# dtrace -m TS
dtrace: description 'TS' matched 80 probes
CPU     ID                    FUNCTION:NAME
  0  12077                 ts_trapret:entry 
  0  12078                ts_trapret:return 
  0  12069                   ts_sleep:entry 
  0  12070                  ts_sleep:return 
  0  12033                  ts_setrun:entry 
  0  12034                 ts_setrun:return 
  0  12081                  ts_wakeup:entry 
  0  12082                 ts_wakeup:return 
  0  12069                   ts_sleep:entry 
  0  12070                  ts_sleep:return 
  0  12033                  ts_setrun:entry 
  0  12034                 ts_setrun:return 
  0  12069                   ts_sleep:entry 
  0  12070                  ts_sleep:return 
  0  12033                  ts_setrun:entry 
  0  12034                 ts_setrun:return 
  0  12069                   ts_sleep:entry 
  0  12070                  ts_sleep:return 
  0  12023                  ts_update:entry 
  0  12079             ts_update_list:entry 
  0  12080            ts_update_list:return 
  0  12079             ts_update_list:entry 
...

Data Recording Actions

The data recording actions comprise the core DTrace actions. Each of these actions records data to the principal buffer by default, but each action may also be used to record data to speculative buffers. See Chapter 11, Buffers and Buffering for more details on the principal buffer. See Chapter 13, Speculative Tracing for more details on speculative buffers. The descriptions in this section refer only to the directed buffer, indicating that data is recorded either to the principal buffer or to a speculative buffer if the action follows a speculate.

`trace`

void trace(expression)

The most basic action is the trace action, which takes a D expression as its argument and traces the result to the directed buffer. The following statements are examples of trace actions:

trace(execname);
trace(curlwpsinfo->pr_pri);
trace(timestamp / 1000);
trace(`lbolt);
trace("somehow managed to get here");

`tracemem`

void tracemem(address, size_t nbytes)

The tracemem action takes a D expression as its first argument, address, and a constant as its second argument, nbytes. tracemem copies the memory from the address specified by addr into the directed buffer for the length specified by nbytes.

The output format depends on the data printed. When dtrace decides that the data looks like ascii string, it prints them as text, and output is terminated by first '\0'. When dtrace decides that the data is binary, it prints them in hex form

  0    342                      write:entry
             0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  0123456789abcdef
         0: c0 de 09 c2 4a e8 27 54 dc f8 9f f1 9a 20 4b d1  ....J.'T..... K.
        10: 9c 7a 7a 85 1b 03 0a fb 3a 81 8a 1b 25 35 b3 9a  .zz.....:...%5..
        20: f1 7d e6 2b 66 6d 1c 11 f8 eb 40 7f 65 9a 25 f8  .}.+fm....@.e.%.
        30: c8 68 87 b2 6f 48 a2 a5 f3 a2 1f 46 ab 3d f9 d2  .h..oH.....F.=..
        40: 3d b8 4c c0 41 3c f7 3c cd 18 ad 0d 0d d3 1a 90  =.L.A<.<........

You can force tracemem to use always binary format by using rawbytes option.

`printf`

void printf(string format, ...)

Like trace, the printf action traces D expressions. However, printf allows for elaborate printf(3C)-style formatting. Like printf(3C), the parameters consists of a format string followed by a variable number of arguments. By default, the arguments are traced to the directed buffer. The arguments are later formatted for output by dtrace(1M) according to the specified format string. For example, the first two examples of trace from trace() could be combined in a single printf:

printf("execname is %s; priority is %d", execname, curlwpsinfo->pr_pri);

For more information on printf, see Chapter 12, Output Formatting.

`printa`

void printa(aggregation)
void printa(string format, aggregation)

The printa action enables you to display and format aggregations. See Chapter 9, Aggregations for more detail on aggregations. If a format is not provided, printa only traces a directive to the DTrace consumer that the specified aggregation should be processed and displayed using the default format. If a format is provided, the aggregation will be formatted as specified. See Chapter 12, Output Formatting for a more detailed description of the printa format string.

printa only traces a directive that the aggregation should be processed by the DTrace consumer. It does not process the aggregation in the kernel. Therefore, the time between the tracing of the printa directive and the actual processing of the directive depends on the factors that affect buffer processing. These factors include the aggregation rate, the buffering policy and, if the buffering policy is switching, the rate at which buffers are switched. See Chapter 9, Aggregations and Chapter 11, Buffers and Buffering for detailed descriptions of these factors.

`stack`

void stack(int nframes)
void stack(void)

The stack action records a kernel stack trace to the directed buffer. The kernel stack will be nframes in depth. If nframes is not provided, the number of stack frames recorded is the number specified by the stackframes option. For example:

# dtrace -n uiomove:entry'{stack()}'
  CPU     ID                    FUNCTION:NAME
    0   9153                    uiomove:entry 
                genunix`fop_write+0x1b
                namefs`nm_write+0x1d
                genunix`fop_write+0x1b
                genunix`write+0x1f7

    0   9153                    uiomove:entry 
                genunix`fop_read+0x1b
                genunix`read+0x1d4

    0   9153                    uiomove:entry 
                genunix`strread+0x394
                specfs`spec_read+0x65
                genunix`fop_read+0x1b
                genunix`read+0x1d4
   ...

The stack action is a little different from other actions in that it may also be used as the key to an aggregation:

# dtrace -n kmem_alloc:entry'{@[stack()] = count()}'
dtrace: description 'kmem_alloc:entry' matched 1 probe
^C
                rpcmod`endpnt_get+0x47c
                rpcmod`clnt_clts_kcallit_addr+0x26f
                rpcmod`clnt_clts_kcallit+0x22
                nfs`rfscall+0x350
                nfs`rfs2call+0x60
                nfs`nfs_getattr_otw+0x9e
                nfs`nfsgetattr+0x26
                nfs`nfs_getattr+0xb8
                genunix`fop_getattr+0x18
                genunix`cstat64+0x30
                genunix`cstatat64+0x4a
                genunix`lstat64+0x1c
                  1

                genunix`vfs_rlock_wait+0xc
                genunix`lookuppnvp+0x19d
                genunix`lookuppnat+0xe7
                genunix`lookupnameat+0x87
                genunix`lookupname+0x19
                genunix`chdir+0x18
                  1

                rpcmod`endpnt_get+0x6b1
                rpcmod`clnt_clts_kcallit_addr+0x26f
                rpcmod`clnt_clts_kcallit+0x22
                nfs`rfscall+0x350
                nfs`rfs2call+0x60
                nfs`nfs_getattr_otw+0x9e
                nfs`nfsgetattr+0x26
                nfs`nfs_getattr+0xb8
                genunix`fop_getattr+0x18
                genunix`cstat64+0x30
                genunix`cstatat64+0x4a
                genunix`lstat64+0x1c
                  1

    ...

`ustack`

void ustack(int nframes, int strsize)
void ustack(int nframes)
void ustack(void)

The ustack action records a user stack trace to the directed buffer. The user stack will be nframes in depth. If nframes is not provided, the number of stack frames recorded is the number specified by the ustackframes option. While ustack is able to determine the address of the calling frames when the probe fires, the stack frames will not be translated into symbols until the ustack action is processed at user-level by the DTrace consumer. If strsize is specified and non-zero, ustack will allocate the specified amount of string space, and use it to perform address-to-symbol translation directly from the kernel. This direct user symbol translation is currently available only for Java virtual machines, version 1.5 and higher. Java address-to-symbol translation annotates user stacks that contain Java frames with the Java class and method name. If such frames cannot be translated, the frames will appear only as hexadecimal addresses.

The following example traces a stack with no string space, and therefore no Java address-to-symbol translation:

# dtrace -n syscall::write:entry'/pid == $target/{ustack(50, 0); 
    exit(0)}' -c "java -version"
dtrace: description 'syscall::write:entry' matched 1 probe
java version "1.5.0-beta3"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-beta3-b58)
Java HotSpot(TM) Client VM (build 1.5.0-beta3-b58, mixed mode)
dtrace: pid 5312 has exited
CPU     ID                    FUNCTION:NAME
  0     35                      write:entry 
              libc.so.1`_write+0x15
              libjvm.so`__1cDhpiFwrite6FipkvI_I_+0xa8
              libjvm.so`JVM_Write+0x2f
              d0c5c946
              libjava.so`Java_java_io_FileOutputStream_writeBytes+0x2c
              cb007fcd
              cb002a7b
              cb002a7b
              cb002a7b
              cb002a7b
              cb002a7b
              cb002a7b
              cb002a7b
              cb002a7b
              cb002a7b
              cb002a7b
              cb002a7b
              cb002a7b
              cb002a7b
              cb000152
              libjvm.so`__1cJJavaCallsLcall_helper6FpnJJavaValue_
                          pnMmethodHandle_pnRJavaCallArguments_
                          pnGThread__v_+0x187
              libjvm.so`__1cCosUos_exception_wrapper6FpFpnJJavaValue_
                          pnMmethodHandle_pnRJavaCallArguments_
                          pnGThread__v2468_v_+0x14
              libjvm.so`__1cJJavaCallsEcall6FpnJJavaValue_nMmethodHandle_
                          pnRJavaCallArguments_pnGThread __v_+0x28
              libjvm.so`__1cRjni_invoke_static6FpnHJNIEnv__pnJJavaValue_
                          pnI_jobject_nLJNICallType_pnK_jmethodID_pnSJNI_
                          ArgumentPusher_pnGThread__v_+0x180
              libjvm.so`jni_CallStaticVoidMethod+0x10f
              java`main+0x53d

Notice that the C and C++ stack frames from the Java virtual machine are presented symbolically using C++ “mangled” symbol names, and the Java stack frames are presented only as hexadecimal addresses. The following example shows a call to ustack with a non-zero string space:

# dtrace -n syscall::write:entry'/pid == $target/{ustack(50, 500); exit(0)}'
      -c "java -version"
dtrace: description 'syscall::write:entry' matched 1 probe
java version "1.5.0-beta3"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-beta3-b58)
Java HotSpot(TM) Client VM (build 1.5.0-beta3-b58, mixed mode)
dtrace: pid 5308 has exited
CPU     ID                    FUNCTION:NAME
  0     35                      write:entry 
              libc.so.1`_write+0x15
              libjvm.so`__1cDhpiFwrite6FipkvI_I_+0xa8
              libjvm.so`JVM_Write+0x2f
              d0c5c946
              libjava.so`Java_java_io_FileOutputStream_writeBytes+0x2c
              java/io/FileOutputStream.writeBytes
              java/io/FileOutputStream.write
              java/io/BufferedOutputStream.flushBuffer
              java/io/BufferedOutputStream.flush
              java/io/PrintStream.write
              sun/nio/cs/StreamEncoder$CharsetSE.writeBytes
              sun/nio/cs/StreamEncoder$CharsetSE.implFlushBuffer
              sun/nio/cs/StreamEncoder.flushBuffer
              java/io/OutputStreamWriter.flushBuffer
              java/io/PrintStream.write
              java/io/PrintStream.print
              java/io/PrintStream.println
              sun/misc/Version.print
              sun/misc/Version.print
              StubRoutines (1)
              libjvm.so`__1cJJavaCallsLcall_helper6FpnJJavaValue_
                          pnMmethodHandle_pnRJavaCallArguments_pnGThread
                          __v_+0x187
              libjvm.so`__1cCosUos_exception_wrapper6FpFpnJJavaValue_
                          pnMmethodHandle_pnRJavaCallArguments_pnGThread
                          __v2468_v_+0x14
              libjvm.so`__1cJJavaCallsEcall6FpnJJavaValue_nMmethodHandle
                          _pnRJavaCallArguments_pnGThread__v_+0x28
              libjvm.so`__1cRjni_invoke_static6FpnHJNIEnv__pnJJavaValue_pnI
                          _jobject_nLJNICallType_pnK_jmethodID_pnSJNI
                          _ArgumentPusher_pnGThread__v_+0x180
              libjvm.so`jni_CallStaticVoidMethod+0x10f
              java`main+0x53d
              8051b9a

The above example output demonstrates symbolic stack frame information for Java stack frames. There are still some hexadecimal frames in this output because some functions are static and do not have entries in the application symbol table. Translation is not possible for these frames.

The ustack symbol translation for non-Java frames occurs after the stack data is recorded. Therefore, the corresponding user process might exit before symbol translation can be performed, making stack frame translation impossible. If the user process exits before symbol translation is performed, dtrace will emit a warning message, followed by the hexadecimal stack frames, as shown in the following example:

  dtrace: failed to grab process 100941: no such process
                c7b834d4
                c7bca85d
                c7bca1a4
                c7bd4374
                c7bc2628
                8047efc

Techniques for mitigating this problem are described in Chapter 33, User Process Tracing.

Finally, because the postmortem DTrace debugger commands cannot perform the frame translation, using ustack with a ring buffer policy always results in raw ustack data.

The following D program shows an example of ustack that leaves strsize unspecified:

syscall::brk:entry
/execname == $$1/
{
        @[ustack(40)] = count();
}

To run this example for the Netscape web browser, .netscape.bin in default Solaris installations, use the following command:

# dtrace -s brk.d .netscape.bin
dtrace: description 'syscall::brk:entry' matched 1 probe
^C
                libc.so.1`_brk_unlocked+0xc
                88143f6
                88146cd
                .netscape.bin`unlocked_malloc+0x3e
                .netscape.bin`unlocked_calloc+0x22
                .netscape.bin`calloc+0x26
                .netscape.bin`_IMGCB_NewPixmap+0x149
                .netscape.bin`il_size+0x2f7
                .netscape.bin`il_jpeg_write+0xde
                8440c19
                .netscape.bin`il_first_write+0x16b
                8394670
                83928e5
                .netscape.bin`NET_ProcessHTTP+0xa6
                .netscape.bin`NET_ProcessNet+0x49a
                827b323
                libXt.so.4`XtAppProcessEvent+0x38f
                .netscape.bin`fe_EventLoop+0x190
                .netscape.bin`main+0x1875
                   1

                libc.so.1`_brk_unlocked+0xc
                libc.so.1`sbrk+0x29
                88143df
                88146cd
                .netscape.bin`unlocked_malloc+0x3e
                .netscape.bin`unlocked_calloc+0x22
                .netscape.bin`calloc+0x26
                .netscape.bin`_IMGCB_NewPixmap+0x149
                .netscape.bin`il_size+0x2f7
                .netscape.bin`il_jpeg_write+0xde
                8440c19
                .netscape.bin`il_first_write+0x16b
                8394670
                83928e5
                .netscape.bin`NET_ProcessHTTP+0xa6
                .netscape.bin`NET_ProcessNet+0x49a
                827b323
                libXt.so.4`XtAppProcessEvent+0x38f
                .netscape.bin`fe_EventLoop+0x190
                .netscape.bin`main+0x1875
                  1

    ...

`jstack`

void jstack(int nframes, int strsize)
void jstack(int nframes)
void jstack(void)

jstack is an alias for ustack that uses the jstackframes option for the number of stack frames the value specified by , and for the string space size the value specified by the jstackstrsize option. By default, jstacksize defaults to a non-zero value. As a result, use of jstack will result in a stack with in situ Java frame translation.

`uaddr`

_usymaddr uaddr(uintptr_t address)

uaddr will prints the symbol for a specified address, including hexadecimal offset. This allows for the same symbol resolution that ustack provides.

# dtrace -c date -n 'pid$target::main:entry{ uaddr(0x8062578); }'
dtrace: description 'pid$target::main:entry' matched 1 probe
Sun Feb  3 20:58:03 PST 2008
dtrace: pid 105537 has exited
CPU     ID                    FUNCTION:NAME
  0  59934                       main:entry   date`clock_val

In the above example, a call to uaddr(0x8062578) causes date`clock_val to be printed.

The example below shows the hexadecimal offsets being printed.

demo$ sudo dtrace -n "pid\$target::main:{uaddr(uregs[R_PC])}" -c nmap
dtrace: description 'pid$target::main:' matched 946 probes

[outout cut]

dtrace: pid 2229 has exited
CPU     ID                    FUNCTION:NAME
  1  20165                       main:entry   nmap`main                                         
  1  20166                           main:0   nmap`main                                         
  1  20167                           main:1   nmap`main+0x1                                     
  1  20168                           main:3   nmap`main+0x3                                     
  1  20169                           main:4   nmap`main+0x4                                     
  1  20170                           main:5   nmap`main+0x5                                     
  1  20171                           main:6   nmap`main+0x6                                     
  1  20172                           main:b   nmap`main+0xb                                     
  1  20173                           main:c   nmap`main+0xc                                     
  1  20174                          main:12   nmap`main+0x12                                    
  1  20175                          main:15   nmap`main+0x15                                    
  1  20176                          main:1c   nmap`main+0x1c                                    
  1  20177                          main:23   nmap`main+0x23                                    
  1  20178                          main:2b   nmap`main+0x2b                                    
  1  20179                          main:2e   nmap`main+0x2e                                    
  1  20180                          main:33   nmap`main+0x33                  
...
...
...

`usym`

_usymaddr usym(uintptr_t address)

usym will print the symbol for a specified address. This is analogous to how uaddr works, but without the hexadecimal offsets.

        uaddr:          date`clock_val+0x1
        usym:           date`clock_val

Destructive Actions

Some DTrace actions are destructive in that they change the state of the system in some well-defined way. Destructive actions may not be used unless they have been explicitly enabled. When using dtrace(1M), you can enable destructive actions using the -w option. If an attempt is made to enable destructive actions in dtrace(1M) without explicitly enabling them, dtrace will fail with a message similar to the following example:

dtrace: failed to enable 'syscall': destructive actions not allowed

An administrator may choose to disable destructive actions system-wide by setting the kernel tunable dtrace_destructive_disallow to 1. This may be done in a number of ways including rebooting after adding the following line to /etc/system:

set dtrace:dtrace_destructive_disallow = 1

It may be set temporarily on a running system using mdb(1):

# echo "dtrace_destructive_disallow/W 1" | mdb -kw
dtrace_destructive_disallow:    0x0             =       0x1

Process Destructive Actions

Some destructive actions are destructive only to a particular process. These actions are available to users with the dtrace_proc or dtrace_user privileges. See Chapter 35, Security for details on DTrace security privileges.

`stop`

void stop(void)

The stop action forces the process that fires the enabled probe to stop when it next leaves the kernel, as if stopped by a proc(4) action. The prun(1) utility may be used to resume a process that has been stopped by the stop action. The stop action can be used to stop a process at any DTrace probe point. This action can be used to capture a program in a particular state that would be difficult to achieve with a simple breakpoint, and then attach a traditional debugger like mdb(1) to the process. You can also use the gcore(1) utility to save the state of a stopped process in a core file for later analysis.

`raise`

void raise(int signal)

The raise action sends the specified signal to the currently running process. This action is similar to using the kill(1) command to send a process a signal. The raise action can be used to send a signal at a precise point in a process's execution.

`copyout`

void copyout(void *buf, uintptr_t addr, size_t nbytes)

The copyout action copies nbytes from the buffer specified by buf to the address specified by addr in the address space of the process associated with the current thread. If the user-space address does not correspond to a valid, faulted-in page in the current address space, an error will be generated.

`copyoutstr`

void copyoutstr(string str, uintptr_t addr, size_t maxlen)

The copyoutstr action copies the string specified by str to the address specified by addr in the address space of the process associated with the current thread. If the user-space address does not correspond to a valid, faulted-in page in the current address space, an error will be generated. The string length is limited to the value set by the strsize option. See Chapter 16, Options and Tunables for details.

`system`

void system(string program, ...)

The system action causes the program specified by program to be executed as if it were given to the shell as input. The program string may contain any of the printf/printa format conversions. Arguments must be specified that match the format conversions. Refer to Chapter 12, Output Formatting for details on valid format conversions.

The following example runs the date(1) command once per second:

# dtrace -wqn tick-1sec'{system("date")}'
 Tue Jul 20 11:56:26 CDT 2004
 Tue Jul 20 11:56:27 CDT 2004
 Tue Jul 20 11:56:28 CDT 2004
 Tue Jul 20 11:56:29 CDT 2004
 Tue Jul 20 11:56:30 CDT 2004

The following example shows a more elaborate use of the action, using printf conversions in the program string along with traditional filtering tools like pipes:

#pragma D option destructive
#pragma D option quiet

proc:::signal-send
/args[2] == SIGINT/
{
        printf("SIGINT sent to %s by ", args[1]->pr_fname);
        system("getent passwd %d | cut -d: -f5", uid);
}

Running the above script results in output similar to the following example:

# ./whosend.d
SIGINT sent to MozillaFirebird- by Bryan Cantrill
SIGINT sent to run-mozilla.sh by Bryan Cantrill
^C
SIGINT sent to dtrace by Bryan Cantrill

The execution of the specified command does not occur in the context of the firing probe – it occurs when the buffer containing the details of the system action are processed at user-level. How and when this processing occurs depends on the buffering policy, described in Chapter 11, Buffers and Buffering. With the default buffering policy, the buffer processing rate is specified by the switchrate option. You can see the delay inherent in system if you explicitly tune the switchrate higher than its one-second default, as shown in the following example:

#pragma D option quiet
#pragma D option destructive
#pragma D option switchrate=5sec

tick-1sec
/n++ < 5/
{
        printf("walltime  : %Y\n", walltimestamp);
        printf("date      : ");
        system("date");
        printf("\n");
}

tick-1sec
/n == 5/
{
        exit(0);
}

Running the above script results in output similar to the following example:

# dtrace -s ./time.d
 walltime  : 2004 Jul 20 13:26:30
date      : Tue Jul 20 13:26:35 CDT 2004

walltime  : 2004 Jul 20 13:26:31
date      : Tue Jul 20 13:26:35 CDT 2004

walltime  : 2004 Jul 20 13:26:32
date      : Tue Jul 20 13:26:35 CDT 2004

walltime  : 2004 Jul 20 13:26:33
date      : Tue Jul 20 13:26:35 CDT 2004

walltime  : 2004 Jul 20 13:26:34
date      : Tue Jul 20 13:26:35 CDT 2004

Notice that the walltime values differ, but the date values are identical. This result reflects the fact that the execution of the date(1) command occured only when the buffer was processed, not when the system action was recorded.

Kernel Destructive Actions

Some destructive actions are destructive to the entire system. These actions must obviously be used extremely carefully, as they will affect every process on the system and any other system implicitly or explicitly depending upon the affected system's network services.

`breakpoint`

void breakpoint(void)

The breakpoint action induces a kernel breakpoint, causing the system to stop and transfer control to the kernel debugger. The kernel debugger will emit a string denoting the DTrace probe that triggered the action. For example, if one were to do the following:

# dtrace -w -n clock:entry'{breakpoint()}'
dtrace: allowing destructive actions
dtrace: description 'clock:entry' matched 1 probe

On Solaris running on SPARC, the following message might appear on the console:

dtrace: breakpoint action at probe fbt:genunix:clock:entry (ecb 30002765700)
Type  'go' to resume
ok

On Solaris running on x86, the following message might appear on the console:

dtrace: breakpoint action at probe fbt:genunix:clock:entry (ecb d2b97060)
stopped at      int20+0xb:      ret
kmdb[0]:

The address following the probe description is the address of the enabling control block (ECB) within DTrace. You can use this address to determine more details about the probe enabling that induced the breakpoint action.

A mistake with the breakpoint action may cause it to be called far more often than intended. This behavior might in turn prevent you from even terminating the DTrace consumer that is triggering the breakpoint actions. In this situation, set the kernel tunable dtrace_destructive_disallow to 1. This setting will disallow all destructive actions on the machine.

The exact method for setting dtrace_destructive_disallow will depend on the kernel debugger that you are using. If using the OpenBoot PROM on a SPARC system, use w!:

ok 1 dtrace_destructive_disallow w!
ok

Confirm that the variable has been set using w?:

ok dtrace_destructive_disallow w?
1
ok

Continue by typing go:

ok go

If using kmdb(1) on x86 or SPARC systems, use the 4–byte write modifier (W) with the / formatting dcmd:

kmdb[0]: dtrace_destructive_disallow/W 1
dtrace_destructive_disallow:    0x0             =       0x1
kmdb[0]:

Continue using :c:

kadb[0]: :c

To re-enable destructive actions after continuing, you will need to explicitly reset dtrace_destructive_disallow back to 0 using mdb(1):

# echo "dtrace_destructive_disallow/W 0" | mdb -kw
dtrace_destructive_disallow:    0x1             =       0x0
#

`panic`

void panic(void)

The panic action causes a kernel panic when triggered. This action should be used to force a system crash dump at a time of interest. You can use this action together with ring buffering and postmortem analysis to understand a problem. For more information, see Chapter 11, Buffers and Buffering and Chapter 37, Postmortem Tracing respectively. When the panic action is used, a panic message appears that denotes the probe causing the panic. For example:

  panic[cpu0]/thread=30001830b80: dtrace: panic action at probe
  syscall::mmap:entry (ecb 300000acfc8)

  000002a10050b840 dtrace:dtrace_probe+518 (fffe, 0, 1830f88, 1830f88,
    30002fb8040, 300000acfc8)
    %l0-3: 0000000000000000 00000300030e4d80 0000030003418000 00000300018c0800
    %l4-7: 000002a10050b980 0000000000000500 0000000000000000 0000000000000502
  000002a10050ba30 genunix:dtrace_systrace_syscall32+44 (0, 2000, 5,
    80000002, 3, 1898400)
    %l0-3: 00000300030de730 0000000002200008 00000000000000e0 000000000184d928
    %l4-7: 00000300030de000 0000000000000730 0000000000000073 0000000000000010

  syncing file systems... 2 done
  dumping to /dev/dsk/c0t0d0s1, offset 214827008, content: kernel
  100% done: 11837 pages dumped, compression ratio 4.66, dump
  succeeded
  rebooting...

syslogd(1M) will also emit a message upon reboot:

  Jun 10 16:56:31 machine1 savecore: [ID 570001 auth.error] reboot after panic:
  dtrace: panic action at probe syscall::mmap:entry (ecb 300000acfc8)

The message buffer of the crash dump also contains the probe and ECB responsible for the panic action.

`chill`

void chill(int nanoseconds)

The chill action causes DTrace to spin for the specified number of nanoseconds. chill is primarily useful for exploring problems that might be timing related. For example, you can use this action to open race condition windows, or to bring periodic events into or out of phase with one another. Because interrupts are disabled while in DTrace probe context, any use of chill will induce interrupt latency, scheduling latency, dispatch latency. Therefore, chill can cause unexpected systemic effects and it should not used indiscriminately. Because system activity relies on periodic interrupt handling, DTrace will refuse to execute the chill action for more than 500 milliseconds out of each one-second interval on any given CPU. If the maximum chill interval is exceeded, DTrace will report an illegal operation error, as shown in the following example:

# dtrace -w -n syscall::openat:entry'{chill(500000001)}'
dtrace: allowing destructive actions
dtrace: description 'syscall::openat:entry' matched 1 probe
dtrace: 57 errors
CPU     ID                    FUNCTION:NAME
dtrace: error on enabled probe ID 1 (ID 14: syscall::openat:entry): \
  illegal operation in action #1

This limit is enforced even if the time is spread across multiple calls to chill, or multiple DTrace consumers of a single probe. For example, the same error would be generated by the following command:

# dtrace -w -n syscall::openat:entry'{chill(250000000); chill(250000001);}'

Special Actions

This section describes actions that are neither data recording actions nor destructive actions.

Speculative Actions

The actions associated with speculative tracing are speculate, commit, and discard. These actions are discussed in Chapter 13, Speculative Tracing.

`exit`

void exit(int status)

The exit action is used to immediately stop tracing, and to inform the DTrace consumer that it should cease tracing, perform any final processing, and call exit(3C) with the status specified. Because exit returns a status to user-level, it is a data recording action, However, unlike other data storing actions, exit cannot be speculatively traced. exit will cause the DTrace consumer to exit regardless of buffer policy. Because exit is a data recording action, it can be dropped.

When exit is called, only DTrace actions already in progress on other CPUs will be completed. No new actions will occur on any CPU. The only exception to this rule is the processing of the END probe, which will be called after the DTrace consumer has processed the exit action and indicated that tracing should stop.

Subroutines

Subroutines differ from actions because they generally only affect internal DTrace state. Therefore, there are no destructive subroutines, and subroutines never trace data into buffers. Many subroutines have analogs in the Section 9F or Section 3C interfaces. See Intro(9F) and Intro(3) for more information on the corresponding subroutines.

`alloca`

void *alloca(size_t size)

alloca allocates size bytes out of scratch space, and returns a pointer to the allocated memory. The returned pointer is guaranteed to have 8–byte alignment. Scratch space is only valid for the duration of a clause. Memory allocated with alloca will be deallocated when the clause completes. If insufficient scratch space is available, no memory is allocated and an error is generated.

`basename`

string basename(char *str)

basename is a D analogue for basename(1). This subroutine creates a string that consists of a copy of the specified string, but without any prefix that ends in /. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, basename does not execute and an error is generated.

`bcopy`

void bcopy(void *src, void *dest, size_t size)

bcopy copies size bytes from the memory pointed to by src to the memory pointed to by dest. All of the source memory must lie outside of scratch memory and all of the destination memory must lie within it. If these conditions are not met, no copying takes place and an error is generated.

`cleanpath`

string cleanpath(char *str)

cleanpath creates a string that consists of a copy of the path indicated by str, but with certain redundant elements eliminated. In particular “/./” elements in the path are removed, and “/../” elements are collapsed. The collapsing of /../ elements in the path occurs without regard to symbolic links. Therefore, it is possible that cleanpath could take a valid path and return a shorter, invalid one.

For example, if str were “/foo/../bar” and /foo were a symbolic link to /net/foo/export, cleanpath would return the string “/bar” even though bar might only be in /net/foo not /. This limitation is due to the fact that cleanpath is called in the context of a firing probe, where full symbolic link resolution or arbitrary names is not possible. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, cleanpath does not execute and an error is generated.

`copyin`

void *copyin(uintptr_t addr, size_t size)

copyin copies the specified size in bytes from the specified user address into a DTrace scratch buffer, and returns the address of this buffer. The user address is interpreted as an address in the space of the process associated with the current thread. The resulting buffer pointer is guaranteed to have 8-byte alignment. The address in question must correspond to a faulted-in page in the current process. If the address does not correspond to a faulted-in page, or if insufficient scratch space is available, NULL is returned, and an error is generated. See Chapter 33, User Process Tracing for techniques to reduce the likelihood of copyin errors.

`copyinstr`

string copyinstr(uintptr_t addr)
string copyinstr(uintptr_t addr, size_t maxlength)

copyinstr copies a null-terminated C string from the specified user address into a DTrace scratch buffer, and returns the address of this buffer. The user address is interpreted as an address in the space of the process associated with the current thread. The maxlength parameter, if specified, sets a limit on the number of bytes past addr which will be examined (the resulting string will always be null-terminated). The resulting string's length is limited to the value set by the strsize option; see Chapter 16, Options and Tunables for details. As with copyin, the specified address must correspond to a faulted-in page in the current process. If the address does not correspond to a faulted-in page, or if insufficient scratch space is available, NULL is returned, and an error is generated. See Chapter 33, User Process Tracing for techniques to reduce the likelihood of copyinstr errors.

`copyinto`

void copyinto(uintptr_t addr, size_t size, void *dest)

copyinto copies the specified size in bytes from the specified user address into the DTrace scratch buffer specified by dest. The user address is interpreted as an address in the space of the process associated with the current thread. The address in question must correspond to a faulted-in page in the current process. If the address does not correspond to a faulted-in page, or if any of the destination memory lies outside scratch space, no copying takes place, and an error is generated. See Chapter 33, User Process Tracing for techniques to reduce the likelihood of copyinto errors.

`dirname`

string dirname(char *str)

dirname is a D analogue for dirname(1). This subroutine creates a string that consists of all but the last level of the pathname specified by str. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, dirname does not execute and an error is generated.

`inet_ntoa`

string inet_ntoa(ipaddr_t *addr)

inet_ntoa takes a pointer to an IPv4 address and returns it as a dotted quad decimal string. This is similar to inet_ntoa() from libnsl as described in inet(3SOCKET), however this D version takes a pointer to the IPv4 address rather than the address itself. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, inet_ntoa does not execute and an error is generated.

`inet_ntoa6`

string inet_ntoa6(in6_addr_t *addr)

inet_ntoa6 takes a pointer to an IPv6 address and returns it as an RFC 1884 convention 2 string, with lower case hexadecimal digits. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, inet_ntoa6 does not execute and an error is generated.

`inet_ntop`

string inet_ntop(int af, void *addr)

inet_ntop takes a pointer to an IP address and returns a string version depending on the provided address family. This is similar to inet_ntop() from libnsl as described in inet(3SOCKET). Supported address families are AF_INET and AF_INET6, both of which have been defined for use in D programs. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, inet_ntop does not execute and an error is generated.

`msgdsize`

size_t msgdsize(mblk_t *mp)

msgdsize returns the number of bytes in the data message pointed to by mp. See msgdsize(9F) for details. msgdsize only includes data blocks of type M_DATA in the count.

`msgsize`

size_t msgsize(mblk_t *mp)

msgsize returns the number of bytes in the message pointed to by mp. Unlike msgdsize, which returns only the number of data bytes, msgsize returns the total number of bytes in the message.

`mutex_owned`

int mutex_owned(kmutex_t *mutex)

mutex_owned is an implementation of mutex_owned(9F). mutex_owned returns non-zero if the calling thread currently holds the specified kernel mutex, or zero if the specified adaptive mutex is currently unowned.

`mutex_owner`

kthread_t *mutex_owner(kmutex_t *mutex)

mutex_owner returns the thread pointer of the current owner of the specified adaptive kernel mutex. mutex_owner returns NULL if the specified adaptive mutex is currently unowned, or if the specified mutex is a spin mutex. See mutex_owned(9F).

`mutex_type_adaptive`

int mutex_type_adaptive(kmutex_t *mutex)

mutex_type_adaptive returns non-zero if the specified kernel mutex is of type MUTEX_ADAPTIVE, or zero if it is not. Mutexes are adaptive if they meet one or more of the following conditions:

The mutex is declared statically
The mutex is created with an interrupt block cookie of NULL
The mutex is created with an interrupt block cookie that does not correspond to a high-level interrupt

See mutex_init(9F) for more details on mutexes. The majority of mutexes in the Solaris kernel are adaptive.

`progenyof`

int progenyof(pid_t pid)

progenyof returns non-zero if the calling process (the process associated with the thread that is currently triggering the matched probe) is among the progeny of the specified process ID.

`rand`

int rand(void)

rand returns a pseudo-random integer. The number returned is a weak pseudo-random number, and should not be used for any cryptographic application.

`rw_iswriter`

int rw_iswriter(krwlock_t *rwlock)

rw_iswriter returns non-zero if the specified reader-writer lock is either held or desired by a writer. If the lock is held only by readers and no writer is blocked, or if the lock is not held at all, rw_iswriter returns zero. See rw_init(9F).

`rw_write_held`

int rw_write_held(krwlock_t *rwlock)

rw_write_held returns non-zero if the specified reader-writer lock is currently held by a writer. If the lock is held only by readers or not held at all, rw_write_held returns zero. See rw_init(9F).

`speculation`

int speculation(void)

speculation reserves a speculative trace buffer for use with speculate and returns an identifier for this buffer. See Chapter 13, Speculative Tracing for details.

`strjoin`

string strjoin(char *str1, char *str2)

strjoin creates a string that consists of str1 concatenated with str2. The returned string is allocated out of scratch memory, and is therefore valid only for the duration of the clause. If insufficient scratch space is available, strjoin does not execute and an error is generated.

`strlen`

size_t strlen(string str)

strlen returns the length of the specified string in bytes, excluding the terminating null byte.

Labels:

None

Sign up or Log in to add a comment or watch this page.

The individuals who post here are part of the extended Oracle community and they might not be employed or in any way formally affiliated with Oracle. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Oracle nor any other party necessarily agrees with them.

Terms of Use | Your Privacy Rights |