Perf -- Linux下的系统性能调优工具

在ubuntu 11.04 要装了这两个包才有的工具的那些命令的,我在软件中心中装的 apt-get install 也一样吧?
linux-tools
linux-tools-common
这个工具不错可以统计很多硬件相关信息 “cpu cache命中" "分支预测" "指令周期"等信息。还可以监控指定的进程的函数调用计数信息等。

桌面$ perf  list

List of pre-defined events (to be used in -e):

  cpu-cycles OR cycles                       [Hardware event]

  instructions                               [Hardware event]
  cache-references                           [Hardware event]
  cache-misses                               [Hardware event]
  branch-instructions OR branches            [Hardware event]
  branch-misses                              [Hardware event]
  bus-cycles                                 [Hardware event]

  cpu-clock                                  [Software event]

  task-clock                                 [Software event]
  page-faults OR faults                      [Software event]
  minor-faults                               [Software event]
  major-faults                               [Software event]
  context-switches OR cs                     [Software event]
  cpu-migrations OR migrations               [Software event]
  alignment-faults                           [Software event]
  emulation-faults                           [Software event]

  L1-dcache-loads                            [Hardware cache event]

  L1-dcache-load-misses                      [Hardware cache event]
  L1-dcache-stores                           [Hardware cache event]
  L1-dcache-store-misses                     [Hardware cache event]
  L1-dcache-prefetches                       [Hardware cache event]
  L1-dcache-prefetch-misses                  [Hardware cache event]
  L1-icache-loads                            [Hardware cache event]
  L1-icache-load-misses                      [Hardware cache event]
  L1-icache-prefetches                       [Hardware cache event]
  L1-icache-prefetch-misses                  [Hardware cache event]
  LLC-loads                                  [Hardware cache event]
  LLC-load-misses                            [Hardware cache event]
  LLC-stores                                 [Hardware cache event]
  LLC-store-misses                           [Hardware cache event]
  LLC-prefetches                             [Hardware cache event]
  LLC-prefetch-misses                        [Hardware cache event]
  dTLB-loads                                 [Hardware cache event]
  dTLB-load-misses                           [Hardware cache event]
  dTLB-stores                                [Hardware cache event]
  dTLB-store-misses                          [Hardware cache event]
  dTLB-prefetches                            [Hardware cache event]
  dTLB-prefetch-misses                       [Hardware cache event]
  iTLB-loads                                 [Hardware cache event]
  iTLB-load-misses                           [Hardware cache event]
  branch-loads                               [Hardware cache event]
  branch-load-misses                         [Hardware cache event]

  rNNN (see 'perf list --help' on how to encode it) [Raw hardware event descript

  mem:<addr>[:access]                        [Hardware breakpoint]

perf stat ./a.out

^C
 Performance counter stats for './a.out':

             9,044 cache-misses             #      0.003 M/sec  (scaled from 66.87%)

           523,191 cache-references         #      0.172 M/sec  (scaled from 66.96%)
        21,838,315 branch-misses            #      6.678 %      (scaled from 33.13%)
       327,014,993 branches                 #    107.285 M/sec  (scaled from 33.04%)
     2,355,587,681 instructions             #      0.349 IPC    (scaled from 49.41%)
     6,740,540,287 cycles                   #   2211.403 M/sec  (scaled from 67.03%)
               100 page-faults              #      0.000 M/sec
                30 CPU-migrations           #      0.000 M/sec
               482 context-switches         #      0.000 M/sec
       3048.082970 task-clock-msecs         #      0.596 CPUs

        5.118230246  seconds time elapsed

桌面$ ps -ef

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 08:21 ?        00:00:00 /sbin/init
root         2     0  0 08:21 ?        00:00:00 [kthreadd]
root         3     2  0 08:21 ?        00:00:00 [ksoftirqd/0]
root         5     2  0 08:21 ?        00:00:00 [kworker/u:0]
root         6     2  0 08:21 ?        00:00:00 [migration/0]
root         7     2  0 08:21 ?        00:00:00 [migration/1]
root         9     2  0 08:21 ?        00:00:00 [ksoftirqd/1]
root        10     2  0 08:21 ?        00:00:01 [kworker/0:1]

桌面$ perf top -c 1000 -p 5
  Fatal: Permission error - are you root?
     Consider tweaking /proc/sys/kernel/perf_event_paranoid.

桌面$ sudo perf top -c 1000 -p 5

[sudo] password for widebright:

-------------------------------------------------------------------------------

   PerfTop:     302 irqs/sec  kernel:100.0%  exact:  0.0% [1000 cycles],  (target_pid: 5)
-------------------------------------------------------------------------------

             samples  pcnt function                         DSO

             _______ _____ ________________________________ ________

              603.00 19.7% i915_gem_retire_requests_ring    [i915] 

              352.00 11.5% kref_put                         [kernel]
              250.00  8.2% __ticket_spin_lock               [kernel]
              184.00  6.0% i915_gem_object_move_to_inactive [i915] 
              143.00  4.7% kfree                            [kernel]
              113.00  3.7% __ticket_spin_unlock             [kernel]
              102.00  3.3% find_busiest_group               [kernel]
               69.00  2.3% i915_gem_retire_work_handler     [i915] 
               65.00  2.1% __slab_free                      [kernel]
               64.00  2.1% mod_timer                        [kernel]
               61.00  2.0% i915_gem_object_move_to_active   [i915] 
               53.00  1.7% update_cfs_load                  [kernel]
               52.00  1.7% process_one_work                 [kernel]

=========================

PERF-STAT(1)                      perf Manual                     PERF-STAT(1)

NAME

       perf-stat - Run a command and gather performance counter statistics

SYNOPSIS

       perf stat [-e <EVENT> | --event=EVENT] [-a] <command>
       perf stat [-e <EVENT> | --event=EVENT] [-a] — <command> [<options>]

DESCRIPTION

       This command runs a command and gathers performance counter statistics
       from it.

OPTIONS

       <command>...
           Any command you can specify in a shell.

       -e, --event=

           Select the PMU event. Selection can be a symbolic event name (use
           perf list to list all events) or a raw PMU event (eventsel+umask)
           in the form of rNNN where NNN is a hexadecimal event descriptor.

       -i, --no-inherit

           child tasks do not inherit counters

       -p, --pid=<pid>

           stat events on existing process id

       -t, --tid=<tid>

           stat events on existing thread id

       -a, --all-cpus

           system-wide collection from all CPUs

       -c, --scale

           scale/normalize counter values

       -r, --repeat=<n>

           repeat command and print average + stddev (max: 100)

       -B, --big-num

           print large numbers with thousands' separators according to locale

       -C, --cpu=

           Count only on the list of CPUs provided. Multiple CPUs can be
           provided as a comma-separated list with no space: 0,1. Ranges of
           CPUs are specified with -: 0-2. In per-thread mode, this option is
           ignored. The -a option is still necessary to activate system-wide
           monitoring. Default is to count on all CPUs.

       -A, --no-aggr

           Do not aggregate counts across all monitored CPUs in system-wide
           mode (-a). This option is only valid in system-wide mode.

       -n, --null

           null run - don’t start any counters

       -v, --verbose

           be more verbose (show counter open errors, etc)

       -x SEP, --field-separator SEP

           print counts using a CSV-style output to make it easy to import
           directly into spreadsheets. Columns are separated by the string
           specified in SEP.

EXAMPLES

       $ perf stat — make -j

           Performance counter stats for 'make -j':

           8117.370256  task clock ticks     #      11.281 CPU utilization factor

                   678  context switches     #       0.000 M/sec
                   133  CPU migrations       #       0.000 M/sec
                235724  pagefaults           #       0.029 M/sec
           24821162526  CPU cycles           #    3057.784 M/sec
           18687303457  instructions         #    2302.138 M/sec
             172158895  cache references     #      21.209 M/sec
              27075259  cache misses         #       3.335 M/sec

           Wall-clock time elapsed:   719.554352 msecs

SEE ALSO

       perf-top(1), perf-list(1)