monitor/cherrypy/lib/cpstats.py

   1 """CPStats, a package for collecting and reporting on program statistics.
   2
   3 Overview
   4 ========
   5
   6 Statistics about program operation are an invaluable monitoring and debugging
   7 tool. Unfortunately, the gathering and reporting of these critical values is
   8 usually ad-hoc. This package aims to add a centralized place for gathering
   9 statistical performance data, a structure for recording that data which
  10 provides for extrapolation of that data into more useful information,
  11 and a method of serving that data to both human investigators and
  12 monitoring software. Let's examine each of those in more detail.
  13
  14 Data Gathering
  15 --------------
  16
  17 Just as Python's `logging` module provides a common importable for gathering
  18 and sending messages, performance statistics would benefit from a similar
  19 common mechanism, and one that does *not* require each package which wishes
  20 to collect stats to import a third-party module. Therefore, we choose to
  21 re-use the `logging` module by adding a `statistics` object to it.
  22
  23 That `logging.statistics` object is a nested dict. It is not a custom class,
  24 because that would 1) require libraries and applications to import a third-
  25 party module in order to participate, 2) inhibit innovation in extrapolation
  26 approaches and in reporting tools, and 3) be slow. There are, however, some
  27 specifications regarding the structure of the dict.
  28
  29     {
  30    +----"SQLAlchemy": {
  31    |        "Inserts": 4389745,
  32    |        "Inserts per Second":
  33    |            lambda s: s["Inserts"] / (time() - s["Start"]),
  34    |  C +---"Table Statistics": {
  35    |  o |        "widgets": {-----------+
  36  N |  l |            "Rows": 1.3M,      | Record
  37  a |  l |            "Inserts": 400,    |
  38  m |  e |        },---------------------+
  39  e |  c |        "froobles": {
  40  s |  t |            "Rows": 7845,
  41  p |  i |            "Inserts": 0,
  42  a |  o |        },
  43  c |  n +---},
  44  e |        "Slow Queries":
  45    |            [{"Query": "SELECT * FROM widgets;",
  46    |              "Processing Time": 47.840923343,
  47    |              },
  48    |             ],
  49    +----},
  50     }
  51
  52 The `logging.statistics` dict has four levels. The topmost level is nothing
  53 more than a set of names to introduce modularity, usually along the lines of
  54 package names. If the SQLAlchemy project wanted to participate, for example,
  55 it might populate the item `logging.statistics['SQLAlchemy']`, whose value
  56 would be a second-layer dict we call a "namespace". Namespaces help multiple
  57 packages to avoid collisions over key names, and make reports easier to read,
  58 to boot. The maintainers of SQLAlchemy should feel free to use more than one
  59 namespace if needed (such as 'SQLAlchemy ORM'). Note that there are no case
  60 or other syntax constraints on the namespace names; they should be chosen
  61 to be maximally readable by humans (neither too short nor too long).
  62
  63 Each namespace, then, is a dict of named statistical values, such as
  64 'Requests/sec' or 'Uptime'. You should choose names which will look
  65 good on a report: spaces and capitalization are just fine.
  66
  67 In addition to scalars, values in a namespace MAY be a (third-layer)
  68 dict, or a list, called a "collection". For example, the CherryPy StatsTool
  69 keeps track of what each request is doing (or has most recently done)
  70 in a 'Requests' collection, where each key is a thread ID; each
  71 value in the subdict MUST be a fourth dict (whew!) of statistical data about
  72 each thread. We call each subdict in the collection a "record". Similarly,
  73 the StatsTool also keeps a list of slow queries, where each record contains
  74 data about each slow query, in order.
  75
  76 Values in a namespace or record may also be functions, which brings us to:
  77
  78 Extrapolation
  79 -------------
  80
  81 The collection of statistical data needs to be fast, as close to unnoticeable
  82 as possible to the host program. That requires us to minimize I/O, for example,
  83 but in Python it also means we need to minimize function calls. So when you
  84 are designing your namespace and record values, try to insert the most basic
  85 scalar values you already have on hand.
  86
  87 When it comes time to report on the gathered data, however, we usually have
  88 much more freedom in what we can calculate. Therefore, whenever reporting
  89 tools (like the provided StatsPage CherryPy class) fetch the contents of
  90 `logging.statistics` for reporting, they first call `extrapolate_statistics`
  91 (passing the whole `statistics` dict as the only argument). This makes a
  92 deep copy of the statistics dict so that the reporting tool can both iterate
  93 over it and even change it without harming the original. But it also expands
  94 any functions in the dict by calling them. For example, you might have a
  95 'Current Time' entry in the namespace with the value "lambda scope: time.time()".
  96 The "scope" parameter is the current namespace dict (or record, if we're
  97 currently expanding one of those instead), allowing you access to existing
  98 static entries. If you're truly evil, you can even modify more than one entry
  99 at a time.
 100
 101 However, don't try to calculate an entry and then use its value in further
 102 extrapolations; the order in which the functions are called is not guaranteed.
 103 This can lead to a certain amount of duplicated work (or a redesign of your
 104 schema), but that's better than complicating the spec.
 105
 106 After the whole thing has been extrapolated, it's time for:
 107
 108 Reporting
 109 ---------
 110
 111 The StatsPage class grabs the `logging.statistics` dict, extrapolates it all,
 112 and then transforms it to HTML for easy viewing. Each namespace gets its own
 113 header and attribute table, plus an extra table for each collection. This is
 114 NOT part of the statistics specification; other tools can format how they like.
 115
 116 You can control which columns are output and how they are formatted by updating
 117 StatsPage.formatting, which is a dict that mirrors the keys and nesting of
 118 `logging.statistics`. The difference is that, instead of data values, it has
 119 formatting values. Use None for a given key to indicate to the StatsPage that a
 120 given column should not be output. Use a string with formatting (such as '%.3f')
 121 to interpolate the value(s), or use a callable (such as lambda v: v.isoformat())
 122 for more advanced formatting. Any entry which is not mentioned in the formatting
 123 dict is output unchanged.
 124
 125 Monitoring
 126 ----------
 127
 128 Although the HTML output takes pains to assign unique id's to each <td> with
 129 statistical data, you're probably better off fetching /cpstats/data, which
 130 outputs the whole (extrapolated) `logging.statistics` dict in JSON format.
 131 That is probably easier to parse, and doesn't have any formatting controls,
 132 so you get the "original" data in a consistently-serialized format.
 133 Note: there's no treatment yet for datetime objects. Try time.time() instead
 134 for now if you can. Nagios will probably thank you.
 135
 136 Turning Collection Off
 137 ----------------------
 138
 139 It is recommended each namespace have an "Enabled" item which, if False,
 140 stops collection (but not reporting) of statistical data. Applications
 141 SHOULD provide controls to pause and resume collection by setting these
 142 entries to False or True, if present.
 143
 144
 145 Usage
 146 =====
 147
 148 To collect statistics on CherryPy applications:
 149
 150     from cherrypy.lib import cpstats
 151     appconfig['/']['tools.cpstats.on'] = True
 152
 153 To collect statistics on your own code:
 154
 155     import logging
 156     # Initialize the repository
 157     if not hasattr(logging, 'statistics'): logging.statistics = {}
 158     # Initialize my namespace
 159     mystats = logging.statistics.setdefault('My Stuff', {})
 160     # Initialize my namespace's scalars and collections
 161     mystats.update({
 162         'Enabled': True,
 163         'Start Time': time.time(),
 164         'Important Events': 0,
 165         'Events/Second': lambda s: (
 166             (s['Important Events'] / (time.time() - s['Start Time']))),
 167         })
 168     ...
 169     for event in events:
 170         ...
 171         # Collect stats
 172         if mystats.get('Enabled', False):
 173             mystats['Important Events'] += 1
 174
 175 To report statistics:
 176
 177     root.cpstats = cpstats.StatsPage()
 178
 179 To format statistics reports:
 180
 181     See 'Reporting', above.
 182
 183 """
 184
 185 # -------------------------------- Statistics -------------------------------- #
 186
 187 import logging
 188 if not hasattr(logging, 'statistics'): logging.statistics = {}
 189
 190 def extrapolate_statistics(scope):
 191     """Return an extrapolated copy of the given scope."""
 192     c = {}
 193     for k, v in list(scope.items()):
 194         if isinstance(v, dict):
 195             v = extrapolate_statistics(v)
 196         elif isinstance(v, (list, tuple)):
 197             v = [extrapolate_statistics(record) for record in v]
 198         elif hasattr(v, '__call__'):
 199             v = v(scope)
 200         c[k] = v
 201     return c
 202
 203
 204 # --------------------- CherryPy Applications Statistics --------------------- #
 205
 206 import threading
 207 import time
 208
 209 import cherrypy
 210
 211 appstats = logging.statistics.setdefault('CherryPy Applications', {})
 212 appstats.update({
 213     'Enabled': True,
 214     'Bytes Read/Request': lambda s: (s['Total Requests'] and
 215         (s['Total Bytes Read'] / float(s['Total Requests'])) or 0.0),
 216     'Bytes Read/Second': lambda s: s['Total Bytes Read'] / s['Uptime'](s),
 217     'Bytes Written/Request': lambda s: (s['Total Requests'] and
 218         (s['Total Bytes Written'] / float(s['Total Requests'])) or 0.0),
 219     'Bytes Written/Second': lambda s: s['Total Bytes Written'] / s['Uptime'](s),
 220     'Current Time': lambda s: time.time(),
 221     'Current Requests': 0,
 222     'Requests/Second': lambda s: float(s['Total Requests']) / s['Uptime'](s),
 223     'Server Version': cherrypy.__version__,
 224     'Start Time': time.time(),
 225     'Total Bytes Read': 0,
 226     'Total Bytes Written': 0,
 227     'Total Requests': 0,
 228     'Total Time': 0,
 229     'Uptime': lambda s: time.time() - s['Start Time'],
 230     'Requests': {},
 231     })
 232
 233 proc_time = lambda s: time.time() - s['Start Time']
 234
 235
 236 class ByteCountWrapper(object):
 237     """Wraps a file-like object, counting the number of bytes read."""
 238
 239     def __init__(self, rfile):
 240         self.rfile = rfile
 241         self.bytes_read = 0
 242
 243     def read(self, size=-1):
 244         data = self.rfile.read(size)
 245         self.bytes_read += len(data)
 246         return data
 247
 248     def readline(self, size=-1):
 249         data = self.rfile.readline(size)
 250         self.bytes_read += len(data)
 251         return data
 252
 253     def readlines(self, sizehint=0):
 254         # Shamelessly stolen from StringIO
 255         total = 0
 256         lines = []
 257         line = self.readline()
 258         while line:
 259             lines.append(line)
 260             total += len(line)
 261             if 0 < sizehint <= total:
 262                 break
 263             line = self.readline()
 264         return lines
 265
 266     def close(self):
 267         self.rfile.close()
 268
 269     def __iter__(self):
 270         return self
 271
 272     def next(self):
 273         data = self.rfile.next()
 274         self.bytes_read += len(data)
 275         return data
 276
 277
 278 average_uriset_time = lambda s: s['Count'] and (s['Sum'] / s['Count']) or 0
 279
 280
 281 class StatsTool(cherrypy.Tool):
 282     """Record various information about the current request."""
 283
 284     def __init__(self):
 285         cherrypy.Tool.__init__(self, 'on_end_request', self.record_stop)
 286
 287     def _setup(self):
 288         """Hook this tool into cherrypy.request.
 289
 290         The standard CherryPy request object will automatically call this
 291         method when the tool is "turned on" in config.
 292         """
 293         if appstats.get('Enabled', False):
 294             cherrypy.Tool._setup(self)
 295             self.record_start()
 296
 297     def record_start(self):
 298         """Record the beginning of a request."""
 299         request = cherrypy.serving.request
 300         if not hasattr(request.rfile, 'bytes_read'):
 301             request.rfile = ByteCountWrapper(request.rfile)
 302             request.body.fp = request.rfile
 303
 304         r = request.remote
 305
 306         appstats['Current Requests'] += 1
 307         appstats['Total Requests'] += 1
 308         appstats['Requests'][threading._get_ident()] = {
 309             'Bytes Read': None,
 310             'Bytes Written': None,
 311             # Use a lambda so the ip gets updated by tools.proxy later
 312             'Client': lambda s: '%s:%s' % (r.ip, r.port),
 313             'End Time': None,
 314             'Processing Time': proc_time,
 315             'Request-Line': request.request_line,
 316             'Response Status': None,
 317             'Start Time': time.time(),
 318             }
 319
 320     def record_stop(self, uriset=None, slow_queries=1.0, slow_queries_count=100,
 321                     debug=False, **kwargs):
 322         """Record the end of a request."""
 323         w = appstats['Requests'][threading._get_ident()]
 324
 325         r = cherrypy.request.rfile.bytes_read
 326         w['Bytes Read'] = r
 327         appstats['Total Bytes Read'] += r
 328
 329         if cherrypy.response.stream:
 330             w['Bytes Written'] = 'chunked'
 331         else:
 332             cl = int(cherrypy.response.headers.get('Content-Length', 0))
 333             w['Bytes Written'] = cl
 334             appstats['Total Bytes Written'] += cl
 335
 336         w['Response Status'] = cherrypy.response.status
 337
 338         w['End Time'] = time.time()
 339         p = w['End Time'] - w['Start Time']
 340         w['Processing Time'] = p
 341         appstats['Total Time'] += p
 342
 343         appstats['Current Requests'] -= 1
 344
 345         if debug:
 346             cherrypy.log('Stats recorded: %s' % repr(w), 'TOOLS.CPSTATS')
 347
 348         if uriset:
 349             rs = appstats.setdefault('URI Set Tracking', {})
 350             r = rs.setdefault(uriset, {
 351                 'Min': None, 'Max': None, 'Count': 0, 'Sum': 0,
 352                 'Avg': average_uriset_time})
 353             if r['Min'] is None or p < r['Min']:
 354                 r['Min'] = p
 355             if r['Max'] is None or p > r['Max']:
 356                 r['Max'] = p
 357             r['Count'] += 1
 358             r['Sum'] += p
 359
 360         if slow_queries and p > slow_queries:
 361             sq = appstats.setdefault('Slow Queries', [])
 362             sq.append(w.copy())
 363             if len(sq) > slow_queries_count:
 364                 sq.pop(0)
 365
 366
 367 import cherrypy
 368 cherrypy.tools.cpstats = StatsTool()
 369
 370
 371 # ---------------------- CherryPy Statistics Reporting ---------------------- #
 372
 373 import os
 374 thisdir = os.path.abspath(os.path.dirname(__file__))
 375
 376 try:
 377     import json
 378 except ImportError:
 379     try:
 380         import simplejson as json
 381     except ImportError:
 382         json = None
 383
 384
 385 missing = object()
 386
 387 locale_date = lambda v: time.strftime('%c', time.gmtime(v))
 388 iso_format = lambda v: time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(v))
 389
 390 def pause_resume(ns):
 391     def _pause_resume(enabled):
 392         pause_disabled = ''
 393         resume_disabled = ''
 394         if enabled:
 395             resume_disabled = 'disabled="disabled" '
 396         else:
 397             pause_disabled = 'disabled="disabled" '
 398         return """
 399             <form action="pause" method="POST" style="display:inline">
 400             <input type="hidden" name="namespace" value="%s" />
 401             <input type="submit" value="Pause" %s/>
 402             </form>
 403             <form action="resume" method="POST" style="display:inline">
 404             <input type="hidden" name="namespace" value="%s" />
 405             <input type="submit" value="Resume" %s/>
 406             </form>
 407             """ % (ns, pause_disabled, ns, resume_disabled)
 408     return _pause_resume
 409
 410
 411 class StatsPage(object):
 412
 413     formatting = {
 414         'CherryPy Applications': {
 415             'Enabled': pause_resume('CherryPy Applications'),
 416             'Bytes Read/Request': '%.3f',
 417             'Bytes Read/Second': '%.3f',
 418             'Bytes Written/Request': '%.3f',
 419             'Bytes Written/Second': '%.3f',
 420             'Current Time': iso_format,
 421             'Requests/Second': '%.3f',
 422             'Start Time': iso_format,
 423             'Total Time': '%.3f',
 424             'Uptime': '%.3f',
 425             'Slow Queries': {
 426                 'End Time': None,
 427                 'Processing Time': '%.3f',
 428                 'Start Time': iso_format,
 429                 },
 430             'URI Set Tracking': {
 431                 'Avg': '%.3f',
 432                 'Max': '%.3f',
 433                 'Min': '%.3f',
 434                 'Sum': '%.3f',
 435                 },
 436             'Requests': {
 437                 'Bytes Read': '%s',
 438                 'Bytes Written': '%s',
 439                 'End Time': None,
 440                 'Processing Time': '%.3f',
 441                 'Start Time': None,
 442                 },
 443         },
 444         'CherryPy WSGIServer': {
 445             'Enabled': pause_resume('CherryPy WSGIServer'),
 446             'Connections/second': '%.3f',
 447             'Start time': iso_format,
 448         },
 449     }
 450
 451
 452     def index(self):
 453         # Transform the raw data into pretty output for HTML
 454         yield """
 455 <html>
 456 <head>
 457     <title>Statistics</title>
 458 <style>
 459
 460 th, td {
 461     padding: 0.25em 0.5em;
 462     border: 1px solid #666699;
 463 }
 464
 465 table {
 466     border-collapse: collapse;
 467 }
 468
 469 table.stats1 {
 470     width: 100%;
 471 }
 472
 473 table.stats1 th {
 474     font-weight: bold;
 475     text-align: right;
 476     background-color: #CCD5DD;
 477 }
 478
 479 table.stats2, h2 {
 480     margin-left: 50px;
 481 }
 482
 483 table.stats2 th {
 484     font-weight: bold;
 485     text-align: center;
 486     background-color: #CCD5DD;
 487 }
 488
 489 </style>
 490 </head>
 491 <body>
 492 """
 493         for title, scalars, collections in self.get_namespaces():
 494             yield """
 495 <h1>%s</h1>
 496
 497 <table class='stats1'>
 498     <tbody>
 499 """ % title
 500             for i, (key, value) in enumerate(scalars):
 501                 colnum = i % 3
 502                 if colnum == 0: yield """
 503         <tr>"""
 504                 yield """
 505             <th>%(key)s</th><td id='%(title)s-%(key)s'>%(value)s</td>""" % vars()
 506                 if colnum == 2: yield """
 507         </tr>"""
 508
 509             if colnum == 0: yield """
 510             <th></th><td></td>
 511             <th></th><td></td>
 512         </tr>"""
 513             elif colnum == 1: yield """
 514             <th></th><td></td>
 515         </tr>"""
 516             yield """
 517     </tbody>
 518 </table>"""
 519
 520             for subtitle, headers, subrows in collections:
 521                 yield """
 522 <h2>%s</h2>
 523 <table class='stats2'>
 524     <thead>
 525         <tr>""" % subtitle
 526                 for key in headers:
 527                     yield """
 528             <th>%s</th>""" % key
 529                 yield """
 530         </tr>
 531     </thead>
 532     <tbody>"""
 533                 for subrow in subrows:
 534                     yield """
 535         <tr>"""
 536                     for value in subrow:
 537                         yield """
 538             <td>%s</td>""" % value
 539                     yield """
 540         </tr>"""
 541                 yield """
 542     </tbody>
 543 </table>"""
 544         yield """
 545 </body>
 546 </html>
 547 """
 548     index.exposed = True
 549
 550     def get_namespaces(self):
 551         """Yield (title, scalars, collections) for each namespace."""
 552         s = extrapolate_statistics(logging.statistics)
 553         for title, ns in sorted(s.items()):
 554             scalars = []
 555             collections = []
 556             ns_fmt = self.formatting.get(title, {})
 557             for k, v in sorted(ns.items()):
 558                 fmt = ns_fmt.get(k, {})
 559                 if isinstance(v, dict):
 560                     headers, subrows = self.get_dict_collection(v, fmt)
 561                     collections.append((k, ['ID'] + headers, subrows))
 562                 elif isinstance(v, (list, tuple)):
 563                     headers, subrows = self.get_list_collection(v, fmt)
 564                     collections.append((k, headers, subrows))
 565                 else:
 566                     format = ns_fmt.get(k, missing)
 567                     if format is None:
 568                         # Don't output this column.
 569                         continue
 570                     if hasattr(format, '__call__'):
 571                         v = format(v)
 572                     elif format is not missing:
 573                         v = format % v
 574                     scalars.append((k, v))
 575             yield title, scalars, collections
 576
 577     def get_dict_collection(self, v, formatting):
 578         """Return ([headers], [rows]) for the given collection."""
 579         # E.g., the 'Requests' dict.
 580         headers = []
 581         for record in v.itervalues():
 582             for k3 in record:
 583                 format = formatting.get(k3, missing)
 584                 if format is None:
 585                     # Don't output this column.
 586                     continue
 587                 if k3 not in headers:
 588                     headers.append(k3)
 589         headers.sort()
 590
 591         subrows = []
 592         for k2, record in sorted(v.items()):
 593             subrow = [k2]
 594             for k3 in headers:
 595                 v3 = record.get(k3, '')
 596                 format = formatting.get(k3, missing)
 597                 if format is None:
 598                     # Don't output this column.
 599                     continue
 600                 if hasattr(format, '__call__'):
 601                     v3 = format(v3)
 602                 elif format is not missing:
 603                     v3 = format % v3
 604                 subrow.append(v3)
 605             subrows.append(subrow)
 606
 607         return headers, subrows
 608
 609     def get_list_collection(self, v, formatting):
 610         """Return ([headers], [subrows]) for the given collection."""
 611         # E.g., the 'Slow Queries' list.
 612         headers = []
 613         for record in v:
 614             for k3 in record:
 615                 format = formatting.get(k3, missing)
 616                 if format is None:
 617                     # Don't output this column.
 618                     continue
 619                 if k3 not in headers:
 620                     headers.append(k3)
 621         headers.sort()
 622
 623         subrows = []
 624         for record in v:
 625             subrow = []
 626             for k3 in headers:
 627                 v3 = record.get(k3, '')
 628                 format = formatting.get(k3, missing)
 629                 if format is None:
 630                     # Don't output this column.
 631                     continue
 632                 if hasattr(format, '__call__'):
 633                     v3 = format(v3)
 634                 elif format is not missing:
 635                     v3 = format % v3
 636                 subrow.append(v3)
 637             subrows.append(subrow)
 638
 639         return headers, subrows
 640
 641     if json is not None:
 642         def data(self):
 643             s = extrapolate_statistics(logging.statistics)
 644             cherrypy.response.headers['Content-Type'] = 'application/json'
 645             return json.dumps(s, sort_keys=True, indent=4)
 646         data.exposed = True
 647
 648     def pause(self, namespace):
 649         logging.statistics.get(namespace, {})['Enabled'] = False
 650         raise cherrypy.HTTPRedirect('./')
 651     pause.exposed = True
 652     pause.cp_config = {'tools.allow.on': True,
 653                        'tools.allow.methods': ['POST']}
 654
 655     def resume(self, namespace):
 656         logging.statistics.get(namespace, {})['Enabled'] = True
 657         raise cherrypy.HTTPRedirect('./')
 658     resume.exposed = True
 659     resume.cp_config = {'tools.allow.on': True,
 660                         'tools.allow.methods': ['POST']}
 661