Fixed python_path problem.
[smonitor.git] / lib / cherrypy / _cpreqbody.py
blob1b0496e39b35313f7e89a4f07f70f3b52de08940
1 """Request body processing for CherryPy.
3 .. versionadded:: 3.2
5 Application authors have complete control over the parsing of HTTP request
6 entities. In short, :attr:`cherrypy.request.body<cherrypy._cprequest.Request.body>`
7 is now always set to an instance of :class:`RequestBody<cherrypy._cpreqbody.RequestBody>`,
8 and *that* class is a subclass of :class:`Entity<cherrypy._cpreqbody.Entity>`.
10 When an HTTP request includes an entity body, it is often desirable to
11 provide that information to applications in a form other than the raw bytes.
12 Different content types demand different approaches. Examples:
14 * For a GIF file, we want the raw bytes in a stream.
15 * An HTML form is better parsed into its component fields, and each text field
16 decoded from bytes to unicode.
17 * A JSON body should be deserialized into a Python dict or list.
19 When the request contains a Content-Type header, the media type is used as a
20 key to look up a value in the
21 :attr:`request.body.processors<cherrypy._cpreqbody.Entity.processors>` dict.
22 If the full media
23 type is not found, then the major type is tried; for example, if no processor
24 is found for the 'image/jpeg' type, then we look for a processor for the 'image'
25 types altogether. If neither the full type nor the major type has a matching
26 processor, then a default processor is used
27 (:func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>`). For most
28 types, this means no processing is done, and the body is left unread as a
29 raw byte stream. Processors are configurable in an 'on_start_resource' hook.
31 Some processors, especially those for the 'text' types, attempt to decode bytes
32 to unicode. If the Content-Type request header includes a 'charset' parameter,
33 this is used to decode the entity. Otherwise, one or more default charsets may
34 be attempted, although this decision is up to each processor. If a processor
35 successfully decodes an Entity or Part, it should set the
36 :attr:`charset<cherrypy._cpreqbody.Entity.charset>` attribute
37 on the Entity or Part to the name of the successful charset, so that
38 applications can easily re-encode or transcode the value if they wish.
40 If the Content-Type of the request entity is of major type 'multipart', then
41 the above parsing process, and possibly a decoding process, is performed for
42 each part.
44 For both the full entity and multipart parts, a Content-Disposition header may
45 be used to fill :attr:`name<cherrypy._cpreqbody.Entity.name>` and
46 :attr:`filename<cherrypy._cpreqbody.Entity.filename>` attributes on the
47 request.body or the Part.
49 .. _custombodyprocessors:
51 Custom Processors
52 =================
54 You can add your own processors for any specific or major MIME type. Simply add
55 it to the :attr:`processors<cherrypy._cprequest.Entity.processors>` dict in a
56 hook/tool that runs at ``on_start_resource`` or ``before_request_body``.
57 Here's the built-in JSON tool for an example::
59 def json_in(force=True, debug=False):
60 request = cherrypy.serving.request
61 def json_processor(entity):
62 \"""Read application/json data into request.json.\"""
63 if not entity.headers.get("Content-Length", ""):
64 raise cherrypy.HTTPError(411)
66 body = entity.fp.read()
67 try:
68 request.json = json_decode(body)
69 except ValueError:
70 raise cherrypy.HTTPError(400, 'Invalid JSON document')
71 if force:
72 request.body.processors.clear()
73 request.body.default_proc = cherrypy.HTTPError(
74 415, 'Expected an application/json content type')
75 request.body.processors['application/json'] = json_processor
77 We begin by defining a new ``json_processor`` function to stick in the ``processors``
78 dictionary. All processor functions take a single argument, the ``Entity`` instance
79 they are to process. It will be called whenever a request is received (for those
80 URI's where the tool is turned on) which has a ``Content-Type`` of
81 "application/json".
83 First, it checks for a valid ``Content-Length`` (raising 411 if not valid), then
84 reads the remaining bytes on the socket. The ``fp`` object knows its own length, so
85 it won't hang waiting for data that never arrives. It will return when all data
86 has been read. Then, we decode those bytes using Python's built-in ``json`` module,
87 and stick the decoded result onto ``request.json`` . If it cannot be decoded, we
88 raise 400.
90 If the "force" argument is True (the default), the ``Tool`` clears the ``processors``
91 dict so that request entities of other ``Content-Types`` aren't parsed at all. Since
92 there's no entry for those invalid MIME types, the ``default_proc`` method of ``cherrypy.request.body``
93 is called. But this does nothing by default (usually to provide the page handler an opportunity to handle it.)
94 But in our case, we want to raise 415, so we replace ``request.body.default_proc``
95 with the error (``HTTPError`` instances, when called, raise themselves).
97 If we were defining a custom processor, we can do so without making a ``Tool``. Just add the config entry::
99 request.body.processors = {'application/json': json_processor}
101 Note that you can only replace the ``processors`` dict wholesale this way, not update the existing one.
104 import re
105 import sys
106 import tempfile
107 from urllib import unquote_plus
109 import cherrypy
110 from cherrypy._cpcompat import basestring, ntob, ntou
111 from cherrypy.lib import httputil
114 # -------------------------------- Processors -------------------------------- #
116 def process_urlencoded(entity):
117 """Read application/x-www-form-urlencoded data into entity.params."""
118 qs = entity.fp.read()
119 for charset in entity.attempt_charsets:
120 try:
121 params = {}
122 for aparam in qs.split(ntob('&')):
123 for pair in aparam.split(ntob(';')):
124 if not pair:
125 continue
127 atoms = pair.split(ntob('='), 1)
128 if len(atoms) == 1:
129 atoms.append(ntob(''))
131 key = unquote_plus(atoms[0]).decode(charset)
132 value = unquote_plus(atoms[1]).decode(charset)
134 if key in params:
135 if not isinstance(params[key], list):
136 params[key] = [params[key]]
137 params[key].append(value)
138 else:
139 params[key] = value
140 except UnicodeDecodeError:
141 pass
142 else:
143 entity.charset = charset
144 break
145 else:
146 raise cherrypy.HTTPError(
147 400, "The request entity could not be decoded. The following "
148 "charsets were attempted: %s" % repr(entity.attempt_charsets))
150 # Now that all values have been successfully parsed and decoded,
151 # apply them to the entity.params dict.
152 for key, value in params.items():
153 if key in entity.params:
154 if not isinstance(entity.params[key], list):
155 entity.params[key] = [entity.params[key]]
156 entity.params[key].append(value)
157 else:
158 entity.params[key] = value
161 def process_multipart(entity):
162 """Read all multipart parts into entity.parts."""
163 ib = ""
164 if 'boundary' in entity.content_type.params:
165 # http://tools.ietf.org/html/rfc2046#section-5.1.1
166 # "The grammar for parameters on the Content-type field is such that it
167 # is often necessary to enclose the boundary parameter values in quotes
168 # on the Content-type line"
169 ib = entity.content_type.params['boundary'].strip('"')
171 if not re.match("^[ -~]{0,200}[!-~]$", ib):
172 raise ValueError('Invalid boundary in multipart form: %r' % (ib,))
174 ib = ('--' + ib).encode('ascii')
176 # Find the first marker
177 while True:
178 b = entity.readline()
179 if not b:
180 return
182 b = b.strip()
183 if b == ib:
184 break
186 # Read all parts
187 while True:
188 part = entity.part_class.from_fp(entity.fp, ib)
189 entity.parts.append(part)
190 part.process()
191 if part.fp.done:
192 break
194 def process_multipart_form_data(entity):
195 """Read all multipart/form-data parts into entity.parts or entity.params."""
196 process_multipart(entity)
198 kept_parts = []
199 for part in entity.parts:
200 if part.name is None:
201 kept_parts.append(part)
202 else:
203 if part.filename is None:
204 # It's a regular field
205 value = part.fullvalue()
206 else:
207 # It's a file upload. Retain the whole part so consumer code
208 # has access to its .file and .filename attributes.
209 value = part
211 if part.name in entity.params:
212 if not isinstance(entity.params[part.name], list):
213 entity.params[part.name] = [entity.params[part.name]]
214 entity.params[part.name].append(value)
215 else:
216 entity.params[part.name] = value
218 entity.parts = kept_parts
220 def _old_process_multipart(entity):
221 """The behavior of 3.2 and lower. Deprecated and will be changed in 3.3."""
222 process_multipart(entity)
224 params = entity.params
226 for part in entity.parts:
227 if part.name is None:
228 key = ntou('parts')
229 else:
230 key = part.name
232 if part.filename is None:
233 # It's a regular field
234 value = part.fullvalue()
235 else:
236 # It's a file upload. Retain the whole part so consumer code
237 # has access to its .file and .filename attributes.
238 value = part
240 if key in params:
241 if not isinstance(params[key], list):
242 params[key] = [params[key]]
243 params[key].append(value)
244 else:
245 params[key] = value
249 # --------------------------------- Entities --------------------------------- #
252 class Entity(object):
253 """An HTTP request body, or MIME multipart body.
255 This class collects information about the HTTP request entity. When a
256 given entity is of MIME type "multipart", each part is parsed into its own
257 Entity instance, and the set of parts stored in
258 :attr:`entity.parts<cherrypy._cpreqbody.Entity.parts>`.
260 Between the ``before_request_body`` and ``before_handler`` tools, CherryPy
261 tries to process the request body (if any) by calling
262 :func:`request.body.process<cherrypy._cpreqbody.RequestBody.process`.
263 This uses the ``content_type`` of the Entity to look up a suitable processor
264 in :attr:`Entity.processors<cherrypy._cpreqbody.Entity.processors>`, a dict.
265 If a matching processor cannot be found for the complete Content-Type,
266 it tries again using the major type. For example, if a request with an
267 entity of type "image/jpeg" arrives, but no processor can be found for
268 that complete type, then one is sought for the major type "image". If a
269 processor is still not found, then the
270 :func:`default_proc<cherrypy._cpreqbody.Entity.default_proc>` method of the
271 Entity is called (which does nothing by default; you can override this too).
273 CherryPy includes processors for the "application/x-www-form-urlencoded"
274 type, the "multipart/form-data" type, and the "multipart" major type.
275 CherryPy 3.2 processes these types almost exactly as older versions.
276 Parts are passed as arguments to the page handler using their
277 ``Content-Disposition.name`` if given, otherwise in a generic "parts"
278 argument. Each such part is either a string, or the
279 :class:`Part<cherrypy._cpreqbody.Part>` itself if it's a file. (In this
280 case it will have ``file`` and ``filename`` attributes, or possibly a
281 ``value`` attribute). Each Part is itself a subclass of
282 Entity, and has its own ``process`` method and ``processors`` dict.
284 There is a separate processor for the "multipart" major type which is more
285 flexible, and simply stores all multipart parts in
286 :attr:`request.body.parts<cherrypy._cpreqbody.Entity.parts>`. You can
287 enable it with::
289 cherrypy.request.body.processors['multipart'] = _cpreqbody.process_multipart
291 in an ``on_start_resource`` tool.
294 # http://tools.ietf.org/html/rfc2046#section-4.1.2:
295 # "The default character set, which must be assumed in the
296 # absence of a charset parameter, is US-ASCII."
297 # However, many browsers send data in utf-8 with no charset.
298 attempt_charsets = ['utf-8']
299 """A list of strings, each of which should be a known encoding.
301 When the Content-Type of the request body warrants it, each of the given
302 encodings will be tried in order. The first one to successfully decode the
303 entity without raising an error is stored as
304 :attr:`entity.charset<cherrypy._cpreqbody.Entity.charset>`. This defaults
305 to ``['utf-8']`` (plus 'ISO-8859-1' for "text/\*" types, as required by
306 `HTTP/1.1 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_),
307 but ``['us-ascii', 'utf-8']`` for multipart parts.
310 charset = None
311 """The successful decoding; see "attempt_charsets" above."""
313 content_type = None
314 """The value of the Content-Type request header.
316 If the Entity is part of a multipart payload, this will be the Content-Type
317 given in the MIME headers for this part.
320 default_content_type = 'application/x-www-form-urlencoded'
321 """This defines a default ``Content-Type`` to use if no Content-Type header
322 is given. The empty string is used for RequestBody, which results in the
323 request body not being read or parsed at all. This is by design; a missing
324 ``Content-Type`` header in the HTTP request entity is an error at best,
325 and a security hole at worst. For multipart parts, however, the MIME spec
326 declares that a part with no Content-Type defaults to "text/plain"
327 (see :class:`Part<cherrypy._cpreqbody.Part>`).
330 filename = None
331 """The ``Content-Disposition.filename`` header, if available."""
333 fp = None
334 """The readable socket file object."""
336 headers = None
337 """A dict of request/multipart header names and values.
339 This is a copy of the ``request.headers`` for the ``request.body``;
340 for multipart parts, it is the set of headers for that part.
343 length = None
344 """The value of the ``Content-Length`` header, if provided."""
346 name = None
347 """The "name" parameter of the ``Content-Disposition`` header, if any."""
349 params = None
351 If the request Content-Type is 'application/x-www-form-urlencoded' or
352 multipart, this will be a dict of the params pulled from the entity
353 body; that is, it will be the portion of request.params that come
354 from the message body (sometimes called "POST params", although they
355 can be sent with various HTTP method verbs). This value is set between
356 the 'before_request_body' and 'before_handler' hooks (assuming that
357 process_request_body is True)."""
359 processors = {'application/x-www-form-urlencoded': process_urlencoded,
360 'multipart/form-data': process_multipart_form_data,
361 'multipart': process_multipart,
363 """A dict of Content-Type names to processor methods."""
365 parts = None
366 """A list of Part instances if ``Content-Type`` is of major type "multipart"."""
368 part_class = None
369 """The class used for multipart parts.
371 You can replace this with custom subclasses to alter the processing of
372 multipart parts.
375 def __init__(self, fp, headers, params=None, parts=None):
376 # Make an instance-specific copy of the class processors
377 # so Tools, etc. can replace them per-request.
378 self.processors = self.processors.copy()
380 self.fp = fp
381 self.headers = headers
383 if params is None:
384 params = {}
385 self.params = params
387 if parts is None:
388 parts = []
389 self.parts = parts
391 # Content-Type
392 self.content_type = headers.elements('Content-Type')
393 if self.content_type:
394 self.content_type = self.content_type[0]
395 else:
396 self.content_type = httputil.HeaderElement.from_str(
397 self.default_content_type)
399 # Copy the class 'attempt_charsets', prepending any Content-Type charset
400 dec = self.content_type.params.get("charset", None)
401 if dec:
402 #dec = dec.decode('ISO-8859-1')
403 self.attempt_charsets = [dec] + [c for c in self.attempt_charsets
404 if c != dec]
405 else:
406 self.attempt_charsets = self.attempt_charsets[:]
408 # Length
409 self.length = None
410 clen = headers.get('Content-Length', None)
411 # If Transfer-Encoding is 'chunked', ignore any Content-Length.
412 if clen is not None and 'chunked' not in headers.get('Transfer-Encoding', ''):
413 try:
414 self.length = int(clen)
415 except ValueError:
416 pass
418 # Content-Disposition
419 self.name = None
420 self.filename = None
421 disp = headers.elements('Content-Disposition')
422 if disp:
423 disp = disp[0]
424 if 'name' in disp.params:
425 self.name = disp.params['name']
426 if self.name.startswith('"') and self.name.endswith('"'):
427 self.name = self.name[1:-1]
428 if 'filename' in disp.params:
429 self.filename = disp.params['filename']
430 if self.filename.startswith('"') and self.filename.endswith('"'):
431 self.filename = self.filename[1:-1]
433 # The 'type' attribute is deprecated in 3.2; remove it in 3.3.
434 type = property(lambda self: self.content_type,
435 doc="""A deprecated alias for :attr:`content_type<cherrypy._cpreqbody.Entity.content_type>`.""")
437 def read(self, size=None, fp_out=None):
438 return self.fp.read(size, fp_out)
440 def readline(self, size=None):
441 return self.fp.readline(size)
443 def readlines(self, sizehint=None):
444 return self.fp.readlines(sizehint)
446 def __iter__(self):
447 return self
449 def next(self):
450 line = self.readline()
451 if not line:
452 raise StopIteration
453 return line
455 def read_into_file(self, fp_out=None):
456 """Read the request body into fp_out (or make_file() if None). Return fp_out."""
457 if fp_out is None:
458 fp_out = self.make_file()
459 self.read(fp_out=fp_out)
460 return fp_out
462 def make_file(self):
463 """Return a file-like object into which the request body will be read.
465 By default, this will return a TemporaryFile. Override as needed.
466 See also :attr:`cherrypy._cpreqbody.Part.maxrambytes`."""
467 return tempfile.TemporaryFile()
469 def fullvalue(self):
470 """Return this entity as a string, whether stored in a file or not."""
471 if self.file:
472 # It was stored in a tempfile. Read it.
473 self.file.seek(0)
474 value = self.file.read()
475 self.file.seek(0)
476 else:
477 value = self.value
478 return value
480 def process(self):
481 """Execute the best-match processor for the given media type."""
482 proc = None
483 ct = self.content_type.value
484 try:
485 proc = self.processors[ct]
486 except KeyError:
487 toptype = ct.split('/', 1)[0]
488 try:
489 proc = self.processors[toptype]
490 except KeyError:
491 pass
492 if proc is None:
493 self.default_proc()
494 else:
495 proc(self)
497 def default_proc(self):
498 """Called if a more-specific processor is not found for the ``Content-Type``."""
499 # Leave the fp alone for someone else to read. This works fine
500 # for request.body, but the Part subclasses need to override this
501 # so they can move on to the next part.
502 pass
505 class Part(Entity):
506 """A MIME part entity, part of a multipart entity."""
508 # "The default character set, which must be assumed in the absence of a
509 # charset parameter, is US-ASCII."
510 attempt_charsets = ['us-ascii', 'utf-8']
511 """A list of strings, each of which should be a known encoding.
513 When the Content-Type of the request body warrants it, each of the given
514 encodings will be tried in order. The first one to successfully decode the
515 entity without raising an error is stored as
516 :attr:`entity.charset<cherrypy._cpreqbody.Entity.charset>`. This defaults
517 to ``['utf-8']`` (plus 'ISO-8859-1' for "text/\*" types, as required by
518 `HTTP/1.1 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_),
519 but ``['us-ascii', 'utf-8']`` for multipart parts.
522 boundary = None
523 """The MIME multipart boundary."""
525 default_content_type = 'text/plain'
526 """This defines a default ``Content-Type`` to use if no Content-Type header
527 is given. The empty string is used for RequestBody, which results in the
528 request body not being read or parsed at all. This is by design; a missing
529 ``Content-Type`` header in the HTTP request entity is an error at best,
530 and a security hole at worst. For multipart parts, however (this class),
531 the MIME spec declares that a part with no Content-Type defaults to
532 "text/plain".
535 # This is the default in stdlib cgi. We may want to increase it.
536 maxrambytes = 1000
537 """The threshold of bytes after which point the ``Part`` will store its data
538 in a file (generated by :func:`make_file<cherrypy._cprequest.Entity.make_file>`)
539 instead of a string. Defaults to 1000, just like the :mod:`cgi` module in
540 Python's standard library.
543 def __init__(self, fp, headers, boundary):
544 Entity.__init__(self, fp, headers)
545 self.boundary = boundary
546 self.file = None
547 self.value = None
549 def from_fp(cls, fp, boundary):
550 headers = cls.read_headers(fp)
551 return cls(fp, headers, boundary)
552 from_fp = classmethod(from_fp)
554 def read_headers(cls, fp):
555 headers = httputil.HeaderMap()
556 while True:
557 line = fp.readline()
558 if not line:
559 # No more data--illegal end of headers
560 raise EOFError("Illegal end of headers.")
562 if line == ntob('\r\n'):
563 # Normal end of headers
564 break
565 if not line.endswith(ntob('\r\n')):
566 raise ValueError("MIME requires CRLF terminators: %r" % line)
568 if line[0] in ntob(' \t'):
569 # It's a continuation line.
570 v = line.strip().decode('ISO-8859-1')
571 else:
572 k, v = line.split(ntob(":"), 1)
573 k = k.strip().decode('ISO-8859-1')
574 v = v.strip().decode('ISO-8859-1')
576 existing = headers.get(k)
577 if existing:
578 v = ", ".join((existing, v))
579 headers[k] = v
581 return headers
582 read_headers = classmethod(read_headers)
584 def read_lines_to_boundary(self, fp_out=None):
585 """Read bytes from self.fp and return or write them to a file.
587 If the 'fp_out' argument is None (the default), all bytes read are
588 returned in a single byte string.
590 If the 'fp_out' argument is not None, it must be a file-like object that
591 supports the 'write' method; all bytes read will be written to the fp,
592 and that fp is returned.
594 endmarker = self.boundary + ntob("--")
595 delim = ntob("")
596 prev_lf = True
597 lines = []
598 seen = 0
599 while True:
600 line = self.fp.readline(1<<16)
601 if not line:
602 raise EOFError("Illegal end of multipart body.")
603 if line.startswith(ntob("--")) and prev_lf:
604 strippedline = line.strip()
605 if strippedline == self.boundary:
606 break
607 if strippedline == endmarker:
608 self.fp.finish()
609 break
611 line = delim + line
613 if line.endswith(ntob("\r\n")):
614 delim = ntob("\r\n")
615 line = line[:-2]
616 prev_lf = True
617 elif line.endswith(ntob("\n")):
618 delim = ntob("\n")
619 line = line[:-1]
620 prev_lf = True
621 else:
622 delim = ntob("")
623 prev_lf = False
625 if fp_out is None:
626 lines.append(line)
627 seen += len(line)
628 if seen > self.maxrambytes:
629 fp_out = self.make_file()
630 for line in lines:
631 fp_out.write(line)
632 else:
633 fp_out.write(line)
635 if fp_out is None:
636 result = ntob('').join(lines)
637 for charset in self.attempt_charsets:
638 try:
639 result = result.decode(charset)
640 except UnicodeDecodeError:
641 pass
642 else:
643 self.charset = charset
644 return result
645 else:
646 raise cherrypy.HTTPError(
647 400, "The request entity could not be decoded. The following "
648 "charsets were attempted: %s" % repr(self.attempt_charsets))
649 else:
650 fp_out.seek(0)
651 return fp_out
653 def default_proc(self):
654 """Called if a more-specific processor is not found for the ``Content-Type``."""
655 if self.filename:
656 # Always read into a file if a .filename was given.
657 self.file = self.read_into_file()
658 else:
659 result = self.read_lines_to_boundary()
660 if isinstance(result, basestring):
661 self.value = result
662 else:
663 self.file = result
665 def read_into_file(self, fp_out=None):
666 """Read the request body into fp_out (or make_file() if None). Return fp_out."""
667 if fp_out is None:
668 fp_out = self.make_file()
669 self.read_lines_to_boundary(fp_out=fp_out)
670 return fp_out
672 Entity.part_class = Part
675 class Infinity(object):
676 def __cmp__(self, other):
677 return 1
678 def __sub__(self, other):
679 return self
680 inf = Infinity()
683 comma_separated_headers = ['Accept', 'Accept-Charset', 'Accept-Encoding',
684 'Accept-Language', 'Accept-Ranges', 'Allow', 'Cache-Control', 'Connection',
685 'Content-Encoding', 'Content-Language', 'Expect', 'If-Match',
686 'If-None-Match', 'Pragma', 'Proxy-Authenticate', 'Te', 'Trailer',
687 'Transfer-Encoding', 'Upgrade', 'Vary', 'Via', 'Warning', 'Www-Authenticate']
690 class SizedReader:
692 def __init__(self, fp, length, maxbytes, bufsize=8192, has_trailers=False):
693 # Wrap our fp in a buffer so peek() works
694 self.fp = fp
695 self.length = length
696 self.maxbytes = maxbytes
697 self.buffer = ntob('')
698 self.bufsize = bufsize
699 self.bytes_read = 0
700 self.done = False
701 self.has_trailers = has_trailers
703 def read(self, size=None, fp_out=None):
704 """Read bytes from the request body and return or write them to a file.
706 A number of bytes less than or equal to the 'size' argument are read
707 off the socket. The actual number of bytes read are tracked in
708 self.bytes_read. The number may be smaller than 'size' when 1) the
709 client sends fewer bytes, 2) the 'Content-Length' request header
710 specifies fewer bytes than requested, or 3) the number of bytes read
711 exceeds self.maxbytes (in which case, 413 is raised).
713 If the 'fp_out' argument is None (the default), all bytes read are
714 returned in a single byte string.
716 If the 'fp_out' argument is not None, it must be a file-like object that
717 supports the 'write' method; all bytes read will be written to the fp,
718 and None is returned.
721 if self.length is None:
722 if size is None:
723 remaining = inf
724 else:
725 remaining = size
726 else:
727 remaining = self.length - self.bytes_read
728 if size and size < remaining:
729 remaining = size
730 if remaining == 0:
731 self.finish()
732 if fp_out is None:
733 return ntob('')
734 else:
735 return None
737 chunks = []
739 # Read bytes from the buffer.
740 if self.buffer:
741 if remaining is inf:
742 data = self.buffer
743 self.buffer = ntob('')
744 else:
745 data = self.buffer[:remaining]
746 self.buffer = self.buffer[remaining:]
747 datalen = len(data)
748 remaining -= datalen
750 # Check lengths.
751 self.bytes_read += datalen
752 if self.maxbytes and self.bytes_read > self.maxbytes:
753 raise cherrypy.HTTPError(413)
755 # Store the data.
756 if fp_out is None:
757 chunks.append(data)
758 else:
759 fp_out.write(data)
761 # Read bytes from the socket.
762 while remaining > 0:
763 chunksize = min(remaining, self.bufsize)
764 try:
765 data = self.fp.read(chunksize)
766 except Exception:
767 e = sys.exc_info()[1]
768 if e.__class__.__name__ == 'MaxSizeExceeded':
769 # Post data is too big
770 raise cherrypy.HTTPError(
771 413, "Maximum request length: %r" % e.args[1])
772 else:
773 raise
774 if not data:
775 self.finish()
776 break
777 datalen = len(data)
778 remaining -= datalen
780 # Check lengths.
781 self.bytes_read += datalen
782 if self.maxbytes and self.bytes_read > self.maxbytes:
783 raise cherrypy.HTTPError(413)
785 # Store the data.
786 if fp_out is None:
787 chunks.append(data)
788 else:
789 fp_out.write(data)
791 if fp_out is None:
792 return ntob('').join(chunks)
794 def readline(self, size=None):
795 """Read a line from the request body and return it."""
796 chunks = []
797 while size is None or size > 0:
798 chunksize = self.bufsize
799 if size is not None and size < self.bufsize:
800 chunksize = size
801 data = self.read(chunksize)
802 if not data:
803 break
804 pos = data.find(ntob('\n')) + 1
805 if pos:
806 chunks.append(data[:pos])
807 remainder = data[pos:]
808 self.buffer += remainder
809 self.bytes_read -= len(remainder)
810 break
811 else:
812 chunks.append(data)
813 return ntob('').join(chunks)
815 def readlines(self, sizehint=None):
816 """Read lines from the request body and return them."""
817 if self.length is not None:
818 if sizehint is None:
819 sizehint = self.length - self.bytes_read
820 else:
821 sizehint = min(sizehint, self.length - self.bytes_read)
823 lines = []
824 seen = 0
825 while True:
826 line = self.readline()
827 if not line:
828 break
829 lines.append(line)
830 seen += len(line)
831 if seen >= sizehint:
832 break
833 return lines
835 def finish(self):
836 self.done = True
837 if self.has_trailers and hasattr(self.fp, 'read_trailer_lines'):
838 self.trailers = {}
840 try:
841 for line in self.fp.read_trailer_lines():
842 if line[0] in ntob(' \t'):
843 # It's a continuation line.
844 v = line.strip()
845 else:
846 try:
847 k, v = line.split(ntob(":"), 1)
848 except ValueError:
849 raise ValueError("Illegal header line.")
850 k = k.strip().title()
851 v = v.strip()
853 if k in comma_separated_headers:
854 existing = self.trailers.get(envname)
855 if existing:
856 v = ntob(", ").join((existing, v))
857 self.trailers[k] = v
858 except Exception:
859 e = sys.exc_info()[1]
860 if e.__class__.__name__ == 'MaxSizeExceeded':
861 # Post data is too big
862 raise cherrypy.HTTPError(
863 413, "Maximum request length: %r" % e.args[1])
864 else:
865 raise
868 class RequestBody(Entity):
869 """The entity of the HTTP request."""
871 bufsize = 8 * 1024
872 """The buffer size used when reading the socket."""
874 # Don't parse the request body at all if the client didn't provide
875 # a Content-Type header. See http://www.cherrypy.org/ticket/790
876 default_content_type = ''
877 """This defines a default ``Content-Type`` to use if no Content-Type header
878 is given. The empty string is used for RequestBody, which results in the
879 request body not being read or parsed at all. This is by design; a missing
880 ``Content-Type`` header in the HTTP request entity is an error at best,
881 and a security hole at worst. For multipart parts, however, the MIME spec
882 declares that a part with no Content-Type defaults to "text/plain"
883 (see :class:`Part<cherrypy._cpreqbody.Part>`).
886 maxbytes = None
887 """Raise ``MaxSizeExceeded`` if more bytes than this are read from the socket."""
889 def __init__(self, fp, headers, params=None, request_params=None):
890 Entity.__init__(self, fp, headers, params)
892 # http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1
893 # When no explicit charset parameter is provided by the
894 # sender, media subtypes of the "text" type are defined
895 # to have a default charset value of "ISO-8859-1" when
896 # received via HTTP.
897 if self.content_type.value.startswith('text/'):
898 for c in ('ISO-8859-1', 'iso-8859-1', 'Latin-1', 'latin-1'):
899 if c in self.attempt_charsets:
900 break
901 else:
902 self.attempt_charsets.append('ISO-8859-1')
904 # Temporary fix while deprecating passing .parts as .params.
905 self.processors['multipart'] = _old_process_multipart
907 if request_params is None:
908 request_params = {}
909 self.request_params = request_params
911 def process(self):
912 """Process the request entity based on its Content-Type."""
913 # "The presence of a message-body in a request is signaled by the
914 # inclusion of a Content-Length or Transfer-Encoding header field in
915 # the request's message-headers."
916 # It is possible to send a POST request with no body, for example;
917 # however, app developers are responsible in that case to set
918 # cherrypy.request.process_body to False so this method isn't called.
919 h = cherrypy.serving.request.headers
920 if 'Content-Length' not in h and 'Transfer-Encoding' not in h:
921 raise cherrypy.HTTPError(411)
923 self.fp = SizedReader(self.fp, self.length,
924 self.maxbytes, bufsize=self.bufsize,
925 has_trailers='Trailer' in h)
926 super(RequestBody, self).process()
928 # Body params should also be a part of the request_params
929 # add them in here.
930 request_params = self.request_params
931 for key, value in self.params.items():
932 # Python 2 only: keyword arguments must be byte strings (type 'str').
933 if isinstance(key, unicode):
934 key = key.encode('ISO-8859-1')
936 if key in request_params:
937 if not isinstance(request_params[key], list):
938 request_params[key] = [request_params[key]]
939 request_params[key].append(value)
940 else:
941 request_params[key] = value