symbian-qemu-0.9.1-12/python-2.6.1/Doc/library/string.rst
changeset 1 2fb8b9db1c86
equal deleted inserted replaced
0:ffa851df0825 1:2fb8b9db1c86
       
     1 :mod:`string` --- Common string operations
       
     2 ==========================================
       
     3 
       
     4 .. module:: string
       
     5    :synopsis: Common string operations.
       
     6 
       
     7 
       
     8 .. index:: module: re
       
     9 
       
    10 The :mod:`string` module contains a number of useful constants and
       
    11 classes, as well as some deprecated legacy functions that are also
       
    12 available as methods on strings. In addition, Python's built-in string
       
    13 classes support the sequence type methods described in the
       
    14 :ref:`typesseq` section, and also the string-specific methods described
       
    15 in the :ref:`string-methods` section. To output formatted strings use
       
    16 template strings or the ``%`` operator described in the
       
    17 :ref:`string-formatting` section. Also, see the :mod:`re` module for
       
    18 string functions based on regular expressions.
       
    19 
       
    20 
       
    21 String constants
       
    22 ----------------
       
    23 
       
    24 The constants defined in this module are:
       
    25 
       
    26 
       
    27 .. data:: ascii_letters
       
    28 
       
    29    The concatenation of the :const:`ascii_lowercase` and :const:`ascii_uppercase`
       
    30    constants described below.  This value is not locale-dependent.
       
    31 
       
    32 
       
    33 .. data:: ascii_lowercase
       
    34 
       
    35    The lowercase letters ``'abcdefghijklmnopqrstuvwxyz'``.  This value is not
       
    36    locale-dependent and will not change.
       
    37 
       
    38 
       
    39 .. data:: ascii_uppercase
       
    40 
       
    41    The uppercase letters ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.  This value is not
       
    42    locale-dependent and will not change.
       
    43 
       
    44 
       
    45 .. data:: digits
       
    46 
       
    47    The string ``'0123456789'``.
       
    48 
       
    49 
       
    50 .. data:: hexdigits
       
    51 
       
    52    The string ``'0123456789abcdefABCDEF'``.
       
    53 
       
    54 
       
    55 .. data:: letters
       
    56 
       
    57    The concatenation of the strings :const:`lowercase` and :const:`uppercase`
       
    58    described below.  The specific value is locale-dependent, and will be updated
       
    59    when :func:`locale.setlocale` is called.
       
    60 
       
    61 
       
    62 .. data:: lowercase
       
    63 
       
    64    A string containing all the characters that are considered lowercase letters.
       
    65    On most systems this is the string ``'abcdefghijklmnopqrstuvwxyz'``.  Do not
       
    66    change its definition --- the effect on the routines :func:`upper` and
       
    67    :func:`swapcase` is undefined.  The specific value is locale-dependent, and will
       
    68    be updated when :func:`locale.setlocale` is called.
       
    69 
       
    70 
       
    71 .. data:: octdigits
       
    72 
       
    73    The string ``'01234567'``.
       
    74 
       
    75 
       
    76 .. data:: punctuation
       
    77 
       
    78    String of ASCII characters which are considered punctuation characters in the
       
    79    ``C`` locale.
       
    80 
       
    81 
       
    82 .. data:: printable
       
    83 
       
    84    String of characters which are considered printable.  This is a combination of
       
    85    :const:`digits`, :const:`letters`, :const:`punctuation`, and
       
    86    :const:`whitespace`.
       
    87 
       
    88 
       
    89 .. data:: uppercase
       
    90 
       
    91    A string containing all the characters that are considered uppercase letters.
       
    92    On most systems this is the string ``'ABCDEFGHIJKLMNOPQRSTUVWXYZ'``.  Do not
       
    93    change its definition --- the effect on the routines :func:`lower` and
       
    94    :func:`swapcase` is undefined.  The specific value is locale-dependent, and will
       
    95    be updated when :func:`locale.setlocale` is called.
       
    96 
       
    97 
       
    98 .. data:: whitespace
       
    99 
       
   100    A string containing all characters that are considered whitespace. On most
       
   101    systems this includes the characters space, tab, linefeed, return, formfeed, and
       
   102    vertical tab.  Do not change its definition --- the effect on the routines
       
   103    :func:`strip` and :func:`split` is undefined.
       
   104 
       
   105 
       
   106 .. _new-string-formatting:
       
   107 
       
   108 String Formatting
       
   109 -----------------
       
   110 
       
   111 Starting in Python 2.6, the built-in str and unicode classes provide the ability
       
   112 to do complex variable substitutions and value formatting via the
       
   113 :meth:`str.format` method described in :pep:`3101`.  The :class:`Formatter`
       
   114 class in the :mod:`string` module allows you to create and customize your own
       
   115 string formatting behaviors using the same implementation as the built-in
       
   116 :meth:`format` method.
       
   117 
       
   118 .. class:: Formatter
       
   119 
       
   120    The :class:`Formatter` class has the following public methods:
       
   121 
       
   122    .. method:: format(format_string, *args, *kwargs)
       
   123 
       
   124       :meth:`format` is the primary API method.  It takes a format template
       
   125       string, and an arbitrary set of positional and keyword argument.
       
   126       :meth:`format` is just a wrapper that calls :meth:`vformat`.
       
   127 
       
   128    .. method:: vformat(format_string, args, kwargs)
       
   129    
       
   130       This function does the actual work of formatting.  It is exposed as a
       
   131       separate function for cases where you want to pass in a predefined
       
   132       dictionary of arguments, rather than unpacking and repacking the
       
   133       dictionary as individual arguments using the ``*args`` and ``**kwds``
       
   134       syntax.  :meth:`vformat` does the work of breaking up the format template
       
   135       string into character data and replacement fields.  It calls the various
       
   136       methods described below.
       
   137 
       
   138    In addition, the :class:`Formatter` defines a number of methods that are
       
   139    intended to be replaced by subclasses:
       
   140 
       
   141    .. method:: parse(format_string)
       
   142    
       
   143       Loop over the format_string and return an iterable of tuples
       
   144       (*literal_text*, *field_name*, *format_spec*, *conversion*).  This is used
       
   145       by :meth:`vformat` to break the string in to either literal text, or
       
   146       replacement fields.
       
   147       
       
   148       The values in the tuple conceptually represent a span of literal text
       
   149       followed by a single replacement field.  If there is no literal text
       
   150       (which can happen if two replacement fields occur consecutively), then
       
   151       *literal_text* will be a zero-length string.  If there is no replacement
       
   152       field, then the values of *field_name*, *format_spec* and *conversion*
       
   153       will be ``None``.
       
   154 
       
   155    .. method:: get_field(field_name, args, kwargs)
       
   156 
       
   157       Given *field_name* as returned by :meth:`parse` (see above), convert it to
       
   158       an object to be formatted.  Returns a tuple (obj, used_key).  The default
       
   159       version takes strings of the form defined in :pep:`3101`, such as
       
   160       "0[name]" or "label.title".  *args* and *kwargs* are as passed in to
       
   161       :meth:`vformat`.  The return value *used_key* has the same meaning as the
       
   162       *key* parameter to :meth:`get_value`.
       
   163 
       
   164    .. method:: get_value(key, args, kwargs)
       
   165    
       
   166       Retrieve a given field value.  The *key* argument will be either an
       
   167       integer or a string.  If it is an integer, it represents the index of the
       
   168       positional argument in *args*; if it is a string, then it represents a
       
   169       named argument in *kwargs*.
       
   170 
       
   171       The *args* parameter is set to the list of positional arguments to
       
   172       :meth:`vformat`, and the *kwargs* parameter is set to the dictionary of
       
   173       keyword arguments.
       
   174 
       
   175       For compound field names, these functions are only called for the first
       
   176       component of the field name; Subsequent components are handled through
       
   177       normal attribute and indexing operations.
       
   178 
       
   179       So for example, the field expression '0.name' would cause
       
   180       :meth:`get_value` to be called with a *key* argument of 0.  The ``name``
       
   181       attribute will be looked up after :meth:`get_value` returns by calling the
       
   182       built-in :func:`getattr` function.
       
   183 
       
   184       If the index or keyword refers to an item that does not exist, then an
       
   185       :exc:`IndexError` or :exc:`KeyError` should be raised.
       
   186 
       
   187    .. method:: check_unused_args(used_args, args, kwargs)
       
   188 
       
   189       Implement checking for unused arguments if desired.  The arguments to this
       
   190       function is the set of all argument keys that were actually referred to in
       
   191       the format string (integers for positional arguments, and strings for
       
   192       named arguments), and a reference to the *args* and *kwargs* that was
       
   193       passed to vformat.  The set of unused args can be calculated from these
       
   194       parameters.  :meth:`check_unused_args` is assumed to throw an exception if
       
   195       the check fails.
       
   196 
       
   197    .. method:: format_field(value, format_spec)
       
   198 
       
   199       :meth:`format_field` simply calls the global :func:`format` built-in.  The
       
   200       method is provided so that subclasses can override it.
       
   201 
       
   202    .. method:: convert_field(value, conversion)
       
   203    
       
   204       Converts the value (returned by :meth:`get_field`) given a conversion type
       
   205       (as in the tuple returned by the :meth:`parse` method.)  The default
       
   206       version understands 'r' (repr) and 's' (str) conversion types.
       
   207 
       
   208 
       
   209 .. _formatstrings:
       
   210 
       
   211 Format String Syntax
       
   212 --------------------
       
   213 
       
   214 The :meth:`str.format` method and the :class:`Formatter` class share the same
       
   215 syntax for format strings (although in the case of :class:`Formatter`,
       
   216 subclasses can define their own format string syntax.)
       
   217 
       
   218 Format strings contain "replacement fields" surrounded by curly braces ``{}``.
       
   219 Anything that is not contained in braces is considered literal text, which is
       
   220 copied unchanged to the output.  If you need to include a brace character in the
       
   221 literal text, it can be escaped by doubling: ``{{`` and ``}}``.
       
   222 
       
   223 The grammar for a replacement field is as follows:
       
   224 
       
   225    .. productionlist:: sf
       
   226       replacement_field: "{" `field_name` ["!" `conversion`] [":" `format_spec`] "}"
       
   227       field_name: (`identifier` | `integer`) ("." `attribute_name` | "[" element_index "]")*
       
   228       attribute_name: `identifier`
       
   229       element_index: `integer`
       
   230       conversion: "r" | "s"
       
   231       format_spec: <described in the next section>
       
   232       
       
   233 In less formal terms, the replacement field starts with a *field_name*, which
       
   234 can either be a number (for a positional argument), or an identifier (for
       
   235 keyword arguments).  Following this is an optional *conversion* field, which is
       
   236 preceded by an exclamation point ``'!'``, and a *format_spec*, which is preceded
       
   237 by a colon ``':'``.
       
   238 
       
   239 The *field_name* itself begins with either a number or a keyword.  If it's a
       
   240 number, it refers to a positional argument, and if it's a keyword it refers to a
       
   241 named keyword argument.  This can be followed by any number of index or
       
   242 attribute expressions. An expression of the form ``'.name'`` selects the named
       
   243 attribute using :func:`getattr`, while an expression of the form ``'[index]'``
       
   244 does an index lookup using :func:`__getitem__`.
       
   245 
       
   246 Some simple format string examples::
       
   247 
       
   248    "First, thou shalt count to {0}" # References first positional argument
       
   249    "My quest is {name}"             # References keyword argument 'name'
       
   250    "Weight in tons {0.weight}"      # 'weight' attribute of first positional arg
       
   251    "Units destroyed: {players[0]}"  # First element of keyword argument 'players'.
       
   252    
       
   253 The *conversion* field causes a type coercion before formatting.  Normally, the
       
   254 job of formatting a value is done by the :meth:`__format__` method of the value
       
   255 itself.  However, in some cases it is desirable to force a type to be formatted
       
   256 as a string, overriding its own definition of formatting.  By converting the
       
   257 value to a string before calling :meth:`__format__`, the normal formatting logic
       
   258 is bypassed.
       
   259 
       
   260 Two conversion flags are currently supported: ``'!s'`` which calls :func:`str`
       
   261 on the value, and ``'!r'`` which calls :func:`repr`.
       
   262 
       
   263 Some examples::
       
   264 
       
   265    "Harold's a clever {0!s}"        # Calls str() on the argument first
       
   266    "Bring out the holy {name!r}"    # Calls repr() on the argument first
       
   267 
       
   268 The *format_spec* field contains a specification of how the value should be
       
   269 presented, including such details as field width, alignment, padding, decimal
       
   270 precision and so on.  Each value type can define it's own "formatting
       
   271 mini-language" or interpretation of the *format_spec*.
       
   272 
       
   273 Most built-in types support a common formatting mini-language, which is
       
   274 described in the next section.
       
   275 
       
   276 A *format_spec* field can also include nested replacement fields within it.
       
   277 These nested replacement fields can contain only a field name; conversion flags
       
   278 and format specifications are not allowed.  The replacement fields within the
       
   279 format_spec are substituted before the *format_spec* string is interpreted.
       
   280 This allows the formatting of a value to be dynamically specified.
       
   281 
       
   282 For example, suppose you wanted to have a replacement field whose field width is
       
   283 determined by another variable::
       
   284 
       
   285    "A man with two {0:{1}}".format("noses", 10)
       
   286 
       
   287 This would first evaluate the inner replacement field, making the format string
       
   288 effectively::
       
   289 
       
   290    "A man with two {0:10}"
       
   291 
       
   292 Then the outer replacement field would be evaluated, producing::
       
   293 
       
   294    "noses     "
       
   295    
       
   296 Which is substituted into the string, yielding::
       
   297    
       
   298    "A man with two noses     "
       
   299    
       
   300 (The extra space is because we specified a field width of 10, and because left
       
   301 alignment is the default for strings.)
       
   302 
       
   303 
       
   304 .. _formatspec:
       
   305 
       
   306 Format Specification Mini-Language
       
   307 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       
   308 
       
   309 "Format specifications" are used within replacement fields contained within a
       
   310 format string to define how individual values are presented (see
       
   311 :ref:`formatstrings`.)  They can also be passed directly to the builtin
       
   312 :func:`format` function.  Each formattable type may define how the format
       
   313 specification is to be interpreted.
       
   314 
       
   315 Most built-in types implement the following options for format specifications,
       
   316 although some of the formatting options are only supported by the numeric types.
       
   317 
       
   318 A general convention is that an empty format string (``""``) produces the same
       
   319 result as if you had called :func:`str` on the value.
       
   320 
       
   321 The general form of a *standard format specifier* is:
       
   322 
       
   323 .. productionlist:: sf
       
   324    format_spec: [[`fill`]`align`][`sign`][#][0][`width`][.`precision`][`type`]
       
   325    fill: <a character other than '}'>
       
   326    align: "<" | ">" | "=" | "^"
       
   327    sign: "+" | "-" | " "
       
   328    width: `integer`
       
   329    precision: `integer`
       
   330    type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "x" | "X" | "%"
       
   331    
       
   332 The *fill* character can be any character other than '}' (which signifies the
       
   333 end of the field).  The presence of a fill character is signaled by the *next*
       
   334 character, which must be one of the alignment options. If the second character
       
   335 of *format_spec* is not a valid alignment option, then it is assumed that both
       
   336 the fill character and the alignment option are absent.
       
   337 
       
   338 The meaning of the various alignment options is as follows:
       
   339 
       
   340    +---------+----------------------------------------------------------+
       
   341    | Option  | Meaning                                                  |
       
   342    +=========+==========================================================+
       
   343    | ``'<'`` | Forces the field to be left-aligned within the available |
       
   344    |         | space (This is the default.)                             |
       
   345    +---------+----------------------------------------------------------+
       
   346    | ``'>'`` | Forces the field to be right-aligned within the          |
       
   347    |         | available space.                                         |
       
   348    +---------+----------------------------------------------------------+
       
   349    | ``'='`` | Forces the padding to be placed after the sign (if any)  |
       
   350    |         | but before the digits.  This is used for printing fields |
       
   351    |         | in the form '+000000120'. This alignment option is only  |
       
   352    |         | valid for numeric types.                                 |
       
   353    +---------+----------------------------------------------------------+
       
   354    | ``'^'`` | Forces the field to be centered within the available     |
       
   355    |         | space.                                                   |
       
   356    +---------+----------------------------------------------------------+
       
   357 
       
   358 Note that unless a minimum field width is defined, the field width will always
       
   359 be the same size as the data to fill it, so that the alignment option has no
       
   360 meaning in this case.
       
   361 
       
   362 The *sign* option is only valid for number types, and can be one of the
       
   363 following:
       
   364 
       
   365    +---------+----------------------------------------------------------+
       
   366    | Option  | Meaning                                                  |
       
   367    +=========+==========================================================+
       
   368    | ``'+'`` | indicates that a sign should be used for both            |
       
   369    |         | positive as well as negative numbers.                    |
       
   370    +---------+----------------------------------------------------------+
       
   371    | ``'-'`` | indicates that a sign should be used only for negative   |
       
   372    |         | numbers (this is the default behavior).                  |
       
   373    +---------+----------------------------------------------------------+
       
   374    | space   | indicates that a leading space should be used on         |
       
   375    |         | positive numbers, and a minus sign on negative numbers.  |
       
   376    +---------+----------------------------------------------------------+
       
   377 
       
   378 The ``'#'`` option is only valid for integers, and only for binary, octal, or
       
   379 hexadecimal output.  If present, it specifies that the output will be prefixed
       
   380 by ``'0b'``, ``'0o'``, or ``'0x'``, respectively.
       
   381 
       
   382 *width* is a decimal integer defining the minimum field width.  If not
       
   383 specified, then the field width will be determined by the content.
       
   384 
       
   385 If the *width* field is preceded by a zero (``'0'``) character, this enables
       
   386 zero-padding.  This is equivalent to an *alignment* type of ``'='`` and a *fill*
       
   387 character of ``'0'``.
       
   388 
       
   389 The *precision* is a decimal number indicating how many digits should be
       
   390 displayed after the decimal point for a floating point value formatted with
       
   391 ``'f'`` and ``'F'``, or before and after the decimal point for a floating point
       
   392 value formatted with ``'g'`` or ``'G'``.  For non-number types the field
       
   393 indicates the maximum field size - in other words, how many characters will be
       
   394 used from the field content. The *precision* is ignored for integer values.
       
   395 
       
   396 Finally, the *type* determines how the data should be presented.
       
   397 
       
   398 The available integer presentation types are:
       
   399 
       
   400    +---------+----------------------------------------------------------+
       
   401    | Type    | Meaning                                                  |
       
   402    +=========+==========================================================+
       
   403    | ``'b'`` | Binary format. Outputs the number in base 2.             |
       
   404    +---------+----------------------------------------------------------+
       
   405    | ``'c'`` | Character. Converts the integer to the corresponding     |
       
   406    |         | unicode character before printing.                       |
       
   407    +---------+----------------------------------------------------------+
       
   408    | ``'d'`` | Decimal Integer. Outputs the number in base 10.          |
       
   409    +---------+----------------------------------------------------------+
       
   410    | ``'o'`` | Octal format. Outputs the number in base 8.              |
       
   411    +---------+----------------------------------------------------------+
       
   412    | ``'x'`` | Hex format. Outputs the number in base 16, using lower-  |
       
   413    |         | case letters for the digits above 9.                     |
       
   414    +---------+----------------------------------------------------------+
       
   415    | ``'X'`` | Hex format. Outputs the number in base 16, using upper-  |
       
   416    |         | case letters for the digits above 9.                     |
       
   417    +---------+----------------------------------------------------------+
       
   418    | ``'n'`` | Number. This is the same as ``'d'``, except that it uses |
       
   419    |         | the current locale setting to insert the appropriate     |
       
   420    |         | number separator characters.                             |
       
   421    +---------+----------------------------------------------------------+
       
   422    | None    | The same as ``'d'``.                                     |
       
   423    +---------+----------------------------------------------------------+
       
   424                                                                          
       
   425 The available presentation types for floating point and decimal values are:
       
   426                                                                          
       
   427    +---------+----------------------------------------------------------+
       
   428    | Type    | Meaning                                                  |
       
   429    +=========+==========================================================+
       
   430    | ``'e'`` | Exponent notation. Prints the number in scientific       |
       
   431    |         | notation using the letter 'e' to indicate the exponent.  |
       
   432    +---------+----------------------------------------------------------+
       
   433    | ``'E'`` | Exponent notation. Same as ``'e'`` except it uses an     |
       
   434    |         | upper case 'E' as the separator character.               |
       
   435    +---------+----------------------------------------------------------+
       
   436    | ``'f'`` | Fixed point. Displays the number as a fixed-point        |
       
   437    |         | number.                                                  |
       
   438    +---------+----------------------------------------------------------+
       
   439    | ``'F'`` | Fixed point. Same as ``'f'``.                            |
       
   440    +---------+----------------------------------------------------------+
       
   441    | ``'g'`` | General format. This prints the number as a fixed-point  |
       
   442    |         | number, unless the number is too large, in which case    |
       
   443    |         | it switches to ``'e'`` exponent notation. Infinity and   |
       
   444    |         | NaN values are formatted as ``inf``, ``-inf`` and        |
       
   445    |         | ``nan``, respectively.                                   |
       
   446    +---------+----------------------------------------------------------+
       
   447    | ``'G'`` | General format. Same as ``'g'`` except switches to       |
       
   448    |         | ``'E'`` if the number gets to large. The representations |
       
   449    |         | of infinity and NaN are uppercased, too.                 |
       
   450    +---------+----------------------------------------------------------+
       
   451    | ``'n'`` | Number. This is the same as ``'g'``, except that it uses |
       
   452    |         | the current locale setting to insert the appropriate     |
       
   453    |         | number separator characters.                             |
       
   454    +---------+----------------------------------------------------------+
       
   455    | ``'%'`` | Percentage. Multiplies the number by 100 and displays    |
       
   456    |         | in fixed (``'f'``) format, followed by a percent sign.   |
       
   457    +---------+----------------------------------------------------------+
       
   458    | None    | The same as ``'g'``.                                     |
       
   459    +---------+----------------------------------------------------------+
       
   460 
       
   461 
       
   462 Template strings
       
   463 ----------------
       
   464 
       
   465 Templates provide simpler string substitutions as described in :pep:`292`.
       
   466 Instead of the normal ``%``\ -based substitutions, Templates support ``$``\
       
   467 -based substitutions, using the following rules:
       
   468 
       
   469 * ``$$`` is an escape; it is replaced with a single ``$``.
       
   470 
       
   471 * ``$identifier`` names a substitution placeholder matching a mapping key of
       
   472   ``"identifier"``.  By default, ``"identifier"`` must spell a Python
       
   473   identifier.  The first non-identifier character after the ``$`` character
       
   474   terminates this placeholder specification.
       
   475 
       
   476 * ``${identifier}`` is equivalent to ``$identifier``.  It is required when valid
       
   477   identifier characters follow the placeholder but are not part of the
       
   478   placeholder, such as ``"${noun}ification"``.
       
   479 
       
   480 Any other appearance of ``$`` in the string will result in a :exc:`ValueError`
       
   481 being raised.
       
   482 
       
   483 .. versionadded:: 2.4
       
   484 
       
   485 The :mod:`string` module provides a :class:`Template` class that implements
       
   486 these rules.  The methods of :class:`Template` are:
       
   487 
       
   488 
       
   489 .. class:: Template(template)
       
   490 
       
   491    The constructor takes a single argument which is the template string.
       
   492 
       
   493 
       
   494    .. method:: substitute(mapping[, **kws])
       
   495 
       
   496       Performs the template substitution, returning a new string.  *mapping* is
       
   497       any dictionary-like object with keys that match the placeholders in the
       
   498       template.  Alternatively, you can provide keyword arguments, where the
       
   499       keywords are the placeholders.  When both *mapping* and *kws* are given
       
   500       and there are duplicates, the placeholders from *kws* take precedence.
       
   501 
       
   502 
       
   503    .. method:: safe_substitute(mapping[, **kws])
       
   504 
       
   505       Like :meth:`substitute`, except that if placeholders are missing from
       
   506       *mapping* and *kws*, instead of raising a :exc:`KeyError` exception, the
       
   507       original placeholder will appear in the resulting string intact.  Also,
       
   508       unlike with :meth:`substitute`, any other appearances of the ``$`` will
       
   509       simply return ``$`` instead of raising :exc:`ValueError`.
       
   510 
       
   511       While other exceptions may still occur, this method is called "safe"
       
   512       because substitutions always tries to return a usable string instead of
       
   513       raising an exception.  In another sense, :meth:`safe_substitute` may be
       
   514       anything other than safe, since it will silently ignore malformed
       
   515       templates containing dangling delimiters, unmatched braces, or
       
   516       placeholders that are not valid Python identifiers.
       
   517 
       
   518 :class:`Template` instances also provide one public data attribute:
       
   519 
       
   520 
       
   521 .. attribute:: string.template
       
   522 
       
   523    This is the object passed to the constructor's *template* argument.  In general,
       
   524    you shouldn't change it, but read-only access is not enforced.
       
   525 
       
   526 Here is an example of how to use a Template:
       
   527 
       
   528    >>> from string import Template
       
   529    >>> s = Template('$who likes $what')
       
   530    >>> s.substitute(who='tim', what='kung pao')
       
   531    'tim likes kung pao'
       
   532    >>> d = dict(who='tim')
       
   533    >>> Template('Give $who $100').substitute(d)
       
   534    Traceback (most recent call last):
       
   535    [...]
       
   536    ValueError: Invalid placeholder in string: line 1, col 10
       
   537    >>> Template('$who likes $what').substitute(d)
       
   538    Traceback (most recent call last):
       
   539    [...]
       
   540    KeyError: 'what'
       
   541    >>> Template('$who likes $what').safe_substitute(d)
       
   542    'tim likes $what'
       
   543 
       
   544 Advanced usage: you can derive subclasses of :class:`Template` to customize the
       
   545 placeholder syntax, delimiter character, or the entire regular expression used
       
   546 to parse template strings.  To do this, you can override these class attributes:
       
   547 
       
   548 * *delimiter* -- This is the literal string describing a placeholder introducing
       
   549   delimiter.  The default value ``$``.  Note that this should *not* be a regular
       
   550   expression, as the implementation will call :meth:`re.escape` on this string as
       
   551   needed.
       
   552 
       
   553 * *idpattern* -- This is the regular expression describing the pattern for
       
   554   non-braced placeholders (the braces will be added automatically as
       
   555   appropriate).  The default value is the regular expression
       
   556   ``[_a-z][_a-z0-9]*``.
       
   557 
       
   558 Alternatively, you can provide the entire regular expression pattern by
       
   559 overriding the class attribute *pattern*.  If you do this, the value must be a
       
   560 regular expression object with four named capturing groups.  The capturing
       
   561 groups correspond to the rules given above, along with the invalid placeholder
       
   562 rule:
       
   563 
       
   564 * *escaped* -- This group matches the escape sequence, e.g. ``$$``, in the
       
   565   default pattern.
       
   566 
       
   567 * *named* -- This group matches the unbraced placeholder name; it should not
       
   568   include the delimiter in capturing group.
       
   569 
       
   570 * *braced* -- This group matches the brace enclosed placeholder name; it should
       
   571   not include either the delimiter or braces in the capturing group.
       
   572 
       
   573 * *invalid* -- This group matches any other delimiter pattern (usually a single
       
   574   delimiter), and it should appear last in the regular expression.
       
   575 
       
   576 
       
   577 String functions
       
   578 ----------------
       
   579 
       
   580 The following functions are available to operate on string and Unicode objects.
       
   581 They are not available as string methods.
       
   582 
       
   583 
       
   584 .. function:: capwords(s)
       
   585 
       
   586    Split the argument into words using :func:`split`, capitalize each word using
       
   587    :func:`capitalize`, and join the capitalized words using :func:`join`.  Note
       
   588    that this replaces runs of whitespace characters by a single space, and removes
       
   589    leading and trailing whitespace.
       
   590 
       
   591 
       
   592 .. function:: maketrans(from, to)
       
   593 
       
   594    Return a translation table suitable for passing to :func:`translate`, that will
       
   595    map each character in *from* into the character at the same position in *to*;
       
   596    *from* and *to* must have the same length.
       
   597 
       
   598    .. warning::
       
   599 
       
   600       Don't use strings derived from :const:`lowercase` and :const:`uppercase` as
       
   601       arguments; in some locales, these don't have the same length.  For case
       
   602       conversions, always use :func:`lower` and :func:`upper`.
       
   603 
       
   604 
       
   605 Deprecated string functions
       
   606 ---------------------------
       
   607 
       
   608 The following list of functions are also defined as methods of string and
       
   609 Unicode objects; see section :ref:`string-methods` for more information on
       
   610 those.  You should consider these functions as deprecated, although they will
       
   611 not be removed until Python 3.0.  The functions defined in this module are:
       
   612 
       
   613 
       
   614 .. function:: atof(s)
       
   615 
       
   616    .. deprecated:: 2.0
       
   617       Use the :func:`float` built-in function.
       
   618 
       
   619    .. index:: builtin: float
       
   620 
       
   621    Convert a string to a floating point number.  The string must have the standard
       
   622    syntax for a floating point literal in Python, optionally preceded by a sign
       
   623    (``+`` or ``-``).  Note that this behaves identical to the built-in function
       
   624    :func:`float` when passed a string.
       
   625 
       
   626    .. note::
       
   627 
       
   628       .. index::
       
   629          single: NaN
       
   630          single: Infinity
       
   631 
       
   632       When passing in a string, values for NaN and Infinity may be returned, depending
       
   633       on the underlying C library.  The specific set of strings accepted which cause
       
   634       these values to be returned depends entirely on the C library and is known to
       
   635       vary.
       
   636 
       
   637 
       
   638 .. function:: atoi(s[, base])
       
   639 
       
   640    .. deprecated:: 2.0
       
   641       Use the :func:`int` built-in function.
       
   642 
       
   643    .. index:: builtin: eval
       
   644 
       
   645    Convert string *s* to an integer in the given *base*.  The string must consist
       
   646    of one or more digits, optionally preceded by a sign (``+`` or ``-``).  The
       
   647    *base* defaults to 10.  If it is 0, a default base is chosen depending on the
       
   648    leading characters of the string (after stripping the sign): ``0x`` or ``0X``
       
   649    means 16, ``0`` means 8, anything else means 10.  If *base* is 16, a leading
       
   650    ``0x`` or ``0X`` is always accepted, though not required.  This behaves
       
   651    identically to the built-in function :func:`int` when passed a string.  (Also
       
   652    note: for a more flexible interpretation of numeric literals, use the built-in
       
   653    function :func:`eval`.)
       
   654 
       
   655 
       
   656 .. function:: atol(s[, base])
       
   657 
       
   658    .. deprecated:: 2.0
       
   659       Use the :func:`long` built-in function.
       
   660 
       
   661    .. index:: builtin: long
       
   662 
       
   663    Convert string *s* to a long integer in the given *base*. The string must
       
   664    consist of one or more digits, optionally preceded by a sign (``+`` or ``-``).
       
   665    The *base* argument has the same meaning as for :func:`atoi`.  A trailing ``l``
       
   666    or ``L`` is not allowed, except if the base is 0.  Note that when invoked
       
   667    without *base* or with *base* set to 10, this behaves identical to the built-in
       
   668    function :func:`long` when passed a string.
       
   669 
       
   670 
       
   671 .. function:: capitalize(word)
       
   672 
       
   673    Return a copy of *word* with only its first character capitalized.
       
   674 
       
   675 
       
   676 .. function:: expandtabs(s[, tabsize])
       
   677 
       
   678    Expand tabs in a string replacing them by one or more spaces, depending on the
       
   679    current column and the given tab size.  The column number is reset to zero after
       
   680    each newline occurring in the string. This doesn't understand other non-printing
       
   681    characters or escape sequences.  The tab size defaults to 8.
       
   682 
       
   683 
       
   684 .. function:: find(s, sub[, start[,end]])
       
   685 
       
   686    Return the lowest index in *s* where the substring *sub* is found such that
       
   687    *sub* is wholly contained in ``s[start:end]``.  Return ``-1`` on failure.
       
   688    Defaults for *start* and *end* and interpretation of negative values is the same
       
   689    as for slices.
       
   690 
       
   691 
       
   692 .. function:: rfind(s, sub[, start[, end]])
       
   693 
       
   694    Like :func:`find` but find the highest index.
       
   695 
       
   696 
       
   697 .. function:: index(s, sub[, start[, end]])
       
   698 
       
   699    Like :func:`find` but raise :exc:`ValueError` when the substring is not found.
       
   700 
       
   701 
       
   702 .. function:: rindex(s, sub[, start[, end]])
       
   703 
       
   704    Like :func:`rfind` but raise :exc:`ValueError` when the substring is not found.
       
   705 
       
   706 
       
   707 .. function:: count(s, sub[, start[, end]])
       
   708 
       
   709    Return the number of (non-overlapping) occurrences of substring *sub* in string
       
   710    ``s[start:end]``. Defaults for *start* and *end* and interpretation of negative
       
   711    values are the same as for slices.
       
   712 
       
   713 
       
   714 .. function:: lower(s)
       
   715 
       
   716    Return a copy of *s*, but with upper case letters converted to lower case.
       
   717 
       
   718 
       
   719 .. function:: split(s[, sep[, maxsplit]])
       
   720 
       
   721    Return a list of the words of the string *s*.  If the optional second argument
       
   722    *sep* is absent or ``None``, the words are separated by arbitrary strings of
       
   723    whitespace characters (space, tab,  newline, return, formfeed).  If the second
       
   724    argument *sep* is present and not ``None``, it specifies a string to be used as
       
   725    the  word separator.  The returned list will then have one more item than the
       
   726    number of non-overlapping occurrences of the separator in the string.  The
       
   727    optional third argument *maxsplit* defaults to 0.  If it is nonzero, at most
       
   728    *maxsplit* number of splits occur, and the remainder of the string is returned
       
   729    as the final element of the list (thus, the list will have at most
       
   730    ``maxsplit+1`` elements).
       
   731 
       
   732    The behavior of split on an empty string depends on the value of *sep*. If *sep*
       
   733    is not specified, or specified as ``None``, the result will be an empty list.
       
   734    If *sep* is specified as any string, the result will be a list containing one
       
   735    element which is an empty string.
       
   736 
       
   737 
       
   738 .. function:: rsplit(s[, sep[, maxsplit]])
       
   739 
       
   740    Return a list of the words of the string *s*, scanning *s* from the end.  To all
       
   741    intents and purposes, the resulting list of words is the same as returned by
       
   742    :func:`split`, except when the optional third argument *maxsplit* is explicitly
       
   743    specified and nonzero.  When *maxsplit* is nonzero, at most *maxsplit* number of
       
   744    splits -- the *rightmost* ones -- occur, and the remainder of the string is
       
   745    returned as the first element of the list (thus, the list will have at most
       
   746    ``maxsplit+1`` elements).
       
   747 
       
   748    .. versionadded:: 2.4
       
   749 
       
   750 
       
   751 .. function:: splitfields(s[, sep[, maxsplit]])
       
   752 
       
   753    This function behaves identically to :func:`split`.  (In the past, :func:`split`
       
   754    was only used with one argument, while :func:`splitfields` was only used with
       
   755    two arguments.)
       
   756 
       
   757 
       
   758 .. function:: join(words[, sep])
       
   759 
       
   760    Concatenate a list or tuple of words with intervening occurrences of  *sep*.
       
   761    The default value for *sep* is a single space character.  It is always true that
       
   762    ``string.join(string.split(s, sep), sep)`` equals *s*.
       
   763 
       
   764 
       
   765 .. function:: joinfields(words[, sep])
       
   766 
       
   767    This function behaves identically to :func:`join`.  (In the past,  :func:`join`
       
   768    was only used with one argument, while :func:`joinfields` was only used with two
       
   769    arguments.) Note that there is no :meth:`joinfields` method on string objects;
       
   770    use the :meth:`join` method instead.
       
   771 
       
   772 
       
   773 .. function:: lstrip(s[, chars])
       
   774 
       
   775    Return a copy of the string with leading characters removed.  If *chars* is
       
   776    omitted or ``None``, whitespace characters are removed.  If given and not
       
   777    ``None``, *chars* must be a string; the characters in the string will be
       
   778    stripped from the beginning of the string this method is called on.
       
   779 
       
   780    .. versionchanged:: 2.2.3
       
   781       The *chars* parameter was added.  The *chars* parameter cannot be passed in
       
   782       earlier 2.2 versions.
       
   783 
       
   784 
       
   785 .. function:: rstrip(s[, chars])
       
   786 
       
   787    Return a copy of the string with trailing characters removed.  If *chars* is
       
   788    omitted or ``None``, whitespace characters are removed.  If given and not
       
   789    ``None``, *chars* must be a string; the characters in the string will be
       
   790    stripped from the end of the string this method is called on.
       
   791 
       
   792    .. versionchanged:: 2.2.3
       
   793       The *chars* parameter was added.  The *chars* parameter cannot be passed in
       
   794       earlier 2.2 versions.
       
   795 
       
   796 
       
   797 .. function:: strip(s[, chars])
       
   798 
       
   799    Return a copy of the string with leading and trailing characters removed.  If
       
   800    *chars* is omitted or ``None``, whitespace characters are removed.  If given and
       
   801    not ``None``, *chars* must be a string; the characters in the string will be
       
   802    stripped from the both ends of the string this method is called on.
       
   803 
       
   804    .. versionchanged:: 2.2.3
       
   805       The *chars* parameter was added.  The *chars* parameter cannot be passed in
       
   806       earlier 2.2 versions.
       
   807 
       
   808 
       
   809 .. function:: swapcase(s)
       
   810 
       
   811    Return a copy of *s*, but with lower case letters converted to upper case and
       
   812    vice versa.
       
   813 
       
   814 
       
   815 .. function:: translate(s, table[, deletechars])
       
   816 
       
   817    Delete all characters from *s* that are in *deletechars* (if  present), and then
       
   818    translate the characters using *table*, which  must be a 256-character string
       
   819    giving the translation for each character value, indexed by its ordinal.  If
       
   820    *table* is ``None``, then only the character deletion step is performed.
       
   821 
       
   822 
       
   823 .. function:: upper(s)
       
   824 
       
   825    Return a copy of *s*, but with lower case letters converted to upper case.
       
   826 
       
   827 
       
   828 .. function:: ljust(s, width)
       
   829               rjust(s, width)
       
   830               center(s, width)
       
   831 
       
   832    These functions respectively left-justify, right-justify and center a string in
       
   833    a field of given width.  They return a string that is at least *width*
       
   834    characters wide, created by padding the string *s* with spaces until the given
       
   835    width on the right, left or both sides.  The string is never truncated.
       
   836 
       
   837 
       
   838 .. function:: zfill(s, width)
       
   839 
       
   840    Pad a numeric string on the left with zero digits until the given width is
       
   841    reached.  Strings starting with a sign are handled correctly.
       
   842 
       
   843 
       
   844 .. function:: replace(str, old, new[, maxreplace])
       
   845 
       
   846    Return a copy of string *str* with all occurrences of substring *old* replaced
       
   847    by *new*.  If the optional argument *maxreplace* is given, the first
       
   848    *maxreplace* occurrences are replaced.
       
   849