:mod:`xdoctest.parser`
======================

.. py:module:: xdoctest.parser

.. autoapi-nested-parse::

   The XDoctest Parser
   -------------------
   This parses a docstring into one or more "doctest part" *after* the docstrings
   have been extracted from the source code by either static or dynamic means.

   Terms and definitions:

       logical block: a snippet of code that can be executed by itself if given
           the correct global / local variable context.

       PS1 : The original meaning is "Prompt String 1". In the context of
           xdoctest, instead of referring to the prompt prefix, we use PS1 to refer
           to a line that starts a "logical block" of code. In the original
           doctest module these all had to be prefixed with ">>>". In xdoctest the
           prefix is used to simply denote the code is part of a doctest. It does
           not necessarilly mean a new "logical block" is starting.

       PS2 : The original meaning is "Prompt String 2". In the context of
           xdoctest, instead of referring to the prompt prefix, we use PS2 to refer
           to a line that continues a "logical block" of code. In the original
           doctest module these all had to be prefixed with "...". However,
           xdoctest uses parsing to automatically determine this.

       want statement: Lines directly after a logical block of code in a doctest
           indicating the desired result of executing the previous block.

   While I do believe this AST-based code is a significant improvement over the
   RE-based builtin doctest parser, I acknowledge that I'm not an AST expert and
   there is room for improvement here.



Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   xdoctest.parser.DoctestParser



Functions
~~~~~~~~~

.. autoapisummary::

   xdoctest.parser._min_indentation
   xdoctest.parser._complete_source
   xdoctest.parser._iterthree
   xdoctest.parser._hasprefix


.. data:: DEBUG
   

   

.. data:: INDENT_RE
   

   

.. py:class:: DoctestParser(simulate_repl=False)

   Bases: :class:`object`

   Breaks docstrings into parts using the `parse` method.

   .. rubric:: Example

   >>> parser = DoctestParser()
   >>> doctest_parts = parser.parse(
   >>>     '''
   >>>     >>> j = 0
   >>>     >>> for i in range(10):
   >>>     >>>     j += 1
   >>>     >>> print(j)
   >>>     10
   >>>     '''.lstrip('\n'))
   >>> print('\n'.join(list(map(str, doctest_parts))))
   <DoctestPart(ln 0, src="j = 0...", want=None)>
   <DoctestPart(ln 3, src="print(j)...", want="10...")>

   .. rubric:: Example

   >>> # Having multiline strings in doctests can be nice
   >>> string = utils.codeblock(
           '''
           >>> name = 'name'
           'anything'
           ''')
   >>> self = DoctestParser()
   >>> doctest_parts = self.parse(string)
   >>> print('\n'.join(list(map(str, doctest_parts))))

   .. method:: parse(self, string, info=None)


      Divide the given string into examples and interleaving text.

      :Parameters: * **string** (*str*) -- string representing the doctest
                   * **info** (*dict*) -- info about where the string came from in case of an
                     error

      :returns: a list of `DoctestPart` objects
      :rtype: list

      CommandLine:
          python -m xdoctest.parser DoctestParser.parse

      .. rubric:: Example

      >>> s = 'I am a dummy example with two parts'
      >>> x = 10
      >>> print(s)
      I am a dummy example with two parts
      >>> s = 'My purpose it so demonstrate how wants work here'
      >>> print('The new want applies ONLY to stdout')
      >>> print('given before the last want')
      >>> '''
          this wont hurt the test at all
          even though its multiline '''
      >>> y = 20
      The new want applies ONLY to stdout
      given before the last want
      >>> # Parts from previous examples are executed in the same context
      >>> print(x + y)
      30

      this is simply text, and doesnt apply to the previous doctest the
      <BLANKLINE> directive is still in effect.

      .. rubric:: Example

      >>> from xdoctest import parser
      >>> from xdoctest.docstr import docscrape_google
      >>> from xdoctest import core
      >>> self = parser.DoctestParser()
      >>> docstr = self.parse.__doc__
      >>> blocks = docscrape_google.split_google_docblocks(docstr)
      >>> doclineno = self.parse.__func__.__code__.co_firstlineno
      >>> key, (string, offset) = blocks[-2]
      >>> self._label_docsrc_lines(string)
      >>> doctest_parts = self.parse(string)
      >>> # each part with a want-string needs to be broken in two
      >>> assert len(doctest_parts) == 6


   .. method:: _package_groups(self, grouped_lines)



   .. method:: _package_chunk(self, raw_source_lines, raw_want_lines, lineno=0)


      if `self.simulate_repl` is True, then each statement is broken into its
      own part.  Otherwise, statements are grouped by the closest `want`
      statement.

      TODO: EXCEPT IN CASES OF EXPLICIT CONTINUATION

      .. rubric:: Example

      >>> from xdoctest.parser import *
      >>> raw_source_lines = ['>>> "string"']
      >>> raw_want_lines = ['string']
      >>> self = DoctestParser()
      >>> part, = self._package_chunk(raw_source_lines, raw_want_lines)
      >>> part.source
      '"string"'
      >>> part.want
      'string'


   .. method:: _group_labeled_lines(self, labeled_lines)


      Group labeled lines into logical parts to be executed together

      :returns:     A list of parts. Text parts are just returned as a list of
                    lines.  Executable parts are returned as a tuple of source
                    lines and an optional "want" statement.
      :rtype: List[List[str] | Tuple[List[str], str]]


   .. method:: _locate_ps1_linenos(self, source_lines)


      Determines which lines in the source begin a "logical block" of code.

      :Parameters: **source_lines** (*List[str]*) -- lines belonging only to the doctest src
                   these will be unindented, prefixed, and without any want.

      :returns:

                a list of indices indicating which lines
                    are considered "PS1" and a flag indicating if the final line
                    should be considered for a got/want assertion.
      :rtype: Tuple[List[int], bool]

      .. rubric:: Example

      >>> self = DoctestParser()
      >>> source_lines = ['>>> def foo():', '>>>     return 0', '>>> 3']
      >>> linenos, eval_final = self._locate_ps1_linenos(source_lines)
      >>> assert linenos == [0, 2]
      >>> assert eval_final is True

      .. rubric:: Example

      >>> self = DoctestParser()
      >>> source_lines = ['>>> x = [1, 2, ', '>>> 3, 4]', '>>> print(len(x))']
      >>> linenos, eval_final = self._locate_ps1_linenos(source_lines)
      >>> assert linenos == [0, 2]
      >>> assert eval_final is True


   .. method:: _workaround_16806(ps1_linenos, exec_source_lines)
      :staticmethod:


      workaround for python issue 16806 (https://bugs.python.org/issue16806)

      This issue causes the AST to report line numbers for multiline strings
      as the line they end on. The correct behavior is to report the line
      they start on. Given a list of line numbers and the original source
      code, this workaround fixes any line number that points from the end of
      a multiline string to point to the start of it instead.

      :Parameters: * **ps1_linenos** (*List[int]*) -- AST provided line numbers that begin
                     statements and may be Python Issue #16806.
                   * **exec_source_lines** (*List[str]*) -- code referenced by ps1_linenos

      :returns:

                new_ps1_lines: Fixed `ps1_linenos` where multiline
                    strings now point to the line where they begin.
      :rtype: List[int]

      .. rubric:: Notes

      A patch for this issue exists
      `https://github.com/python/cpython/pull/1800`. This workaround is a
      idempotent (i.e. a no-op) when line numbers are correct, so nothing
      should break when this bug is fixed.

      Starting from the end look at consecutive pairs of indices to
      inspect the statement it corresponds to.  (the first statement goes
      from ps1_linenos[-1] to the end of the line list.

      .. rubric:: Example

      >>> ps1_linenos = [0, 2, 3]
      >>> exec_source_lines = ["x = 1", "y = '''foo", " bar'''", "pass"]
      >>> DoctestParser._workaround_16806(ps1_linenos, exec_source_lines)
      [0, 1, 3]


   .. method:: _label_docsrc_lines(self, string)


      Give each line in the docstring a label so we can distinguish
      what parts are text, what parts are code, and what parts are "want"
      string.

      :Parameters: **string** (*str*) -- doctest source

      :returns:

                labeled_lines - the above source broken
                    up by lines, each with a label indicating its type for later
                    use in parsing.
      :rtype: List[Tuple[str, str]]

      .. rubric:: Example

      >>> from xdoctest.parser import *
      >>> # Having multiline strings in doctests can be nice
      >>> string = utils.codeblock(
              '''
              text
              >>> items = ['also', 'nice', 'to', 'not', 'worry',
              >>>          'about', '...', 'vs', '>>>']
              ... print('but its still allowed')
              but its still allowed

              more text
              ''')
      >>> self = DoctestParser()
      >>> labeled = self._label_docsrc_lines(string)
      >>> expected = [
      >>>     ('text', 'text'),
      >>>     ('dsrc', ">>> items = ['also', 'nice', 'to', 'not', 'worry',"),
      >>>     ('dsrc', ">>>          'about', '...', 'vs', '>>>']"),
      >>>     ('dcnt', "... print('but its still allowed')"),
      >>>     ('want', 'but its still allowed'),
      >>>     ('text', ''),
      >>>     ('text', 'more text')
      >>> ]
      >>> assert labeled == expected



.. function:: _min_indentation(s)

   Return the minimum indentation of any non-blank line in `s`


.. function:: _complete_source(line, state_indent, line_iter)

   helper
   remove lines from the iterator if they are needed to complete source


.. function:: _iterthree(items, pad_value=None)

   Iterate over a sliding window of size 3 with None padding on
   both sides.

   .. rubric:: Example

   >>> from xdoctest.parser import *
   >>> print(list(_iterthree([])))
   >>> print(list(_iterthree(range(1))))
   >>> print(list(_iterthree([1, 2])))
   >>> print(list(_iterthree([1, 2, 3])))
   >>> print(list(_iterthree(range(4))))
   >>> print(list(_iterthree(range(7))))


.. function:: _hasprefix(line, prefixes)

   helper prefix test


