g_docformatter.collectors¶
Module with classes to collect tokens.
- type g_docformatter.collectors.Paragraph = list[DocstringToken]¶
A list of
DocstringTokenobjects representing a paragraph of docstring tokens.
- class g_docformatter.collectors.CollectorContext(state: str | None = None, paragraphs: list[tuple[str, FrozenParagraph]]=<factory>, current_paragraph: Paragraph = <factory>, base_indent: int = 0)¶
Bases:
objectContext for token collectors.
- state: str | None = None¶
The current paragraph type being collected, or
Noneif no paragraph is currently being collected.
- paragraphs: list[tuple[str, FrozenParagraph]]¶
List of finalized paragraphs, each represented as a tuple of paragraph type and frozen paragraph.
- base_indent: int = 0¶
The base indentation amount of the docstring being collected in number of spaces.
- finalize_current(*, new_state: str | None = None) None¶
Finalize the current paragraph by appending it to the paragraphs list, clear the current paragraph, and optionally update the state.
- Keyword Arguments:
new_state – The new state to set after finalizing the paragraph. Default is None.
- clear_state() None¶
Clear the current state.
Sets the state property to an empty string and clears the current_paragraph list.
- exception g_docformatter.collectors.CollectorMethodContextError¶
Bases:
RuntimeErrorException raised if a collector method is called directly, instead of through the
collect_tokensmethod of aTokenCollectorABCsubclass.
- class g_docformatter.collectors.TokenCollectorMeta(name, bases, namespace, /, **kwargs)¶
Bases:
ABCMetaMetaclass for TokenCollectorABC to initialize collection_methods class variable.
- class g_docformatter.collectors.TokenCollectorABC(settings: FormatterSettings, context: CollectorContext | None = None)¶
Bases:
objectAbstract base class for token collectors.
Loops through an iterable of
DocstringTokenand collects tokens according to some logic, updating theCollectorContext. Collection methods should be decorated with the@collector_methoddecorator. Collection methods should returnTrueif they successfully collected the token, andFalseotherwise. If no collection method collects a token, a warning is emitted. Collection methods are called in the order they are defined in the class.- class CollectorMethod(method: Callable[[T, int, DocstringToken], bool])¶
Bases:
objectDescriptor class for collector methods.
- collect_tokens(docstring_tokens: ParagraphArgument, base_indent: int) list[tuple[str, FrozenParagraph]]¶
Collect tokens from an iterable of DocstringToken and update the collector context.
- get_token_indent_diff(token: DocstringToken, base_indent: int | None = None) int¶
Helper method to get the indent difference, in number of spaces, of a token from the base indent.
- Parameters:
token – The token to get the indent difference for.
- Keyword Arguments:
base_indent – The base indent to calculate the difference from. If None, the base_indent from the collector context is used.
- Returns:
The indent difference in number of spaces, calculated as the length of the token’s text minus the length of the token’s text with leading spaces removed, minus the base indent of the collector context or the provided base_indent.
- spaces_to_indent_level(spaces: int) float¶
Helper method to convert a number of spaces to an indent level based on the settings.
- Parameters:
spaces – The number of spaces to convert to an indent level.
- Returns:
The indent level, calculated as the number of spaces divided by the indent size specified in the settings.
It is possible for the indent level to be a float if the number of spaces is not a multiple of the indent size. This allows for more precise handling of indentation in cases where the indentation is not consistent.
- get_token_indent_level_diff(token: DocstringToken) float¶
Helper method to get the indent level difference of a token from the base indent.
- Parameters:
token – The token to get the indent level difference for.
- Returns:
The indent level difference, calculated by getting the indent difference in spaces using
get_token_indent_diffand then converting that to an indent level usingspaces_to_indent_level.
- get_token_indent(token: DocstringToken, *, relative_to: DocstringToken | None = None) int¶
Helper method to get the indent of a token in number of spaces.
- Parameters:
token – The token to get the indent of.
- Keyword Arguments:
relative_to – If provided, calculate the indent relative to this token.
- Returns:
The indent in number of spaces.
Note
The indent is calculated as the length of the token’s text minus the length of the token’s text with leading spaces removed.
- g_docformatter.collectors.collector_method¶
alias of
CollectorMethod
- class g_docformatter.collectors.RootCollector(settings: FormatterSettings, context: CollectorContext | None = None)¶
Bases:
TokenCollectorABCToken collector for the root level of a docstring.
- SUMMARY_ONLY = 'SUMMARY_ONLY'¶
Paragraph type for summary-only docstrings.
- SUMMARY = 'SUMMARY'¶
Paragraph type for the summary section.
- EOD = 'EOD'¶
Paragraph type for the end-of-docstring token (triple quotes).
- DESCRIPTION = 'DESCRIPTION'¶
Paragraph type for description sections.
- STD_SECTION = 'STD_SECTION'¶
‘, ‘Warning:’, etc.).
- Type:
Paragraph type for standard sections (IE
- Type:
‘Note
- LIST_SECTION = 'LIST_SECTION'¶
‘, ‘Attributes:’, etc.).
- Type:
Paragraph type for list sections (IE
- Type:
‘Parameters
- GENERAL_TOKEN_TYPES = ('REPL_START', 'REPL_CONTINUE', 'CODE_BLOCK_START', 'SPHINX_OPTION', 'STRING', 'LIST_ITEM')¶
Token types that are a part of paragraphs but do not define paragraph boundaries.
- process_summary_only_docstring(token: DocstringToken) bool¶
Collect summary-only docstrings.
- finalize_current_paragraph(token: DocstringToken) bool¶
Finalize the current paragraph if the token should start a new paragraph.
- process_eod_token(token: DocstringToken) bool¶
When encountering the EOD token, add an EOD paragraph.
- start_new_paragraph(token: DocstringToken) bool¶
Start a new paragraph if the token indicates the start of a new paragraph.
- process_current_paragraph_token(token: DocstringToken) bool¶
Collect tokens for the current paragraph.
- class g_docformatter.collectors.BodyTokenCollector(settings: FormatterSettings, context: CollectorContext | None = None)¶
Bases:
TokenCollectorABCToken collector for the body of a docstring section.
Collects tokens into one of three paragraph types:
TEXTfor plain textREPL_CODEfor REPL-style code blocks (lines starting with>>>)SPHINX_CODEfor Sphinx.. code-block::directives
A new paragraph begins when a
REPL_STARTorCODE_BLOCK_STARTtoken is encountered (starting aREPL_CODEorSPHINX_CODEparagraph respectively), or defaults toTEXTfor any other token.Finalization rules differ by paragraph type:
TEXTparagraphs are finalized by a blank line (NLtoken), aREPL_STARTtoken, or aCODE_BLOCK_STARTtoken.REPL_CODEparagraphs are finalized by a blank line (NLtoken) or aCODE_BLOCK_STARTtoken (but not by aREPL_STARTtoken).SPHINX_CODEparagraphs are finalized when a dedented non-blank token is seen (indent_diff <= 0); blank lines within the block are kept.
Any in-progress paragraph is also finalized after all tokens have been collected.
- TEXT = 'TEXT'¶
Paragraph type for text in body paragraphs.
- REPL_CODE = 'REPL_CODE'¶
Paragraph type for REPL code blocks in body paragraphs.
- SPHINX_CODE = 'SPHINX_CODE'¶
Paragraph type for Sphinx code blocks in body paragraphs.
- finalize_current_paragraph(token: DocstringToken) bool¶
Finalize the current paragraph when the incoming token terminates it.
For
TEXTandREPL_CODEparagraphs a blank line (NLtoken) always ends the paragraph and the method returnsTrueso that the caller knows the token wasn’t reprocessed. A new code-block start token also finalizes the current paragraph, but the return value remainsFalseto allow the start token to be handled by subsequent collectors.SPHINX_CODEparagraphs close when a non-blank token dedented below the level of the opening directive is seen. The return value is alwaysFalseso that the token triggering the break can be re-evaluated.
- start_new_paragraph(token: DocstringToken) bool¶
Start a new paragraph if the token indicates the start of a new paragraph.
- collect_current_paragraph_token(token: DocstringToken) bool¶
Collect tokens for the current paragraph.
- class g_docformatter.collectors.StdSectionTokenCollector(settings: FormatterSettings, context: CollectorContext | None = None)¶
Bases:
TokenCollectorABCToken collector for standard sections (e.g.
Note:,Warning:) of a docstring.Splits the section into exactly two paragraph types:
HEADER: the first token (the section keyword line, e.g.Note:),
immediately finalized as a single-token paragraph.
BODY: all remaining tokens collected together into a single paragraph.
The two-paragraph split is intentional.
HEADERshould be formatted directly by the formatter which uses this collector, whileBODYshould be passed to a formatter that uses aBodyTokenCollector.- HEADER = 'HEADER'¶
‘, ‘Warning:’, etc.).
- Type:
Paragraph type for the header of a standard section (IE
- Type:
‘Note
- BODY = 'BODY'¶
Paragraph type for the body of a standard section.
- collect_and_finalize_header(token: DocstringToken) bool¶
Collect the header token of the standard section and finalize it as a paragraph.
- collect_remaining_tokens(token: DocstringToken) bool¶
Collect the tokens that occur after the header.
- class g_docformatter.collectors.ListSectionTokenCollector(settings: FormatterSettings, context: CollectorContext | None = None)¶
Bases:
TokenCollectorABCToken collector for list sections (e.g.
Parameters:,Attributes:) of a docstring.Splits the section into up to three paragraph types:
HEADER: the first token (the section keyword line, e.g.Parameters:), immediately finalized as a single-token paragraph.LIST_ITEM: the opening line of each individual list entry (e.g.param_name (type): description). EachLIST_ITEMparagraph contains exactly one item.LIST_ITEM_BODY: any continuation lines that follow aLIST_ITEMwithin the same logical entry. These are collected into a single paragraph per entry and are intended to be passed to a formatter that uses aBodyTokenCollector.
- HEADER = 'HEADER'¶
‘, ‘Attributes:’, etc.).
- Type:
Paragraph type for the header of a list section (IE
- Type:
‘Parameters
- LIST_ITEM = 'LIST_ITEM'¶
Paragraph type for list items in a list section.
- LIST_ITEM_BODY = 'LIST_ITEM_BODY'¶
Paragraph type for any additional paragraphs that occur within a list item, after the initial line.
- collect_and_finalize_header(token: DocstringToken) bool¶
Collect the header token of the list section and finalize it as a paragraph.
- finalize_current_paragraph(token: DocstringToken) bool¶
Finalize the current paragraph if the token should start a new paragraph.
Paragraph boundaries within a list section are driven by indentation and token type. The logic is as follows:
A
LIST_ITEMtoken that is dedented to level 0 or 1, or that matches the indent level of the previous list item, always terminates the current paragraph. This covers the start of a new top-level list item.When the collector is currently in state
LIST_ITEMand anNLtoken is seen, the paragraph is finalized. Blank lines within a list item body are not handled here; they remain insideLIST_ITEM_BODYparagraphs so that downstream formatters can split them according to normal body rules.While in
LIST_ITEM_BODYstate, blank lines and other non-LIST_ITEMtokens do not trigger finalization. Only aLIST_ITEMtoken that is dedented to level 0 or 1, or that matches the indent level of the previous list item, will end the body paragraph (via rule 1 above).
The method returns
Trueonly when the triggering token has been consumed (currently only theNLcase), allowing the caller to avoid reprocessing it.
- start_new_paragraph(token: DocstringToken) bool¶
Start a new paragraph if the token indicates a paragraph boundary.
This method is only called when no paragraph is currently in progress (
context.stateisNone). It uses the token’s type and its indentation level-computed relative tocontext.base_indent-to decide whether a new paragraph should begin, and which kind of list section paragraph it should be.The header line itself (
HEADERstate) is handled bycollect_and_finalize_header()and therefore is not included in the logic below.Two paragraph states are possible:
LIST_ITEM - begins when the token is a
LIST_ITEMand any of these conditions hold:the token’s indent level is 0 or 1;
or there are no previous paragraphs (first item in section);
or the last paragraph was not a
LIST_ITEM(for example, the header or an item body), giving precedence to a new top-level entry;or the token’s indent exactly matches the indent level of the previous list item (after subtracting
base_indent).
The last two bullets ensure that a deeply-nested item will still start a new
LIST_ITEMif it is indented the same as the prior item, and that isolated dedented items are treated as new list entries.LIST_ITEM_BODY - any token whose indent level is greater than 1 and that does not satisfy the
LIST_ITEMcriteria above will open a body paragraph. This category includes both non-LIST_ITEMcontinuation lines andLIST_ITEMtokens that are so deeply indented that they don’t align with the previous item. Paragraphs in this state are intentionally not closed by blank lines; they persist until a new list item or header is encountered.
The method always returns
Falseso that the caller will forward the same token tocollect_current_paragraph_token(), ensuring the triggering token is appended to whatever paragraph has just started.
- collect_current_paragraph_token(token: DocstringToken) bool¶
Collect tokens for the current paragraph.
- class g_docformatter.collectors.SphinxCodeBlockTokenCollector(settings: FormatterSettings, context: CollectorContext | None = None)¶
Bases:
TokenCollectorABCToken collector for Sphinx code blocks in a docstring.
Splits the code block into up to three paragraph types:
HEADER: the first token (the directive line, e.g... code-block:: python), immediately finalized as a single-token paragraph.OPTION: each option line following the header (e.g.:caption: My caption,:linenos:), finalized individually as single-token paragraphs. Zero or moreOPTIONparagraphs may appear.CODE: all remaining tokens collected together into a single paragraph. Blank lines between the header or options and the first code line are not included in the paragraph.
- HEADER = 'HEADER'¶
Paragraph type for the
..code-block::directive line that starts a Sphinx code block.
- OPTION = 'OPTION'¶
Paragraph type for an option line in a Sphinx code block (the lines that specify options for the code block, EG: (
:caption: This is a caption).
- CODE = 'CODE'¶
Paragraph type for the lines of code in a Sphinx code block.
- collect_and_finalize_header(token: DocstringToken) bool¶
Collect the header token of the Sphinx code block and finalize it as a paragraph.
- collect_and_finalize_options(token: DocstringToken) bool¶
Collect and finalize each option line of the Sphinx code block.
Option lines appear after the header or other option lines.
- collect_code(token: DocstringToken) bool¶
Collect the code lines of the Sphinx code block.
Blank lines (NL tokens) that appear before the first code line are skipped: they are marked as collected to suppress uncollected-token warnings but are not added to the CODE paragraph. Once the first non-blank token is seen, the state transitions to CODE and all subsequent tokens (including blank lines) are appended to the current paragraph.