Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError in asttokens.ASTTokens #105

Open
DavidKorczynski opened this issue Mar 14, 2023 · 2 comments
Open

IndexError in asttokens.ASTTokens #105

DavidKorczynski opened this issue Mar 14, 2023 · 2 comments

Comments

@DavidKorczynski
Copy link

The following program raises an uncaught exception:

import sys
import asttokens, ast
import atheris

def TestOneInput(data):
  fdp = atheris.FuzzedDataProvider(data)
  source_to_parse = fdp.ConsumeUnicodeNoSurrogates(4196)
  try:
    ast.parse(source_to_parse)
  except:
    # Avoid anything that throws any issues in ast.parse.
    return
  try:
    atok = asttokens.ASTTokens(source_to_parse, parse=True)
  except SyntaxError:
    pass

data = (b"\x79\x0a\x79\x0a\x79\x0d\x79\x0a\x0a\x79\x0a\x79\x0a\x79\x0a\x79\x0a\x79\x0a\x79\x79\x0a\x0a\x79\x0a\x79\x0a\x79\x0a\x79\x0a\x79\xae\x79\x0a\x78\x0a\x79\x0a\x79\x0a\x79\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\xc5\x0a")
TestOneInput(data)

Where the atheris module refers to https://pypi.org/project/atheris/

The program is a derivative of the fuzzer here https://github.com/google/oss-fuzz/blob/master/projects/asttokens/fuzz_asttokens.py

The following program is a shortened version of above, without fuzzing-related logic:

import asttokens, ast

def TestOneInput():
  source_to_parse = "\x0a\x79\x0a\x79\x0d\x79\x0a\x0a\x79\x0a\x79\x0a\x79\x0a\x79\x0a\x79\x0a\x79\x79\x0a\x0a\x79\x0a\x79\x0a\x79\x0a\x79\x0a\x79\x2e\x79\x0a\x78\x0a\x79\x0a\x79\x0a\x79\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x45\x0a"

  try:
    ast.parse(source_to_parse)
  except:
    # Avoid anything that throws any issues in ast.parse.
    return
  try:
    atok = asttokens.ASTTokens(source_to_parse, parse=True)
  except SyntaxError:
    pass

TestOneInput()

This produces the stack trace:

# python3 ./reproducer.py 
Traceback (most recent call last):
  File "./reproducer.py", line 29, in <module>
    TestOneInput()
  File "./reproducer.py", line 26, in TestOneInput
    atok = asttokens.ASTTokens(source_to_parse, parse=True)
  File "/usr/local/lib/python3.8/site-packages/asttokens/asttokens.py", line 127, in __init__
    self.mark_tokens(self._tree)
  File "/usr/local/lib/python3.8/site-packages/asttokens/asttokens.py", line 139, in mark_tokens
    MarkTokens(self).visit_tree(root_node)
  File "/usr/local/lib/python3.8/site-packages/asttokens/mark_tokens.py", line 61, in visit_tree
    util.visit_tree(node, self._visit_before_children, self._visit_after_children)
  File "/usr/local/lib/python3.8/site-packages/asttokens/util.py", line 273, in visit_tree
    ret = postvisit(current, par_value, cast(Optional[Token], value))
  File "/usr/local/lib/python3.8/site-packages/asttokens/mark_tokens.py", line 109, in _visit_after_children
    nfirst, nlast = self._methods.get(self, node.__class__)(node, first, last)
  File "/usr/local/lib/python3.8/site-packages/asttokens/mark_tokens.py", line 220, in handle_attr
    name = self._code.next_token(dot)
  File "/usr/local/lib/python3.8/site-packages/asttokens/asttokens.py", line 210, in next_token
    while is_non_coding_token(self._tokens[i].type):
IndexError: list index out of range

This was found by way of OSS-Fuzz and the set up here: https://github.com/google/oss-fuzz/tree/master/projects/asttokens If you find this issue helpful then it would be great to have maintainer emails in the project.yaml to receive notifications of bug reports, which contain all details similar to what I posted above -- namely they contain the stacktrace, crashing input and identification of the fuzzer.

@PeterJCLaw
Copy link
Collaborator

Here's a more minimal cut-down which appears to fail in the same way: '\ry.y\n'. Attempting to cut down further (either by removing the attribute access, leaving just y, or changing the leading carriage return to a newline) causes the error to disappear. The mix of line ending styles here seems to be part of the issue, though why the attribute access is needed is less clear.

@alexmojaki
Copy link
Contributor

Thanks, I was gonna say something similar. '\ry' also produces an error, but a different one. Probably the same underlying cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants