10.04.2018       Выпуск 225 (09.04.2018 - 15.04.2018)       Статьи

PyPy Development: улучшено исключение SyntaxError

Читать>>



Экспериментальная функция:

Ниже вы видите текст статьи по ссылке. По нему можно быстро понять ссылка достойна прочтения или нет

Просим обратить внимание, что текст по ссылке и здесь может не совпадать.

For the last year, my halftime job has been to teach non-CS uni students to program in Python. While doing that, I have been trying to see what common stumbling blocks exist for novice programmers. There are many things that could be said here, but a common theme that emerges is hard-to-understand error messages. One source of such error messages, particularly when starting out, isSyntaxErrors.

PyPy's parser (mostly following the architecture of CPython) uses a regular-expression-based tokenizer with some cleverness to deal with indentation, and a simple LR(1) parser. Both of these components obviously produce errors for invalid syntax, but the messages are not very helpful. Often, the message is just "invalid syntax", without any hint of what exactly is wrong. In the last couple of weeks I have invested a little bit of effort to make them a tiny bit better. They will be part of the upcoming PyPy 6.0 release. Here are some examples of what changed.

Missing Characters

The first class of errors occurs when a token is missing, often there is only one valid token that the parser expects. This happens most commonly by leaving out the ':' after control flow statements (which is the syntax error I personally still make at least a few times a day). In such situations, the parser will now tell you which character it expected:

>>>> # before
>>>> if 1
  File "<stdin>", line 1
    if 1
       ^
SyntaxError: invalid syntax
>>>>

>>>> # after
>>>> if 1
  File "<stdin>", line 1
    if 1
       ^
SyntaxError: invalid syntax (expected ':')
>>>>

Another example of this feature:

>>>> # before
>>>> def f:
  File "<stdin>", line 1
    def f:
        ^
SyntaxError: invalid syntax
>>>>

>>>> # after
>>>> def f:
  File "<stdin>", line 1
    def f:
         ^
SyntaxError: invalid syntax (expected '(')
>>>>

Parentheses

Another source of errors are unmatched parentheses. Here, PyPy has always had slightly better error messages than CPython:

>>> # CPython
>>> )
  File "<stdin>", line 1
    )
    ^
SyntaxError: invalid syntax
>>>

>>>> # PyPy
>>> )
  File "<stdin>", line 1
    )
    ^
SyntaxError: unmatched ')'
>>>>

The same is true for parentheses that are never closed (the call toevalis needed to get the error, otherwise the repl will just wait for more input):

>>> # CPython
>>> eval('(')
  File "<string>", line 1
    (
    ^
SyntaxError: unexpected EOF while parsing
>>>

>>>> # PyPy
>>>> eval('(')
  File "<string>", line 1
    (
    ^
SyntaxError: parenthesis is never closed
>>>>

What I have now improved is the case of parentheses that are matched wrongly:

>>>> # before
>>>> (1,
.... 2,
.... ]
  File "<stdin>", line 3
    ]
    ^
SyntaxError: invalid syntax
>>>>

>>>> # after
>>>> (1,
.... 2,
.... ]
  File "<stdin>", line 3
    ]
    ^
SyntaxError: closing parenthesis ']' does not match opening parenthesis '(' on line 1
>>>>

Conclusion

Obviously these are just some very simple cases, and there is still a lot of room for improvement (one huge problem is that only a singleSyntaxErroris ever shown per parse attempt, but fixing that is rather hard).

If you have a favorite unhelpful SyntaxError message you love to hate, please tell us in the comments and we might try to improve it. Other kinds of non-informative error messages are also always welcome!



Лучшая Python рассылка



Разместим вашу рекламу

Пиши: mail@pythondigest.ru

Нашли опечатку?

Выделите фрагмент и отправьте нажатием Ctrl+Enter.

Система Orphus