Unterminated string literal python ошибка

How to fix SyntaxError: unterminated string literal (detected at line 8) in python, in this scenario, I forgot by mistakenly closing quotes ( ” ) with f string

How to fix SyntaxError: unterminated string literal (detected at line 8) in python, in this scenario, I forgot by mistakenly closing quotes ( ” ) with f string different code lines, especially 1st line and last line code that is why we face this error in python. This is one of the command errors in python, If face this type of error just find where you miss the opening and closing parentheses ( “)”  just enter then our code error free see the below code.

Wrong Code: unterminated string literal python

# Just create age input variable
a = input("What is Your Current age?n")

Y = 101 - int(a)
M = Y * 12
W = M * 4
D = W * 7
print(f"You have {D} Days {W} Weeks, {M} Months And {Y} Years Left In Your Life)
print("Hello World")

Error Massage

  File "/home/kali/python/webproject/error/main.py", line 8
    print(f"You have {D} Days {W} Weeks, {M} Months And {Y} Years Left In Your Life)
          ^
SyntaxError: unterminated string literal (detected at line 8)

Wrong code line

Missing closing quotes ( ” ).

print(f"You have {D} Days {W} Weeks, {M} Months And {Y} Years Left In Your Life)

Correct code line

print(f"You have {D} Days {W} Weeks, {M} Months And {Y} Years Left In Your Life")

print(” “).

Entire Correct Code line

# Just create age input variable
a = input("What is Your Current age?n")

Y = 101 - int(a)
M = Y * 12
W = M * 4
D = W * 7
print(f"You have {D} Days {W} Weeks, {M} Months And {Y} Years Left In Your Life")
print("Hello World")

What is unterminated string literal Python?

Syntax in python sets forth a specific symbol for coding elements like opening and closing quotes (“  “ ), Whenever we miss the closing quotes with f string that time we face SyntaxError: unterminated string literal In Python. See the above example.

How to Fix unterminated string literal Python?

Syntax in python sets forth a specific symbol for coding elements like opening and closing quotes (), Whenever we miss the closing quotes with f string that time we face SyntaxError: unterminated string literal so we need to find in which line of code we miss special closing quotes ( “ )symbol and need to enter correct symbols, See the above example.

For more information visit Amol Blog Code YouTube Channel.

Cover image for How to fix "SyntaxError: unterminated string literal" in Python

Reza Lavarian

Update: This post was originally published on my blog decodingweb.dev, where you can read the latest version for a 💯 user experience. ~reza

Python raises “SyntaxError: unterminated string literal” when a string value doesn’t have a closing quotation mark.

This syntax error usually occurs owing to a missing quotation mark or an invalid multi-line string. Here’s what the error looks like on Python version 3.11:

File /dwd/sandbox/test.py, line 1
    book_title = 'Head First Python
                 ^
SyntaxError: unterminated string literal (detected at line 1)

Enter fullscreen mode

Exit fullscreen mode

On the other hand, the error «SyntaxError: unterminated string literal» means Python was expecting a closing quotation mark, but it didn’t encounter any:

# 🚫 SyntaxError: unterminated string literal (detected at line 1)
book_title = 'Head First Python

Enter fullscreen mode

Exit fullscreen mode

Adding the missing quotation mark fixes the problem instantly:

# ✅ Correct
book_title = 'Head First Python'

Enter fullscreen mode

Exit fullscreen mode

How to fix «SyntaxError: unterminated string literal»

The error «SyntaxError: unterminated string literal» occurs under various scenarios:

  1. Missing the closing quotation mark
  2. When a string value ends with a backslash ()
  3. Opening and closing quotation marks mismatch
  4. A multi-line string enclosed with " or '
  5. A missing backslash!
  6. Multi-line strings enclosed with quadruple quotes!

Missing the closing quotation mark: The most common reason behind this error is to forget to close your string with a quotation mark — whether it’s a single, double, or triple quotation mark.

# 🚫 SyntaxError: unterminated string literal (detected at line 1)
book_title = 'Head First Python

Enter fullscreen mode

Exit fullscreen mode

Needless to say, adding the missing end fixes the problem:

# ✅ Correct
book_title = 'Head First Python'

Enter fullscreen mode

Exit fullscreen mode

When a string value ends with a backslash (): Based on Python semantics, a pair of quotation marks work as a boundary for a string literal.

Putting a backslash before a quotation mark will neutralize it and make it an ordinary character. In the programming terminology, it’s called an escape character.

This is helpful when you want to include a quotation mark in your string, but you don’t want it to interefer with its surrounding quotation marks:

message = 'I' m good'

Enter fullscreen mode

Exit fullscreen mode

That said, if you use the backslash before the ending quotation mark, it won’t be a boundary character anymore.

Imagine you need to define a string that ends with a like a file path on Windows

# 🚫 SyntaxError: unterminated string literal (detected at line 1)
file_dir = 'C:files'

Enter fullscreen mode

Exit fullscreen mode

In the above code, the last escapes the quotation mark, leaving our string unterminated. As a result, Python raises «SyntaxError: unterminated string literal».

To fix it, we use a double backslash \ instead of one. It’s like using its effect against itself; the first escapes the its following backslash. As a result, we’ll have our backslash in the string (as an ordinary character), and the quoting effect of ' remains intact.

# ✅ Escaping a slash in a string literal
file_dir = 'C:files'

Enter fullscreen mode

Exit fullscreen mode

Opening and closing quotation marks mismatch: The opening and closing quotation marks must be identical, meaning if the opening quotation mark is ", the closing part must be " too.

The following code raises the error:

# 🚫 SyntaxError: unterminated string literal (detected at line 1)
book_title = "Python Head First'

Enter fullscreen mode

Exit fullscreen mode

No matter which one you choose, they need to be identical:

# ✅ Opening and closing quotation marks match
book_title = 'Python Head First'

Enter fullscreen mode

Exit fullscreen mode

A multi-line string enclosed with " or ': If you use ' or " to quote a string literal, Python will look for the closing quotation mark on the same line.

Python is an interpreted language that executes lines one after another. So every statement is assumed to be on one line. A new line marks the end of a Python statement and the beginning of another.

So if you define a string literal in multiple lines, you’ll get the syntax error:

# 🚫 SyntaxError: unterminated string literal (detected at line 1)
message = 'Python is a high-level, 
general-purpose 
programming language'

Enter fullscreen mode

Exit fullscreen mode

In the above code, the first line doesn’t end with a quotation mark.

If you want a string literal to span across several lines, you should use the triple quotes (""" or ''') instead:

# ✅ The correct way of defining multi-line strings
message = '''Python is a high-level, 
general-purpose 
programming language'''

Enter fullscreen mode

Exit fullscreen mode

Multi-line strings enclosed with quadruple quotes: As mentioned earlier, to create multi-line strings, we use triple quotes. But what happens if you use quadruple quotes?

The opening part would be ok, as the fourth quote would be considered a part of the string. However, the closing part will cause a SyntaxError.

Since Python expects the closing part to be triple quotes, the fourth quotation mark would be considered a separate opening without a closing part, and it raises the «SyntaxError: unterminated string literal» error.

# 🚫 SyntaxError: unterminated string literal (detected at line 3)
message = ''''Python is a high-level, 
general-purpose 
programming language''''

Enter fullscreen mode

Exit fullscreen mode

Always make sure you’re not using quadruple quotes by mistake.

A missing slash: Another way to span Python statements across multiple lines is by marking each line with a backslash. This backslash () escapes the hidden newline character (n) and makes Python continue parsing statements until it reaches a newline character.

You can use this technique as an alternative to the triple quotes:

# ✅ The correct way of defining multi-line strings with ''
message = 'Python is a high-level, 
general-purpose 
programming language'

Enter fullscreen mode

Exit fullscreen mode

Now, if you forget the in the second line, Python will expect to see the closing quotation mark on that line:

# 🚫 SyntaxError: unterminated string literal (detected at line 2)
message = 'Python is a high-level, 
general-purpose 
programming language'

Enter fullscreen mode

Exit fullscreen mode

If you’re taking this approach, all lines should end with , except for the last line, which ends with the closing quotation mark.

✋ Please note: Since the is supposed to escape the newline character (the hidden last character), no space should follow the . If you leave a space after the backslash, the newline character wouldn’t be affected, and Python will expect the statement to end on the current line.

Alright, I think it does it. I hope this quick guide helped you solve your problem.

Thanks for reading.


❤️ You might like:

  • SyntaxError: EOL while scanning string literal in Python
  • SyntaxError: cannot assign to expression here in Python
  • SyntaxError: cannot assign to literal here in Python
  • TypeError: ‘str’ object is not callable in Python (Fixed)

Python известен своим простым синтаксисом. Однако, когда вы изучаете Python в первый раз или когда вы попали на Python с большим опытом работы на другом языке программирования, вы можете столкнуться с некоторыми вещами, которые Python не позволяет. Если вы когда-либо получали + SyntaxError + при попытке запустить код Python, то это руководство может вам помочь. В этом руководстве вы увидите общие примеры неправильного синтаксиса в Python и узнаете, как решить эту проблему.

Неверный синтаксис в Python

Когда вы запускаете ваш код Python, интерпретатор сначала анализирует его, чтобы преобразовать в байтовый код Python, который он затем выполнит. Интерпретатор найдет любой недопустимый синтаксис в Python на этом первом этапе выполнения программы, также известном как этап синтаксического анализа . Если интерпретатор не может успешно проанализировать ваш код Python, это означает, что вы использовали неверный синтаксис где-то в вашем коде. Переводчик попытается показать вам, где произошла эта ошибка.

Когда вы изучаете Python в первый раз, может быть неприятно получить + SyntaxError +. Python попытается помочь вам определить, где в вашем коде указан неверный синтаксис, но предоставляемый им traceback может немного сбить с толку. Иногда код, на который он указывает, вполне подходит.

*Примечание:* Если ваш код *синтаксически* правильный, то вы можете получить другие исключения, которые не являются `+ SyntaxError +`. Чтобы узнать больше о других исключениях Python и о том, как их обрабатывать, ознакомьтесь с https://realpython.com/python-exceptions/[Python Exceptions: Введение].

Вы не можете обрабатывать неправильный синтаксис в Python, как и другие исключения. Даже если вы попытаетесь обернуть блок + try + и + кроме + вокруг кода с неверным синтаксисом, вы все равно увидите, что интерпретатор вызовет + SyntaxError +.

+ SyntaxError + Исключение и трассировка

Когда интерпретатор обнаруживает неверный синтаксис в коде Python, он вызовет исключение + SyntaxError + и предоставит трассировку с некоторой полезной информацией, которая поможет вам отладить ошибку. Вот некоторый код, который содержит недопустимый синтаксис в Python:

 1 # theofficefacts.py
 2 ages = {
 3     'pam': 24,
 4     'jim': 24
 5     'michael': 43
 6 }
 7 print(f'Michael is {ages["michael"]} years old.')

Вы можете увидеть недопустимый синтаксис в литерале словаря в строке 4. Во второй записи + 'jim' + пропущена запятая. Если вы попытаетесь запустить этот код как есть, вы получите следующую трассировку:

$ python theofficefacts.py
File "theofficefacts.py", line 5
    'michael': 43
            ^
SyntaxError: invalid syntax

Обратите внимание, что сообщение трассировки обнаруживает ошибку в строке 5, а не в строке 4. Интерпретатор Python пытается указать, где находится неправильный синтаксис. Тем не менее, он может только указать, где он впервые заметил проблему. Когда вы получите трассировку + SyntaxError + и код, на который указывает трассировка, выглядит нормально, тогда вы захотите начать движение назад по коду, пока не сможете определить, что не так.

В приведенном выше примере нет проблемы с запятой, в зависимости от того, что следует после нее. Например, нет проблемы с отсутствующей запятой после + 'michael' + в строке 5. Но как только переводчик сталкивается с чем-то, что не имеет смысла, он может лишь указать вам на первое, что он обнаружил, что он не может понять.

*Примечание:* В этом руководстве предполагается, что вы знакомы с основами *tracebacks* в Python. Чтобы узнать больше о трассировке Python и о том, как их читать, ознакомьтесь с https://realpython.com/python-traceback/[Understanding Python Traceback].

Существует несколько элементов трассировки + SyntaxError +, которые могут помочь вам определить, где в вашем коде содержится неверный синтаксис:

  • Имя файла , где встречается неверный синтаксис

  • Номер строки и воспроизводимая строка кода, где возникла проблема

  • Знак (+ ^ +) в строке ниже воспроизводимого кода, который показывает точку в коде, которая имеет проблему

  • Сообщение об ошибке , которое следует за типом исключения + SyntaxError +, которое может предоставить информацию, которая поможет вам определить проблему

В приведенном выше примере имя файла было + theofficefacts.py +, номер строки был 5, а каретка указывала на закрывающую кавычку из словарного ключа + michael +. Трассировка + SyntaxError + может не указывать на реальную проблему, но она будет указывать на первое место, где интерпретатор не может понять синтаксис.

Есть два других исключения, которые вы можете увидеть в Python. Они эквивалентны + SyntaxError +, но имеют разные имена:

  1. + + IndentationError

  2. + + TabError

Оба эти исключения наследуются от класса + SyntaxError +, но это особые случаи, когда речь идет об отступе. + IndentationError + возникает, когда уровни отступа вашего кода не совпадают. + TabError + возникает, когда ваш код использует и табуляцию, и пробелы в одном файле. Вы познакомитесь с этими исключениями более подробно в следующем разделе.

Общие проблемы с синтаксисом

Когда вы впервые сталкиваетесь с + SyntaxError +, полезно знать, почему возникла проблема и что вы можете сделать, чтобы исправить неверный синтаксис в вашем коде Python. В следующих разделах вы увидите некоторые из наиболее распространенных причин, по которым может быть вызвано «+ SyntaxError +», и способы их устранения.

Неправильное использование оператора присваивания (+ = +)

В Python есть несколько случаев, когда вы не можете назначать объекты. Некоторые примеры присваивают литералам и вызовам функций. В приведенном ниже блоке кода вы можете увидеть несколько примеров, которые пытаются это сделать, и получающиеся в результате трассировки + SyntaxError +:

>>>

>>> len('hello') = 5
  File "<stdin>", line 1
SyntaxError: can't assign to function call

>>> 'foo' = 1
  File "<stdin>", line 1
SyntaxError: can't assign to literal

>>> 1 = 'foo'
  File "<stdin>", line 1
SyntaxError: can't assign to literal

Первый пример пытается присвоить значение + 5 + вызову + len () +. Сообщение + SyntaxError + очень полезно в этом случае. Он говорит вам, что вы не можете присвоить значение вызову функции.

Второй и третий примеры пытаются присвоить литералам строку и целое число. То же правило верно и для других литеральных значений. И снова сообщения трассировки указывают, что проблема возникает, когда вы пытаетесь присвоить значение литералу.

*Примечание:* В приведенных выше примерах отсутствует повторяющаяся строка кода и каретка (`+ ^ +`), указывающая на проблему в трассировке. Исключение и обратная трассировка, которые вы видите, будут другими, когда вы находитесь в REPL и пытаетесь выполнить этот код из файла. Если бы этот код был в файле, то вы бы получили повторяющуюся строку кода и указали на проблему, как вы видели в других случаях в этом руководстве.

Вероятно, ваше намерение не состоит в том, чтобы присвоить значение литералу или вызову функции. Например, это может произойти, если вы случайно пропустите дополнительный знак равенства (+ = +), что превратит назначение в сравнение. Сравнение, как вы можете видеть ниже, будет правильным:

>>>

>>> len('hello') == 5
True

В большинстве случаев, когда Python сообщает вам, что вы делаете присвоение чему-то, что не может быть назначено, вы сначала можете проверить, чтобы убедиться, что оператор не должен быть логическим выражением. Вы также можете столкнуться с этой проблемой, когда пытаетесь присвоить значение ключевому слову Python, о котором вы расскажете в следующем разделе.

Неправильное написание, отсутствие или неправильное использование ключевых слов Python

Ключевые слова Python — это набор защищенных слов , которые имеют особое значение в Python. Это слова, которые вы не можете использовать в качестве идентификаторов, переменных или имен функций в своем коде. Они являются частью языка и могут использоваться только в контексте, который допускает Python.

Существует три распространенных способа ошибочного использования ключевых слов:

  1. Неправильное написание ключевое слово

  2. Отсутствует ключевое слово

  3. Неправильное использование ключевого слова

Если вы неправильно написали ключевое слово в своем коде Python, вы получите + SyntaxError +. Например, вот что происходит, если вы пишете ключевое слово + for + неправильно:

>>>

>>> fro i in range(10):
  File "<stdin>", line 1
    fro i in range(10):
        ^
SyntaxError: invalid syntax

Сообщение читается как + SyntaxError: неверный синтаксис +, но это не очень полезно. Трассировка указывает на первое место, где Python может обнаружить, что что-то не так. Чтобы исправить эту ошибку, убедитесь, что все ваши ключевые слова Python написаны правильно.

Другая распространенная проблема с ключевыми словами — это когда вы вообще их пропускаете:

>>>

>>> for i range(10):
  File "<stdin>", line 1
    for i range(10):
              ^
SyntaxError: invalid syntax

Еще раз, сообщение об исключении не очень полезно, но трассировка действительно пытается указать вам правильное направление. Если вы отойдете от каретки, то увидите, что ключевое слово + in + отсутствует в синтаксисе цикла + for +.

Вы также можете неправильно использовать защищенное ключевое слово Python. Помните, что ключевые слова разрешено использовать только в определенных ситуациях. Если вы используете их неправильно, у вас будет неправильный синтаксис в коде Python. Типичным примером этого является использование https://realpython.com/python-for-loop/#the-break-and-continue-statements [+ continue + или + break +] вне цикла. Это может легко произойти во время разработки, когда вы реализуете вещи и когда-то перемещаете логику за пределы цикла:

>>>

>>> names = ['pam', 'jim', 'michael']
>>> if 'jim' in names:
...     print('jim found')
...     break
...
  File "<stdin>", line 3
SyntaxError: 'break' outside loop

>>> if 'jim' in names:
...     print('jim found')
...     continue
...
  File "<stdin>", line 3
SyntaxError: 'continue' not properly in loop

Здесь Python отлично говорит, что именно не так. Сообщения " 'break' вне цикла " и " 'continue' не в цикле должным образом " помогут вам точно определить, что делать. Если бы этот код был в файле, то Python также имел бы курсор, указывающий прямо на неправильно использованное ключевое слово.

Другой пример — если вы пытаетесь назначить ключевое слово Python переменной или использовать ключевое слово для определения функции:

>>>

>>> pass = True
  File "<stdin>", line 1
    pass = True
         ^
SyntaxError: invalid syntax

>>> def pass():
  File "<stdin>", line 1
    def pass():
           ^
SyntaxError: invalid syntax

Когда вы пытаетесь присвоить значение + pass +, или когда вы пытаетесь определить новую функцию с именем + pass +, вы получите ` + SyntaxError + и снова увидеть сообщение + «неверный синтаксис» + `.

Может быть немного сложнее решить этот тип недопустимого синтаксиса в коде Python, потому что код выглядит хорошо снаружи. Если ваш код выглядит хорошо, но вы все еще получаете + SyntaxError +, то вы можете рассмотреть возможность проверки имени переменной или имени функции, которое вы хотите использовать, по списку ключевых слов для версии Python, которую вы используете.

Список защищенных ключевых слов менялся с каждой новой версией Python. Например, в Python 3.6 вы можете использовать + await + в качестве имени переменной или имени функции, но в Python 3.7 это слово было добавлено в список ключевых слов. Теперь, если вы попытаетесь использовать + await + в качестве имени переменной или функции, это вызовет + SyntaxError +, если ваш код для Python 3.7 или более поздней версии.

Другим примером этого является + print +, который отличается в Python 2 от Python 3:

Version print Type Takes A Value

Python 2

keyword

no

Python 3

built-in function

yes

+ print + — это ключевое слово в Python 2, поэтому вы не можете присвоить ему значение. Однако в Python 3 это встроенная функция, которой можно присваивать значения.

Вы можете запустить следующий код, чтобы увидеть список ключевых слов в любой версии Python, которую вы используете:

import keyword
print(keyword.kwlist)

+ keyword + также предоставляет полезную + keyword.iskeyword () +. Если вам просто нужен быстрый способ проверить переменную + pass +, то вы можете использовать следующую однострочную строку:

>>>

>>> import keyword; keyword.iskeyword('pass')
True

Этот код быстро сообщит вам, является ли идентификатор, который вы пытаетесь использовать, ключевым словом или нет.

Отсутствующие скобки, скобки и цитаты

Часто причиной неправильного синтаксиса в коде Python являются пропущенные или несовпадающие закрывающие скобки, скобки или кавычки. Их может быть трудно обнаружить в очень длинных строках вложенных скобок или длинных многострочных блоках. Вы можете найти несоответствующие или пропущенные кавычки с помощью обратных трассировок Python:

>>>

>>> message = 'don't'
  File "<stdin>", line 1
    message = 'don't'
                   ^
SyntaxError: invalid syntax

Здесь трассировка указывает на неверный код, где после закрывающей одинарной кавычки стоит + t '+. Чтобы это исправить, вы можете сделать одно из двух изменений:

  1. Escape одиночная кавычка с обратной косой чертой (+ 'don ' t '+)

  2. Окружить всю строку в двойных кавычках (" не ")

Другая распространенная ошибка — забыть закрыть строку. Как для строк с двойными, так и с одинарными кавычками ситуация и обратная трассировка одинаковы:

>>>

>>> message = "This is an unclosed string
  File "<stdin>", line 1
    message = "This is an unclosed string
                                        ^
SyntaxError: EOL while scanning string literal

На этот раз каретка в трассировке указывает прямо на код проблемы. Сообщение + SyntaxError +, " EOL при сканировании строкового литерала ", немного более конкретно и полезно при определении проблемы. Это означает, что интерпретатор Python дошел до конца строки (EOL) до закрытия открытой строки. Чтобы это исправить, закройте строку с кавычкой, которая совпадает с той, которую вы использовали для ее запуска. В этом случае это будет двойная кавычка (`+» + `).

Кавычки, отсутствующие в инструкциях внутри f-string, также могут привести к неверному синтаксису в Python:

 1 # theofficefacts.py
 2 ages = {
 3     'pam': 24,
 4     'jim': 24,
 5     'michael': 43
 6 }
 7 print(f'Michael is {ages["michael]} years old.')

Здесь, ссылка на словарь + ages + внутри напечатанной f-строки пропускает закрывающую двойную кавычку из ссылки на ключ. Итоговая трассировка выглядит следующим образом:

$ python theofficefacts.py
  File "theofficefacts.py", line 7
    print(f'Michael is {ages["michael]} years old.')
         ^
SyntaxError: f-string: unterminated string

Python идентифицирует проблему и сообщает, что она существует внутри f-строки. Сообщение " неопределенная строка " также указывает на проблему. Каретка в этом случае указывает только на начало струны.

Это может быть не так полезно, как когда каретка указывает на проблемную область струны, но она сужает область поиска. Где-то внутри этой f-строки есть неопределенная строка. Вы просто должны узнать где. Чтобы решить эту проблему, убедитесь, что присутствуют все внутренние кавычки и скобки f-строки.

Ситуация в основном отсутствует в скобках и скобках. Например, если вы исключите закрывающую квадратную скобку из списка, Python обнаружит это и укажет на это. Однако есть несколько вариантов этого. Первый — оставить закрывающую скобку вне списка:

# missing.py
def foo():
    return [1, 2, 3

print(foo())

Когда вы запустите этот код, вам скажут, что есть проблема с вызовом + print () +:

$ python missing.py
  File "missing.py", line 5
    print(foo())
        ^
SyntaxError: invalid syntax

Здесь происходит то, что Python думает, что список содержит три элемента: + 1 +, + 2 + и +3 print (foo ()) +. Python использует whitespace для логической группировки вещей, и потому что нет запятой или скобки, отделяющей + 3 + от `+ print (foo ()) + `, Python объединяет их вместе как третий элемент списка.

Еще один вариант — добавить запятую после последнего элемента в списке, оставляя при этом закрывающую квадратную скобку:

# missing.py
def foo():
    return [1, 2, 3,

print(foo())

Теперь вы получаете другую трассировку:

$ python missing.py
  File "missing.py", line 6

                ^
SyntaxError: unexpected EOF while parsing

В предыдущем примере + 3 + и + print (foo ()) + были объединены в один элемент, но здесь вы видите запятую, разделяющую два. Теперь вызов + print (foo ()) + добавляется в качестве четвертого элемента списка, и Python достигает конца файла без закрывающей скобки. В трассировке говорится, что Python дошел до конца файла (EOF), но ожидал чего-то другого.

В этом примере Python ожидал закрывающую скобку (+] +), но повторяющаяся строка и каретка не очень помогают. Отсутствующие круглые скобки и скобки сложно определить Python. Иногда единственное, что вы можете сделать, это начать с каретки и двигаться назад, пока вы не сможете определить, чего не хватает или что нет.

Ошибочный синтаксис словаря

Вы видели ссылку: # syntaxerror-exception-and-traceback [ранее], чтобы вы могли получить + SyntaxError +, если не указывать запятую в словарном элементе. Другая форма недопустимого синтаксиса в словарях Python — это использование знака равенства (+ = +) для разделения ключей и значений вместо двоеточия:

>>>

>>> ages = {'pam'=24}
  File "<stdin>", line 1
    ages = {'pam'=24}
                 ^
SyntaxError: invalid syntax

Еще раз, это сообщение об ошибке не очень полезно. Повторная линия и каретка, однако, очень полезны! Они указывают прямо на характер проблемы.

Этот тип проблемы распространен, если вы путаете синтаксис Python с синтаксисом других языков программирования. Вы также увидите это, если перепутаете определение словаря с вызовом + dict () +. Чтобы это исправить, вы можете заменить знак равенства двоеточием. Вы также можете переключиться на использование + dict () +:

>>>

>>> ages = dict(pam=24)
>>> ages
{'pam': 24}

Вы можете использовать + dict () + для определения словаря, если этот синтаксис более полезен.

Использование неправильного отступа

Существует два подкласса + SyntaxError +, которые конкретно занимаются проблемами отступов:

  1. + + IndentationError

  2. + + TabError

В то время как другие языки программирования используют фигурные скобки для обозначения блоков кода, Python использует whitespace. Это означает, что Python ожидает, что пробелы в вашем коде будут вести себя предсказуемо. Он вызовет + IndentationError + , если в блоке кода есть строка с неправильным количеством пробелов:

 1 # indentation.py
 2 def foo():
 3     for i in range(10):
 4         print(i)
 5   print('done')
 6
 7 foo()

Это может быть сложно увидеть, но в строке 5 есть только два пробела с отступом. Он должен соответствовать выражению цикла + for +, которое на 4 пробела больше. К счастью, Python может легко определить это и быстро расскажет вам, в чем проблема.

Здесь также есть некоторая двусмысленность. Является ли строка + print ('done') + after циклом + for + или inside блоком цикла + for +? Когда вы запустите приведенный выше код, вы увидите следующую ошибку:

$ python indentation.py
  File "indentation.py", line 5
    print('done')
                ^
IndentationError: unindent does not match any outer indentation level

Хотя трассировка выглядит во многом как трассировка + SyntaxError +, на самом деле это + IndentationError +. Сообщение об ошибке также очень полезно. Он говорит вам, что уровень отступа строки не соответствует ни одному другому уровню отступа. Другими словами, + print ('done') + это отступ с двумя пробелами, но Python не может найти любую другую строку кода, соответствующую этому уровню отступа. Вы можете быстро это исправить, убедившись, что код соответствует ожидаемому уровню отступа.

Другой тип + SyntaxError + — это + TabError + , который вы будете видеть всякий раз, когда есть строка, содержащая либо табуляцию, либо пробелы для отступа, в то время как остальная часть файла содержит другую. Это может скрыться, пока Python не покажет это вам!

Если размер вкладки равен ширине пробелов на каждом уровне отступа, то может показаться, что все строки находятся на одном уровне. Однако, если одна строка имеет отступ с использованием пробелов, а другая — с помощью табуляции, Python укажет на это как на проблему:

 1 # indentation.py
 2 def foo():
 3     for i in range(10):
 4         print(i)
 5     print('done')
 6
 7 foo()

Здесь строка 5 имеет отступ вместо 4 пробелов. Этот блок кода может выглядеть идеально для вас, или он может выглядеть совершенно неправильно, в зависимости от настроек вашей системы.

Python, однако, сразу заметит проблему. Но прежде чем запускать код, чтобы увидеть, что Python скажет вам, что это неправильно, вам может быть полезно посмотреть пример того, как код выглядит при различных настройках ширины вкладки:

$ tabs 4 # Sets the shell tab width to 4 spaces
$ cat -n indentation.py
     1   # indentation.py
     2   def foo():
     3       for i in range(10)
     4           print(i)
     5       print('done')
     6
     7   foo()

$ tabs 8 # Sets the shell tab width to 8 spaces (standard)
$ cat -n indentation.py
     1   # indentation.py
     2   def foo():
     3       for i in range(10)
     4           print(i)
     5           print('done')
     6
     7   foo()

$ tabs 3 # Sets the shell tab width to 3 spaces
$ cat -n indentation.py
     1   # indentation.py
     2   def foo():
     3       for i in range(10)
     4           print(i)
     5      print('done')
     6
     7   foo()

Обратите внимание на разницу в отображении между тремя примерами выше. Большая часть кода использует 4 пробела для каждого уровня отступа, но строка 5 использует одну вкладку во всех трех примерах. Ширина вкладки изменяется в зависимости от настройки tab width :

  • Если ширина вкладки равна 4 , то оператор + print + будет выглядеть так, как будто он находится вне цикла + for +. Консоль выведет + 'done' + в конце цикла.

  • Если ширина табуляции равна 8 , что является стандартным для многих систем, то оператор + print + будет выглядеть так, как будто он находится внутри цикла + for +. Консоль будет печатать + 'done' + после каждого числа.

  • Если ширина табуляции равна 3 , то оператор + print + выглядит неуместно. В этом случае строка 5 не соответствует ни одному уровню отступа.

Когда вы запустите код, вы получите следующую ошибку и трассировку:

$ python indentation.py
  File "indentation.py", line 5
    print('done')
                ^
TabError: inconsistent use of tabs and spaces in indentation

Обратите внимание на + TabError + вместо обычного + SyntaxError +. Python указывает на проблемную строку и дает вам полезное сообщение об ошибке. Это ясно говорит о том, что в одном и том же файле для отступа используется смесь вкладок и пробелов.

Решение этой проблемы состоит в том, чтобы все строки в одном и том же файле кода Python использовали либо табуляции, либо пробелы, но не обе. Для приведенных выше блоков кода исправление будет состоять в том, чтобы удалить вкладку и заменить ее на 4 пробела, которые будут печатать + 'done' + после завершения цикла + for +.

Определение и вызов функций

Вы можете столкнуться с неверным синтаксисом в Python, когда вы определяете или вызываете функции. Например, вы увидите + SyntaxError +, если будете использовать точку с запятой вместо двоеточия в конце определения функции:

>>>

>>> def fun();
  File "<stdin>", line 1
    def fun();
             ^
SyntaxError: invalid syntax

Трассировка здесь очень полезна, с помощью каретки, указывающей прямо на символ проблемы. Вы можете очистить этот неверный синтаксис в Python, отключив точку с запятой для двоеточия.

Кроме того, ключевые аргументы как в определениях функций, так и в вызовах функций должны быть в правильном порядке. Аргументы ключевых слов always идут после позиционных аргументов. Отказ от использования этого порядка приведет к + SyntaxError +:

>>>

>>> def fun(a, b):
...     print(a, b)
...
>>> fun(a=1, 2)
  File "<stdin>", line 1
SyntaxError: positional argument follows keyword argument

Здесь, еще раз, сообщение об ошибке очень полезно, чтобы рассказать вам точно, что не так со строкой.

Изменение версий Python

Иногда код, который прекрасно работает в одной версии Python, ломается в более новой версии. Это связано с официальными изменениями в синтаксисе языка. Наиболее известным примером этого является оператор + print +, который перешел от ключевого слова в Python 2 к встроенной функции в Python 3:

>>>

>>> # Valid Python 2 syntax that fails in Python 3
>>> print 'hello'
  File "<stdin>", line 1
    print 'hello'
                ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print('hello')?

Это один из примеров, где появляется сообщение об ошибке, сопровождающее + SyntaxError +! Он не только сообщает вам, что в вызове + print + отсутствует скобка, но также предоставляет правильный код, который поможет вам исправить оператор.

Другая проблема, с которой вы можете столкнуться, — это когда вы читаете или изучаете синтаксис, который является допустимым синтаксисом в более новой версии Python, но недопустим в той версии, в которую вы пишете. Примером этого является синтаксис f-string, которого нет в версиях Python до 3.6:

>>>

>>> # Any version of python before 3.6 including 2.7
>>> w ='world'
>>> print(f'hello, {w}')
  File "<stdin>", line 1
    print(f'hello, {w}')
                      ^
SyntaxError: invalid syntax

В версиях Python до 3.6 интерпретатор ничего не знает о синтаксисе f-строки и просто предоставляет общее сообщение «» неверный синтаксис «`. Проблема, в данном случае, в том, что код looks прекрасно работает, но он был запущен с более старой версией Python. В случае сомнений перепроверьте, какая версия Python у вас установлена!

Синтаксис Python продолжает развиваться, и в Python 3.8 появилось несколько интересных новых функций:

  • Walrus оператор (выражения присваивания)

  • F-string синтаксис для отладки
    *https://docs.python.org/3.8/whatsnew/3.8.html#positional-only-parameters[Positional-only arguments]

Если вы хотите опробовать некоторые из этих новых функций, то вам нужно убедиться, что вы работаете в среде Python 3.8. В противном случае вы получите + SyntaxError +.

Python 3.8 также предоставляет новый* + SyntaxWarning + *. Вы увидите это предупреждение в ситуациях, когда синтаксис допустим, но все еще выглядит подозрительно. Примером этого может быть отсутствие запятой между двумя кортежами в списке. Это будет действительный синтаксис в версиях Python до 3.8, но код вызовет + TypeError +, потому что кортеж не может быть вызван:

>>>

>>> [(1,2)(2,3)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object is not callable

Этот + TypeError + означает, что вы не можете вызывать кортеж, подобный функции, что, как думает интерпретатор Python, вы делаете.

В Python 3.8 этот код все еще вызывает + TypeError +, но теперь вы также увидите + SyntaxWarning +, который указывает, как вы можете решить проблему:

>>>

>>> [(1,2)(2,3)]
<stdin>:1: SyntaxWarning: 'tuple' object is not callable; perhaps you missed a comma?
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object is not callable

Полезное сообщение, сопровождающее новый + SyntaxWarning +, даже дает подсказку (" возможно, вы пропустили запятую? "), Чтобы указать вам правильное направление!

Заключение

В этом руководстве вы увидели, какую информацию предоставляет обратная связь + SyntaxError +. Вы также видели много распространенных примеров неправильного синтаксиса в Python и каковы решения этих проблем. Это не только ускорит ваш рабочий процесс, но и сделает вас более полезным рецензентом кода!

Когда вы пишете код, попробуйте использовать IDE, который понимает синтаксис Python и предоставляет обратную связь. Если вы поместите многие из недопустимых примеров кода Python из этого руководства в хорошую IDE, то они должны выделить проблемные строки, прежде чем вы даже сможете выполнить свой код.

Получение + SyntaxError + во время изучения Python может быть неприятным, но теперь вы знаете, как понимать сообщения трассировки и с какими формами недопустимого синтаксиса в Python вы можете столкнуться. В следующий раз, когда вы получите + SyntaxError +, у вас будет больше возможностей быстро решить проблему!

I am trying to use a shutil script I found but it receives SyntaxError: unterminated string literal (detected at line 4). Any assistance would be appreciated to fixing this or new script

import shutil
import os

source = r"C:Users[username]Downloads"
dest1 = r" C:Users[username]DesktopReports14"
dest2 = r" C:Users[username]DesktopReports52"
dest3 = r" C:Users[username]DesktopReports59"

files = os.listdir(source)

for f in files:
   
 if (f.startswith("Log 14")):
        shutil.move(f, dest1)
    elif (f.startswith("Log 54")):
        shutil.move(f, dest2)

asked Dec 9, 2021 at 20:42

Ric 's user avatar

Ric Ric

211 gold badge1 silver badge3 bronze badges

8

Watch out for smart quotes . They need to be double quotes ".

You had smart quotes instead of normal ones. Indenting is also not correct.

Here is the fixed code:

import shutil
import os

source = "C:\Users\[username]\Downloads\"
dest1 = "C:\Users\[username]\Desktop\Reports\14"
dest2 = "C:\Users\[username]\Desktop\Reports\52"
dest3 = "C:\Users\[username]\Desktop\Reports\59"

files = os.listdir(source)

for f in files:
    if f.startswith("Log 14"):
        shutil.move(source + f, dest1)
    elif f.startswith("Log 54"):
        shutil.move(source + f, dest2)

wjandrea's user avatar

wjandrea

26.2k8 gold badges57 silver badges78 bronze badges

answered Dec 9, 2021 at 21:48

norbot's user avatar

norbotnorbot

1576 bronze badges

2

Smart Quotation Marks strike again.

Using BabelStone, one can determine the unicode identification of each character in your code.

The start/ending quotes you’re used to are U0022. However, the end of the URL on dest2 ends with a different character, which is U201D. This is a different character. Easiest way to fix this is to retype out the quotation marks in your IDE.

Input: "”

U+0022 : QUOTATION MARK {double quote}
U+201D : RIGHT DOUBLE QUOTATION MARK {double comma quotation mark}

answered Dec 9, 2021 at 21:43

blackbrandt's user avatar

blackbrandtblackbrandt

1,91815 silver badges31 bronze badges

import os

if os.name == 'nt':    #check if windows 
  a='\'
else:
  a='/'


source = "C:"+a+"Users"+a+"[username]"+a+"Downloads"+a

answered Jul 7, 2022 at 8:21

RITESH 's user avatar

RITESH RITESH

851 silver badge4 bronze badges

3

When working with strings in Python you may sometimes get the SyntaxError: unterminated triple-quoted string literal. Don’t be panic, we are here to help you with detailed explanations and methods to solve this problem.

Why does the “SyntaxError: unterminated triple-quoted string literal” occur?

There are two types of string syntax: single-quoted and triple-quoted. The single-quoted string starts with a single quote and ends with a single quote, or it can start with a double quote and end with a double quote. Whereas the triple-quoted one starts and ends with a triple of single quotes or double quotes (“) instead. Triple quotes string is multiline docstring; it reads a string that has multiple lines before ending it. For example:

# Single-quoted string:
single = 'LearnShareIT'
double = "LearnShareIT"
 
# Triple-quoted string:
single = '''Learn
Share
IT'''
double = """Learn
Share
IT"""

Both the four strings in the example above represent the same string literal in Python. Now you have learned what the triple-quoted string literal is. The reason you encounter this error is that you are opening a triple-quoted string literal but not closing them so the string literal never terminates. For example:

# Triple-quoted string with opening single quotes only:
single = '''Learn
Share
IT

Or another example:

# Triple-quoted string with opening double quotes:
double = """LearnShareIT

The error also happens when you open a string literal with a triple of single quotes but end with double quotes and vice versa:

# Triple-quoted string with opening double quotes and ending single quotes:
double = """LearnShareIT'''

In the following section, we will provide you with many solutions to solve this error.

How to solve the error?

Close the triple-quoted string literal

As the error indicates, you have the wrong syntax of the triple-quoted string literal, to overcome the problem, you should close them with a triple of quotes (which is the same type as the opening quotes) instead:

# Triple-quoted string using single quotes:
single = '''LearnShareIT'''
print (single)
 
# Triple-quoted string using double quotes:
double = """LearnShareIT"""
print (double)

Output

LearnShareIT
LearnShareIT

Using the single-quoted string literal

Another way to overcome this problem is to use the single-quoted string literal instead of triple-quoted one:

# Single-quoted string using single quotes:
single = 'LearnShareIT'
print (single)

# Single-quoted string using double quotes:
double = "LearnShareIT"
print (double)

Output

LearnShareIT
LearnShareIT

Make sure opening quotes and ending quotes are same type of quotes

We recommend you recheck the quotes at the beginning and ending of the string literals and make sure they are both single quotes or both double quotes, and not to mix them up. For example, if you start the string literals with three single quotes, then you must end it with three single quotes, not double quotes, and vice versa. Also check the number of quotes you have used, because you cannot open a string with three quotes but end with one quote (or two).

Summary

We have learned how to deal with the SyntaxError: unterminated triple-quoted string literal in Python. By following the rules of the string literal syntax in Python as we have explained and finding out the reasons causing this problem in our tutorial, you can easily solve it.

Maybe you are interested:

  • SyntaxError: leading zeros in decimal integer literals are not permitted in python
  • SyntaxError: f-string: unmatched ‘(‘ in Python
  • SyntaxError: positional argument follows keyword argument

I’m Edward Anderson. My current job is as a programmer. I’m majoring in information technology and 5 years of programming expertise. Python, C, C++, Javascript, Java, HTML, CSS, and R are my strong suits. Let me know if you have any questions about these programming languages.


Name of the university: HCMUT
Major: CS
Programming Languages: Python, C, C++, Javascript, Java, HTML, CSS, R

Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Cool New Features in Python 3.10

Python 3.10 is out! Volunteers have been working on the new version since May 2020 to bring you a better, faster, and more secure Python. As of October 4, 2021, the first official version is available.

Each new version of Python brings a host of changes. You can read about all of them in the documentation. Here, you’ll get to learn about the coolest new features.

In this tutorial, you’ll learn about:

  • Debugging with more helpful and precise error messages
  • Using structural pattern matching to work with data structures
  • Adding more readable and more specific type hints
  • Checking the length of sequences when using zip()
  • Calculating multivariable statistics

To try out the new features yourself, you need to run Python 3.10. You can get it from the Python homepage. Alternatively, you can use Docker with the latest Python image.

Better Error Messages

Python is often lauded for being a user-friendly programming language. While this is true, there are certain parts of Python that could be friendlier. Python 3.10 comes with a host of more precise and constructive error messages. In this section, you’ll see some of the newest improvements. The full list is available in the documentation.

Think back to writing your first Hello World program in Python:

# hello.py

print("Hello, World!)

Maybe you created a file, added the famous call to print(), and saved it as hello.py. You then ran the program, eager to call yourself a proper Pythonista. However, something went wrong:

$ python hello.py
  File "/home/rp/hello.py", line 3
    print("Hello, World!)
                        ^
SyntaxError: EOL while scanning string literal

There was a SyntaxError in the code. EOL, what does that even mean? You went back to your code, and after a bit of staring and searching, you realized that there was a missing quotation mark at the end of your string.

One of the more impactful improvements in Python 3.10 is better and more precise error messages for many common issues. If you run your buggy Hello World in Python 3.10, you’ll get a bit more help than in earlier versions of Python:

$ python hello.py
  File "/home/rp/hello.py", line 3
    print("Hello, World!)
          ^
SyntaxError: unterminated string literal (detected at line 3)

The error message is still a bit technical, but gone is the mysterious EOL. Instead, the message tells you that you need to terminate your string! There are similar improvements to many different error messages, as you’ll see below.

A SyntaxError is an error raised when your code is parsed, before it even starts to execute. Syntax errors can be tricky to debug because the interpreter provides imprecise or sometimes even misleading error messages. The following code is missing a curly brace to terminate the dictionary:

 1# unterminated_dict.py
 2
 3months = {
 4    10: "October",
 5    11: "November",
 6    12: "December"
 7
 8print(f"{months[10]} is the tenth month")

The missing closing curly brace that should have been on line 7 is an error. If you run this code with Python 3.9 or earlier, you’ll see the following error message:

  File "/home/rp/unterminated_dict.py", line 8
    print(f"{months[10]} is the tenth month")
    ^
SyntaxError: invalid syntax

The error message highlights line 8, but there are no syntactical problems in line 8! If you’ve experienced your share of syntax errors in Python, you might already know that the trick is to look at the lines before the one Python complains about. In this case, you’re looking for the missing closing brace on line 7.

In Python 3.10, the same code shows a much more helpful and precise error message:

  File "/home/rp/unterminated_dict.py", line 3
    months = {
             ^
SyntaxError: '{' was never closed

This points you straight to the offending dictionary and allows you to fix the issue in no time.

There are a few other ways to mess up dictionary syntax. A typical one is forgetting a comma after one of the items:

 1# missing_comma.py
 2
 3months = {
 4    10: "October"
 5    11: "November",
 6    12: "December",
 7}

In this code, a comma is missing at the end of line 4. Python 3.10 gives you a clear suggestion on how to fix your code:

  File "/home/real_python/missing_comma.py", line 4
    10: "October"
        ^^^^^^^^^
SyntaxError: invalid syntax. Perhaps you forgot a comma?

You can add the missing comma and have your code back up and running in no time.

Another common mistake is using the assignment operator (=) instead of the equality comparison operator (==) when you’re comparing values. Previously, this would just cause another invalid syntax message. In the newest version of Python, you get some more advice:

>>>

>>> if month = "October":
  File "<stdin>", line 1
    if month = "October":
       ^^^^^^^^^^^^^^^^^
SyntaxError: invalid syntax. Maybe you meant '==' or ':=' instead of '='?

The parser suggests that you maybe meant to use a comparison operator or an assignment expression operator instead.

Take note of another nifty improvement in Python 3.10 error messages. The last two examples show how carets (^^^) highlight the whole offending expression. Previously, a single caret symbol (^) indicated just an approximate location.

The final error message improvement that you’ll play with for now is that attribute and name errors can now offer suggestions if you misspell an attribute or a name:

>>>

>>> import math
>>> math.py
AttributeError: module 'math' has no attribute 'py'. Did you mean: 'pi'?

>>> pint
NameError: name 'pint' is not defined. Did you mean: 'print'?

>>> release = "3.10"
>>> relaese
NameError: name 'relaese' is not defined. Did you mean: 'release'?

Note that the suggestions work for both built-in names and names that you define yourself, although they may not be available in all environments. If you like these kinds of suggestions, check out BetterErrorMessages, which offers similar suggestions in even more contexts.

The improvements you’ve seen in this section are just some of the many error messages that have gotten a face-lift. The new Python will be even more user-friendly than before, and hopefully, the new error messages will save you both time and frustration going forward.

Structural Pattern Matching

The biggest new feature in Python 3.10, probably both in terms of controversy and potential impact, is structural pattern matching. Its introduction has sometimes been referred to as switch ... case coming to Python, but you’ll see that structural pattern matching is much more powerful than that.

You’ll see three different examples that together highlight why this feature is called structural pattern matching and show you how you can use this new feature:

  1. Detecting and deconstructing different structures in your data
  2. Using different kinds of patterns
  3. Matching literal patterns

Structural pattern matching is a comprehensive addition to the Python language. To give you a taste of how you can take advantage of it in your own projects, the next three subsections will dive into some of the details. You’ll also see some links that can help you explore in even more depth if you want.

Deconstructing Data Structures

At its core, structural pattern matching is about defining patterns to which your data structures can be matched. In this section, you’ll study a practical example where you’ll work with data that are structured differently, even though the meaning is the same. You’ll define several patterns, and depending on which pattern matches your data, you’ll process your data appropriately.

This section will be a bit light on explanations of the possible patterns. Instead, it will try to give you an impression of the possibilities. The next section will step back and explain the patterns in more detail.

Time to match your first pattern! The following example uses a match ... case block to find the first name of a user by extracting it from a user data structure:

>>>

>>> user = {
...     "name": {"first": "Pablo", "last": "Galindo Salgado"},
...     "title": "Python 3.10 release manager",
... }

>>> match user:
...     case {"name": {"first": first_name}}:
...         pass
...

>>> first_name
'Pablo'

You can see structural pattern matching at work in the highlighted lines. user is a small dictionary with user information. The case line specifies a pattern that user is matched against. In this case, you’re looking for a dictionary with a "name" key whose value is a new dictionary. This nested dictionary has a key called "first". The corresponding value is bound to the variable first_name.

For a practical example, say that you’re processing user data where the underlying data model changes over time. Therefore, you need to be able to process different versions of the same data.

In the next example, you’ll use data from randomuser.me. This is a great API for generating random user data that you can use during testing and development. The API is also an example of an API that has changed over time. You can still access the old versions of the API.

You may expand the collapsed section below to see how you can use requests to obtain different versions of the user data using the API:

You can get a random user from the API using requests as follows:

# random_user.py

import requests

def get_user(version="1.3"):
    """Get random users"""
    url = f"https://randomuser.me/api/{version}/?results=1"
    response = requests.get(url)
    if response:
        return response.json()["results"][0]

get_user() gets one random user in JSON format. Note the version parameter. The structure of the returned data has changed quite a bit between earlier versions like "1.1" and the current version "1.3", but in each case, the actual user data are contained in a list inside the "results" array. The function returns the first—and only—user in this list.

At the time of writing, the latest version of the API is 1.3 and the data has the following structure:

{
    "gender": "female",
    "name": {
        "title": "Miss",
        "first": "Ilona",
        "last": "Jokela"
    },
    "location": {
        "street": {
            "number": 4473,
            "name": "Mannerheimintie"
        },
        "city": "Harjavalta",
        "state": "Ostrobothnia",
        "country": "Finland",
        "postcode": 44879,
        "coordinates": {
            "latitude": "-6.0321",
            "longitude": "123.2213"
        },
        "timezone": {
            "offset": "+5:30",
            "description": "Bombay, Calcutta, Madras, New Delhi"
        }
    },
    "email": "ilona.jokela@example.com",
    "login": {
        "uuid": "632b7617-6312-4edf-9c24-d6334a6af52d",
        "username": "brownsnake482",
        "password": "biatch",
        "salt": "ofk518ZW",
        "md5": "6d589615ca44f6e583c85d45bf431c54",
        "sha1": "cd87c931d579bdff77af96c09e0eea82d1edfc19",
        "sha256": "6038ede83d4ce74116faa67fb3b1b2e6f6898e5749b57b5a0312bd46a539214a"
    },
    "dob": {
        "date": "1957-05-20T08:36:09.083Z",
        "age": 64
    },
    "registered": {
        "date": "2006-07-30T18:39:20.050Z",
        "age": 15
    },
    "phone": "07-369-318",
    "cell": "048-284-01-59",
    "id": {
        "name": "HETU",
        "value": "NaNNA204undefined"
    },
    "picture": {
        "large": "https://randomuser.me/api/portraits/women/28.jpg",
        "medium": "https://randomuser.me/api/portraits/med/women/28.jpg",
        "thumbnail": "https://randomuser.me/api/portraits/thumb/women/28.jpg"
    },
    "nat": "FI"
}

One of the members that changed between different versions is "dob", the date of birth. Note that in version 1.3, this is a JSON object with two members, "date" and "age".

Compare the result above with a version 1.1 random user:

{
    "gender": "female",
    "name": {
        "title": "miss",
        "first": "ilona",
        "last": "jokela"
    },
    "location": {
        "street": "7336 myllypuronkatu",
        "city": "kurikka",
        "state": "central ostrobothnia",
        "postcode": 53740
    },
    "email": "ilona.jokela@example.com",
    "login": {
        "username": "blackelephant837",
        "password": "sand",
        "salt": "yofk518Z",
        "md5": "b26367ea967600d679ee3e0b9bda012f",
        "sha1": "87d2910595acba5b8e8aa8b00a841bab08580e2f",
        "sha256": "73bd0d205d0dc83ae184ae222ff2e9de5ea4039119a962c4f97fabd5bbfa7aca"
    },
    "dob": "1966-04-17 11:57:01",
    "registered": "2005-08-10 10:15:01",
    "phone": "04-636-931",
    "cell": "048-828-40-15",
    "id": {
        "name": "HETU",
        "value": "366-9204"
    },
    "picture": {
        "large": "https://randomuser.me/api/portraits/women/24.jpg",
        "medium": "https://randomuser.me/api/portraits/med/women/24.jpg",
        "thumbnail": "https://randomuser.me/api/portraits/thumb/women/24.jpg"
    },
    "nat": "FI"
}

Observe that in this older format, the value of the "dob" member is a plain string.

In this example, you’ll work with the information about the date of birth (dob) for each user. The structure of these data has changed between different versions of the Random User API:

# Version 1.1
"dob": "1966-04-17 11:57:01"

# Version 1.3
"dob": {"date": "1957-05-20T08:36:09.083Z", "age": 64}

Note that in version 1.1, the date of birth is represented as a simple string, while in version 1.3, it’s a JSON object with two members: "date" and "age". Say that you want to find the age of a user. Depending on the structure of your data, you’d either need to calculate the age based on the date of birth or look up the age if it’s already available.

Traditionally, you would detect the structure of the data with an if test, maybe based on the type of the "dob" field. You can approach this differently in Python 3.10. Now, you can use structural pattern matching instead:

 1# random_user.py (continued)
 2
 3from datetime import datetime
 4
 5def get_age(user):
 6    """Get the age of a user"""
 7    match user:
 8        case {"dob": {"age": int(age)}}:
 9            return age
10        case {"dob": dob}:
11            now = datetime.now()
12            dob_date = datetime.strptime(dob, "%Y-%m-%d %H:%M:%S")
13            return now.year - dob_date.year

The match ... case construct is new in Python 3.10 and is how you perform structural pattern matching. You start with a match statement that specifies what you want to match. In this example, that’s the user data structure.

One or several case statements follow match. Each case describes one pattern, and the indented block beneath it says what should happen if there’s a match. In this example:

  • Line 8 matches a dictionary with a "dob" key whose value is another dictionary with an integer (int) item named "age". The name age captures its value.

  • Line 10 matches any dictionary with a "dob" key. The name dob captures its value.

One important feature of pattern matching is that at most one pattern will be matched. Since the pattern on line 10 matches any dictionary with "dob", it’s important that the more specific pattern on line 8 comes first.

Before looking closer at the details of the patterns and how they work, try calling get_age() with different data structures to see the result:

>>>

>>> import random_user

>>> users11 = random_user.get_user(version="1.1")
>>> random_user.get_age(users11)
55

>>> users13 = random_user.get_user(version="1.3")
>>> random_user.get_age(users13)
64

Your code can calculate the age correctly for both versions of the user data, which have different dates of birth.

Look closer at those patterns. The first pattern, {"dob": {"age": int(age)}}, matches version 1.3 of the user data:

{
    ...
    "dob": {"date": "1957-05-20T08:36:09.083Z", "age": 64},
    ...
}

The first pattern is a nested pattern. The outer curly braces say that a dictionary with the key "dob" is required. The corresponding value should be a dictionary. This nested dictionary must match the subpattern {"age": int(age)}. In other words, it needs to have an "age" key with an integer value. That value is bound to the name age.

The second pattern, {"dob": dob}, matches the older version 1.1 of the user data:

{
    ...
    "dob": "1966-04-17 11:57:01",
    ...
}

This second pattern is a simpler pattern than the first one. Again, the curly braces indicate that it will match a dictionary. However, any dictionary with a "dob" key is matched because there are no other restrictions specified. The value of that key is bound to the name dob.

The main takeaway is that you can describe the structure of your data using mostly familiar notation. One striking change, though, is that you can use names like dob and age, which aren’t yet defined. Instead, values from your data are bound to these names when a pattern matches.

You’ve explored some of the power of structural pattern matching in this example. In the next section, you’ll dive a bit more into the details.

Using Different Kinds of Patterns

You’ve seen an example of how you can use patterns to effectively unravel complicated data structures. Now, you’ll take a step back and look at the building blocks that make up this new feature. Many things come together to make it work. In fact, there are three Python Enhancement Proposals (PEPs) that describe structural pattern matching:

  1. PEP 634: Specification
  2. PEP 635: Motivation and Rationale
  3. PEP 636: Tutorial

These documents give you a lot of background and detail if you’re interested in a deeper dive than what follows.

Patterns are at the center of structural pattern matching. In this section, you’ll learn about some of the different kinds of patterns that exist:

  • Mapping patterns match mapping structures like dictionaries.
  • Sequence patterns match sequence structures like tuples and lists.
  • Capture patterns bind values to names.
  • AS patterns bind the value of subpatterns to names.
  • OR patterns match one of several different subpatterns.
  • Wildcard patterns match anything.
  • Class patterns match class structures.
  • Value patterns match values stored in attributes.
  • Literal patterns match literal values.

You already used several of them in the example in the previous section. In particular, you used mapping patterns to unravel data stored in dictionaries. In this section, you’ll learn more about how some of these work. All the details are available in the PEPs mentioned above.

A capture pattern is used to capture a match to a pattern and bind it to a name. Consider the following recursive function that sums a list of numbers:

 1def sum_list(numbers):
 2    match numbers:
 3        case []:
 4            return 0
 5        case [first, *rest]:
 6            return first + sum_list(rest)

The first case on line 3 matches the empty list and returns 0 as its sum. The second case on line 5 uses a sequence pattern with two capture patterns to match lists with one or more elements. The first element in the list is captured and bound to the name first. The second capture pattern, *rest, uses unpacking syntax to match any number of elements. rest will bind to a list containing all elements of numbers except the first one.

sum_list() calculates the sum of a list of numbers by recursively adding the first number in the list and the sum of the rest of the numbers. You can use it as follows:

>>>

>>> sum_list([4, 5, 9, 4])
22

The sum of 4 + 5 + 9 + 4 is correctly calculated to be 22. As an exercise for yourself, you can try to trace the recursive calls to sum_list() to make sure you understand how the code sums the whole list.

sum_list() handles summing up a list of numbers. Observe what happens if you try to sum anything that isn’t a list:

>>>

>>> print(sum_list("4594"))
None

>>> print(sum_list(4594))
None

Passing a string or a number to sum_list() returns None. This occurs because none of the patterns match, and the execution continues after the match block. That happens to be the end of the function, so sum_list() implicitly returns None.

Often, though, you want to be alerted about failed matches. You can add a catchall pattern as the final case that handles this by raising an error, for example. You can use the underscore (_) as a wildcard pattern that matches anything without binding it to a name. You can add some error handling to sum_list() as follows:

def sum_list(numbers):
    match numbers:
        case []:
            return 0
        case [first, *rest]:
            return first + sum_list(rest)
        case _:
            wrong_type = numbers.__class__.__name__
            raise ValueError(f"Can only sum lists, not {wrong_type!r}")

The final case will match anything that doesn’t match the first two patterns. This will raise a descriptive error, for instance, if you try to calculate sum_list(4594). This is useful when you need to alert your users that some input was not matched as expected.

Your patterns are still not foolproof, though. Consider what happens if you try to sum a list of strings:

>>>

>>> sum_list(["45", "94"])
TypeError: can only concatenate str (not "int") to str

The base case returns 0, so therefore the summing only works for types that you can add with numbers. Python doesn’t know how to add numbers and text strings together. You can restrict your pattern to only match integers using a class pattern:

def sum_list(numbers):
    match numbers:
        case []:
            return 0
        case [int(first), *rest]:
            return first + sum_list(rest)
        case _:
            raise ValueError(f"Can only sum lists of numbers")

Adding int() around first makes sure that the pattern only matches if the value is an integer. This might be too restrictive, though. Your function should be able to sum both integers and floating-point numbers, so how can you allow this in your pattern?

To check whether at least one out of several subpatterns match, you can use an OR pattern. OR patterns consist of two or more subpatterns, and the pattern matches if at least one of the subpatterns does. You can use this to match when the first element is either of type int or type float:

def sum_list(numbers):
    match numbers:
        case []:
            return 0
        case [int(first) | float(first), *rest]:
            return first + sum_list(rest)
        case _:
            raise ValueError(f"Can only sum lists of numbers")

You use the pipe symbol (|) to separate the subpatterns in an OR pattern. Your function now allows summing a list of floating-point numbers:

>>>

>>> sum_list([45.94, 46.17, 46.72])
138.82999999999998

There’s a lot of power and flexibility within structural pattern matching, even more than what you’ve seen so far. Some things that aren’t covered in this overview are:

  • Using guards to restrict patterns
  • Using AS patterns to capture the value of subpatterns
  • Using class patterns to match custom enums and data classes

If you’re interested, have a look in the documentation to learn more about these features as well. In the next section, you’ll learn about literal patterns and value patterns.

Matching Literal Patterns

A literal pattern is a pattern that matches a literal object like an explicit string or number. In a sense, this is the most basic kind of pattern and allows you to emulate switch ... case statements seen in other languages. The following example matches a specific name:

def greet(name):
    match name:
        case "Guido":
            print("Hi, Guido!")
        case _:
            print("Howdy, stranger!")

The first case matches the literal string "Guido". In this case, you use _ as a wildcard to print a generic greeting whenever name is not "Guido". Such literal patterns can sometimes take the place of if ... elif ... else constructs and can play the same role that switch ... case does in some other languages.

One limitation with structural pattern matching is that you can’t directly match values stored in variables. Say that you’ve defined bdfl = "Guido". A pattern like case bdfl: will not match "Guido". Instead, this will be interpreted as a capture pattern that matches anything and binds that value to bdfl, effectively overwriting the old value.

You can, however, use a value pattern to match stored values. A value pattern looks a bit like a capture pattern but uses a previously defined dotted name that holds the value that will be matched against.

You can, for example, use an enumeration to create such dotted names:

import enum

class Pythonista(str, enum.Enum):
    BDFL = "Guido"
    FLUFL = "Barry"

def greet(name):
    match name:
        case Pythonista.BDFL:
            print("Hi, Guido!")
        case _:
            print("Howdy, stranger!")

The first case now uses a value pattern to match Pythonista.BDFL, which is "Guido". Note that you can use any dotted name in a value pattern. You could, for example, have used a regular class or a module instead of the enumeration.

To see a bigger example of how to use literal patterns, consider the game of FizzBuzz. This is a counting game where you should replace some numbers with words according to the following rules:

  • You replace numbers divisible by 3 with fizz.
  • You replace numbers divisible by 5 with buzz.
  • You replace numbers divisible by both 3 and 5 with fizzbuzz.

FizzBuzz is sometimes used to introduce conditionals in programming education and as a screening problem in interviews. Even though a solution is quite straightforward, Joel Grus has written a full book about different ways to program the game.

A typical solution in Python will use if ... elif ... else as follows:

def fizzbuzz(number):
    mod_3 = number % 3
    mod_5 = number % 5

    if mod_3 == 0 and mod_5 == 0:
        return "fizzbuzz"
    elif mod_3 == 0:
        return "fizz"
    elif mod_5 == 0:
        return "buzz"
    else:
        return str(number)

The % operator calculates the modulus, which you can use to test divisibility. Namely, if a modulus b is 0 for two numbers a and b, then a is divisible by b.

In fizzbuzz(), you calculate number % 3 and number % 5, which you then use to test for divisibility with 3 and 5. Note that you must do the test for divisibility with both 3 and 5 first. If not, numbers that are divisible by both 3 and 5 will be covered by either the "fizz" or the "buzz" cases instead.

You can check that your implementation gives the expected result:

>>>

>>> fizzbuzz(3)
fizz

>>> fizzbuzz(14)
14

>>> fizzbuzz(15)
fizzbuzz

>>> fizzbuzz(92)
92

>>> fizzbuzz(65)
buzz

You can confirm for yourself that 3 is divisible by 3, 65 is divisible by 5, and 15 is divisible by both 3 and 5, while 14 and 92 aren’t divisible by either 3 or 5.

An if ... elif ... else structure where you’re comparing one or a few variables several times over is quite straightforward to rewrite using pattern matching instead. For example, you can do the following:

def fizzbuzz(number):
    mod_3 = number % 3
    mod_5 = number % 5

    match (mod_3, mod_5):
        case (0, 0):
            return "fizzbuzz"
        case (0, _):
            return "fizz"
        case (_, 0):
            return "buzz"
        case _:
            return str(number)

You match on both mod_3 and mod_5. Each case pattern then matches either the literal number 0 or the wildcard _ on the corresponding values.

Compare and contrast this version with the previous one. Note how the pattern (0, 0) corresponds to the test mod_3 == 0 and mod_5 == 0, while (0, _) corresponds to mod_3 == 0.

As you saw earlier, you can use an OR pattern to match on several different patterns. For example, since mod_3 can only take the values 0, 1, and 2, you can replace case (_, 0) with case (1, 0) | (2, 0). Remember that (0, 0) has already been covered.

The Python core developers have consciously chosen not to include switch ... case statements in the language earlier. However, there are some third-party packages that do, like switchlang, which adds a switch command that also works on earlier versions of Python.

Type Unions, Aliases, and Guards

Reliably, each new Python release brings some improvements to the static typing system. Python 3.10 is no exception. In fact, four different PEPs about typing accompany this new release:

  1. PEP 604: Allow writing union types as X | Y
  2. PEP 613: Explicit Type Aliases
  3. PEP 647: User-Defined Type Guards
  4. PEP 612: Parameter Specification Variables

PEP 604 will probably be the most widely used of these changes going forward, but you’ll get a brief overview of each of the features in this section.

You can use union types to declare that a variable can have one of several different types. For example, you’ve been able to type hint a function calculating the mean of a list of numbers, floats, or integers as follows:

from typing import List, Union

def mean(numbers: List[Union[float, int]]) -> float:
    return sum(numbers) / len(numbers)

The annotation List[Union[float, int]] means that numbers should be a list where each element is either a floating-point number or an integer. This works well, but the notation is a bit verbose. Also, you need to import both List and Union from typing.

In Python 3.10, you can replace Union[float, int] with the more succinct float | int. Combine this with the ability to use list instead of typing.List in type hints, which Python 3.9 introduced. You can then simplify your code while keeping all the type information:

def mean(numbers: list[float | int]) -> float:
    return sum(numbers) / len(numbers)

The annotation of numbers is easier to read now, and as an added bonus, you didn’t need to import anything from typing.

A special case of union types is when a variable can have either a specific type or be None. You can annotate such optional types either as Union[None, T] or, equivalently, Optional[T] for some type T. There is no new, special syntax for optional types, but you can use the new union syntax to avoid importing typing.Optional:

In this example, address is allowed to be either None or a string.

You can also use the new union syntax at runtime in isinstance() or issubclass() tests:

>>>

>>> isinstance("mypy", str | int)
True

>>> issubclass(str, int | float | bytes)
False

Traditionally, you’ve used tuples to test for several types at once—for example, (str, int) instead of str | int. This old syntax will still work.

Type aliases allow you to quickly define new aliases that can stand in for more complicated type declarations. For example, say that you’re representing a playing card using a tuple of suit and rank strings and a deck of cards by a list of such playing card tuples. A deck of cards is then type hinted as list[tuple[str, str]].

To simplify type annotation, you define type aliases as follows:

Card = tuple[str, str]
Deck = list[Card]

This usually works okay. However, it’s often not possible for the type checker to know whether such a statement is a type alias or just the definition of a regular global variable. To help the type checker—or really, help the type checker help you—you can now explicitly annotate type aliases:

from typing import TypeAlias

Card: TypeAlias = tuple[str, str]
Deck: TypeAlias = list[Card]

Adding the TypeAlias annotation clarifies the intention, both to a type checker and to anyone reading your code.

Type guards are used to narrow down union types. The following function takes in either a string or None but always returns a tuple of strings representing a playing card:

def get_ace(suit: str | None) -> tuple[str, str]:
    if suit is None:
        suit = "♠"
    return (suit, "A")

The highlighted line works as a type guard, and static type checkers are able to realize that suit is necessarily a string when it’s returned.

Currently, the type checkers can only use a few different constructs to narrow down union types in this way. With the new typing.TypeGuard, you can annotate custom functions that can be used to narrow down union types:

from typing import Any, TypeAlias, TypeGuard

Card: TypeAlias = tuple[str, str]
Deck: TypeAlias = list[Card]

def is_deck_of_cards(obj: Any) -> TypeGuard[Deck]:
    # Return True if obj is a deck of cards, otherwise False

is_deck_of_cards() should return True or False depending on whether obj represents a Deck object or not. You can then use your guard function, and the type checker will be able to narrow down the types correctly:

def get_score(card_or_deck: Card | Deck) -> int:
    if is_deck_of_cards(card_or_deck):
        # Calculate score of a deck of cards
    ...

Inside of the if block, the type checker knows that card_or_deck is, in fact, of the type Deck. See PEP 647 for more details.

The final new typing feature is Parameter Specification Variables, which is related to type variables. Consider the definition of a decorator. In general, it looks something like the following:

import functools
from typing import Any, Callable, TypeVar

R = TypeVar("R")

def decorator(func: Callable[..., R]) -> Callable[..., R]:
    @functools.wraps(func)
    def wrapper(*args: Any, **kwargs: Any) -> R:
        ...
    return wrapper

The annotations mean that the function returned by the decorator is a callable with some parameters and the same return type, R, as the function passed into the decorator. The ellipsis (...) in the function header correctly allows any number of parameters, and each of those parameters can be of any type. However, there’s no validation that the returned callable has the same parameters as the function that was passed in. In practice, this means that type checkers aren’t able to check decorated functions properly.

Unfortunately, you can’t use TypeVar for the parameters because you don’t know how many parameters the function will have. In Python 3.10, you’ll have access to ParamSpec in order to type hint these kinds of callables properly. ParamSpec works similarly to TypeVar but stands in for several parameters at once. You can rewrite your decorator as follows to take advantage of ParamSpec:

import functools
from typing import Callable, ParamSpec, TypeVar

P = ParamSpec("P")
R = TypeVar("R")

def decorator(func: Callable[P, R]) -> Callable[P, R]:
    @functools.wraps(func)
    def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
        ...
    return wrapper

Note that you also use P when you annotate wrapper(). You can also use the new typing.Concatenate to add types to ParamSpec. See the documentation and PEP 612 for details and examples.

Stricter Zipping of Sequences

zip() is a built-in function in Python that can combine elements from several sequences. Python 3.10 introduces the new strict parameter, which adds a runtime test to check that all sequences being zipped have the same length.

As an example, consider the following table of Lego sets:

One way to represent these data in plain Python would be with each column as a list. It could look something like this:

>>>

>>> names = ["Louvre", "Diagon Alley", "Saturn V", "Millennium Falcon", "NYC"]
>>> set_numbers = ["21024", "75978", "92176", "75192", "21028"]
>>> num_pieces = [695, 5544, 1969, 7541, 598]

Note that you have three independent lists, but there’s an implicit correspondence between their elements. The first name ("Louvre"), the first set number ("21024"), and the first number of pieces (695) all describe the first Lego set.

zip() can be used to iterate over these three lists in parallel:

>>>

>>> for name, num, pieces in zip(names, set_numbers, num_pieces):
...     print(f"{name} ({num}): {pieces} pieces")
...
Louvre (21024): 695 pieces
Diagon Alley (75978): 5544 pieces
Saturn V (92176): 1969 pieces
Millennium Falcon (75192): 7541 pieces
NYC (21028): 598 pieces

Note how each line collects information from all three lists and shows information about one particular set. This is a very common pattern that’s used in a lot of different Python code, including in the standard library.

You can also add list() to collect the contents of all three lists in a single, nested list of tuples:

>>>

>>> list(zip(names, set_numbers, num_pieces))
[('Louvre', '21024', 695),
 ('Diagon Alley', '75978', 5544),
 ('Saturn V', '92176', 1969),
 ('Millennium Falcon', '75192', 7541),
 ('NYC', '21028', 598)]

Note how the nested list closely resembles the original table.

The dark side of using zip() is that it’s quite easy to introduce a subtle bug that can be hard to discover. Note what happens if there’s a missing item in one of your lists:

>>>

>>> set_numbers = ["21024", "75978", "75192", "21028"]  # Saturn V missing

>>> list(zip(names, set_numbers, num_pieces))
[('Louvre', '21024', 695),
 ('Diagon Alley', '75978', 5544),
 ('Saturn V', '75192', 1969),
 ('Millennium Falcon', '21028', 7541)]

All the information about the New York City set disappeared! Additionally, the set numbers for Saturn V and Millennium Falcon are wrong. If your datasets are bigger, these kinds of errors can be very hard to discover. And even when you observe that something’s wrong, it’s not always easy to diagnose and fix.

The issue is that you assumed that the three lists have the same number of elements and that the information is in the same order in each list. After set_numbers gets corrupted, this assumption is no longer true.

PEP 618 introduces a new strict keyword parameter to zip() that you can use to confirm all sequences have the same length. In your example, it would raise an error alerting you to the corrupted list:

>>>

>>> list(zip(names, set_numbers, num_pieces, strict=True))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: zip() argument 2 is shorter than argument 1

When the iteration reaches the New York City Lego set, the second argument set_numbers is already exhausted, while there are still elements left in the first argument names. Instead of silently giving the wrong result, your code fails with an error, and you can take action to find and fix the mistake.

There are use cases when you want to combine sequences of unequal length. Expand the box below to see how zip() and itertools.zip_longest() handle these:

The following idiom divides the Lego sets into pairs:

>>>

>>> num_per_group = 2
>>> list(zip(*[iter(names)] * num_per_group))
[('Louvre', 'Diagon Alley'), ('Saturn V', 'Millennium Falcon')]

There are five sets, a number that doesn’t divide evenly into pairs. In this case, the default behavior of zip(), where the last element is dropped, might make sense. You could use strict=True here as well, but that would raise an error when your list can’t be split into pairs. A third option, which could be the best in this case, is to use zip_longest() from the itertools standard library.

As the name suggests, zip_longest() combines sequences until the longest sequence is exhausted. If you use zip_longest() to divide the Lego sets, it becomes more explicit that New York City doesn’t have any pairing:

>>>

>>> from itertools import zip_longest

>>> list(zip_longest(*[iter(names)] * num_per_group, fillvalue=""))
[('Louvre', 'Diagon Alley'),
 ('Saturn V', 'Millennium Falcon'),
 ('NYC', '')]

Note that 'NYC' shows up in the last tuple together with an empty string. You can control what’s filled in for missing values with the fillvalue parameter.

While strict is not really adding any new functionality to zip(), it can help you avoid those hard-to-find bugs.

New Functions in the statistics Module

The statistics module was added to the standard library all the way back in 2014 with the release of Python 3.4. The intent of statistics is to make statistical calculations at the level of graphing calculators available in Python.

Python 3.10 adds a few multivariable functions to statistics:

  • correlation() to calculate Pearson’s correlation coefficient for two variables
  • covariance() to calculate sample covariance for two variables
  • linear_regression() to calculate the slope and intercept in a linear regression

You can use each function to describe a certain aspect of the relationship between two variables. As an example, say that you have data from a set of blog posts—the number of words in each blog post and the number of views each post has had over some time period:

>>>

>>> words = [7742, 11539, 16898, 13447, 4608, 6628, 2683, 6156, 2623, 6948]
>>> views = [8368, 5901, 3978, 3329, 2611, 2096, 1515, 1177, 814, 467]

You now want to investigate whether there’s any (linear) relationship between the number of words and number of views. In Python 3.10, you can calculate the correlation between words and views with the new correlation() function:

>>>

>>> import statistics

>>> statistics.correlation(words, views)
0.454180067865917

The correlation between two variables is always a number between -1 and 1. If it’s close to 0, then there’s little correspondence between them, while a correlation close to -1 or 1 indicates that the behaviors of the two variables tend to follow each other. In this example, a correlation of 0.45 indicates that there’s a tendency for posts with more words to have more views, although it’s not a strong connection.

You can also calculate the covariance between words and views. The covariance is another measure of the joint variability between two variables. You can calculate it with covariance():

>>>

>>> import statistics

>>> statistics.covariance(words, views)
5292289.977777777

In contrast to correlation, covariance is an absolute measure. It should be interpreted in the context of the variability within the variables themselves. In fact, you can normalize the covariance by the standard deviation of each variable to recover Pearson’s correlation coefficient:

>>>

>>> import statistics

>>> cov = statistics.covariance(words, views)
>>> σ_words, σ_views = statistics.stdev(words), statistics.stdev(views)
>>> cov / (σ_words * σ_views)
0.454180067865917

Note that this matches your earlier correlation coefficient exactly.

A third way of looking at the linear correspondence between the two variables is through simple linear regression. You do the linear regression by calculating two numbers, slope and intercept, so that the (squared) error is minimized in the approximation number of views = slope × number of words + intercept.

In Python 3.10, you can use linear_regression():

>>>

>>> import statistics

>>> statistics.linear_regression(words, views)
LinearRegression(slope=0.2424443064354672, intercept=1103.6954940247645)

Based on this regression, a post with 10,074 words could expect about 0.2424 × 10074 + 1104 = 3546 views. However, as you saw earlier, the correlation between the number of words and the number of views is quite weak. Therefore, you shouldn’t expect this prediction to be very accurate.

The LinearRegression object is a named tuple. This means that you can unpack the slope and intercept directly:

>>>

>>> import statistics

>>> slope, intercept = statistics.linear_regression(words, views)
>>> slope * 10074 + intercept
3546.0794370556605

Here, you use slope and intercept to predict the number of views on a blog post with 10,074 words.

You still want to use some of the more advanced packages like pandas and statsmodels if you do a lot of statistical analysis. With the new additions to statistics in Python 3.10, however, you have the chance to do basic analysis more easily without bringing in third-party dependencies.

Other Pretty Cool Features

So far, you’ve seen the biggest and most impactful new features in Python 3.10. In this section, you’ll get a glimpse of a few of the other changes that the new version brings along. If you’re curious about all the changes made for this new version, check out the documentation.

Default Text Encodings

When you open a text file, the default encoding used to interpret the characters is system dependent. In particular, locale.getpreferredencoding() is used. On Mac and Linux, this usually returns "UTF-8", while the result on Windows is more varied.

You should therefore always specify an encoding when you attempt to open a text file:

with open("some_file.txt", mode="r", encoding="utf-8") as file:
    ...  # Do something with file

If you don’t explicitly specify an encoding, the preferred locale encoding is used, and you could experience that a file that can be read on one computer fails to open on another.

Python 3.7 introduced UTF-8 mode, which allows you to force your programs to use UTF-8 encoding independent of the locale encoding. You can enable UTF-8 mode by giving the -X utf8 command-line option to the python executable or by setting the PYTHONUTF8 environment variable.

In Python 3.10, you can activate a warning that will tell you when a text file is opened without a specified encoding. Consider the following script, which doesn’t specify an encoding:

# mirror.py

import pathlib
import sys

def mirror_file(filename):
    for line in pathlib.Path(filename).open(mode="r"):
        print(f"{line.rstrip()[::-1]:>72}")

if __name__ == "__main__":
    for filename in sys.argv[1:]:
        mirror_file(filename)

The program will echo one or more text files back to the console, but with each line reversed. Run the program on itself with the encoding warning enabled:

$ python -X warn_default_encoding mirror.py mirror.py
/home/rp/mirror.py:7: EncodingWarning: 'encoding' argument not specified
  for line in pathlib.Path(filename).open(mode="r"):
                                                             yp.rorrim #

                                                          bilhtap tropmi
                                                              sys tropmi

                                              :)emanelif(elif_rorrim fed
                  :)"r"=edom(nepo.)emanelif(htaP.bilhtap ni enil rof
                             )"}27>:]1-::[)(pirtsr.enil{"f(tnirp

                                              :"__niam__" == __eman__ fi
                                       :]:1[vgra.sys ni emanelif rof
                                           )emanelif(elif_rorrim

Note the EncodingWarning printed to the console. The command-line option -X warn_default_encoding activates it. The warning will disappear if you specify an encoding—for example, encoding="utf-8"—when you open the file.

There are times when you want to use the user-defined local encoding. You can still do so by explicitly using encoding="locale". However, it’s recommended to use UTF-8 whenever possible. You can check out PEP 597 for more information.

Asynchronous Iteration

Asynchronous programming is a powerful programming paradigm that’s been available in Python since version 3.5. You can recognize an asynchronous program by its use of the async keyword or special methods that start with .__a like .__aiter__() or .__aenter__().

In Python 3.10, two new asynchronous built-in functions are added: aiter() and anext(). In practice, these functions call the .__aiter__() and .__anext__() special methods—analogous to the regular iter() and next()—so no new functionality is added. These are convenience functions that make your code more readable.

In other words, in the newest version of Python, the following statements—where things is an asynchronous iterable—are equivalent:

>>>

>>> it = things.__aiter__()
>>> it = aiter(things)

In either case, it ends up as an asynchronous iterator. Expand the following box to see a complete example using aiter() and anext():

The following program counts the number of lines in several files. In practice, you use Python’s ability to iterate over files to count the number of lines. The script uses asynchronous iteration in order to handle several files concurrently.

Note that you need to install the third-party aiofiles package with pip before running this code:

# line_count.py

import asyncio
import sys
import aiofiles

async def count_lines(filename):
    """Count the number of lines in the given file"""
    num_lines = 0

    async with aiofiles.open(filename, mode="r") as file:
        lines = aiter(file)
        while True:
            try:
                await anext(lines)
                num_lines += 1
            except StopAsyncIteration:
                break

    print(f"{filename}: {num_lines}")

async def count_all_files(filenames):
    """Asynchronously count lines in all files"""
    tasks = [asyncio.create_task(count_lines(f)) for f in filenames]
    await asyncio.gather(*tasks)

if __name__ == "__main__":
    asyncio.run(count_all_files(filenames=sys.argv[1:]))

asyncio is used to create and run one asynchronous task per filename. count_lines() opens one file asynchronously and iterates through it using aiter() and anext() in order to count the number of lines.

See PEP 525 to learn more about asynchronous iteration.

Context Manager Syntax

Context managers are great for managing resources in your programs. Until recently, though, their syntax has included an uncommon wart. You haven’t been allowed to use parentheses to break long with statements like this:

with (
    read_path.open(mode="r", encoding="utf-8") as read_file,
    write_path.open(mode="w", encoding="utf-8") as write_file,
):
    ...

In earlier versions of Python, this causes an invalid syntax error message. Instead, you need to use a backslash () if you want to control where you break your lines:

with read_path.open(mode="r", encoding="utf-8") as read_file, 
     write_path.open(mode="w", encoding="utf-8") as write_file:
    ...

While explicit line continuation with backslashes is possible in Python, PEP 8 discourages it. The Black formatting tool avoids backslashes completely.

In Python 3.10, you’re now allowed to add parentheses around with statements to your heart’s content. Especially if you’re employing several context managers at once, like in the example above, this can help improve the readability of your code. Python’s documentation shows a few other possibilities with this new syntax.

One small fun fact: parenthesized with statements actually work in version 3.9 of CPython. Their implementation came almost for free with the introduction of the PEG parser in Python 3.9. The reason that this is called a Python 3.10 feature is that using the PEG parser is voluntary in Python 3.9, while Python 3.9, with the old LL(1) parser, doesn’t support parenthesized with statements.

Modern and Secure SSL

Security can be challenging! A good rule of thumb is to avoid rolling your own security algorithms and instead rely on established packages.

Python uses OpenSSL for different cryptographic features that are exposed in the hashlib, hmac, and ssl standard library modules. Your system can manage OpenSSL, or a Python installer can include OpenSSL.

Python 3.9 supports using any of the OpenSSL versions 1.0.2 LTS, 1.1.0, and 1.1.1 LTS. Both OpenSSL 1.0.2 LTS and OpenSSL 1.1.0 are past their lifetime, so Python 3.10 will only support OpenSSL 1.1.1 LTS, as described in the following table:

Open SSL version Python 3.9 Python 3.10 End-of-life
1.0.2 LTS December 20, 2019
1.1.0 September 10, 2019
1.1.1 LTS September 11, 2023

This end of support for older versions will only affect you if you need to upgrade the system Python on an older operating system. If you use macOS or Windows, or if you install Python from python.org or use (Ana)Conda, you’ll see no change.

However, Ubuntu 18.04 LTS uses OpenSSL 1.1.0, while Red Hat Enterprise Linux (RHEL) 7 and CentOS 7 both use OpenSSL 1.0.2 LTS. If you need to run Python 3.10 on these systems, you should look at installing it yourself using either the python.org or Conda installer.

Dropping support for older versions of OpenSSL will make Python more secure. It’ll also help the Python developers in that code will be easier to maintain. Ultimately, this helps you because your Python experience will be more robust. See PEP 644 for more details.

More Information About Your Python Interpreter

The sys module contains a lot of information about your system, the current Python runtime, and the script currently being executed. You can, for example, inquire about the paths where Python looks for modules with sys.path and see all modules that have been imported in the current session with sys.modules.

In Python 3.10, sys has two new attributes. First, you can now get a list of the names of all modules in the standard library:

>>>

>>> import sys

>>> len(sys.stdlib_module_names)
302

>>> sorted(sys.stdlib_module_names)[-5:]
['zipapp', 'zipfile', 'zipimport', 'zlib', 'zoneinfo']

Here, you can see that there are around 300 modules in the standard library, several of which start with the letter z. Note that only top-level modules and packages are listed. Subpackages like importlib.metadata don’t get a separate entry.

You will probably not be using sys.stdlib_module_names all that often. Still, the list ties in nicely with similar introspection features like keyword.kwlist and sys.builtin_module_names.

One possible use case for the new attribute is to identify which of the currently imported modules are third-party dependencies:

>>>

>>> import pandas as pd
>>> import sys

>>> {m for m in sys.modules if "." not in m} - sys.stdlib_module_names
{'__main__', 'numpy', '_cython_0_29_24', 'dateutil', 'pytz',
 'six', 'pandas', 'cython_runtime'}

You find the imported top-level modules by looking at names in sys.modules that don’t have a dot in their name. By comparing them to the standard library module names, you find that numpy, dateutil, and pandas are some of the imported third-party modules in this example.

The other new attribute is sys.orig_argv. This is related to sys.argv, which holds the command-line arguments given to your program when it was started. In contrast, sys.orig_argv lists the command-line arguments passed to the python executable itself. Consider the following example:

# argvs.py

import sys

print(f"argv: {sys.argv}")
print(f"orig_argv: {sys.orig_argv}")

This script echoes back the orig_argv and argv lists. Run it to see how the information is captured:

$ python -X utf8 -O argvs.py 3.10 --upgrade
argv: ['argvs.py', '3.10', '--upgrade']
orig_argv: ['python', '-X', 'utf8', '-O', 'argvs.py', '3.10', '--upgrade']

Essentially, all arguments—including the name of the Python executable—end up in orig_argv. This is in contrast to argv, which only contains the arguments that aren’t handled by python itself.

Again, this is not a feature that you’ll use a lot. If your program needs to concern itself with how it’s being run, you’re usually better off relying on information that’s already exposed instead of trying to parse this list. For example, you can choose to use the strict zip() mode only when your script is not running with the optimized flag, -O, like this:

list(zip(names, set_numbers, num_pieces, strict=__debug__))

The __debug__ flag is set when the interpreter starts. It’ll be False if you’re running python with -O or -OO specified, and True otherwise. Using __debug__ is usually preferable to "-O" not in sys.orig_argv or some similar construct.

One of the motivating use cases for sys.orig_argv is that you can use it to spawn a new Python process with the same or modified command-line arguments as your current process.

Future Annotations

Annotations were introduced in Python 3 to give you a way to attach metadata to variables, function parameters, and return values. They are most commonly used to add type hints to your code.

One challenge with annotations is that they must be valid Python code. For one thing, this makes it hard to type hint recursive classes. PEP 563 introduced postponed evaluation of annotations, making it possible to annotate with names that haven’t yet been defined. Since Python 3.7, you can activate postponed evaluation of annotations with a __future__ import:

from __future__ import annotations

The intention was that postponed evaluation would become the default at some point in the future. After the 2020 Python Language Summit, it was decided to make this happen in Python 3.10.

However, after more testing, it became clear that postponed evaluation didn’t work well for projects that use annotations at runtime. Key people in the FastAPI and the Pydantic projects voiced their concerns. At the last minute, it was decided to reschedule these changes for Python 3.11.

To ease the transition into future behavior, a few changes have been made in Python 3.10 as well. Most importantly, a new inspect.get_annotations() function has been added. You should call this to access annotations at runtime:

>>>

>>> import inspect

>>> def mean(numbers: list[int | float]) -> float:
...     return sum(numbers) / len(numbers)
...

>>> inspect.get_annotations(mean)
{'numbers': list[int | float], 'return': <class 'float'>}

Check out Annotations Best Practices for details.

How to Detect Python 3.10 at Runtime

Python 3.10 is the first version of Python with a two-digit minor version number. While this is mostly an interesting fun fact and an indication that Python 3 has been around for quite some time, it does also have some practical consequences.

When your code needs to do something specific based on the version of Python at runtime, you’ve gotten away with doing a lexicographical comparison of version strings until now. While it’s never been good practice, it’s been possible to do the following:

# bad_version_check.py

import sys

# Don't do the following
if sys.version < "3.6":
    raise SystemExit("Only Python 3.6 and above is supported")

In Python 3.10, this code will raise SystemExit and stop your program. This happens because, as strings, "3.10" is less than "3.6".

The correct way to compare version numbers is to use tuples of numbers:

# good_version_check.py

import sys

if sys.version_info < (3, 6):
    raise SystemExit("Only Python 3.6 and above is supported")

sys.version_info is a tuple object you can use for comparisons.

If you’re doing these kinds of comparisons in your code, you should check your code with flake8-2020 to make sure you’re handling versions correctly:

$ python -m pip install flake8-2020

$ flake8 bad_version_check.py good_version_check.py
bad_version_check.py:3:4: YTT103 `sys.version` compared to string
                          (python3.10), use `sys.version_info`

With the flake8-2020 extension activated, you’ll get a recommendation about replacing sys.version with sys.version_info.

So, Should You Upgrade to Python 3.10?

You’ve now seen the coolest features of the newest and latest version of Python. The question now is whether you should upgrade to Python 3.10, and if yes, when you should do so. There are two different aspects to consider when thinking about upgrading to Python 3.10:

  1. Should you upgrade your environment so that you run your code with the Python 3.10 interpreter?
  2. Should you write your code using the new Python 3.10 features?

Clearly, if you want to test out structural pattern matching or any of the other cool new features you’ve read about here, you need Python 3.10. It’s possible to install the latest version side by side with your current Python version. A straightforward way to do this is to use an environment manager like pyenv or Conda. You can also use Docker to run Python 3.10 without installing it locally.

Python 3.10 has been through about five months of beta testing, so there shouldn’t be any big issues with starting to use it for your own development. You may find that some of your dependencies don’t immediately have wheels for Python 3.10 available, which makes them more cumbersome to install. But in general, using the newest Python for local development is fairly safe.

As always, you should be careful before upgrading your production environment. Be vigilant about testing that your code runs well on the new version. In particular, you want to be on the lookout for features that are deprecated or removed.

Whether you can start using the new features in your code or not depends on your user base and the environment where your code is running. If you can guarantee that Python 3.10 is available, then there’s no danger in using the new union type syntax or any other new feature.

If you’re distributing an app or a library that’s used by others instead, you may want to be a bit more conservative. Currently, Python 3.6 is the oldest officially supported Python version. It reaches end-of-life in December 2021, after which Python 3.7 will be the minimum supported version.

The documentation includes a useful guide about porting your code to Python 3.10. Check it out for more details!

Conclusion

The release of a new Python version is always worth celebrating. Even if you can’t start using the new features right away, they’ll become broadly available and part of your daily life within a few years.

In this tutorial, you’ve seen new features like:

  • Friendlier error messages
  • Powerful structural pattern matching
  • Type hint improvements
  • Safer combination of sequences
  • New statistics functions

For more Python 3.10 tips and a discussion with members of the Real Python team, check out Real Python Podcast Episode #81.

Have fun trying out the new features! Share your experiences in the comments below.

Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Cool New Features in Python 3.10

Понравилась статья? Поделить с друзьями:
  • Unsupported personality pcl ошибка
  • Unsupported partition table как исправить windows 7
  • Unsupported operand type s for str and int как исправить
  • Unsupported operand type s for int and str python ошибка
  • Unrecoverable playback error invalid argument