Syntax error unicode error python

In this post, you can find several solutions for: SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated UXXXXXXXX escape While this error can appear in different situations the reason for the error is one and the same: * there are special characters( escape sequence - characters starting with backslash - '' ). * From the error above you can recognize that the culprit is 'U' - which is considered as unicode character. * another possible

In this post, you can find several solutions for:

SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape

While this error can appear in different situations the reason for the error is one and the same:

  • there are special characters( escape sequence — characters starting with backslash — » ).
  • From the error above you can recognize that the culprit is ‘U’ — which is considered as unicode character.
    • another possible errors for SyntaxError: (unicode error) ‘unicodeescape’ will be raised for ‘x’, ‘u’
      • codec can’t decode bytes in position 2-3: truncated xXX escape
      • codec can’t decode bytes in position 2-3: truncated uXXXX escape

Step #1: How to solve SyntaxError: (unicode error) ‘unicodeescape’ — Double slashes for escape characters

Let’s start with one of the most frequent examples — windows paths. In this case there is a bad character sequence in the string:

import json

json_data=open("C:Userstest.txt").read()
json_obj = json.loads(json_data)

The problem is that U is considered as a special escape sequence for Python string. In order to resolved you need to add second escape character like:

import json

json_data=open("C:\Users\test.txt").read()
json_obj = json.loads(json_data)

Step #2: Use raw strings to prevent SyntaxError: (unicode error) ‘unicodeescape’

If the first option is not good enough or working then raw strings are the next option. Simply by adding r (for raw string literals) to resolve the error. This is an example of raw strings:

import json

json_data=open(r"C:Userstest.txt").read()
json_obj = json.loads(json_data)

If you like to find more information about Python strings, literals

2.4.1 String literals

In the same link we can find:

When an r' or R’ prefix is present, backslashes are still used to quote the following character, but all backslashes are left in the string. For example, the string literal r»n» consists of two characters: a backslash and a lowercase `n’.

Step #3: Slashes for file paths -SyntaxError: (unicode error) ‘unicodeescape’

Another possible solution is to replace the backslash with slash for paths of files and folders. For example:

«C:Userstest.txt»

will be changed to:

«C:/Users/test.txt»

Since python can recognize both I prefer to use only the second way in order to avoid such nasty traps. Another reason for using slashes is your code to be uniform and homogeneous.

The picture below demonstrates how the error will look like in PyCharm. In order to understand what happens you will need to investigate the error log.

SyntaxError_unicode_error_unicodeescape

The error log will have information for the program flow as:

/home/vanx/Software/Tensorflow/environments/venv36/bin/python3 /home/vanx/PycharmProjects/python/test/Other/temp.py
  File "/home/vanx/PycharmProjects/python/test/Other/temp.py", line 3
    json_data=open("C:Userstest.txt").read()
                  ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated UXXXXXXXX escape

You can see the latest call which produces the error and click on it. Once the reason is identified then you can test what could solve the problem.

Introduction

The following error message is a common Python error, the «SyntaxError» represents a Python syntax error and the «unicodeescape» means that we made a mistake in using unicode escape character.

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-5: truncated UXXXXXXXX escape

Simply to put, the «SyntaxError» can be occurred accidentally in Python and it is often happens because the (ESCAPE CHARACTER) is misused in a python string, it caused the unicodeescape error.

(Note: «escape character» can convert your other character to be a normal character.)


SyntaxError Example

Let’s look at an example:

print('It's a nice day.')

Looks like we want to print «It’s a nice day.«, right? But the program will report an error message.

File "<stdin>", line 1
    print('It's a nice day.')
              ^
SyntaxError: invalid syntax

The reason is very easy-to-know. In Python, we can use print('xxx') to print out xxx. But in our code, if we had used the ' character, the Python interpreter will misinterpret the range of our characters so it will report an error.

To solve this problem, we need to add an escape character «» to convert our ‘ character to be a normal character, not a superscript of string.

print('It's a nice day.')

Output:

It's a nice day.

We print it successfully!

So how did the syntax error happen? Let me talk about my example:

One day, I run my program for experiment, I saved some data in a csv file. In order for this file can be viewed on the Windows OS, I add a new code uFFEF in the beginning of file.

This is «BOM» (Byte Order Mark), Explain to the system that the file format is «Big-Ending«.

I got the error message.

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-5: truncated UXXXXXXXX escape

As mentioned at the beginning of this article, this is an escape character error in Python.


Solutions

There are three solutions you can try:

  • Add a «r» character in the right of string
  • Change to be /
  • Change to be \

Solution 1: Add a «r» character in the beginning of string.

title = r'uFFEF'

After we adding a r character at right side of python string, it means a complete string not anything else.

Solution 2: Change to be /.

open("C:UsersClayDesktoptest.txt")

Change to:

open("C:/Users/Clay/Desktop/test.txt")

This way is avoid to use escape character.

Solution 3: Change to be \.

open("C:UsersClayDesktoptest.txt")

Change the code to:

open("C:\Users\Clay\Desktop\test.txt")

It is similar to the solution 2 that it also avoids the use of escape characters.


The above are three common solutions. We can run normally on Windows.


Reference

  • https://stackoverflow.com/questions/37400974/unicode-error-unicodeescape-codec-cant-decode-bytes-in-position-2-3-trunca
  • https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Error-unicodeescape-codec-can-t-decode-bytes/td-p/427540

Read More

  • [Solved][Python] ModuleNotFoundError: No module named ‘cStringIO’
Table of Contents
Hide
  1. What is SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape?
  2. How to fix SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape?
    1. Solution 1 – Using Double backslash (\)
    2. Solution 2 – Using raw string by prefixing ‘r’
    3. Solution 3 – Using forward slash 
  3. Conclusion

The SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape occurs if you are trying to access a file path with a regular string.

In this tutorial, we will take a look at what exactly (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape means and how to fix it with examples.

The Python String literals can be enclosed in matching single quotes (‘) or double quotes (“). 

String literals can also be prefixed with a letter ‘r‘ or ‘R‘; such strings are called raw strings and use different rules for backslash escape sequences.

They can also be enclosed in matching groups of three single or double quotes (these are generally referred to as triple-quoted strings). 

The backslash () character is used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character. 

Now that we have understood the string literals. Let us take an example to demonstrate the issue.

import pandas

# read the file
pandas.read_csv("C:UsersitsmycodeDesktoptest.csv")

Output

  File "c:PersonalIJSCodeprogram.py", line 4
    pandas.read_csv("C:UsersitsmycodeDesktoptest.csv")                                                                                     ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated UXXXXXXXX escape

We are using the single backslash in the above code while providing the file path. Since the backslash is present in the file path, it is interpreted as a special character or escape character (any sequence starting with ‘’). In particular, “U” introduces a 32-bit Unicode character.

How to fix SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape?

Solution 1 – Using Double backslash (\)

In Python, the single backslash in the string is interpreted as a special character, and the character U(in users) will be treated as the Unicode code point.

We can fix the issue by escaping the backslash, and we can do that by adding an additional backslash, as shown below.

import pandas

# read the file
pandas.read_csv("C:\Users\itsmycode\Desktop\test.csv")

Solution 2 – Using raw string by prefixing ‘r’

We can also escape the Unicode by prefixing r in front of the string. The r stands for “raw” and indicates that backslashes need to be escaped, and they should be treated as a regular backslash.

import pandas

# read the file
pandas.read_csv("C:\Users\itsmycode\Desktop\test.csv")

Solution 3 – Using forward slash 

Another easier way is to avoid the backslash and instead replace it with the forward-slash character(/), as shown below.

import pandas

# read the file
pandas.read_csv("C:/Users/itsmycode/Desktop/test.csv")

Conclusion

The SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape occurs if you are trying to access a file path and provide the path as a regular string.

We can solve the issue by escaping the single backslash with a double backslash or prefixing the string with ‘r,’ which converts it into a raw string. Alternatively, we can replace the backslash with a forward slash.

Ezoic

Avatar Of Srinivas Ramakrishna

Srinivas Ramakrishna is a Solution Architect and has 14+ Years of Experience in the Software Industry. He has published many articles on Medium, Hackernoon, dev.to and solved many problems in StackOverflow. He has core expertise in various technologies such as Microsoft .NET Core, Python, Node.JS, JavaScript, Cloud (Azure), RDBMS (MSSQL), React, Powershell, etc.

Sign Up for Our Newsletters

Subscribe to get notified of the latest articles. We will never spam you. Be a part of our ever-growing community.

By checking this box, you confirm that you have read and are agreeing to our terms of use regarding the storage of the data submitted through this form.

Содержание

  1. Python SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape
  2. Step #1: How to solve SyntaxError: (unicode error) ‘unicodeescape’ — Double slashes for escape characters
  3. Step #2: Use raw strings to prevent SyntaxError: (unicode error) ‘unicodeescape’
  4. Step #3: Slashes for file paths -SyntaxError: (unicode error) ‘unicodeescape’
  5. Step #4: PyCharm — SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape
  6. Syntax error unicode error python
  7. SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape #
  8. SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape
  9. What is SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape?
  10. How to fix SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape?
  11. Solution 1 – Using Double backslash ()
  12. Solution 2 – Using raw string by prefixing ‘r’
  13. Solution 3 – Using forward slash
  14. Conclusion
  15. [Solved] Python SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 0-5: truncated UXXXXXXXX escape
  16. Introduction
  17. SyntaxError Example
  18. Solutions
  19. Solution 1: Add a «r» character in the beginning of string.
  20. Solution 2: Change to be / .
  21. Solution 3: Change to be .
  22. gornostal / Unicode.md

Python SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape

In this post, you can find several solutions for:

SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape

While this error can appear in different situations the reason for the error is one and the same:

  • there are special characters( escape sequence — characters starting with backslash — » ).
  • From the error above you can recognize that the culprit is ‘U’ — which is considered as unicode character.
    • another possible errors for SyntaxError: (unicode error) ‘unicodeescape’ will be raised for ‘x’, ‘u’
      • codec can’t decode bytes in position 2-3: truncated xXX escape
      • codec can’t decode bytes in position 2-3: truncated uXXXX escape

Step #1: How to solve SyntaxError: (unicode error) ‘unicodeescape’ — Double slashes for escape characters

Let’s start with one of the most frequent examples — windows paths. In this case there is a bad character sequence in the string:

The problem is that U is considered as a special escape sequence for Python string. In order to resolved you need to add second escape character like:

Step #2: Use raw strings to prevent SyntaxError: (unicode error) ‘unicodeescape’

If the first option is not good enough or working then raw strings are the next option. Simply by adding r (for raw string literals) to resolve the error. This is an example of raw strings:

If you like to find more information about Python strings, literals

In the same link we can find:

When an r’ or R’ prefix is present, backslashes are still used to quote the following character, but all backslashes are left in the string. For example, the string literal r»n» consists of two characters: a backslash and a lowercase `n’.

Step #3: Slashes for file paths -SyntaxError: (unicode error) ‘unicodeescape’

Another possible solution is to replace the backslash with slash for paths of files and folders. For example:

will be changed to:

Since python can recognize both I prefer to use only the second way in order to avoid such nasty traps. Another reason for using slashes is your code to be uniform and homogeneous.

Step #4: PyCharm — SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape

The picture below demonstrates how the error will look like in PyCharm. In order to understand what happens you will need to investigate the error log.

The error log will have information for the program flow as:

You can see the latest call which produces the error and click on it. Once the reason is identified then you can test what could solve the problem.

Источник

Syntax error unicode error python

Reading time В· 2 min

SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape #

The Python «SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position» occurs when we have an unescaped backslash character in a path. To solve the error, prefix the path with r to mark it as a raw string, e.g. r’C:UsersBobDesktopexample.txt’ .

Here is an example of how the error occurs.

The path contains backslash characters which is the cause of the error.

The backslash character has a special meaning in Python — it is used as an escape character (e.g. n or t ).

One way to solve the error is to prefix the string with the letter r to mark it as a raw string.

An alternative way to treat a backslash as a literal character is to escape it with a second backslash \ .

We escaped each backslash character to treat them as literal backslashes.

Here is a string that shows how 2 backslashes only get translated into 1.

Similarly, if you need to have 2 backslashes next to one another, you would have to use 4 backslashes.

An alternative solution to the «unicodeescape codec can’t decode bytes in position» error is to use forward slashes in the path instead of backslashes.

A forward slash can be used in place of a backslash when you need to specify a path.

This solves the error because we no longer have any unescaped backslash characters in the path.

Since backslash characters have a special meaning in Python, we need to treat them as a literal character by:

  • prefixing the string with r to mark it as a raw string
  • escaping each backslash with a second backslash
  • using forward slashes in place of backslashes in the path

Источник

SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape

Table of Contents Hide

The SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape occurs if you are trying to access a file path with a regular string.

In this tutorial, we will take a look at what exactly (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape means and how to fix it with examples.

What is SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape?

The Python String literals can be enclosed in matching single quotes (‘) or double quotes (“).

String literals can also be prefixed with a letter ‘r‘ or ‘R‘; such strings are called raw strings and use different rules for backslash escape sequences.

They can also be enclosed in matching groups of three single or double quotes (these are generally referred to as triple-quoted strings).

The backslash () character is used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character.

Now that we have understood the string literals. Let us take an example to demonstrate the issue.

Output

We are using the single backslash in the above code while providing the file path. Since the backslash is present in the file path, it is interpreted as a special character or escape character (any sequence starting with ‘’). In particular, “U” introduces a 32-bit Unicode character.

How to fix SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape?

Solution 1 – Using Double backslash (\)

In Python, the single backslash in the string is interpreted as a special character, and the character U(in users) will be treated as the Unicode code point.

We can fix the issue by escaping the backslash, and we can do that by adding an additional backslash, as shown below.

Solution 2 – Using raw string by prefixing ‘r’

We can also escape the Unicode by prefixing r in front of the string. The r stands for “raw” and indicates that backslashes need to be escaped, and they should be treated as a regular backslash.

Solution 3 – Using forward slash

Another easier way is to avoid the backslash and instead replace it with the forward-slash character(/), as shown below.

Conclusion

The SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape occurs if you are trying to access a file path and provide the path as a regular string.

We can solve the issue by escaping the single backslash with a double backslash or prefixing the string with ‘r,’ which converts it into a raw string. Alternatively, we can replace the backslash with a forward slash.

Источник

[Solved] Python SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 0-5: truncated UXXXXXXXX escape

Introduction

The following error message is a common Python error, the «SyntaxError» represents a Python syntax error and the «unicodeescape» means that we made a mistake in using unicode escape character.

Simply to put, the «SyntaxError» can be occurred accidentally in Python and it is often happens because the (ESCAPE CHARACTER) is misused in a python string, it caused the unicodeescape error.

(Note: «escape character» can convert your other character to be a normal character.)

SyntaxError Example

Let’s look at an example:

Looks like we want to print «It’s a nice day.«, right? But the program will report an error message.

The reason is very easy-to-know. In Python, we can use print(‘xxx’) to print out xxx. But in our code, if we had used the ‘ character, the Python interpreter will misinterpret the range of our characters so it will report an error.

To solve this problem, we need to add an escape character «» to convert our ‘ character to be a normal character, not a superscript of string.

We print it successfully!

So how did the syntax error happen? Let me talk about my example:

One day, I run my program for experiment, I saved some data in a csv file. In order for this file can be viewed on the Windows OS, I add a new code uFFEF in the beginning of file.

This is «BOM» (Byte Order Mark), Explain to the system that the file format is «Big-Ending«.

I got the error message.

As mentioned at the beginning of this article, this is an escape character error in Python.

Solutions

There are three solutions you can try:

  • Add a «r» character in the right of string
  • Change to be /
  • Change to be \

Solution 1: Add a «r» character in the beginning of string.

After we adding a r character at right side of python string, it means a complete string not anything else.

Solution 2: Change to be / .

This way is avoid to use escape character.

Solution 3: Change to be \ .

Change the code to:

It is similar to the solution 2 that it also avoids the use of escape characters.

The above are three common solutions. We can run normally on Windows.

Источник

gornostal / Unicode.md

Python 2.7. Unicode Errors Simply Explained

I know I’m late with this article for about 5 years or so, but people are still using Python 2.x, so this subject is relevant I think.

Some facts first:

  • Unicode is an international encoding standard for use with different languages and scripts
  • In python-2.x, there are two types that deal with text.
    1. str is an 8-bit string.
    2. unicode is for strings of unicode code points.
      A code point is a number that maps to a particular abstract character. It is written using the notation U+12ca to mean the character with value 0x12ca (4810 decimal)
  • Encoding (noun) is a map of Unicode code points to a sequence of bytes. (Synonyms: character encoding, character set, codeset). Popular encodings: UTF-8, ASCII, Latin-1, etc.
  • Encoding (verb) is a process of converting unicode to bytes of str , and decoding is the reverce operation.
  • Python 2.x uses ASCII as a default encoding. (More about this later)

SyntaxError: Non-ASCII character

When you sees something like this

you just need to define encoding in the first or second line of your file. All you need is to have string coding=utf8 or coding: utf8 somewhere in your comments. Python doesn’t care what goes before or after those string, so the following will work fine too:

Notice the dash in utf-8. Python has many aliases for UTF-8 encoding, so you should not worry about dashes or case sensitivity.

str() function encodes a string. We passed a unicode string, and it tried to encode it using a default encoding, which is ASCII. Now the error makes sence because ASCII is 7-bit encoding which doesn’t know how to represent characters outside of range 0..128.
Here we called str() explicitly, but something in your code may call it implicitly and you will also get UnicodeEncodeError .

How to fix: encode unicode string manually using .encode(‘utf8’) before passing to str()

Let’s say we somehow obtained a byte string byte_string which contains encoded UTF-8 characters. We could get this by simply using a library that returns str type.
Then we passed the string to a function that converts it to unicode . In this example we explicitly call unicode() , but some functions may call it implicitly and you’ll get the same error.
Now again, Python uses ASCII encoding by default, so it tries to convert bytes to a default encoding ASCII. Since there is no ASCII symbol that converts to 0xc3 (195 decimal) it fails with UnicodeDecodeError .

How to fix: decode str manually using .decode(‘utf8’) before passing to your function.

Make sure your code works only with Unicode strings internally, converting to a particular encoding on output, and decoding str on input. Learn the libraries you are using, and find places where they return str . Decode str before return value is passed further in your code.

Источник

In Python, This SyntaxError occurred when you are trying to access a path with normal String. As you know ‘/’ is escape character in Python that’s having different meaning by adding with different characters for example ‘n’ is use for ne line , ‘t’ use for tab.

This error is considered as SyntaxError because unicode forward slash () is not allow in path.

In further section of topic you will learn how to handle this problem in Python while writing path of file to access it.

Example of SyntaxError of Unicode Error

Lets take below example to read CSV file in Windows operating system.

import csv
with open('C:Userssaurabh.guptaDesktopPython Exampleinput.csv','r') as csvfile:
    reader=csv.reader(csvfile)
    for record in reader:
        print(record) 

If you notice the above code is having path for windows file system to access input.csv. In windows path mentioned by using forward slash () while in Python programming forward slash() is use for handling unicode characters. That’s why when you execute the above program will throw below exception.

Output

File "C:/Users/saurabh.gupta14/Desktop/Python Example/ReadingCSV.py", line 2
    with open('C:Userssaurabh.guptaDesktopPython Exampleinput.csv','r') as csvfile:

Solution

The above error is occurred because of handling forward slash() as normal string. To handle such problem in Python , there are couple of solutions:

1: Just put r in path before your normal string it converts normal string to raw string:

with open(r'C:Userssaurabh.guptaDesktopPython Exampleinput.csv','r') 

2: Use back slash (/) instated of forward slash()

with open('C:/Users/saurabh.gupta/Desktop/Python Example/input.csv','r') 

3: Use double forward slash (\) instead of forward slash() because in Python the unicode double forward value convert in string as forward slash()’

with open('C:\Users\saurabh.gupta\Desktop\Python Example\input.csv','r') 

If this solution help you , Please like and write in comment section or any other way you know to handle this issue write in comment so that help others.

“Learn From Others Experience»

Python 2.7. Unicode Errors Simply Explained

I know I’m late with this article for about 5 years or so, but people are still using Python 2.x, so this subject is relevant I think.

Some facts first:

  • Unicode is an international encoding standard for use with different languages and scripts
  • In python-2.x, there are two types that deal with text.
    1. str is an 8-bit string.
    2. unicode is for strings of unicode code points.
      A code point is a number that maps to a particular abstract character. It is written using the notation U+12ca to mean the character with value 0x12ca (4810 decimal)
  • Encoding (noun) is a map of Unicode code points to a sequence of bytes. (Synonyms: character encoding, character set, codeset). Popular encodings: UTF-8, ASCII, Latin-1, etc.
  • Encoding (verb) is a process of converting unicode to bytes of str, and decoding is the reverce operation.
  • Python 2.x uses ASCII as a default encoding. (More about this later)

SyntaxError: Non-ASCII character

When you sees something like this

SyntaxError: Non-ASCII character 'xd0' in file /tmp/p.py on line 2, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

you just need to define encoding in the first or second line of your file.
All you need is to have string coding=utf8 or coding: utf8 somewhere in your comments.
Python doesn’t care what goes before or after those string, so the following will work fine too:

# -*- encoding: utf-8 -*-

Notice the dash in utf-8. Python has many aliases for UTF-8 encoding, so you should not worry about dashes or case sensitivity.

UnicodeEncodeError Explained

>>> str(u'café')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'xe9' in position 3: ordinal not in range(128)

str() function encodes a string. We passed a unicode string, and it tried to encode it using a default encoding, which is ASCII. Now the error makes sence because ASCII is 7-bit encoding which doesn’t know how to represent characters outside of range 0..128.
Here we called str() explicitly, but something in your code may call it implicitly and you will also get UnicodeEncodeError.

How to fix: encode unicode string manually using .encode('utf8') before passing to str()

UnicodeDecodeError Explained

>>> utf_string = u'café'
>>> byte_string = utf_string.encode('utf8')
>>> unicode(byte_string)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)

Let’s say we somehow obtained a byte string byte_string which contains encoded UTF-8 characters. We could get this by simply using a library that returns str type.
Then we passed the string to a function that converts it to unicode. In this example we explicitly call unicode(), but some functions may call it implicitly and you’ll get the same error.
Now again, Python uses ASCII encoding by default, so it tries to convert bytes to a default encoding ASCII. Since there is no ASCII symbol that converts to 0xc3 (195 decimal) it fails with UnicodeDecodeError.

How to fix: decode str manually using .decode('utf8') before passing to your function.

Rule of Thumb

Make sure your code works only with Unicode strings internally, converting to a particular encoding on output, and decoding str on input.
Learn the libraries you are using, and find places where they return str. Decode str before return value is passed further in your code.

I use this helper function in my code:

def force_to_unicode(text):
    "If text is unicode, it is returned as is. If it's str, convert it to Unicode using UTF-8 encoding"
    return text if isinstance(text, unicode) else text.decode('utf8')

Source: https://docs.python.org/2/howto/unicode.html

Понравилась статья? Поделить с друзьями:
  • Syntax error unexpected token public
  • Sysprep exe неустранимая ошибка
  • Sysprep error 0x80073cf2
  • Sysprep error 0x0f0070
  • Syslinux edd load error boot error