I’m trying to read a JSON file into a Pandas dataframe, in the following:
def read_JSON_into_dataframe( file_name ):
with sys.stdin if file_name is None else open( file_name, "r", encoding='utf8', errors='ignore' ) as reader:
df = pd.read_json( reader )
print( df.describe(), file = sys.stderr )
return df
However, I’m getting an error, for which to bottom stack frame is:
C:ProgramDataAnaconda3libsite-packagespandasiojsonjson.py in _parse_no_numpy(self)
869 if orient == "columns":
870 self.obj = DataFrame(
--> 871 loads(json, precise_float=self.precise_float), dtype=None)
872 elif orient == "split":
873 decoded = {str(k): v for k, v in compat.iteritems(
ValueError: Trailing data
What does «trailing data» refer to? If it refers to some point in the JSON file, is there something I can do to figure out where that is and what’s wrong with it?
asked Oct 11, 2019 at 17:46
1
df = pd.read_json («filename.json», lines = True)
answered Aug 7, 2020 at 11:39
J_VJ_V
3073 silver badges6 bronze badges
3
I made such experiment:
- Took a properly formatted JSON file.
- Opened it with a text editor and added » xxxx» after the final «}».
- Attempted to read it, calling data = json.load(…).
The full error message was:
JSONDecodeError: Extra data: line 112 column 3 (char 6124)
So as you can see, you have precisely indicated in which row / column
there was found this extra text.
Take a look at this place of your input file.
Probably it is corrupted in some way, e.g. some «{» char was
deleted.
To find the source of problem you can even use Notepad++.
Note that if you place the cursor either before of after a «{» then
this char and also the closing «}» are displayed in red.
The same pertains to «[» and «]».
So this way you can locate matching opening / closing braces or brackets
and find out what is missing.
Of course, usage of json.load is not likely to read your file
as a DataFrame, but at least it precisely indicates the place
where the problem occurred.
After you find the source of error and correct it, use your program again.
answered Oct 11, 2019 at 18:35
Valdi_BoValdi_Bo
29.3k4 gold badges24 silver badges38 bronze badges
1
17 авг. 2022 г.
читать 2 мин
Одна ошибка, с которой вы можете столкнуться при использовании Python:
ValueError : Trailing data
Эта ошибка обычно возникает, когда вы пытаетесь импортировать файл JSON в кадр данных pandas, но данные записываются в строки, разделенные конечными строками, такими как « n ».
Самый простой способ исправить эту ошибку — просто указать lines=True при импорте данных:
df = pd.read_json('my_data.json', lines= True )
В следующем примере показано, как исправить эту ошибку на практике.
Как воспроизвести ошибку
Предположим, у нас есть следующий файл JSON:
Теперь предположим, что мы пытаемся импортировать этот файл JSON в DataFrame pandas:
#attempt to import JSON file into pandas DataFrame
df = pd.read_json('Documents/DataFiles/my_data.json')
ValueError : Trailing data
Мы получаем ошибку, потому что элемент «Обзор» в нашем файле JSON содержит n для представления конечных строк.
Как исправить ошибку
Самый простой способ исправить эту ошибку — просто указать lines=True при импорте данных:
#import JSON file into pandas DataFrame
df = pd.read_json('Documents/DataFiles/my_data.json', lines= True )
#view DataFrame
df
ID Rating Review
0 A 8 Great movie.nI would recommend it.
1 B 5 Mediocre movie.nWould not recommend it.
2 C 3 Bad movie.nI would not recommend.
3 D 7 Decent movie.nI might recommend it.
Обратите внимание, что мы можем успешно импортировать файл JSON в кадр данных pandas без каких-либо ошибок.
Если мы хотим удалить конечные строки n из столбца «Обзор», мы можем использовать следующий синтаксис:
#replace n with empty space in 'Review' column
df['Review'] = df['Review']. str.replace('n', '')
#view updated DataFrame
df
ID Rating Review
0 A 8 Great movie. I would recommend it.
1 B 5 Mediocre movie. Would not recommend it.
2 C 3 Bad movie. I would not recommend.
3 D 7 Decent movie. I might recommend it.
Значения n теперь удалены из столбца «Обзор».
Дополнительные ресурсы
В следующих руководствах объясняется, как выполнять другие распространенные операции в pandas:
Как преобразовать фрейм данных Pandas в файл JSON
Как преобразовать файл JSON в Pandas DataFrame
I’m trying to read a JSON file into a Pandas dataframe, in the following:
def read_JSON_into_dataframe( file_name ):
with sys.stdin if file_name is None else open( file_name, "r", encoding='utf8', errors='ignore' ) as reader:
df = pd.read_json( reader )
print( df.describe(), file = sys.stderr )
return df
However, I’m getting an error, for which to bottom stack frame is:
C:ProgramDataAnaconda3libsite-packagespandasiojsonjson.py in _parse_no_numpy(self)
869 if orient == "columns":
870 self.obj = DataFrame(
--> 871 loads(json, precise_float=self.precise_float), dtype=None)
872 elif orient == "split":
873 decoded = {str(k): v for k, v in compat.iteritems(
ValueError: Trailing data
What does «trailing data» refer to? If it refers to some point in the JSON file, is there something I can do to figure out where that is and what’s wrong with it?
asked Oct 11, 2019 at 17:46
1
df = pd.read_json («filename.json», lines = True)
answered Aug 7, 2020 at 11:39
J_VJ_V
3073 silver badges6 bronze badges
3
I made such experiment:
- Took a properly formatted JSON file.
- Opened it with a text editor and added » xxxx» after the final «}».
- Attempted to read it, calling data = json.load(…).
The full error message was:
JSONDecodeError: Extra data: line 112 column 3 (char 6124)
So as you can see, you have precisely indicated in which row / column
there was found this extra text.
Take a look at this place of your input file.
Probably it is corrupted in some way, e.g. some «{» char was
deleted.
To find the source of problem you can even use Notepad++.
Note that if you place the cursor either before of after a «{» then
this char and also the closing «}» are displayed in red.
The same pertains to «[» and «]».
So this way you can locate matching opening / closing braces or brackets
and find out what is missing.
Of course, usage of json.load is not likely to read your file
as a DataFrame, but at least it precisely indicates the place
where the problem occurred.
After you find the source of error and correct it, use your program again.
answered Oct 11, 2019 at 18:35
Valdi_BoValdi_Bo
29.3k4 gold badges24 silver badges38 bronze badges
1
In Python ValueError: Trailing data occurs when you try to load the JSON data or file into pandas DataFrame, and the data is written in lines separated with newline characters such as ‘n’.
Typically, we import data from the JSON files, and there are higher chances that JSON data contains newline characters.
Let’s take a simple example to reproduce this error. We have a JSON file of employees, and the address property in the JSON has ‘n’
# import pandas library
import pandas as pd
# create pandas DataFrame
df = pd.read_json('employee.json')
# print names of employee
print(df)
Output
ValueError: Trailing data
Note: If the JSON data is malformed or the file path is invalid you will get an error ValueError: Expected object or value
The simplest way to fix this error is to pass the lines=True
argument in the read_json() method while importing the JSON file.
The lines=True
parameter ensures to read the JSON file as an object per line.
Now when we import the JSON file into a pandas DataFrame, it will load and print the data without any issue.
# import pandas library
import pandas as pd
# create pandas DataFrame
df = pd.read_json('employee.json',lines=True)
# print names of employee
print(df)
Output
ID name age address
0 123 Jack 25 #3, 5th Main nIndia
1 124 Chandler 25 #5, 2nd Main nUS
2 123 Jack 25 #3/2, 6th Main nCanada
Another way is to remove the n
character from the address column. We can simply replace the n character with an empty ''
character, as shown below.
# import pandas library
import pandas as pd
# create pandas DataFrame
df = pd.read_json('employee.json',lines=True)
df['address'] = df['address'].str.replace('n', ' ')
# print names of employee
print(df)
Output
ID name age address
0 123 Jack 25 #3, 5th Main India
1 124 Chandler 25 #5, 2nd Main US
2 123 Jack 25 #3/2, 6th Main Canada
Srinivas Ramakrishna is a Solution Architect and has 14+ Years of Experience in the Software Industry. He has published many articles on Medium, Hackernoon, dev.to and solved many problems in StackOverflow. He has core expertise in various technologies such as Microsoft .NET Core, Python, Node.JS, JavaScript, Cloud (Azure), RDBMS (MSSQL), React, Powershell, etc.
Sign Up for Our Newsletters
Subscribe to get notified of the latest articles. We will never spam you. Be a part of our ever-growing community.
By checking this box, you confirm that you have read and are agreeing to our terms of use regarding the storage of the data submitted through this form.
One error you may encounter when using Python is:
ValueError: Trailing data
This error usually occurs when you attempt to import a JSON file into a pandas DataFrame, yet the data is written in lines separated by endlines like ‘n‘.
The easiest way to fix this error is to simply specify lines=True when importing the data:
df = pd.read_json('my_data.json', lines=True)
The following example shows how to fix this error in practice.
How to Reproduce the Error
Suppose we have the following JSON file:
Now suppose we attempt to import this JSON file into a pandas DataFrame:
#attempt to import JSON file into pandas DataFrame
df = pd.read_json('Documents/DataFiles/my_data.json')
ValueError: Trailing data
We receive an error because the “Review” item in our JSON file contains n to represent endlines.
How to Fix the Error
The easiest way to fix this error is to simply specify lines=True when importing the data:
#import JSON file into pandas DataFrame df = pd.read_json('Documents/DataFiles/my_data.json', lines=True) #view DataFrame df ID Rating Review 0 A 8 Great movie.nI would recommend it. 1 B 5 Mediocre movie.nWould not recommend it. 2 C 3 Bad movie.nI would not recommend. 3 D 7 Decent movie.nI might recommend it.
Notice that we’re able to successfully import the JSON file into a pandas DataFrame without any errors.
If we’d like to remove the n endlines from the “Review” column, we can use the following syntax:
#replace n with empty space in 'Review' column
df['Review'] = df['Review'].str.replace('n', ' ')
#view updated DataFrame
df
ID Rating Review
0 A 8 Great movie. I would recommend it.
1 B 5 Mediocre movie. Would not recommend it.
2 C 3 Bad movie. I would not recommend.
3 D 7 Decent movie. I might recommend it.
The n values are now removed from the “Review” column.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
How to Convert a Pandas DataFrame to JSON File
How to Convert a JSON File to Pandas DataFrame