Trailing data error - Исправление ошибок и поиск оптимальных решений проблем

I’m trying to read a JSON file into a Pandas dataframe, in the following:

def read_JSON_into_dataframe( file_name ):
    with sys.stdin if file_name is None else open( file_name, "r", encoding='utf8', errors='ignore' ) as reader:
        df = pd.read_json( reader )
        print( df.describe(), file = sys.stderr )
        return df

However, I’m getting an error, for which to bottom stack frame is:

C:ProgramDataAnaconda3libsite-packagespandasiojsonjson.py in _parse_no_numpy(self)
    869         if orient == "columns":
    870             self.obj = DataFrame(
--> 871                 loads(json, precise_float=self.precise_float), dtype=None)
    872         elif orient == "split":
    873             decoded = {str(k): v for k, v in compat.iteritems(

ValueError: Trailing data

What does «trailing data» refer to? If it refers to some point in the JSON file, is there something I can do to figure out where that is and what’s wrong with it?

asked Oct 11, 2019 at 17:46

df = pd.read_json («filename.json», lines = True)

answered Aug 7, 2020 at 11:39

J_VJ_V

3073 silver badges6 bronze badges

I made such experiment:

Took a properly formatted JSON file.
Opened it with a text editor and added » xxxx» after the final «}».
Attempted to read it, calling data = json.load(…).

The full error message was:

JSONDecodeError: Extra data: line 112 column 3 (char 6124)

So as you can see, you have precisely indicated in which row / column
there was found this extra text.

Take a look at this place of your input file.
Probably it is corrupted in some way, e.g. some «{» char was
deleted.

To find the source of problem you can even use Notepad++.
Note that if you place the cursor either before of after a «{» then
this char and also the closing «}» are displayed in red.
The same pertains to «[» and «]».

So this way you can locate matching opening / closing braces or brackets
and find out what is missing.

Of course, usage of json.load is not likely to read your file
as a DataFrame, but at least it precisely indicates the place
where the problem occurred.
After you find the source of error and correct it, use your program again.

answered Oct 11, 2019 at 18:35

Valdi_BoValdi_Bo

29.3k4 gold badges24 silver badges38 bronze badges

Источник

17 авг. 2022 г.
читать 2 мин

Одна ошибка, с которой вы можете столкнуться при использовании Python:

ValueError : Trailing data

Эта ошибка обычно возникает, когда вы пытаетесь импортировать файл JSON в кадр данных pandas, но данные записываются в строки, разделенные конечными строками, такими как « n ».

Самый простой способ исправить эту ошибку — просто указать lines=True при импорте данных:

df = pd.read_json('my_data.json', lines= True )

В следующем примере показано, как исправить эту ошибку на практике.

Как воспроизвести ошибку

Предположим, у нас есть следующий файл JSON:

Теперь предположим, что мы пытаемся импортировать этот файл JSON в DataFrame pandas:

#attempt to import JSON file into pandas DataFrame
df = pd.read_json('Documents/DataFiles/my_data.json')

ValueError : Trailing data

Мы получаем ошибку, потому что элемент «Обзор» в нашем файле JSON содержит n для представления конечных строк.

Как исправить ошибку

Самый простой способ исправить эту ошибку — просто указать lines=True при импорте данных:

#import JSON file into pandas DataFrame
df = pd.read_json('Documents/DataFiles/my_data.json', lines= True )

#view DataFrame
df

 ID Rating Review
0 A 8 Great movie.nI would recommend it.
1 B 5 Mediocre movie.nWould not recommend it.
2 C 3 Bad movie.nI would not recommend.
3 D 7 Decent movie.nI might recommend it.

Обратите внимание, что мы можем успешно импортировать файл JSON в кадр данных pandas без каких-либо ошибок.

Если мы хотим удалить конечные строки n из столбца «Обзор», мы можем использовать следующий синтаксис:

#replace n with empty space in 'Review' column
df['Review'] = df['Review']. str.replace('n', '')

#view updated DataFrame
df

 ID Rating Review
0 A 8 Great movie. I would recommend it.
1 B 5 Mediocre movie. Would not recommend it.
2 C 3 Bad movie. I would not recommend.
3 D 7 Decent movie. I might recommend it.

Значения n теперь удалены из столбца «Обзор».

Дополнительные ресурсы

В следующих руководствах объясняется, как выполнять другие распространенные операции в pandas:

Как преобразовать фрейм данных Pandas в файл JSON
Как преобразовать файл JSON в Pandas DataFrame

Источник

I’m trying to read a JSON file into a Pandas dataframe, in the following:

def read_JSON_into_dataframe( file_name ):
    with sys.stdin if file_name is None else open( file_name, "r", encoding='utf8', errors='ignore' ) as reader:
        df = pd.read_json( reader )
        print( df.describe(), file = sys.stderr )
        return df

However, I’m getting an error, for which to bottom stack frame is:

C:ProgramDataAnaconda3libsite-packagespandasiojsonjson.py in _parse_no_numpy(self)
    869         if orient == "columns":
    870             self.obj = DataFrame(
--> 871                 loads(json, precise_float=self.precise_float), dtype=None)
    872         elif orient == "split":
    873             decoded = {str(k): v for k, v in compat.iteritems(

ValueError: Trailing data

What does «trailing data» refer to? If it refers to some point in the JSON file, is there something I can do to figure out where that is and what’s wrong with it?

asked Oct 11, 2019 at 17:46

df = pd.read_json («filename.json», lines = True)

answered Aug 7, 2020 at 11:39

J_VJ_V

3073 silver badges6 bronze badges

I made such experiment:

Took a properly formatted JSON file.
Opened it with a text editor and added » xxxx» after the final «}».
Attempted to read it, calling data = json.load(…).

The full error message was:

JSONDecodeError: Extra data: line 112 column 3 (char 6124)

So as you can see, you have precisely indicated in which row / column
there was found this extra text.

Take a look at this place of your input file.
Probably it is corrupted in some way, e.g. some «{» char was
deleted.

So this way you can locate matching opening / closing braces or brackets
and find out what is missing.

answered Oct 11, 2019 at 18:35

Valdi_BoValdi_Bo

29.3k4 gold badges24 silver badges38 bronze badges

Источник

In Python ValueError: Trailing data occurs when you try to load the JSON data or file into pandas DataFrame, and the data is written in lines separated with newline characters such as ‘n’.

Typically, we import data from the JSON files, and there are higher chances that JSON data contains newline characters.

Let’s take a simple example to reproduce this error. We have a JSON file of employees, and the address property in the JSON has ‘n’

JSON File

# import pandas library
import pandas as pd

# create pandas DataFrame
df = pd.read_json('employee.json')

# print names of employee
print(df)

Output

ValueError: Trailing data

Note: If the JSON data is malformed or the file path is invalid you will get an error ValueError: Expected object or value

The simplest way to fix this error is to pass the lines=True argument in the read_json() method while importing the JSON file.

The lines=True parameter ensures to read the JSON file as an object per line.

Now when we import the JSON file into a pandas DataFrame, it will load and print the data without any issue.

# import pandas library
import pandas as pd

# create pandas DataFrame
df = pd.read_json('employee.json',lines=True)

# print names of employee
print(df)

Output

    ID      name  age                  address
0  123      Jack   25     #3, 5th Main nIndia
1  124  Chandler   25        #5, 2nd Main nUS
2  123      Jack   25  #3/2, 6th Main nCanada

Another way is to remove the n character from the address column. We can simply replace the n character with an empty '' character, as shown below.

# import pandas library
import pandas as pd

# create pandas DataFrame
df = pd.read_json('employee.json',lines=True)
df['address'] = df['address'].str.replace('n', ' ')

# print names of employee
print(df)

Output

    ID      name  age                 address
0  123      Jack   25     #3, 5th Main  India
1  124  Chandler   25        #5, 2nd Main  US
2  123      Jack   25  #3/2, 6th Main  Canada

Avatar Of Srinivas Ramakrishna

Srinivas Ramakrishna is a Solution Architect and has 14+ Years of Experience in the Software Industry. He has published many articles on Medium, Hackernoon, dev.to and solved many problems in StackOverflow. He has core expertise in various technologies such as Microsoft .NET Core, Python, Node.JS, JavaScript, Cloud (Azure), RDBMS (MSSQL), React, Powershell, etc.

Sign Up for Our Newsletters

Subscribe to get notified of the latest articles. We will never spam you. Be a part of our ever-growing community.

By checking this box, you confirm that you have read and are agreeing to our terms of use regarding the storage of the data submitted through this form.

Источник

One error you may encounter when using Python is:

ValueError: Trailing data

This error usually occurs when you attempt to import a JSON file into a pandas DataFrame, yet the data is written in lines separated by endlines like ‘n‘.

The easiest way to fix this error is to simply specify lines=True when importing the data:

df = pd.read_json('my_data.json', lines=True)

The following example shows how to fix this error in practice.

How to Reproduce the Error

Suppose we have the following JSON file:

Now suppose we attempt to import this JSON file into a pandas DataFrame:

#attempt to import JSON file into pandas DataFrame
df = pd.read_json('Documents/DataFiles/my_data.json')

ValueError: Trailing data

We receive an error because the “Review” item in our JSON file contains n to represent endlines.

How to Fix the Error

The easiest way to fix this error is to simply specify lines=True when importing the data:

#import JSON file into pandas DataFrame
df = pd.read_json('Documents/DataFiles/my_data.json', lines=True)

#view DataFrame
df

	ID	Rating	Review
0	A	8	Great movie.nI would recommend it.
1	B	5	Mediocre movie.nWould not recommend it.
2	C	3	Bad movie.nI would not recommend.
3	D	7	Decent movie.nI might recommend it.

Notice that we’re able to successfully import the JSON file into a pandas DataFrame without any errors.

If we’d like to remove the n endlines from the “Review” column, we can use the following syntax:

#replace n with empty space in 'Review' column
df['Review'] = df['Review'].str.replace('n', ' ')

#view updated DataFrame
df

	ID	Rating	Review
0	A	8	Great movie. I would recommend it.
1	B	5	Mediocre movie. Would not recommend it.
2	C	3	Bad movie. I would not recommend.
3	D	7	Decent movie. I might recommend it.

The n values are now removed from the “Review” column.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Convert a Pandas DataFrame to JSON File
How to Convert a JSON File to Pandas DataFrame

Источник