I am new to Python and I am facing problem in creating the Dataframe
in the format of key and value i.e.
data = [{'key':'[GlobalProgramSizeInThousands]','value':'1000'},]
Here is my code:
columnsss = ['key','value'];
query = "select * from bparst_tags where tag_type = 1 ";
result = database.cursor(db.cursors.DictCursor);
result.execute(query);
result_set = result.fetchall();
data = "[";
for row in result_set:
`row["tag_expression"]`)
data += "{'value': %s , 'key': %s }," % ( `row["tag_expression"]`, `row["tag_name"]` )
data += "]" ;
df = DataFrame(data , columns=columnsss);
But when I pass the data in DataFrame
it shows me
pandas.core.common.PandasError: DataFrame constructor not properly called!
while if I print the data and assign the same value to data variable then it works.
Syscall
19k10 gold badges36 silver badges51 bronze badges
asked Sep 1, 2014 at 10:47
1
You are providing a string representation of a dict to the DataFrame constructor, and not a dict itself. So this is the reason you get that error.
So if you want to use your code, you could do:
df = DataFrame(eval(data))
But better would be to not create the string in the first place, but directly putting it in a dict. Something roughly like:
data = []
for row in result_set:
data.append({'value': row["tag_expression"], 'key': row["tag_name"]})
But probably even this is not needed, as depending on what is exactly in your result_set
you could probably:
- provide this directly to a DataFrame:
DataFrame(result_set)
- or use the pandas
read_sql_query
function to do this for you (see docs on this)
answered Sep 1, 2014 at 11:24
jorisjoris
129k35 gold badges242 silver badges202 bronze badges
1
Just ran into the same error, but the above answer could not help me.
My code worked fine on my computer which was like this:
test_dict = {'x': '123', 'y': '456', 'z': '456'}
df=pd.DataFrame(test_dict.items(),columns=['col1','col2'])
However, it did not work on another platform. It gave me the same error as mentioned in the original question. I tried below code by simply adding the list()
around the dictionary items, and it worked smoothly after:
df=pd.DataFrame(list(test_dict.items()),columns=['col1','col2'])
Hopefully, this answer can help whoever ran into a similar situation like me.
answered Mar 29, 2021 at 17:00
dayaoyaodayaoyao
1102 silver badges9 bronze badges
import json
# Opening JSON file
f = open('data.json')
# returns JSON object as
# a dictionary
data1 = json.load(f)
#converting it into dataframe
df = pd.read_json(data1, orient ='index')
answered Aug 1, 2022 at 4:56
1
The error code that reads valueerror: dataframe constructor not properly called! occurs when using a data analysis tool like Pandas or Pyspark. It can be confusing to figure out the cause of the error but don’t fret. Keep on reading as our experts teach you what causes this error and how you can fix it. Our explanation will be in-depth, so you are in good hands.
Contents
- Why Is the Dataframe Constructor Not Properly Called?
- – You Provided a String Representation to the Dataframe Constructor
- – You Misused the Input Types to Pandas Dataframe
- – You Used the Wrong Parameter to Pandas Dataframe
- – There Is a Mismatch Between Python and Azure-ML Libraries
- How To Fix Dataframe Constructor Not Called
- – Use a Dictionary for Pandas.dataframe
- – Provide the Right Input to the Dataframe
- – Use the Right Parameter for the DataFrame
- – Switch Python Version in Azure
- Useful Information About the Dataframe Error
- – What Is a Value Error?
- – How To Convert Json to a Dataframe?
- – How To Convert a List to a Dataframe?
- – How To Make Research About Python and Dataframe?
- – How Do You Create a Dataframe in Python?
- – How To Create a Dataframe From Another Dataframe in Pandas?
- Conclusion
The DataFrame Constructor is not called properly because: you provided a string representation to the pandas.DataFrame Constructor, you misused the input types to Pandas Dataframe, you used the wrong parameter to Pandas DataFrame, or there is a mismatch between Python and azure-ml libraries.
Let’s take a closer look at these possible reasons.
– You Provided a String Representation to the Dataframe Constructor
The DataFrame constructor requires that its input be an iterable, a dictionary, or another DataFrame. Failure to adhere to any of the requirements will lead to a ValueError. That’s because the string that you’ve provided can not work with pandas.DataFrame.
In the sample code below, we’ve violated the requirements for the pandas.DataFrame, so the code will result in a ValueError.
dataframe_test = DataFrame(index = idx, data=(myData))
– You Misused the Input Types to Pandas Dataframe
Pandas clearly define the input types that you must use, but it’s easy to think that it’ll do otherwise. Which raises the question: What if I supply a number instead of what Pandas require? Well, things won’t go as planned, and you’ll get the ValueError that the Constructor was not called properly.
For example, in the code below, we’ve supplied a number to Pandas. Therefore, any usage of the code will lead to a ValueError.
import pandas as p_d
p_d.DataFrame(5000)
Furthermore, in the example below, we used a string in the DataFrame, so we get the same ValueError.
data_frame = p_d.DataFrame(‘Email’)
– You Used the Wrong Parameter to Pandas Dataframe
When you are working with images in Pandas, you can see the ValueError about the Constructor. Meanwhile, if your code is a lot, it can be tough to know how to fix the error. We present an example in the next code.
In the code below, we’ve used the result of a calculation as the value of pandas.DataFrame. As a result, the code will result in an error.
def detect_Image(self):
img = self.aux4
(_, contours, _) = cv.findContours(img, cv.RETR_EXTERNAL,
cv.CHAIN_APPROX_SIMPLE)
c = [cv.minEnclosingCircle(cnt) for cnt in contours]
print (‘Circles: ‘, len(c), ‘Type: ‘,
type(c))
for i in range(len(c)):
r = ci
a = pi * (r * r)
a = a * self.Scal
self.res = pd.dataframe(a)
print self.res
– There Is a Mismatch Between Python and Azure-ML Libraries
You can run into an error with pandas in a Python script when doing machine learning with Microsoft Azure. This error will happen if you are running Python 2.7.7 (sklearn v.0.15.1). As a result, it’ll return a non-zero exit code.
How To Fix Dataframe Constructor Not Called
You can fix the pandas.DataFrame not called by using a Dictionary for pandas.Dataframe, providing the right input to the DataFrame, using the right parameter for the DataFrame, or switching Python version in Azure.
In this section, we’ll be going more in-depth regarding the steps you can take to solve this error.
– Use a Dictionary for Pandas.dataframe
Using a dictionary for pandas.DataFrame will prevent the Constructor not called error. In Python, you use a convert string to dataframe Python code. This will allow you to use a dataframe in pandas.DataFrame. Another option is to create dataframe to use with Python Pandas.
In the example below, we’ve made corrections to the code. The corrections will prevent the ValueError.
import ast
myData = [“-35054”, “2018-09-15T09:09:23.340Z”, “2018-09-15T09:09:23.340Z”]
# convert string to a dict
dict = ast.literal_eval(myData)
# Use as input to the dataframe
df_test2 = DataFrame(index = idx, data=(dict))
– Provide the Right Input to the Dataframe
The right input to DataFrame will prevent the ValueError about the Constructor. In the code below, we modified a previous example. Only this time, the DataFrame has the correct input.
data_frame = p_d.DataFrame(columns=[‘Email’])
In PySpark, the right input will prevent valueerror: dataframe constructor not properly called! pyspark error. Also, this applies in Plotyl express, as it eliminates the valueerror dataframe constructor not properly called plotly express error. What’s more, in Scikit, you’ll get the following error code if you are using the wrong method to create a DataFrame:
in init raise valueerror dataframe constructor not properly called
– Use the Right Parameter for the DataFrame
Using the right parameter for the DataFrame will eliminate the error that will occur with DataFrame. Earlier in this article, we showed you an example of a code that does some image processing. However, the code resulted in an error stating that the Constructor was not called properly.
In the code below, we’ve made corrections that will prevent the error.
def detect_Image(self):
img = self.aux4
(_, contours, _) = cv.findContours(img, cv.RETR_EXTERNAL,
cv.CHAIN_APPROX_SIMPLE)
c = [cv.minEnclosingCircle(cnt) for cnt in contours]
print(f”Circles: {len(circles)}, Type: {type(circles)}”)
areas_gen = (pi * (r * r) * self.Scal for _, radius in circles)
self.res = pd.dataframe({“Area”: areas_generated})
print(self.res)
From the code above, we’ve used a Python dictionary to eliminate the error. However, in some situations, you might use a list to solve this error. So you need to be wary of the error code that reads valueerror: if using all scalar values, you must pass an index, as this error means that you must use an index when using scalar values.
The following is an example of what we are talking about:
a = 2
b = 3
data_frame = pd.DataFrame({‘X’: x, ‘Y’: y}, index=[0])
However, the following line will produce a DataFrame error:
data_frame = pd.DataFrame({‘X’:y,’Y’:y}
– Switch Python Version in Azure
You get the Pandas error when you switch the python version in Azure. You can switch the Python version in an Execute Python Script module. All you have to do is select the Python Script, and a drop-down menu will appear on the right-hand side of the page. Choose another Python version.
Useful Information About the Dataframe Error
In this section, we’ll discuss additional relevant information that will help you understand why this error occurred in the first place. We’ll make it an interactive section, so we’ll pose some commonly-asked questions and we’ll give you the answers as well.
– What Is a Value Error?
A ValueError is an exception that Python raises when you supply an invalid value to a function. However, the value is a valid function. This definition best explains the reason why the DataFrame throws an error when you provide it with a string, because it expected a dictionary.
– How To Convert Json to a Dataframe?
You can convert JSON to DataFrame by using the read_json() function available in Pandas. However, when you are working with a nested JSON, you can use the json_normalize() function.
– How To Convert a List to a Dataframe?
You can convert a list to DataFrame by using a list with index and column names, using zip() function, creating from the multidimensional list, using a multidimensional list with column name, or using a list in the dictionary.
– How To Make Research About Python and Dataframe?
You can make further research about Python and DataFrame by checking the web for questions tagged Python. An example of such a site is StackOverflow. The research we are talking about should be about how to prevent the ValueError in DataFrame. When you do this, you’ll be in a better position to know what to check when this error occurs.
– How Do You Create a Dataframe in Python?
You can create a DataFrame in Python by importing Pandas into your code, creating the data as lists, creating a DataFrame using Pandas.DataFrame(), and printing the results.
The code below is the implementation of these steps.
import pandas as pd
# assign data
data = {‘Name’: [‘Frank’, ‘Michael’, ‘Maddison’, ‘Tony’], ‘Age’: [34, 25, 29, 28]}
# Create a DataFrame
data_frame = pd.DataFrame(data)
# Print the data
print(data_frame)
– How To Create a Dataframe From Another Dataframe in Pandas?
The DataFrame.assign() method allows you to create a new DataFrame from another DataFrame. It does this by assigning new columns to a DataFrame, so it returns a new object that contains the original columns added to the new ones.
Conclusion
This article explained how to fix the DataFrame error when using Pandas and related libraries. What’s more, we also answered some frequent questions about the error as a whole. The following are the main points that we discussed in this guide:
- A wrong input will cause a DataFrame ValueError.
- In Azure, mismatch in Python versions can cause a DataFrame ValueError.
- The Constructor in the DataFrame expects values like a dictionary or another DataFrame.
- The DataFrame.assign() method will create a DataFrame from another DataFrame.
At this stage, you are all set to fix the DataFrame ValueError in Pandas and related Python libraries.
- Author
- Recent Posts
Position Is Everything: Your Go-To Resource for Learn & Build: CSS,JavaScript,HTML,PHP,C++ and MYSQL.
I am new to using the Python language and am co facing problem in creating the Dataframe in the format of key and value i.e.
data = [{'key':'[GlobalProgramSizeInThousands]','value':'1000'},]
Here is my code:
columnsss = ['key','value']; query = "select * from bparst_tags where tag_type = 1 "; result = database.cursor(db.cursors.DictCursor); result.execute(query); result_set = result.fetchall(); data = "["; for row in result_set: `row["tag_expression"]`) data += "{'value': %s , 'key': %s }," % ( `row["tag_expression"]`, `row["tag_name"]` ) data += "]" ; df = DataFrame(data , columns=columnsss);
But when I pass the data in DataFrame it shows me
pandas.core.common.PandasError: DataFrame constructor not properly called!
while if I print the data and assign the same value to the data variable then it works.
Feb 22, 2022
in Python
by
• 9,670 points
•
1,409 views
1 answer to this question.
You are providing a string representation of a dict to the DataFrame constructor, and not a dict itself. So this is the reason you get that error.
So if you want to use your code, you could do:
df = DataFrame(eval(data))
But better would be to not create the string in the first place, but directly putting it in a dict. Something roughly like:
data = [] for row in result_set: data.append({'value': row["tag_expression"], 'key': row["tag_name"]})
But probably even this is not needed, as depending on what is exactly in your result_set you could probably provide this directly to a DataFrame: DataFrame(result_set) or even use the pandas read_sql_query function to do this for you
answered
Feb 22, 2022
by
Aditya
• 7,660 points
Related Questions In Python
- All categories
-
ChatGPT
(2) -
Apache Kafka
(84) -
Apache Spark
(596) -
Azure
(131) -
Big Data Hadoop
(1,907) -
Blockchain
(1,673) -
C#
(141) -
C++
(271) -
Career Counselling
(1,060) -
Cloud Computing
(3,446) -
Cyber Security & Ethical Hacking
(147) -
Data Analytics
(1,266) -
Database
(855) -
Data Science
(75) -
DevOps & Agile
(3,575) -
Digital Marketing
(111) -
Events & Trending Topics
(28) -
IoT (Internet of Things)
(387) -
Java
(1,247) -
Kotlin
(8) -
Linux Administration
(389) -
Machine Learning
(337) -
MicroStrategy
(6) -
PMP
(423) -
Power BI
(516) -
Python
(3,188) -
RPA
(650) -
SalesForce
(92) -
Selenium
(1,569) -
Software Testing
(56) -
Tableau
(608) -
Talend
(73) -
TypeSript
(124) -
Web Development
(3,002) -
Ask us Anything!
(66) -
Others
(1,929) -
Mobile Development
(263)
Subscribe to our Newsletter, and get personalized recommendations.
Already have an account? Sign in.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and
privacy statement. We’ll occasionally send you account related emails.
Already on GitHub?
Sign in
to your account
Labels
bug
Something isn’t working
Comments
While running a simple linear regression model on kaggle using pycaret, I get the error
DataFrame constructor not properly called on running setup. The dataset contains only two columns.
df=pd.read_csv("/kaggle/input/salary-data-simple-linear-regression/Salary_Data.csv")`
setup_data1 = setup(data = df, target = 'Salary', session_id=123)
Logs:
ValueError Traceback (most recent call last)
in
—-> 1 setup_data1 = setup(data = df, target = ‘Salary’, session_id=123)
/opt/conda/lib/python3.6/site-packages/pycaret/regression.py in setup(data, target, train_size, sampling, sample_estimator, categorical_features, categorical_imputation, ordinal_features, high_cardinality_features, high_cardinality_method, numeric_features, numeric_imputation, date_features, ignore_features, normalize, normalize_method, transformation, transformation_method, handle_unknown_categorical, unknown_categorical_method, pca, pca_method, pca_components, ignore_low_variance, combine_rare_levels, rare_level_threshold, bin_numeric_features, remove_outliers, outliers_threshold, remove_multicollinearity, multicollinearity_threshold, create_clusters, cluster_iter, polynomial_features, polynomial_degree, trigonometry_features, polynomial_threshold, group_features, group_names, feature_selection, feature_selection_threshold, feature_interaction, feature_ratio, interaction_threshold, transform_target, transform_target_method, session_id, silent, profile)
955 target_transformation = transform_target, #new
956 target_transformation_method = transform_target_method_pass, #new
—> 957 random_state = seed)
958
959 progress.value += 1
/opt/conda/lib/python3.6/site-packages/pycaret/preprocess.py in Preprocess_Path_One(train_data, target_variable, ml_usecase, test_data, categorical_features, numerical_features, time_features, features_todrop, display_types, imputation_type, numeric_imputation_strategy, categorical_imputation_strategy, apply_zero_nearZero_variance, club_rare_levels, rara_level_threshold_percentage, apply_untrained_levels_treatment, untrained_levels_treatment_method, apply_ordinal_encoding, ordinal_columns_and_categories, apply_cardinality_reduction, cardinal_method, cardinal_features, apply_binning, features_to_binn, apply_grouping, group_name, features_to_group_ListofList, apply_polynomial_trigonometry_features, max_polynomial, trigonometry_calculations, top_poly_trig_features_to_select_percentage, scale_data, scaling_method, Power_transform_data, Power_transform_method, target_transformation, target_transformation_method, remove_outliers, outlier_contamination_percentage, outlier_methods, apply_feature_selection, feature_selection_top_features_percentage, remove_multicollinearity, maximum_correlation_between_features, remove_perfect_collinearity, apply_feature_interactions, feature_interactions_to_apply, feature_interactions_top_features_to_select_percentage, cluster_entire_data, range_of_clusters_to_try, apply_pca, pca_method, pca_variance_retained_or_number_of_components, random_state)
2538 return(pipe.fit_transform(train_data),pipe.transform(test_data))
2539 else:
-> 2540 return(pipe.fit_transform(train_data))
2541
2542
/opt/conda/lib/python3.6/site-packages/sklearn/pipeline.py in fit_transform(self, X, y, **fit_params)
381 «»»
382 last_step = self._final_estimator
—> 383 Xt, fit_params = self._fit(X, y, **fit_params)
384 with _print_elapsed_time(‘Pipeline’,
385 self._log_message(len(self.steps) — 1)):
/opt/conda/lib/python3.6/site-packages/sklearn/pipeline.py in _fit(self, X, y, **fit_params)
311 message_clsname=’Pipeline’,
312 message=self._log_message(step_idx),
—> 313 **fit_params_steps[name])
314 # Replace the transformer of the step with the fitted
315 # transformer. This is necessary when loading the transformer
/opt/conda/lib/python3.6/site-packages/joblib/memory.py in call(self, *args, **kwargs)
353
354 def call(self, *args, **kwargs):
—> 355 return self.func(*args, **kwargs)
356
357 def call_and_shelve(self, *args, **kwargs):
/opt/conda/lib/python3.6/site-packages/sklearn/pipeline.py in _fit_transform_one(transformer, X, y, weight, message_clsname, message, **fit_params)
724 with _print_elapsed_time(message_clsname, message):
725 if hasattr(transformer, ‘fit_transform’):
—> 726 res = transformer.fit_transform(X, y, **fit_params)
727 else:
728 res = transformer.fit(X, y, **fit_params).transform(X)
/opt/conda/lib/python3.6/site-packages/pycaret/preprocess.py in fit_transform(self, dataset, y)
1957 def fit_transform(self,dataset,y=None):
1958 data = dataset.copy()
-> 1959 corr = pd.DataFrame(np.corrcoef(data.drop(self.target,axis=1).T))
1960 corr.columns = data.drop(self.target,axis=1).columns
1961 corr.index = data.drop(self.target,axis=1).columns
/opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in init(self, data, index, columns, dtype, copy)
483 )
484 else:
—> 485 raise ValueError(«DataFrame constructor not properly called!»)
486
487 NDFrame.init(self, mgr, fastpath=True)
ValueError: DataFrame constructor not properly called!
Hi,
Thanks for logging the issue. I don’t see if you have imported regression module in above code. Can you please confirm if you have run the below code before running setup function:
from pycaret.regression import *
Thanks.
Yes, I have made all the necessary imports otherwise the error would be module not found. I didnt mention that in the code above since I thought that’s implied.
Thanks. Can you try this on a local Jupyter notebook or Google colab and see if you can reproduce the error?
Can you also share the csv file?
@DaGuT Thank you so so much. I will start working on PR’s in weeks time.
@pycaret issue solved and can be closed?
i have the same issue — one numeric, one label data type
`—————————————————————————
ValueError Traceback (most recent call last)
in ()
1 from pycaret.regression import *
—-> 2 exp1 = setup(df, target = ‘y’)
~Anaconda3libsite-packagespycaretregression.py in setup(data, target, train_size, sampling, sample_estimator, categorical_features, categorical_imputation, ordinal_features, high_cardinality_features, high_cardinality_method, numeric_features, numeric_imputation, date_features, ignore_features, normalize, normalize_method, transformation, transformation_method, handle_unknown_categorical, unknown_categorical_method, pca, pca_method, pca_components, ignore_low_variance, combine_rare_levels, rare_level_threshold, bin_numeric_features, remove_outliers, outliers_threshold, remove_multicollinearity, multicollinearity_threshold, create_clusters, cluster_iter, polynomial_features, polynomial_degree, trigonometry_features, polynomial_threshold, group_features, group_names, feature_selection, feature_selection_threshold, feature_interaction, feature_ratio, interaction_threshold, transform_target, transform_target_method, session_id, silent, profile)
955 target_transformation = transform_target, #new
956 target_transformation_method = transform_target_method_pass, #new
—> 957 random_state = seed)
958
959 progress.value += 1
~Anaconda3libsite-packagespycaretpreprocess.py in Preprocess_Path_One(train_data, target_variable, ml_usecase, test_data, categorical_features, numerical_features, time_features, features_todrop, display_types, imputation_type, numeric_imputation_strategy, categorical_imputation_strategy, apply_zero_nearZero_variance, club_rare_levels, rara_level_threshold_percentage, apply_untrained_levels_treatment, untrained_levels_treatment_method, apply_ordinal_encoding, ordinal_columns_and_categories, apply_cardinality_reduction, cardinal_method, cardinal_features, apply_binning, features_to_binn, apply_grouping, group_name, features_to_group_ListofList, apply_polynomial_trigonometry_features, max_polynomial, trigonometry_calculations, top_poly_trig_features_to_select_percentage, scale_data, scaling_method, Power_transform_data, Power_transform_method, target_transformation, target_transformation_method, remove_outliers, outlier_contamination_percentage, outlier_methods, apply_feature_selection, feature_selection_top_features_percentage, remove_multicollinearity, maximum_correlation_between_features, remove_perfect_collinearity, apply_feature_interactions, feature_interactions_to_apply, feature_interactions_top_features_to_select_percentage, cluster_entire_data, range_of_clusters_to_try, apply_pca, pca_method, pca_variance_retained_or_number_of_components, random_state)
2538 return(pipe.fit_transform(train_data),pipe.transform(test_data))
2539 else:
-> 2540 return(pipe.fit_transform(train_data))
2541
2542
~Anaconda3libsite-packagessklearnpipeline.py in fit_transform(self, X, y, **fit_params)
381 «»»
382 last_step = self._final_estimator
—> 383 Xt, fit_params = self._fit(X, y, **fit_params)
384 with _print_elapsed_time(‘Pipeline’,
385 self._log_message(len(self.steps) — 1)):
~Anaconda3libsite-packagessklearnpipeline.py in _fit(self, X, y, **fit_params)
311 message_clsname=’Pipeline’,
312 message=self._log_message(step_idx),
—> 313 **fit_params_steps[name])
314 # Replace the transformer of the step with the fitted
315 # transformer. This is necessary when loading the transformer
~Anaconda3libsite-packagesjoblibmemory.py in call(self, *args, **kwargs)
353
354 def call(self, *args, **kwargs):
—> 355 return self.func(*args, **kwargs)
356
357 def call_and_shelve(self, *args, **kwargs):
~Anaconda3libsite-packagessklearnpipeline.py in _fit_transform_one(transformer, X, y, weight, message_clsname, message, **fit_params)
724 with _print_elapsed_time(message_clsname, message):
725 if hasattr(transformer, ‘fit_transform’):
—> 726 res = transformer.fit_transform(X, y, **fit_params)
727 else:
728 res = transformer.fit(X, y, **fit_params).transform(X)
~Anaconda3libsite-packagespycaretpreprocess.py in fit_transform(self, dataset, y)
1957 def fit_transform(self,dataset,y=None):
1958 data = dataset.copy()
-> 1959 corr = pd.DataFrame(np.corrcoef(data.drop(self.target,axis=1).T))
1960 corr.columns = data.drop(self.target,axis=1).columns
1961 corr.index = data.drop(self.target,axis=1).columns
~Anaconda3libsite-packagespandascoreframe.py in init(self, data, index, columns, dtype, copy)
466 dtype=values.dtype, copy=False)
467 else:
—> 468 raise ValueError(‘DataFrame constructor not properly called!’)
469
470 NDFrame.init(self, mgr, fastpath=True)
ValueError: DataFrame constructor not properly called!`
github-actions
bot
locked as resolved and limited conversation to collaborators
May 18, 2022
Labels
bug
Something isn’t working