Error workfile per query size limit exceeded - Исправление ошибок и поиск оптимальных решений проблем

Содержание

Sql error 53000 error workfile per query size limit exceeded
Приложение A. Коды ошибок PostgreSQL
Sql error 53000 error workfile per query size limit exceeded
Submit correction

Sql error 53000 error workfile per query size limit exceeded

You are using an outdated browser. Please upgrade your browser to improve your experience.

—> share-line

—> —> —> —> plus

Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to run an SQL query in memory.

The maximum number of spill files for a given query is governed by the gp_workfile_limit_files_per_query server configuration parameter setting. The default value of 100,000 spill files is sufficient for the majority of queries.

If a query creates more than the configured number of spill files, Greenplum Database returns this error:

Greenplum Database may generate a large number of spill files when:

Data skew is present in the queried data. To check for data skew, see Checking for Data Distribution Skew.
The amount of memory allocated for the query is too low. You control the maximum amount of memory that can be used by a query with the Greenplum Database server configuration parameters max_statement_mem and statement_mem, or through resource group or resource queue configuration.

You might be able to run the query successfully by changing the query, changing the data distribution, or changing the system memory configuration. The gp_toolkit gp_workfile_* views display spill file usage information. You can use this information to troubleshoot and tune queries. The gp_workfile_* views are described in Checking Query Disk Spill Space Usage.

Additional documentation resources:

Memory Consumption Parameters identifies the memory-related spill file server configuration parameters.
Using Resource Groups describes memory and spill considerations when resource group-based resource management is active.
Using Resource Queues describes memory and spill considerations when resource queue-based resource management is active.

Parent topic: Querying Data

Источник

Приложение A. Коды ошибок PostgreSQL

Всем сообщениям, которые выдаёт сервер PostgreSQL , назначены пятисимвольные коды ошибок, соответствующие кодам «SQLSTATE» , описанным в стандарте SQL. Приложения, которые должны знать, какое условие ошибки имело место, обычно проверяют код ошибки и только потом обращаются к текстовому сообщению об ошибке. Коды ошибок, скорее всего, не изменятся от выпуска к выпуску PostgreSQL , и они не меняются при локализации как сообщения об ошибках. Заметьте, что отдельные, но не все коды ошибок, которые выдаёт PostgreSQL , определены стандартом SQL; некоторые дополнительные коды ошибок для условий, не описанных стандартом, были добавлены независимо или позаимствованы из других баз данных.

Согласно стандарту, первые два символа кода ошибки обозначают класс ошибок, а последние три символа обозначают определённое условие в этом классе. Таким образом, приложение, не знающее значение определённого кода ошибки, всё же может понять, что делать, по классу ошибки.

В Таблице A-1 перечислены все коды ошибок, определённые в PostgreSQL 9.4.1. (Некоторые коды в настоящее время не используются, хотя они определены в стандарте SQL.) Также показаны классы ошибок. Для каждого класса ошибок имеется «стандартный» код ошибки с последними тремя символами 000. Этот код выдаётся только для таких условий ошибок, которые относятся к определённому классу, но не имеют более определённого кода.

Символ, указанный в колонке «Имя условия» , определяет условие в PL/pgSQL . Имена условий могут записываться в верхнем или нижнем регистре. (Заметьте, что PL/pgSQL , в отличие от ошибок, не распознаёт предупреждения; то есть классы 00, 01 и 02.)

Для некоторых типов ошибок сервер сообщает имя объекта базы данных (таблица, колонка таблицы, тип данных или ограничение), связанного с ошибкой; например, имя уникального ограничения, вызвавшего ошибку unique_violation. Такие имена передаются в отдельных полях сообщения об ошибке, чтобы приложениям не пришлось извлекать его из возможно локализованного текста ошибки для человека. На момент выхода PostgreSQL 9.3 полностью охватывались только ошибки класса SQLSTATE 23 (нарушения ограничений целостности), но в будущем должны быть охвачены и другие классы.

Источник

Sql error 53000 error workfile per query size limit exceeded

All messages emitted by the PostgreSQL server are assigned five-character error codes that follow the SQL standard’s conventions for “ SQLSTATE ” codes. Applications that need to know which error condition has occurred should usually test the error code, rather than looking at the textual error message. The error codes are less likely to change across PostgreSQL releases, and also are not subject to change due to localization of error messages. Note that some, but not all, of the error codes produced by PostgreSQL are defined by the SQL standard; some additional error codes for conditions not defined by the standard have been invented or borrowed from other databases.

According to the standard, the first two characters of an error code denote a class of errors, while the last three characters indicate a specific condition within that class. Thus, an application that does not recognize the specific error code might still be able to infer what to do from the error class.

Table A.1 lists all the error codes defined in PostgreSQL 15.1. (Some are not actually used at present, but are defined by the SQL standard.) The error classes are also shown. For each error class there is a “ standard ” error code having the last three characters 000 . This code is used only for error conditions that fall within the class but do not have any more-specific code assigned.

The symbol shown in the column “ Condition Name ” is the condition name to use in PL/pgSQL . Condition names can be written in either upper or lower case. (Note that PL/pgSQL does not recognize warning, as opposed to error, condition names; those are classes 00, 01, and 02.)

For some types of errors, the server reports the name of a database object (a table, table column, data type, or constraint) associated with the error; for example, the name of the unique constraint that caused a unique_violation error. Such names are supplied in separate fields of the error report message so that applications need not try to extract them from the possibly-localized human-readable text of the message. As of PostgreSQL 9.3, complete coverage for this feature exists only for errors in SQLSTATE class 23 (integrity constraint violation), but this is likely to be expanded in future.

Table A.1. PostgreSQL Error Codes

Error Code	Condition Name
Class 00 — Successful Completion
00000	successful_completion
Class 01 — Warning
01000	warning
0100C	dynamic_result_sets_returned
01008	implicit_zero_bit_padding
01003	null_value_eliminated_in_set_function
01007	privilege_not_granted
01006	privilege_not_revoked
01004	string_data_right_truncation
01P01	deprecated_feature
Class 02 — No Data (this is also a warning class per the SQL standard)
02000	no_data
02001	no_additional_dynamic_result_sets_returned
Class 03 — SQL Statement Not Yet Complete
03000	sql_statement_not_yet_complete
Class 08 — Connection Exception
08000	connection_exception
08003	connection_does_not_exist
08006	connection_failure
08001	sqlclient_unable_to_establish_sqlconnection
08004	sqlserver_rejected_establishment_of_sqlconnection
08007	transaction_resolution_unknown
08P01	protocol_violation
Class 09 — Triggered Action Exception
09000	triggered_action_exception
Class 0A — Feature Not Supported
0A000	feature_not_supported
Class 0B — Invalid Transaction Initiation
0B000	invalid_transaction_initiation
Class 0F — Locator Exception
0F000	locator_exception
0F001	invalid_locator_specification
Class 0L — Invalid Grantor
0L000	invalid_grantor
0LP01	invalid_grant_operation
Class 0P — Invalid Role Specification
0P000	invalid_role_specification
Class 0Z — Diagnostics Exception
0Z000	diagnostics_exception
0Z002	stacked_diagnostics_accessed_without_active_handler
Class 20 — Case Not Found
20000	case_not_found
Class 21 — Cardinality Violation
21000	cardinality_violation
Class 22 — Data Exception
22000	data_exception
2202E	array_subscript_error
22021	character_not_in_repertoire
22008	datetime_field_overflow
22012	division_by_zero
22005	error_in_assignment
2200B	escape_character_conflict
22022	indicator_overflow
22015	interval_field_overflow
2201E	invalid_argument_for_logarithm
22014	invalid_argument_for_ntile_function
22016	invalid_argument_for_nth_value_function
2201F	invalid_argument_for_power_function
2201G	invalid_argument_for_width_bucket_function
22018	invalid_character_value_for_cast
22007	invalid_datetime_format
22019	invalid_escape_character
2200D	invalid_escape_octet
22025	invalid_escape_sequence
22P06	nonstandard_use_of_escape_character
22010	invalid_indicator_parameter_value
22023	invalid_parameter_value
22013	invalid_preceding_or_following_size
2201B	invalid_regular_expression
2201W	invalid_row_count_in_limit_clause
2201X	invalid_row_count_in_result_offset_clause
2202H	invalid_tablesample_argument
2202G	invalid_tablesample_repeat
22009	invalid_time_zone_displacement_value
2200C	invalid_use_of_escape_character
2200G	most_specific_type_mismatch
22004	null_value_not_allowed
22002	null_value_no_indicator_parameter
22003	numeric_value_out_of_range
2200H	sequence_generator_limit_exceeded
22026	string_data_length_mismatch
22001	string_data_right_truncation
22011	substring_error
22027	trim_error
22024	unterminated_c_string
2200F	zero_length_character_string
22P01	floating_point_exception
22P02	invalid_text_representation
22P03	invalid_binary_representation
22P04	bad_copy_file_format
22P05	untranslatable_character
2200L	not_an_xml_document
2200M	invalid_xml_document
2200N	invalid_xml_content
2200S	invalid_xml_comment
2200T	invalid_xml_processing_instruction
22030	duplicate_json_object_key_value
22031	invalid_argument_for_sql_json_datetime_function
22032	invalid_json_text
22033	invalid_sql_json_subscript
22034	more_than_one_sql_json_item
22035	no_sql_json_item
22036	non_numeric_sql_json_item
22037	non_unique_keys_in_a_json_object
22038	singleton_sql_json_item_required
22039	sql_json_array_not_found
2203A	sql_json_member_not_found
2203B	sql_json_number_not_found
2203C	sql_json_object_not_found
2203D	too_many_json_array_elements
2203E	too_many_json_object_members
2203F	sql_json_scalar_required
2203G	sql_json_item_cannot_be_cast_to_target_type
Class 23 — Integrity Constraint Violation
23000	integrity_constraint_violation
23001	restrict_violation
23502	not_null_violation
23503	foreign_key_violation
23505	unique_violation
23514	check_violation
23P01	exclusion_violation
Class 24 — Invalid Cursor State
24000	invalid_cursor_state
Class 25 — Invalid Transaction State
25000	invalid_transaction_state
25001	active_sql_transaction
25002	branch_transaction_already_active
25008	held_cursor_requires_same_isolation_level
25003	inappropriate_access_mode_for_branch_transaction
25004	inappropriate_isolation_level_for_branch_transaction
25005	no_active_sql_transaction_for_branch_transaction
25006	read_only_sql_transaction
25007	schema_and_data_statement_mixing_not_supported
25P01	no_active_sql_transaction
25P02	in_failed_sql_transaction
25P03	idle_in_transaction_session_timeout
Class 26 — Invalid SQL Statement Name
26000	invalid_sql_statement_name
Class 27 — Triggered Data Change Violation
27000	triggered_data_change_violation
Class 28 — Invalid Authorization Specification
28000	invalid_authorization_specification
28P01	invalid_password
Class 2B — Dependent Privilege Descriptors Still Exist
2B000	dependent_privilege_descriptors_still_exist
2BP01	dependent_objects_still_exist
Class 2D — Invalid Transaction Termination
2D000	invalid_transaction_termination
Class 2F — SQL Routine Exception
2F000	sql_routine_exception
2F005	function_executed_no_return_statement
2F002	modifying_sql_data_not_permitted
2F003	prohibited_sql_statement_attempted
2F004	reading_sql_data_not_permitted
Class 34 — Invalid Cursor Name
34000	invalid_cursor_name
Class 38 — External Routine Exception
38000	external_routine_exception
38001	containing_sql_not_permitted
38002	modifying_sql_data_not_permitted
38003	prohibited_sql_statement_attempted
38004	reading_sql_data_not_permitted
Class 39 — External Routine Invocation Exception
39000	external_routine_invocation_exception
39001	invalid_sqlstate_returned
39004	null_value_not_allowed
39P01	trigger_protocol_violated
39P02	srf_protocol_violated
39P03	event_trigger_protocol_violated
Class 3B — Savepoint Exception
3B000	savepoint_exception
3B001	invalid_savepoint_specification
Class 3D — Invalid Catalog Name
3D000	invalid_catalog_name
Class 3F — Invalid Schema Name
3F000	invalid_schema_name
Class 40 — Transaction Rollback
40000	transaction_rollback
40002	transaction_integrity_constraint_violation
40001	serialization_failure
40003	statement_completion_unknown
40P01	deadlock_detected
Class 42 — Syntax Error or Access Rule Violation
42000	syntax_error_or_access_rule_violation
42601	syntax_error
42501	insufficient_privilege
42846	cannot_coerce
42803	grouping_error
42P20	windowing_error
42P19	invalid_recursion
42830	invalid_foreign_key
42602	invalid_name
42622	name_too_long
42939	reserved_name
42804	datatype_mismatch
42P18	indeterminate_datatype
42P21	collation_mismatch
42P22	indeterminate_collation
42809	wrong_object_type
428C9	generated_always
42703	undefined_column
42883	undefined_function
42P01	undefined_table
42P02	undefined_parameter
42704	undefined_object
42701	duplicate_column
42P03	duplicate_cursor
42P04	duplicate_database
42723	duplicate_function
42P05	duplicate_prepared_statement
42P06	duplicate_schema
42P07	duplicate_table
42712	duplicate_alias
42710	duplicate_object
42702	ambiguous_column
42725	ambiguous_function
42P08	ambiguous_parameter
42P09	ambiguous_alias
42P10	invalid_column_reference
42611	invalid_column_definition
42P11	invalid_cursor_definition
42P12	invalid_database_definition
42P13	invalid_function_definition
42P14	invalid_prepared_statement_definition
42P15	invalid_schema_definition
42P16	invalid_table_definition
42P17	invalid_object_definition
Class 44 — WITH CHECK OPTION Violation
44000	with_check_option_violation
Class 53 — Insufficient Resources
53000	insufficient_resources
53100	disk_full
53200	out_of_memory
53300	too_many_connections
53400	configuration_limit_exceeded
Class 54 — Program Limit Exceeded
54000	program_limit_exceeded
54001	statement_too_complex
54011	too_many_columns
54023	too_many_arguments
Class 55 — Object Not In Prerequisite State
55000	object_not_in_prerequisite_state
55006	object_in_use
55P02	cant_change_runtime_param
55P03	lock_not_available
55P04	unsafe_new_enum_value_usage
Class 57 — Operator Intervention
57000	operator_intervention
57014	query_canceled
57P01	admin_shutdown
57P02	crash_shutdown
57P03	cannot_connect_now
57P04	database_dropped
57P05	idle_session_timeout
Class 58 — System Error (errors external to PostgreSQL itself)
58000	system_error
58030	io_error
58P01	undefined_file
58P02	duplicate_file
Class 72 — Snapshot Failure
72000	snapshot_too_old
Class F0 — Configuration File Error
F0000	config_file_error
F0001	lock_file_exists
Class HV — Foreign Data Wrapper Error (SQL/MED)
HV000	fdw_error
HV005	fdw_column_name_not_found
HV002	fdw_dynamic_parameter_value_needed
HV010	fdw_function_sequence_error
HV021	fdw_inconsistent_descriptor_information
HV024	fdw_invalid_attribute_value
HV007	fdw_invalid_column_name
HV008	fdw_invalid_column_number
HV004	fdw_invalid_data_type
HV006	fdw_invalid_data_type_descriptors
HV091	fdw_invalid_descriptor_field_identifier
HV00B	fdw_invalid_handle
HV00C	fdw_invalid_option_index
HV00D	fdw_invalid_option_name
HV090	fdw_invalid_string_length_or_buffer_length
HV00A	fdw_invalid_string_format
HV009	fdw_invalid_use_of_null_pointer
HV014	fdw_too_many_handles
HV001	fdw_out_of_memory
HV00P	fdw_no_schemas
HV00J	fdw_option_name_not_found
HV00K	fdw_reply_handle
HV00Q	fdw_schema_not_found
HV00R	fdw_table_not_found
HV00L	fdw_unable_to_create_execution
HV00M	fdw_unable_to_create_reply
HV00N	fdw_unable_to_establish_connection
Class P0 — PL/pgSQL Error
P0000	plpgsql_error
P0001	raise_exception
P0002	no_data_found
P0003	too_many_rows
P0004	assert_failure
Class XX — Internal Error
XX000	internal_error
XX001	data_corrupted
XX002	index_corrupted

Prev	Up	Next
Part VIII. Appendixes	Home	Appendix B. Date/Time Support

Submit correction

If you see anything in the documentation that is not correct, does not match your experience with the particular feature or requires further clarification, please use this form to report a documentation issue.

Источник

The gadget spec URL could not be found

Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to execute an SQL query in memory.

By default maximum no of spill files that can be created is 100,000 which is sufficient for the majority of queries. However, If a query creates more than the specified number of spill files, Greenplum Database returns error:

ERROR: number of workfiles per query limit exceeded

Reasons behind large number of spill files to be generated include:

Data skew is present in the queried data.

The amount memory allocated for the query is too low.

when you get this error there are many ways you solve this problem.

By changing the query, changing the data distribution,

By changing the system memory configuration.

Note: You can control the maximum amount of memory that can used by a query with the Greenplum Database server configuration parameters max_statement_mem,statement_mem or through resource queues.

Monitoring spill file usage is done by looking at the information available using gp_workfile_* views

sachi=# dv gp_toolkit.gp_workfile_*
List of relations
Schema | Name | Type | Owner | Storage
————+——————————-+——+———+———
gp_toolkit | gp_workfile_entries | view | gpadmin | none
gp_toolkit | gp_workfile_usage_per_query | view | gpadmin | none
gp_toolkit | gp_workfile_usage_per_segment | view | gpadmin | none
(3 rows)

Lets look at the definition of these views.

1. gp_workfile_entries: This view contains one row for each operator using disk space for workfiles on a segment at the current time. The view is accessible to all users, however non-superusers only to see information for the databases that they have permission to access.

2. gp_workfile_usage_per_query: This view contains one row for each query using disk space for workfiles on a segment at the current time. The view is accessible to all users, however non-superusers only to see information for the databases that they have permission to access.

3. gp_workfile_usage_per_segment:This view contains one row for each segment. Each row displays the total amount of disk space used for workfiles on the segment at the current time. The view is accessible to all users, however non-superusers only to see information for the databases that they have permission to access.

gp_workfile_usage_per_segment (one row for each segment)

||

||

V

gp_workfile_usage_per_query ( one row for each query using disk space for workfiles on a segment at the current time)

gp_workfile_entries (one row for each operator using disk space for workfiles on a segment at the current time)

sachi=# d gp_toolkit.gp_workfile_entries

View «gp_toolkit.gp_workfile_entries»
Column | Type | Modifiers
—————+———+————
datname | name |
procpid | integer |
sess_id | integer |
command_cnt | integer |
usename | name |
current_query | text |
segid | integer |
slice | integer |
optype | text |
workmem | integer |
size | bigint |
numfiles | integer |
directory | text |
state | text |
utility | integer |
View definition:
WITH all_entries AS (
SELECT c.segid, c.path, c.hash, c.size, c.utility, c.state, c.workmem, c.optype, c.slice, c.sessionid, c.commandid, c.query_start, c.numfiles
FROM ONLY gp_toolkit.__gp_localid, gp_toolkit.__gp_workfile_entries_f() c(segid integer, path text, hash integer, size bigint, utility integer, state integer, workmem integer, optype text, slice integer, sessionid integer, commandid integer, query_start timestamp with time zone, numfiles integer)
UNION ALL
SELECT c.segid, c.path, c.hash, c.size, c.utility, c.state, c.workmem, c.optype, c.slice, c.sessionid, c.commandid, c.query_start, c.numfiles
FROM ONLY gp_toolkit.__gp_masterid, gp_toolkit.__gp_workfile_entries_f() c(segid integer, path text, hash integer, size bigint, utility integer, state integer, workmem integer, optype text, slice integer, sessionid integer, commandid integer, query_start timestamp with time zone, numfiles integer)
)
SELECT s.datname,
CASE
WHEN c.state = 1 THEN s.procpid
ELSE NULL::integer
END AS procpid, c.sessionid AS sess_id, c.commandid AS command_cnt, s.usename,
CASE
WHEN c.state = 1 THEN s.current_query
ELSE NULL::text
END AS current_query, c.segid, c.slice, c.optype, c.workmem, c.size, c.numfiles, c.path AS directory,
CASE
WHEN c.state = 1 THEN ‘RUNNING’::text
WHEN c.state = 2 THEN ‘CACHED’::text
WHEN c.state = 3 THEN ‘DELETING’::text
ELSE ‘UNKNOWN’::text
END AS state, c.utility
FROM all_entries c
LEFT JOIN pg_stat_activity s ON c.sessionid = s.sess_id;

sachi=# d gp_toolkit.gp_workfile_usage_per_query

View «gp_toolkit.gp_workfile_usage_per_query»
Column | Type | Modifiers
—————+———+————
datname | name |
procpid | integer |
sess_id | integer |
command_cnt | integer |
usename | name |
current_query | text |
segid | integer |
state | text |
size | numeric |
numfiles | bigint |
View definition:
SELECT gp_workfile_entries.datname, gp_workfile_entries.procpid, gp_workfile_entries.sess_id, gp_workfile_entries.command_cnt, gp_workfile_entries.usename, gp_workfile_entries.current_query, gp_workfile_entries.segid, gp_workfile_entries.state, sum(gp_workfile_entries.size) AS size, sum(gp_workfile_entries.numfiles) AS numfiles
FROM gp_toolkit.gp_workfile_entries
GROUP BY gp_workfile_entries.datname, gp_workfile_entries.procpid, gp_workfile_entries.sess_id, gp_workfile_entries.command_cnt, gp_workfile_entries.usename, gp_workfile_entries.current_query, gp_workfile_entries.segid, gp_workfile_entries.state;

sachi=# d gp_toolkit.gp_workfile_usage_per_segment
View «gp_toolkit.gp_workfile_usage_per_segment»
Column | Type | Modifiers
———-+———-+————
segid | smallint |
size | numeric |
numfiles | bigint |
View definition:
SELECT gpseg.content AS segid, COALESCE(sum(wfe.size), 0::numeric) AS size, sum(wfe.numfiles) AS numfiles
FROM ( SELECT gp_segment_configuration.content
FROM gp_segment_configuration
WHERE gp_segment_configuration.role = ‘p’::»char») gpseg
LEFT JOIN gp_toolkit.gp_workfile_entries wfe ON gpseg.content = wfe.segid
GROUP BY gpseg.content;

sachi=#

Workfile Disk Spill Space (4.3.0.0, 4.3.1.0, 4.3.2.0)

Server Configuration Parameters for gp_workfiles

1. gp_workfile_compress

_algorithm

2. gp_workfile_limit_files

_per_query

3. gp_workfile_limit_per

_query

4. gp_workfile_limit_per

_checksumming

The gadget spec URL could not be found

Источник

Permalink

Cannot retrieve contributors at this time

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters

Show hidden characters

	—
	— Test workfile limits
	—
	— Ensure the queries below need to spill to disk.
	set statement_mem=’1 MB’;
	— SRF materializes the result in a tuplestore. Check that
	— gp_workfile_limit_per_query is enforced.
	select count(distinct g) from generate_series(1, 1000000) g;
	count
	———
	1000000
	(1 row)

	set gp_workfile_limit_per_query=’5 MB’;
	select count(distinct g) from generate_series(1, 1000000) g;
	ERROR: workfile per query size limit exceeded
	reset gp_workfile_limit_per_query;
	— Also test limit on number of files (gp_workfile_limit_files_per_query)
	set gp_workfile_limit_files_per_query=’4′;
	select count(g) from generate_series(1, 500000) g
	union
	select count(g) from generate_series(1, 500000) g
	union
	select count(g) from generate_series(1, 500000) g
	order by 1;
	count
	———
	500000
	(1 row)

	set gp_workfile_limit_files_per_query=’2′;
	select count(g) from generate_series(1, 500000) g
	union
	select count(g) from generate_series(1, 500000) g
	union
	select count(g) from generate_series(1, 500000) g
	order by 1;
	ERROR: number of workfiles per query limit exceeded
	— We cannot test the per-segment limit, because changing it requires
	— a postmaster restart. It’s enforced in the same way as the per-query
	— limit, though, and it’s simpler, so if the per-query limit works,
	— the per-segment limit probably works too.

Источник

Sql error 53000 error workfile per query size limit exceeded

You are using an outdated browser. Please upgrade your browser to improve your experience.

—> share-line

—> —> —> —> plus

Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to run an SQL query in memory.

If a query creates more than the configured number of spill files, Greenplum Database returns this error:

Greenplum Database may generate a large number of spill files when:

Data skew is present in the queried data. To check for data skew, see Checking for Data Distribution Skew.
The amount of memory allocated for the query is too low. You control the maximum amount of memory that can be used by a query with the Greenplum Database server configuration parameters max_statement_mem and statement_mem, or through resource group or resource queue configuration.

Additional documentation resources:

Memory Consumption Parameters identifies the memory-related spill file server configuration parameters.
Using Resource Groups describes memory and spill considerations when resource group-based resource management is active.
Using Resource Queues describes memory and spill considerations when resource queue-based resource management is active.

Parent topic: Querying Data

Источник

Sql error 53000 error workfile per query size limit exceeded

Всем сообщениям, которые выдаёт сервер Postgres Pro , назначены пятисимвольные коды ошибок, соответствующие кодам « SQLSTATE » , описанным в стандарте SQL. Приложения, которые должны знать, какое условие ошибки имело место, обычно проверяют код ошибки и только потом обращаются к текстовому сообщению об ошибке. Коды ошибок, скорее всего, не изменятся от выпуска к выпуску Postgres Pro , и они не меняются при локализации как сообщения об ошибках. Заметьте, что отдельные, но не все коды ошибок, которые выдаёт Postgres Pro , определены стандартом SQL; некоторые дополнительные коды ошибок для условий, не описанных стандартом, были добавлены независимо или позаимствованы из других баз данных.

В Таблице A.1 перечислены все коды ошибок, определённые в Postgres Pro 9.5.20.1. (Некоторые коды в настоящее время не используются, хотя они определены в стандарте SQL.) Также показаны классы ошибок. Для каждого класса ошибок имеется « стандартный » код ошибки с последними тремя символами 000 . Этот код выдаётся только для таких условий ошибок, которые относятся к некоторому классу, но не имеют более определённого кода.

Символ, указанный в столбце « Имя условия » , определяет условие в PL/pgSQL . Имена условий могут записываться в верхнем или нижнем регистре. (Заметьте, что PL/pgSQL , в отличие от ошибок, не распознаёт предупреждения; то есть классы 00, 01 и 02.)

Для некоторых типов ошибок сервер сообщает имя объекта базы данных (таблица, столбец таблицы, тип данных или ограничение), связанного с ошибкой; например, имя уникального ограничения, вызвавшего ошибку unique_violation . Такие имена передаются в отдельных полях сообщения об ошибке, чтобы приложениям не пришлось извлекать его из возможно локализованного текста ошибки для человека. На момент выхода PostgreSQL 9.3 полностью охватывались только ошибки класса SQLSTATE 23 (нарушения ограничений целостности), но в будущем должны быть охвачены и другие классы.

Таблица A.1. Коды ошибок Postgres Pro

Источник

Приложение A. Коды ошибок PostgreSQL

Источник

Monitoring a Greenplum System

You can monitor a Greenplum Database system using a variety of tools included with the system or available as add-ons.

Observing the Greenplum Database system day-to-day performance helps administrators understand the system behavior, plan workflow, and troubleshoot problems. This chapter discusses tools for monitoring database performance and activity.

Also, be sure to review Recommended Monitoring and Maintenance Tasks for monitoring activities you can script to quickly detect problems in the system.

Monitoring Database Activity and Performance

Greenplum Database includes an optional system monitoring and management database, gpperfmon , that administrators can enable. The gpperfmon_install command-line utility creates the gpperfmon database and enables data collection agents that collect and store query and system metrics in the database. Administrators can query metrics in the gpperfmon database. See the documentation for the gpperfmon database in the Greenplum Database Reference Guide .

Tanzu Greenplum Command Center, an optional web-based interface, provides cluster status information, graphical administrative tools, real-time query monitoring, and historical cluster and query data. Download the Greenplum Command Center package from VMware Tanzu Network and view the documentation at the Tanzu Greenplum Command Center Documentation web site.

Monitoring System State

As a Greenplum Database administrator, you must monitor the system for problem events such as a segment going down or running out of disk space on a segment host. The following topics describe how to monitor the health of a Greenplum Database system and examine certain state information for a Greenplum Database system.

Checking System State

A Greenplum Database system is comprised of multiple PostgreSQL instances (the master and segments) spanning multiple machines. To monitor a Greenplum Database system, you need to know information about the system as a whole, as well as status information of the individual instances. The gpstate utility provides status information about a Greenplum Database system.

Viewing Master and Segment Status and Configuration

The default gpstate action is to check segment instances and show a brief status of the valid and failed segments. For example, to see a quick status of your Greenplum Database system:

To see more detailed information about your Greenplum Database array configuration, use gpstate with the -s option:

Viewing Your Mirroring Configuration and Status

If you are using mirroring for data redundancy, you may want to see the list of mirror segment instances in the system, their current synchronization status, and the mirror to primary mapping. For example, to see the mirror segments in the system and their status:

To see the primary to mirror segment mappings:

To see the status of the standby master mirror:

Checking Disk Space Usage

A database administrator’s most important monitoring task is to make sure the file systems where the master and segment data directories reside do not grow to more than 70 percent full. A filled data disk will not result in data corruption, but it may prevent normal database activity from continuing. If the disk grows too full, it can cause the database server to shut down.

You can use the gp_disk_free external table in the gp_toolkit administrative schema to check for remaining free space (in kilobytes) on the segment host file systems. For example:

Checking Sizing of Distributed Databases and Tables

The gp_toolkit administrative schema contains several views that you can use to determine the disk space usage for a distributed Greenplum Database database, schema, table, or index.

For a list of the available sizing views for checking database object sizes and disk space, see the Greenplum Database Reference Guide.

Viewing Disk Space Usage for a Database

To see the total size of a database (in bytes), use the gp_size_of_database view in the gp_toolkit administrative schema. For example:

Viewing Disk Space Usage for a Table

The gp_toolkit administrative schema contains several views for checking the size of a table. The table sizing views list the table by object ID (not by name). To check the size of a table by name, you must look up the relation name ( relname ) in the pg_class table. For example:

For a list of the available table sizing views, see the Greenplum Database Reference Guide.

Viewing Disk Space Usage for Indexes

The gp_toolkit administrative schema contains a number of views for checking index sizes. To see the total size of all index(es) on a table, use the gp_size_of_all_table_indexes view. To see the size of a particular index, use the gp_size_of_index view. The index sizing views list tables and indexes by object ID (not by name). To check the size of an index by name, you must look up the relation name ( relname ) in the pg_class table. For example:

Checking for Data Distribution Skew

All tables in Greenplum Database are distributed, meaning their data is divided across all of the segments in the system. Unevenly distributed data may diminish query processing performance. A table’s distribution policy, set at table creation time, determines how the table’s rows are distributed. For information about choosing the table distribution policy, see the following topics:

The gp_toolkit administrative schema also contains a number of views for checking data distribution skew on a table. For information about how to check for uneven data distribution, see the Greenplum Database Reference Guide .

Viewing a Table’s Distribution Key

To see the columns used as the data distribution key for a table, you can use the d+ meta-command in psql to examine the definition of a table. For example:

When you create a replicated table, Greenplum Database stores all rows in the table on every segment. Replicated tables have no distribution key. Where the d+ meta-command reports the distribution key for a normally distributed table, it shows Distributed Replicated for a replicated table.

Viewing Data Distribution

To see the data distribution of a table’s rows (the number of rows on each segment), you can run a query such as:

A table is considered to have a balanced distribution if all segments have roughly the same number of rows.

Checking for Query Processing Skew

When a query is being processed, all segments should have equal workloads to ensure the best possible performance. If you identify a poorly-performing query, you may need to investigate further using the EXPLAIN command. For information about using the EXPLAIN command and query profiling, see Query Profiling.

Query processing workload can be skewed if the table’s data distribution policy and the query predicates are not well matched. To check for processing skew, you can run a query such as:

This will show the number of rows returned by segment for the given WHERE predicate.

As noted in Viewing Data Distribution, this query will fail if you run it on a replicated table because you cannot reference the gp_segment_id system column in a query on a replicated table.

Avoiding an Extreme Skew Warning

You may receive the following warning message while running a query that performs a hash join operation:

Extreme skew in the innerside of Hashjoin

Viewing Metadata Information about Database Objects

Greenplum Database tracks various metadata information in its system catalogs about the objects stored in a database, such as tables, views, indexes and so on, as well as global objects such as roles and tablespaces.

Viewing the Last Operation Performed

You can use the system views pg_stat_operations and pg_stat_partition_operations to look up actions performed on an object, such as a table. For example, to see the actions performed on a table, such as when it was created and when it was last vacuumed and analyzed:

Viewing the Definition of an Object

To see the definition of an object, such as a table or view, you can use the d+ meta-command when working in psql . For example, to see the definition of a table:

Viewing Session Memory Usage Information

You can create and use the session_level_memory_consumption view that provides information about the current memory utilization for sessions that are running queries on Greenplum Database. The view contains session information and information such as the database that the session is connected to, the query that the session is currently running, and memory consumed by the session processes.

Creating the session_level_memory_consumption View

The session_level_memory_consumption View

The session_state.session_level_memory_consumption view provides information about memory consumption and idle time for sessions that are running SQL queries.

When resource queue-based resource management is active, the column is_runaway indicates whether Greenplum Database considers the session a runaway session based on the vmem memory consumption of the session’s queries. Under the resource queue-based resource management scheme, Greenplum Database considers the session a runaway when the queries consume an excessive amount of memory. The Greenplum Database server configuration parameter runaway_detector_activation_percent governs the conditions under which Greenplum Database considers a session a runaway session.

The is_runaway , runaway_vmem_mb , and runaway_command_cnt columns are not applicable when resource group-based resource management is active.

Table 1. session_state.session_level_memory_consumption

column	type	references	description
datname	name	В	Name of the database that the session is connected to.
sess_id	integer	В	Session ID.
usename	name	В	Name of the session user.
query	text	В	Current SQL query that the session is running.
segid	integer	В	Segment ID.
vmem_mb	integer	В	Total vmem memory usage for the session in MB.
is_runaway	boolean	В	Session is marked as runaway on the segment.
qe_count	integer	В	Number of query processes for the session.
active_qe_count	integer	В	Number of active query processes for the session.
dirty_qe_count	integer	В	Number of query processes that have not yet released their memory.

The value is -1 for sessions that are not running.

runaway_vmem_mb integer В Amount of vmem memory that the session was consuming when it was marked as a runaway session. runaway_command_cnt integer В Command count for the session when it was marked as a runaway session. idle_start timestamptz В The last time a query process in this session became idle.

Viewing Query Workfile Usage Information

The Greenplum Database administrative schema gp_toolkit contains views that display information about Greenplum Database workfiles. Greenplum Database creates workfiles on disk if it does not have sufficient memory to run the query in memory. This information can be used for troubleshooting and tuning queries. The information in the views can also be used to specify the values for the Greenplum Database configuration parameters gp_workfile_limit_per_query and gp_workfile_limit_per_segment .

These are the views in the schema gp_toolkit:

The gp_workfile_entries view contains one row for each operator using disk space for workfiles on a segment at the current time.
The gp_workfile_usage_per_query view contains one row for each query using disk space for workfiles on a segment at the current time.
The gp_workfile_usage_per_segment view contains one row for each segment. Each row displays the total amount of disk space used for workfiles on the segment at the current time.

For information about using gp_toolkit, see Using gp_toolkit.

Viewing the Database Server Log Files

Every database instance in Greenplum Database (master and segments) runs a PostgreSQL database server with its own server log file. Log files are created in the pg_log directory of the master and each segment data directory.

Log File Format

The server log files are written in comma-separated values (CSV) format. Some log entries will not have values for all log fields. For example, only log entries associated with a query worker process will have the slice_id populated. You can identify related log entries of a particular query by the query’s session identifier ( gp_session_id ) and command identifier ( gp_command_count ).

The following fields are written to the log:

Table 2. Greenplum Database Server Log Format

#	Field Name	Data Type	Description
1	event_time	timestamp with time zone	Time that the log entry was written to the log
2	user_name	varchar(100)	The database user name
3	database_name	varchar(100)	The database name
4	process_id	varchar(10)	The system process ID (prefixed with «p»)
5	thread_id	varchar(50)	The thread count (prefixed with «th»)
6	remote_host	varchar(100)	On the master, the hostname/address of the client machine. On the segment, the hostname/address of the master.
7	remote_port	varchar(10)	The segment or master port number
8	session_start_time	timestamp with time zone	Time session connection was opened
9	transaction_id	int	Top-level transaction ID on the master. This ID is the parent of any subtransactions.
10	gp_session_id	text	Session identifier number (prefixed with «con»)
11	gp_command_count	text	The command number within a session (prefixed with «cmd»)
12	gp_segment	text	The segment content identifier (prefixed with «seg» for primaries or «mir» for mirrors). The master always has a content ID of -1.
13	slice_id	text	The slice ID (portion of the query plan being executed)
14	distr_tranx_id	text	Distributed transaction ID
15	local_tranx_id	text	Local transaction ID
16	sub_tranx_id	text	Subtransaction ID
17	event_severity	varchar(10)	Values include: LOG, ERROR, FATAL, PANIC, DEBUG1, DEBUG2
18	sql_state_code	varchar(10)	SQL state code associated with the log message
19	event_message	text	Log or error message text
20	event_detail	text	Detail message text associated with an error or warning message
21	event_hint	text	Hint message text associated with an error or warning message
22	internal_query	text	The internally-generated query text
23	internal_query_pos	int	The cursor index into the internally-generated query text
24	event_context	text	The context in which this message gets generated
25	debug_query_string	text	User-supplied query string with full detail for debugging. This string can be modified for internal use.
26	error_cursor_pos	int	The cursor index into the query string
27	func_name	text	The function in which this message is generated
28	file_name	text	The internal code file where the message originated
29	file_line	int	The line of the code file where the message originated
30	stack_trace	text	Stack trace text associated with this message

Searching the Greenplum Server Log Files

Greenplum Database provides a utility called gplogfilter can search through a Greenplum Database log file for entries matching the specified criteria. By default, this utility searches through the Greenplum Database master log file in the default logging location. For example, to display the last three lines of each of the log files under the master directory:

To search through all segment log files simultaneously, run gplogfilter through the gpssh utility. For example, to display the last three lines of each segment log file:

Using gp_toolkit

Use the Greenplum Database administrative schema gp_toolkit to query the system catalogs, log files, and operating environment for system status information. The gp_toolkit schema contains several views you can access using SQL commands. The gp_toolkit schema is accessible to all database users. Some objects require superuser permissions. Use a command similar to the following to add the gp_toolkit schema to your schema search path:

For a description of the available administrative schema views and their usages, see the Greenplum Database Reference Guide .

SQL Standard Error Codes

The following table lists all the defined error codes. Some are not used, but are defined by the SQL standard. The error classes are also shown. For each error class there is a standard error code having the last three characters 000. This code is used only for error conditions that fall within the class but do not have any more-specific code assigned.

The PL/pgSQL condition name for each error code is the same as the phrase shown in the table, with underscores substituted for spaces. For example, code 22012, DIVISION BY ZERO, has condition name DIVISION_BY_ZERO. Condition names can be written in either upper or lower case.

Источник

Sql error 53000 error workfile per query size limit exceeded

Приложение A. Коды ошибок PostgreSQL

Sql error 53000 error workfile per query size limit exceeded

Submit correction

Sql error 53000 error workfile per query size limit exceeded

Sql error 53000 error workfile per query size limit exceeded

Приложение A. Коды ошибок PostgreSQL

Monitoring a Greenplum System

Monitoring Database Activity and Performance

Monitoring System State

Checking System State

Viewing Master and Segment Status and Configuration

Viewing Your Mirroring Configuration and Status

Checking Disk Space Usage

Checking Sizing of Distributed Databases and Tables

Viewing Disk Space Usage for a Database

Viewing Disk Space Usage for a Table

Viewing Disk Space Usage for Indexes

Checking for Data Distribution Skew

Viewing a Table’s Distribution Key

Viewing Data Distribution

Checking for Query Processing Skew

Avoiding an Extreme Skew Warning

Viewing Metadata Information about Database Objects

Viewing the Last Operation Performed

Viewing the Definition of an Object

Viewing Session Memory Usage Information

Creating the session_level_memory_consumption View

The session_level_memory_consumption View

Viewing Query Workfile Usage Information

Viewing the Database Server Log Files

Log File Format

Searching the Greenplum Server Log Files

Using gp_toolkit

SQL Standard Error Codes

Читайте также: