Содержание
- Sql error 53000 error workfile per query size limit exceeded
- Приложение A. Коды ошибок PostgreSQL
- Sql error 53000 error workfile per query size limit exceeded
- Submit correction
Sql error 53000 error workfile per query size limit exceeded
You are using an outdated browser. Please upgrade your browser to improve your experience.
—> share-line
—> —> —> —> plus
Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to run an SQL query in memory.
The maximum number of spill files for a given query is governed by the gp_workfile_limit_files_per_query server configuration parameter setting. The default value of 100,000 spill files is sufficient for the majority of queries.
If a query creates more than the configured number of spill files, Greenplum Database returns this error:
Greenplum Database may generate a large number of spill files when:
- Data skew is present in the queried data. To check for data skew, see Checking for Data Distribution Skew.
- The amount of memory allocated for the query is too low. You control the maximum amount of memory that can be used by a query with the Greenplum Database server configuration parameters max_statement_mem and statement_mem, or through resource group or resource queue configuration.
You might be able to run the query successfully by changing the query, changing the data distribution, or changing the system memory configuration. The gp_toolkit gp_workfile_* views display spill file usage information. You can use this information to troubleshoot and tune queries. The gp_workfile_* views are described in Checking Query Disk Spill Space Usage.
Additional documentation resources:
- Memory Consumption Parameters identifies the memory-related spill file server configuration parameters.
- Using Resource Groups describes memory and spill considerations when resource group-based resource management is active.
- Using Resource Queues describes memory and spill considerations when resource queue-based resource management is active.
Parent topic: Querying Data
Источник
Приложение A. Коды ошибок PostgreSQL
Всем сообщениям, которые выдаёт сервер PostgreSQL , назначены пятисимвольные коды ошибок, соответствующие кодам «SQLSTATE» , описанным в стандарте SQL. Приложения, которые должны знать, какое условие ошибки имело место, обычно проверяют код ошибки и только потом обращаются к текстовому сообщению об ошибке. Коды ошибок, скорее всего, не изменятся от выпуска к выпуску PostgreSQL , и они не меняются при локализации как сообщения об ошибках. Заметьте, что отдельные, но не все коды ошибок, которые выдаёт PostgreSQL , определены стандартом SQL; некоторые дополнительные коды ошибок для условий, не описанных стандартом, были добавлены независимо или позаимствованы из других баз данных.
Согласно стандарту, первые два символа кода ошибки обозначают класс ошибок, а последние три символа обозначают определённое условие в этом классе. Таким образом, приложение, не знающее значение определённого кода ошибки, всё же может понять, что делать, по классу ошибки.
В Таблице A-1 перечислены все коды ошибок, определённые в PostgreSQL 9.4.1. (Некоторые коды в настоящее время не используются, хотя они определены в стандарте SQL.) Также показаны классы ошибок. Для каждого класса ошибок имеется «стандартный» код ошибки с последними тремя символами 000. Этот код выдаётся только для таких условий ошибок, которые относятся к определённому классу, но не имеют более определённого кода.
Символ, указанный в колонке «Имя условия» , определяет условие в PL/pgSQL . Имена условий могут записываться в верхнем или нижнем регистре. (Заметьте, что PL/pgSQL , в отличие от ошибок, не распознаёт предупреждения; то есть классы 00, 01 и 02.)
Для некоторых типов ошибок сервер сообщает имя объекта базы данных (таблица, колонка таблицы, тип данных или ограничение), связанного с ошибкой; например, имя уникального ограничения, вызвавшего ошибку unique_violation. Такие имена передаются в отдельных полях сообщения об ошибке, чтобы приложениям не пришлось извлекать его из возможно локализованного текста ошибки для человека. На момент выхода PostgreSQL 9.3 полностью охватывались только ошибки класса SQLSTATE 23 (нарушения ограничений целостности), но в будущем должны быть охвачены и другие классы.
Источник
Sql error 53000 error workfile per query size limit exceeded
All messages emitted by the PostgreSQL server are assigned five-character error codes that follow the SQL standard’s conventions for “ SQLSTATE ” codes. Applications that need to know which error condition has occurred should usually test the error code, rather than looking at the textual error message. The error codes are less likely to change across PostgreSQL releases, and also are not subject to change due to localization of error messages. Note that some, but not all, of the error codes produced by PostgreSQL are defined by the SQL standard; some additional error codes for conditions not defined by the standard have been invented or borrowed from other databases.
According to the standard, the first two characters of an error code denote a class of errors, while the last three characters indicate a specific condition within that class. Thus, an application that does not recognize the specific error code might still be able to infer what to do from the error class.
Table A.1 lists all the error codes defined in PostgreSQL 15.1. (Some are not actually used at present, but are defined by the SQL standard.) The error classes are also shown. For each error class there is a “ standard ” error code having the last three characters 000 . This code is used only for error conditions that fall within the class but do not have any more-specific code assigned.
The symbol shown in the column “ Condition Name ” is the condition name to use in PL/pgSQL . Condition names can be written in either upper or lower case. (Note that PL/pgSQL does not recognize warning, as opposed to error, condition names; those are classes 00, 01, and 02.)
For some types of errors, the server reports the name of a database object (a table, table column, data type, or constraint) associated with the error; for example, the name of the unique constraint that caused a unique_violation error. Such names are supplied in separate fields of the error report message so that applications need not try to extract them from the possibly-localized human-readable text of the message. As of PostgreSQL 9.3, complete coverage for this feature exists only for errors in SQLSTATE class 23 (integrity constraint violation), but this is likely to be expanded in future.
Table A.1. PostgreSQL Error Codes
Error Code | Condition Name |
---|---|
Class 00 — Successful Completion | |
00000 | successful_completion |
Class 01 — Warning | |
01000 | warning |
0100C | dynamic_result_sets_returned |
01008 | implicit_zero_bit_padding |
01003 | null_value_eliminated_in_set_function |
01007 | privilege_not_granted |
01006 | privilege_not_revoked |
01004 | string_data_right_truncation |
01P01 | deprecated_feature |
Class 02 — No Data (this is also a warning class per the SQL standard) | |
02000 | no_data |
02001 | no_additional_dynamic_result_sets_returned |
Class 03 — SQL Statement Not Yet Complete | |
03000 | sql_statement_not_yet_complete |
Class 08 — Connection Exception | |
08000 | connection_exception |
08003 | connection_does_not_exist |
08006 | connection_failure |
08001 | sqlclient_unable_to_establish_sqlconnection |
08004 | sqlserver_rejected_establishment_of_sqlconnection |
08007 | transaction_resolution_unknown |
08P01 | protocol_violation |
Class 09 — Triggered Action Exception | |
09000 | triggered_action_exception |
Class 0A — Feature Not Supported | |
0A000 | feature_not_supported |
Class 0B — Invalid Transaction Initiation | |
0B000 | invalid_transaction_initiation |
Class 0F — Locator Exception | |
0F000 | locator_exception |
0F001 | invalid_locator_specification |
Class 0L — Invalid Grantor | |
0L000 | invalid_grantor |
0LP01 | invalid_grant_operation |
Class 0P — Invalid Role Specification | |
0P000 | invalid_role_specification |
Class 0Z — Diagnostics Exception | |
0Z000 | diagnostics_exception |
0Z002 | stacked_diagnostics_accessed_without_active_handler |
Class 20 — Case Not Found | |
20000 | case_not_found |
Class 21 — Cardinality Violation | |
21000 | cardinality_violation |
Class 22 — Data Exception | |
22000 | data_exception |
2202E | array_subscript_error |
22021 | character_not_in_repertoire |
22008 | datetime_field_overflow |
22012 | division_by_zero |
22005 | error_in_assignment |
2200B | escape_character_conflict |
22022 | indicator_overflow |
22015 | interval_field_overflow |
2201E | invalid_argument_for_logarithm |
22014 | invalid_argument_for_ntile_function |
22016 | invalid_argument_for_nth_value_function |
2201F | invalid_argument_for_power_function |
2201G | invalid_argument_for_width_bucket_function |
22018 | invalid_character_value_for_cast |
22007 | invalid_datetime_format |
22019 | invalid_escape_character |
2200D | invalid_escape_octet |
22025 | invalid_escape_sequence |
22P06 | nonstandard_use_of_escape_character |
22010 | invalid_indicator_parameter_value |
22023 | invalid_parameter_value |
22013 | invalid_preceding_or_following_size |
2201B | invalid_regular_expression |
2201W | invalid_row_count_in_limit_clause |
2201X | invalid_row_count_in_result_offset_clause |
2202H | invalid_tablesample_argument |
2202G | invalid_tablesample_repeat |
22009 | invalid_time_zone_displacement_value |
2200C | invalid_use_of_escape_character |
2200G | most_specific_type_mismatch |
22004 | null_value_not_allowed |
22002 | null_value_no_indicator_parameter |
22003 | numeric_value_out_of_range |
2200H | sequence_generator_limit_exceeded |
22026 | string_data_length_mismatch |
22001 | string_data_right_truncation |
22011 | substring_error |
22027 | trim_error |
22024 | unterminated_c_string |
2200F | zero_length_character_string |
22P01 | floating_point_exception |
22P02 | invalid_text_representation |
22P03 | invalid_binary_representation |
22P04 | bad_copy_file_format |
22P05 | untranslatable_character |
2200L | not_an_xml_document |
2200M | invalid_xml_document |
2200N | invalid_xml_content |
2200S | invalid_xml_comment |
2200T | invalid_xml_processing_instruction |
22030 | duplicate_json_object_key_value |
22031 | invalid_argument_for_sql_json_datetime_function |
22032 | invalid_json_text |
22033 | invalid_sql_json_subscript |
22034 | more_than_one_sql_json_item |
22035 | no_sql_json_item |
22036 | non_numeric_sql_json_item |
22037 | non_unique_keys_in_a_json_object |
22038 | singleton_sql_json_item_required |
22039 | sql_json_array_not_found |
2203A | sql_json_member_not_found |
2203B | sql_json_number_not_found |
2203C | sql_json_object_not_found |
2203D | too_many_json_array_elements |
2203E | too_many_json_object_members |
2203F | sql_json_scalar_required |
2203G | sql_json_item_cannot_be_cast_to_target_type |
Class 23 — Integrity Constraint Violation | |
23000 | integrity_constraint_violation |
23001 | restrict_violation |
23502 | not_null_violation |
23503 | foreign_key_violation |
23505 | unique_violation |
23514 | check_violation |
23P01 | exclusion_violation |
Class 24 — Invalid Cursor State | |
24000 | invalid_cursor_state |
Class 25 — Invalid Transaction State | |
25000 | invalid_transaction_state |
25001 | active_sql_transaction |
25002 | branch_transaction_already_active |
25008 | held_cursor_requires_same_isolation_level |
25003 | inappropriate_access_mode_for_branch_transaction |
25004 | inappropriate_isolation_level_for_branch_transaction |
25005 | no_active_sql_transaction_for_branch_transaction |
25006 | read_only_sql_transaction |
25007 | schema_and_data_statement_mixing_not_supported |
25P01 | no_active_sql_transaction |
25P02 | in_failed_sql_transaction |
25P03 | idle_in_transaction_session_timeout |
Class 26 — Invalid SQL Statement Name | |
26000 | invalid_sql_statement_name |
Class 27 — Triggered Data Change Violation | |
27000 | triggered_data_change_violation |
Class 28 — Invalid Authorization Specification | |
28000 | invalid_authorization_specification |
28P01 | invalid_password |
Class 2B — Dependent Privilege Descriptors Still Exist | |
2B000 | dependent_privilege_descriptors_still_exist |
2BP01 | dependent_objects_still_exist |
Class 2D — Invalid Transaction Termination | |
2D000 | invalid_transaction_termination |
Class 2F — SQL Routine Exception | |
2F000 | sql_routine_exception |
2F005 | function_executed_no_return_statement |
2F002 | modifying_sql_data_not_permitted |
2F003 | prohibited_sql_statement_attempted |
2F004 | reading_sql_data_not_permitted |
Class 34 — Invalid Cursor Name | |
34000 | invalid_cursor_name |
Class 38 — External Routine Exception | |
38000 | external_routine_exception |
38001 | containing_sql_not_permitted |
38002 | modifying_sql_data_not_permitted |
38003 | prohibited_sql_statement_attempted |
38004 | reading_sql_data_not_permitted |
Class 39 — External Routine Invocation Exception | |
39000 | external_routine_invocation_exception |
39001 | invalid_sqlstate_returned |
39004 | null_value_not_allowed |
39P01 | trigger_protocol_violated |
39P02 | srf_protocol_violated |
39P03 | event_trigger_protocol_violated |
Class 3B — Savepoint Exception | |
3B000 | savepoint_exception |
3B001 | invalid_savepoint_specification |
Class 3D — Invalid Catalog Name | |
3D000 | invalid_catalog_name |
Class 3F — Invalid Schema Name | |
3F000 | invalid_schema_name |
Class 40 — Transaction Rollback | |
40000 | transaction_rollback |
40002 | transaction_integrity_constraint_violation |
40001 | serialization_failure |
40003 | statement_completion_unknown |
40P01 | deadlock_detected |
Class 42 — Syntax Error or Access Rule Violation | |
42000 | syntax_error_or_access_rule_violation |
42601 | syntax_error |
42501 | insufficient_privilege |
42846 | cannot_coerce |
42803 | grouping_error |
42P20 | windowing_error |
42P19 | invalid_recursion |
42830 | invalid_foreign_key |
42602 | invalid_name |
42622 | name_too_long |
42939 | reserved_name |
42804 | datatype_mismatch |
42P18 | indeterminate_datatype |
42P21 | collation_mismatch |
42P22 | indeterminate_collation |
42809 | wrong_object_type |
428C9 | generated_always |
42703 | undefined_column |
42883 | undefined_function |
42P01 | undefined_table |
42P02 | undefined_parameter |
42704 | undefined_object |
42701 | duplicate_column |
42P03 | duplicate_cursor |
42P04 | duplicate_database |
42723 | duplicate_function |
42P05 | duplicate_prepared_statement |
42P06 | duplicate_schema |
42P07 | duplicate_table |
42712 | duplicate_alias |
42710 | duplicate_object |
42702 | ambiguous_column |
42725 | ambiguous_function |
42P08 | ambiguous_parameter |
42P09 | ambiguous_alias |
42P10 | invalid_column_reference |
42611 | invalid_column_definition |
42P11 | invalid_cursor_definition |
42P12 | invalid_database_definition |
42P13 | invalid_function_definition |
42P14 | invalid_prepared_statement_definition |
42P15 | invalid_schema_definition |
42P16 | invalid_table_definition |
42P17 | invalid_object_definition |
Class 44 — WITH CHECK OPTION Violation | |
44000 | with_check_option_violation |
Class 53 — Insufficient Resources | |
53000 | insufficient_resources |
53100 | disk_full |
53200 | out_of_memory |
53300 | too_many_connections |
53400 | configuration_limit_exceeded |
Class 54 — Program Limit Exceeded | |
54000 | program_limit_exceeded |
54001 | statement_too_complex |
54011 | too_many_columns |
54023 | too_many_arguments |
Class 55 — Object Not In Prerequisite State | |
55000 | object_not_in_prerequisite_state |
55006 | object_in_use |
55P02 | cant_change_runtime_param |
55P03 | lock_not_available |
55P04 | unsafe_new_enum_value_usage |
Class 57 — Operator Intervention | |
57000 | operator_intervention |
57014 | query_canceled |
57P01 | admin_shutdown |
57P02 | crash_shutdown |
57P03 | cannot_connect_now |
57P04 | database_dropped |
57P05 | idle_session_timeout |
Class 58 — System Error (errors external to PostgreSQL itself) | |
58000 | system_error |
58030 | io_error |
58P01 | undefined_file |
58P02 | duplicate_file |
Class 72 — Snapshot Failure | |
72000 | snapshot_too_old |
Class F0 — Configuration File Error | |
F0000 | config_file_error |
F0001 | lock_file_exists |
Class HV — Foreign Data Wrapper Error (SQL/MED) | |
HV000 | fdw_error |
HV005 | fdw_column_name_not_found |
HV002 | fdw_dynamic_parameter_value_needed |
HV010 | fdw_function_sequence_error |
HV021 | fdw_inconsistent_descriptor_information |
HV024 | fdw_invalid_attribute_value |
HV007 | fdw_invalid_column_name |
HV008 | fdw_invalid_column_number |
HV004 | fdw_invalid_data_type |
HV006 | fdw_invalid_data_type_descriptors |
HV091 | fdw_invalid_descriptor_field_identifier |
HV00B | fdw_invalid_handle |
HV00C | fdw_invalid_option_index |
HV00D | fdw_invalid_option_name |
HV090 | fdw_invalid_string_length_or_buffer_length |
HV00A | fdw_invalid_string_format |
HV009 | fdw_invalid_use_of_null_pointer |
HV014 | fdw_too_many_handles |
HV001 | fdw_out_of_memory |
HV00P | fdw_no_schemas |
HV00J | fdw_option_name_not_found |
HV00K | fdw_reply_handle |
HV00Q | fdw_schema_not_found |
HV00R | fdw_table_not_found |
HV00L | fdw_unable_to_create_execution |
HV00M | fdw_unable_to_create_reply |
HV00N | fdw_unable_to_establish_connection |
Class P0 — PL/pgSQL Error | |
P0000 | plpgsql_error |
P0001 | raise_exception |
P0002 | no_data_found |
P0003 | too_many_rows |
P0004 | assert_failure |
Class XX — Internal Error | |
XX000 | internal_error |
XX001 | data_corrupted |
XX002 | index_corrupted |
Prev | Up | Next |
Part VIII. Appendixes | Home | Appendix B. Date/Time Support |
Submit correction
If you see anything in the documentation that is not correct, does not match your experience with the particular feature or requires further clarification, please use this form to report a documentation issue.
Источник
The gadget spec URL could not be found
Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to execute an SQL query in memory.
By default maximum no of spill files that can be created is 100,000 which is sufficient for the majority of queries. However, If a query creates more than the specified number of spill files, Greenplum Database returns error:
ERROR: number of workfiles per query limit exceeded
Reasons behind large number of spill files to be generated include:
- Data skew is present in the queried data.
- The amount memory allocated for the query is too low.
when you get this error there are many ways you solve this problem.
- By changing the query, changing the data distribution,
- By changing the system memory configuration.
Note: You can control the maximum amount of memory that can used by a query with the Greenplum Database server configuration parameters max_statement_mem,statement_mem or through resource queues.
Monitoring spill file usage is done by looking at the information available using gp_workfile_* views
sachi=# dv gp_toolkit.gp_workfile_*
List of relations
Schema | Name | Type | Owner | Storage
————+——————————-+——+———+———
gp_toolkit | gp_workfile_entries | view | gpadmin | none
gp_toolkit | gp_workfile_usage_per_query | view | gpadmin | none
gp_toolkit | gp_workfile_usage_per_segment | view | gpadmin | none
(3 rows)
Lets look at the definition of these views.
1. gp_workfile_entries: This view contains one row for each operator using disk space for workfiles on a segment at the current time. The view is accessible to all users, however non-superusers only to see information for the databases that they have permission to access.
2. gp_workfile_usage_per_query: This view contains one row for each query using disk space for workfiles on a segment at the current time. The view is accessible to all users, however non-superusers only to see information for the databases that they have permission to access.
3. gp_workfile_usage_per_segment:This view contains one row for each segment. Each row displays the total amount of disk space used for workfiles on the segment at the current time. The view is accessible to all users, however non-superusers only to see information for the databases that they have permission to access.
gp_workfile_usage_per_segment (one row for each segment)
||
||
V
gp_workfile_usage_per_query ( one row for each query using disk space for workfiles on a segment at the current time)
gp_workfile_entries (one row for each operator using disk space for workfiles on a segment at the current time)
sachi=# d gp_toolkit.gp_workfile_entries
View «gp_toolkit.gp_workfile_entries»
Column | Type | Modifiers
—————+———+————
datname | name |
procpid | integer |
sess_id | integer |
command_cnt | integer |
usename | name |
current_query | text |
segid | integer |
slice | integer |
optype | text |
workmem | integer |
size | bigint |
numfiles | integer |
directory | text |
state | text |
utility | integer |
View definition:
WITH all_entries AS (
SELECT c.segid, c.path, c.hash, c.size, c.utility, c.state, c.workmem, c.optype, c.slice, c.sessionid, c.commandid, c.query_start, c.numfiles
FROM ONLY gp_toolkit.__gp_localid, gp_toolkit.__gp_workfile_entries_f() c(segid integer, path text, hash integer, size bigint, utility integer, state integer, workmem integer, optype text, slice integer, sessionid integer, commandid integer, query_start timestamp with time zone, numfiles integer)
UNION ALL
SELECT c.segid, c.path, c.hash, c.size, c.utility, c.state, c.workmem, c.optype, c.slice, c.sessionid, c.commandid, c.query_start, c.numfiles
FROM ONLY gp_toolkit.__gp_masterid, gp_toolkit.__gp_workfile_entries_f() c(segid integer, path text, hash integer, size bigint, utility integer, state integer, workmem integer, optype text, slice integer, sessionid integer, commandid integer, query_start timestamp with time zone, numfiles integer)
)
SELECT s.datname,
CASE
WHEN c.state = 1 THEN s.procpid
ELSE NULL::integer
END AS procpid, c.sessionid AS sess_id, c.commandid AS command_cnt, s.usename,
CASE
WHEN c.state = 1 THEN s.current_query
ELSE NULL::text
END AS current_query, c.segid, c.slice, c.optype, c.workmem, c.size, c.numfiles, c.path AS directory,
CASE
WHEN c.state = 1 THEN ‘RUNNING’::text
WHEN c.state = 2 THEN ‘CACHED’::text
WHEN c.state = 3 THEN ‘DELETING’::text
ELSE ‘UNKNOWN’::text
END AS state, c.utility
FROM all_entries c
LEFT JOIN pg_stat_activity s ON c.sessionid = s.sess_id;
sachi=# d gp_toolkit.gp_workfile_usage_per_query
View «gp_toolkit.gp_workfile_usage_per_query»
Column | Type | Modifiers
—————+———+————
datname | name |
procpid | integer |
sess_id | integer |
command_cnt | integer |
usename | name |
current_query | text |
segid | integer |
state | text |
size | numeric |
numfiles | bigint |
View definition:
SELECT gp_workfile_entries.datname, gp_workfile_entries.procpid, gp_workfile_entries.sess_id, gp_workfile_entries.command_cnt, gp_workfile_entries.usename, gp_workfile_entries.current_query, gp_workfile_entries.segid, gp_workfile_entries.state, sum(gp_workfile_entries.size) AS size, sum(gp_workfile_entries.numfiles) AS numfiles
FROM gp_toolkit.gp_workfile_entries
GROUP BY gp_workfile_entries.datname, gp_workfile_entries.procpid, gp_workfile_entries.sess_id, gp_workfile_entries.command_cnt, gp_workfile_entries.usename, gp_workfile_entries.current_query, gp_workfile_entries.segid, gp_workfile_entries.state;
sachi=# d gp_toolkit.gp_workfile_usage_per_segment
View «gp_toolkit.gp_workfile_usage_per_segment»
Column | Type | Modifiers
———-+———-+————
segid | smallint |
size | numeric |
numfiles | bigint |
View definition:
SELECT gpseg.content AS segid, COALESCE(sum(wfe.size), 0::numeric) AS size, sum(wfe.numfiles) AS numfiles
FROM ( SELECT gp_segment_configuration.content
FROM gp_segment_configuration
WHERE gp_segment_configuration.role = ‘p’::»char») gpseg
LEFT JOIN gp_toolkit.gp_workfile_entries wfe ON gpseg.content = wfe.segid
GROUP BY gpseg.content;
sachi=#
Workfile Disk Spill Space (4.3.0.0, 4.3.1.0, 4.3.2.0)
Server Configuration Parameters for gp_workfiles
1. gp_workfile_compress
_algorithm
2. gp_workfile_limit_files
_per_query
3. gp_workfile_limit_per
_query
4. gp_workfile_limit_per
_checksumming
The gadget spec URL could not be found
The gadget spec URL could not be found
The gadget spec URL could not be found
Permalink
Cannot retrieve contributors at this time
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
— | |
— Test workfile limits | |
— | |
— Ensure the queries below need to spill to disk. | |
set statement_mem=’1 MB’; | |
— SRF materializes the result in a tuplestore. Check that | |
— gp_workfile_limit_per_query is enforced. | |
select count(distinct g) from generate_series(1, 1000000) g; | |
count | |
——— | |
1000000 | |
(1 row) | |
set gp_workfile_limit_per_query=’5 MB’; | |
select count(distinct g) from generate_series(1, 1000000) g; | |
ERROR: workfile per query size limit exceeded | |
reset gp_workfile_limit_per_query; | |
— Also test limit on number of files (gp_workfile_limit_files_per_query) | |
set gp_workfile_limit_files_per_query=’4′; | |
select count(g) from generate_series(1, 500000) g | |
union | |
select count(g) from generate_series(1, 500000) g | |
union | |
select count(g) from generate_series(1, 500000) g | |
order by 1; | |
count | |
——— | |
500000 | |
(1 row) | |
set gp_workfile_limit_files_per_query=’2′; | |
select count(g) from generate_series(1, 500000) g | |
union | |
select count(g) from generate_series(1, 500000) g | |
union | |
select count(g) from generate_series(1, 500000) g | |
order by 1; | |
ERROR: number of workfiles per query limit exceeded | |
— We cannot test the per-segment limit, because changing it requires | |
— a postmaster restart. It’s enforced in the same way as the per-query | |
— limit, though, and it’s simpler, so if the per-query limit works, | |
— the per-segment limit probably works too. |
Sql error 53000 error workfile per query size limit exceeded
You are using an outdated browser. Please upgrade your browser to improve your experience.
—> share-line
—> —> —> —> plus
Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to run an SQL query in memory.
The maximum number of spill files for a given query is governed by the gp_workfile_limit_files_per_query server configuration parameter setting. The default value of 100,000 spill files is sufficient for the majority of queries.
If a query creates more than the configured number of spill files, Greenplum Database returns this error:
Greenplum Database may generate a large number of spill files when:
- Data skew is present in the queried data. To check for data skew, see Checking for Data Distribution Skew.
- The amount of memory allocated for the query is too low. You control the maximum amount of memory that can be used by a query with the Greenplum Database server configuration parameters max_statement_mem and statement_mem, or through resource group or resource queue configuration.
You might be able to run the query successfully by changing the query, changing the data distribution, or changing the system memory configuration. The gp_toolkit gp_workfile_* views display spill file usage information. You can use this information to troubleshoot and tune queries. The gp_workfile_* views are described in Checking Query Disk Spill Space Usage.
Additional documentation resources:
- Memory Consumption Parameters identifies the memory-related spill file server configuration parameters.
- Using Resource Groups describes memory and spill considerations when resource group-based resource management is active.
- Using Resource Queues describes memory and spill considerations when resource queue-based resource management is active.
Parent topic: Querying Data
Источник
Sql error 53000 error workfile per query size limit exceeded
Всем сообщениям, которые выдаёт сервер Postgres Pro , назначены пятисимвольные коды ошибок, соответствующие кодам « SQLSTATE » , описанным в стандарте SQL. Приложения, которые должны знать, какое условие ошибки имело место, обычно проверяют код ошибки и только потом обращаются к текстовому сообщению об ошибке. Коды ошибок, скорее всего, не изменятся от выпуска к выпуску Postgres Pro , и они не меняются при локализации как сообщения об ошибках. Заметьте, что отдельные, но не все коды ошибок, которые выдаёт Postgres Pro , определены стандартом SQL; некоторые дополнительные коды ошибок для условий, не описанных стандартом, были добавлены независимо или позаимствованы из других баз данных.
Согласно стандарту, первые два символа кода ошибки обозначают класс ошибок, а последние три символа обозначают определённое условие в этом классе. Таким образом, приложение, не знающее значение определённого кода ошибки, всё же может понять, что делать, по классу ошибки.
В Таблице A.1 перечислены все коды ошибок, определённые в Postgres Pro 9.5.20.1. (Некоторые коды в настоящее время не используются, хотя они определены в стандарте SQL.) Также показаны классы ошибок. Для каждого класса ошибок имеется « стандартный » код ошибки с последними тремя символами 000 . Этот код выдаётся только для таких условий ошибок, которые относятся к некоторому классу, но не имеют более определённого кода.
Символ, указанный в столбце « Имя условия » , определяет условие в PL/pgSQL . Имена условий могут записываться в верхнем или нижнем регистре. (Заметьте, что PL/pgSQL , в отличие от ошибок, не распознаёт предупреждения; то есть классы 00, 01 и 02.)
Для некоторых типов ошибок сервер сообщает имя объекта базы данных (таблица, столбец таблицы, тип данных или ограничение), связанного с ошибкой; например, имя уникального ограничения, вызвавшего ошибку unique_violation . Такие имена передаются в отдельных полях сообщения об ошибке, чтобы приложениям не пришлось извлекать его из возможно локализованного текста ошибки для человека. На момент выхода PostgreSQL 9.3 полностью охватывались только ошибки класса SQLSTATE 23 (нарушения ограничений целостности), но в будущем должны быть охвачены и другие классы.
Таблица A.1. Коды ошибок Postgres Pro
Источник
Приложение A. Коды ошибок PostgreSQL
Всем сообщениям, которые выдаёт сервер PostgreSQL , назначены пятисимвольные коды ошибок, соответствующие кодам «SQLSTATE» , описанным в стандарте SQL. Приложения, которые должны знать, какое условие ошибки имело место, обычно проверяют код ошибки и только потом обращаются к текстовому сообщению об ошибке. Коды ошибок, скорее всего, не изменятся от выпуска к выпуску PostgreSQL , и они не меняются при локализации как сообщения об ошибках. Заметьте, что отдельные, но не все коды ошибок, которые выдаёт PostgreSQL , определены стандартом SQL; некоторые дополнительные коды ошибок для условий, не описанных стандартом, были добавлены независимо или позаимствованы из других баз данных.
Согласно стандарту, первые два символа кода ошибки обозначают класс ошибок, а последние три символа обозначают определённое условие в этом классе. Таким образом, приложение, не знающее значение определённого кода ошибки, всё же может понять, что делать, по классу ошибки.
В Таблице A-1 перечислены все коды ошибок, определённые в PostgreSQL 9.4.1. (Некоторые коды в настоящее время не используются, хотя они определены в стандарте SQL.) Также показаны классы ошибок. Для каждого класса ошибок имеется «стандартный» код ошибки с последними тремя символами 000. Этот код выдаётся только для таких условий ошибок, которые относятся к определённому классу, но не имеют более определённого кода.
Символ, указанный в колонке «Имя условия» , определяет условие в PL/pgSQL . Имена условий могут записываться в верхнем или нижнем регистре. (Заметьте, что PL/pgSQL , в отличие от ошибок, не распознаёт предупреждения; то есть классы 00, 01 и 02.)
Для некоторых типов ошибок сервер сообщает имя объекта базы данных (таблица, колонка таблицы, тип данных или ограничение), связанного с ошибкой; например, имя уникального ограничения, вызвавшего ошибку unique_violation. Такие имена передаются в отдельных полях сообщения об ошибке, чтобы приложениям не пришлось извлекать его из возможно локализованного текста ошибки для человека. На момент выхода PostgreSQL 9.3 полностью охватывались только ошибки класса SQLSTATE 23 (нарушения ограничений целостности), но в будущем должны быть охвачены и другие классы.
Источник
Monitoring a Greenplum System
You can monitor a Greenplum Database system using a variety of tools included with the system or available as add-ons.
Observing the Greenplum Database system day-to-day performance helps administrators understand the system behavior, plan workflow, and troubleshoot problems. This chapter discusses tools for monitoring database performance and activity.
Also, be sure to review Recommended Monitoring and Maintenance Tasks for monitoring activities you can script to quickly detect problems in the system.
Monitoring Database Activity and Performance
Greenplum Database includes an optional system monitoring and management database, gpperfmon , that administrators can enable. The gpperfmon_install command-line utility creates the gpperfmon database and enables data collection agents that collect and store query and system metrics in the database. Administrators can query metrics in the gpperfmon database. See the documentation for the gpperfmon database in the Greenplum Database Reference Guide .
Tanzu Greenplum Command Center, an optional web-based interface, provides cluster status information, graphical administrative tools, real-time query monitoring, and historical cluster and query data. Download the Greenplum Command Center package from VMware Tanzu Network and view the documentation at the Tanzu Greenplum Command Center Documentation web site.
Monitoring System State
As a Greenplum Database administrator, you must monitor the system for problem events such as a segment going down or running out of disk space on a segment host. The following topics describe how to monitor the health of a Greenplum Database system and examine certain state information for a Greenplum Database system.
Checking System State
A Greenplum Database system is comprised of multiple PostgreSQL instances (the master and segments) spanning multiple machines. To monitor a Greenplum Database system, you need to know information about the system as a whole, as well as status information of the individual instances. The gpstate utility provides status information about a Greenplum Database system.
Viewing Master and Segment Status and Configuration
The default gpstate action is to check segment instances and show a brief status of the valid and failed segments. For example, to see a quick status of your Greenplum Database system:
To see more detailed information about your Greenplum Database array configuration, use gpstate with the -s option:
Viewing Your Mirroring Configuration and Status
If you are using mirroring for data redundancy, you may want to see the list of mirror segment instances in the system, their current synchronization status, and the mirror to primary mapping. For example, to see the mirror segments in the system and their status:
To see the primary to mirror segment mappings:
To see the status of the standby master mirror:
Checking Disk Space Usage
A database administrator’s most important monitoring task is to make sure the file systems where the master and segment data directories reside do not grow to more than 70 percent full. A filled data disk will not result in data corruption, but it may prevent normal database activity from continuing. If the disk grows too full, it can cause the database server to shut down.
You can use the gp_disk_free external table in the gp_toolkit administrative schema to check for remaining free space (in kilobytes) on the segment host file systems. For example:
Checking Sizing of Distributed Databases and Tables
The gp_toolkit administrative schema contains several views that you can use to determine the disk space usage for a distributed Greenplum Database database, schema, table, or index.
For a list of the available sizing views for checking database object sizes and disk space, see the Greenplum Database Reference Guide.
Viewing Disk Space Usage for a Database
To see the total size of a database (in bytes), use the gp_size_of_database view in the gp_toolkit administrative schema. For example:
Viewing Disk Space Usage for a Table
The gp_toolkit administrative schema contains several views for checking the size of a table. The table sizing views list the table by object ID (not by name). To check the size of a table by name, you must look up the relation name ( relname ) in the pg_class table. For example:
For a list of the available table sizing views, see the Greenplum Database Reference Guide.
Viewing Disk Space Usage for Indexes
The gp_toolkit administrative schema contains a number of views for checking index sizes. To see the total size of all index(es) on a table, use the gp_size_of_all_table_indexes view. To see the size of a particular index, use the gp_size_of_index view. The index sizing views list tables and indexes by object ID (not by name). To check the size of an index by name, you must look up the relation name ( relname ) in the pg_class table. For example:
Checking for Data Distribution Skew
All tables in Greenplum Database are distributed, meaning their data is divided across all of the segments in the system. Unevenly distributed data may diminish query processing performance. A table’s distribution policy, set at table creation time, determines how the table’s rows are distributed. For information about choosing the table distribution policy, see the following topics:
The gp_toolkit administrative schema also contains a number of views for checking data distribution skew on a table. For information about how to check for uneven data distribution, see the Greenplum Database Reference Guide .
Viewing a Table’s Distribution Key
To see the columns used as the data distribution key for a table, you can use the d+ meta-command in psql to examine the definition of a table. For example:
When you create a replicated table, Greenplum Database stores all rows in the table on every segment. Replicated tables have no distribution key. Where the d+ meta-command reports the distribution key for a normally distributed table, it shows Distributed Replicated for a replicated table.
Viewing Data Distribution
To see the data distribution of a table’s rows (the number of rows on each segment), you can run a query such as:
A table is considered to have a balanced distribution if all segments have roughly the same number of rows.
Checking for Query Processing Skew
When a query is being processed, all segments should have equal workloads to ensure the best possible performance. If you identify a poorly-performing query, you may need to investigate further using the EXPLAIN command. For information about using the EXPLAIN command and query profiling, see Query Profiling.
Query processing workload can be skewed if the table’s data distribution policy and the query predicates are not well matched. To check for processing skew, you can run a query such as:
This will show the number of rows returned by segment for the given WHERE predicate.
As noted in Viewing Data Distribution, this query will fail if you run it on a replicated table because you cannot reference the gp_segment_id system column in a query on a replicated table.
Avoiding an Extreme Skew Warning
You may receive the following warning message while running a query that performs a hash join operation:
Extreme skew in the innerside of Hashjoin
Viewing Metadata Information about Database Objects
Greenplum Database tracks various metadata information in its system catalogs about the objects stored in a database, such as tables, views, indexes and so on, as well as global objects such as roles and tablespaces.
Viewing the Last Operation Performed
You can use the system views pg_stat_operations and pg_stat_partition_operations to look up actions performed on an object, such as a table. For example, to see the actions performed on a table, such as when it was created and when it was last vacuumed and analyzed:
Viewing the Definition of an Object
To see the definition of an object, such as a table or view, you can use the d+ meta-command when working in psql . For example, to see the definition of a table:
Viewing Session Memory Usage Information
You can create and use the session_level_memory_consumption view that provides information about the current memory utilization for sessions that are running queries on Greenplum Database. The view contains session information and information such as the database that the session is connected to, the query that the session is currently running, and memory consumed by the session processes.
Creating the session_level_memory_consumption View
The session_level_memory_consumption View
The session_state.session_level_memory_consumption view provides information about memory consumption and idle time for sessions that are running SQL queries.
When resource queue-based resource management is active, the column is_runaway indicates whether Greenplum Database considers the session a runaway session based on the vmem memory consumption of the session’s queries. Under the resource queue-based resource management scheme, Greenplum Database considers the session a runaway when the queries consume an excessive amount of memory. The Greenplum Database server configuration parameter runaway_detector_activation_percent governs the conditions under which Greenplum Database considers a session a runaway session.
The is_runaway , runaway_vmem_mb , and runaway_command_cnt columns are not applicable when resource group-based resource management is active.
Table 1. session_state.session_level_memory_consumption
column | type | references | description |
---|---|---|---|
datname | name | В | Name of the database that the session is connected to. |
sess_id | integer | В | Session ID. |
usename | name | В | Name of the session user. |
query | text | В | Current SQL query that the session is running. |
segid | integer | В | Segment ID. |
vmem_mb | integer | В | Total vmem memory usage for the session in MB. |
is_runaway | boolean | В | Session is marked as runaway on the segment. |
qe_count | integer | В | Number of query processes for the session. |
active_qe_count | integer | В | Number of active query processes for the session. |
dirty_qe_count | integer | В | Number of query processes that have not yet released their memory. |
The value is -1 for sessions that are not running.
runaway_vmem_mb integer В Amount of vmem memory that the session was consuming when it was marked as a runaway session. runaway_command_cnt integer В Command count for the session when it was marked as a runaway session. idle_start timestamptz В The last time a query process in this session became idle.
Viewing Query Workfile Usage Information
The Greenplum Database administrative schema gp_toolkit contains views that display information about Greenplum Database workfiles. Greenplum Database creates workfiles on disk if it does not have sufficient memory to run the query in memory. This information can be used for troubleshooting and tuning queries. The information in the views can also be used to specify the values for the Greenplum Database configuration parameters gp_workfile_limit_per_query and gp_workfile_limit_per_segment .
These are the views in the schema gp_toolkit:
- The gp_workfile_entries view contains one row for each operator using disk space for workfiles on a segment at the current time.
- The gp_workfile_usage_per_query view contains one row for each query using disk space for workfiles on a segment at the current time.
- The gp_workfile_usage_per_segment view contains one row for each segment. Each row displays the total amount of disk space used for workfiles on the segment at the current time.
For information about using gp_toolkit, see Using gp_toolkit.
Viewing the Database Server Log Files
Every database instance in Greenplum Database (master and segments) runs a PostgreSQL database server with its own server log file. Log files are created in the pg_log directory of the master and each segment data directory.
Log File Format
The server log files are written in comma-separated values (CSV) format. Some log entries will not have values for all log fields. For example, only log entries associated with a query worker process will have the slice_id populated. You can identify related log entries of a particular query by the query’s session identifier ( gp_session_id ) and command identifier ( gp_command_count ).
The following fields are written to the log:
Table 2. Greenplum Database Server Log Format
# | Field Name | Data Type | Description |
---|---|---|---|
1 | event_time | timestamp with time zone | Time that the log entry was written to the log |
2 | user_name | varchar(100) | The database user name |
3 | database_name | varchar(100) | The database name |
4 | process_id | varchar(10) | The system process ID (prefixed with «p») |
5 | thread_id | varchar(50) | The thread count (prefixed with «th») |
6 | remote_host | varchar(100) | On the master, the hostname/address of the client machine. On the segment, the hostname/address of the master. |
7 | remote_port | varchar(10) | The segment or master port number |
8 | session_start_time | timestamp with time zone | Time session connection was opened |
9 | transaction_id | int | Top-level transaction ID on the master. This ID is the parent of any subtransactions. |
10 | gp_session_id | text | Session identifier number (prefixed with «con») |
11 | gp_command_count | text | The command number within a session (prefixed with «cmd») |
12 | gp_segment | text | The segment content identifier (prefixed with «seg» for primaries or «mir» for mirrors). The master always has a content ID of -1. |
13 | slice_id | text | The slice ID (portion of the query plan being executed) |
14 | distr_tranx_id | text | Distributed transaction ID |
15 | local_tranx_id | text | Local transaction ID |
16 | sub_tranx_id | text | Subtransaction ID |
17 | event_severity | varchar(10) | Values include: LOG, ERROR, FATAL, PANIC, DEBUG1, DEBUG2 |
18 | sql_state_code | varchar(10) | SQL state code associated with the log message |
19 | event_message | text | Log or error message text |
20 | event_detail | text | Detail message text associated with an error or warning message |
21 | event_hint | text | Hint message text associated with an error or warning message |
22 | internal_query | text | The internally-generated query text |
23 | internal_query_pos | int | The cursor index into the internally-generated query text |
24 | event_context | text | The context in which this message gets generated |
25 | debug_query_string | text | User-supplied query string with full detail for debugging. This string can be modified for internal use. |
26 | error_cursor_pos | int | The cursor index into the query string |
27 | func_name | text | The function in which this message is generated |
28 | file_name | text | The internal code file where the message originated |
29 | file_line | int | The line of the code file where the message originated |
30 | stack_trace | text | Stack trace text associated with this message |
Searching the Greenplum Server Log Files
Greenplum Database provides a utility called gplogfilter can search through a Greenplum Database log file for entries matching the specified criteria. By default, this utility searches through the Greenplum Database master log file in the default logging location. For example, to display the last three lines of each of the log files under the master directory:
To search through all segment log files simultaneously, run gplogfilter through the gpssh utility. For example, to display the last three lines of each segment log file:
Using gp_toolkit
Use the Greenplum Database administrative schema gp_toolkit to query the system catalogs, log files, and operating environment for system status information. The gp_toolkit schema contains several views you can access using SQL commands. The gp_toolkit schema is accessible to all database users. Some objects require superuser permissions. Use a command similar to the following to add the gp_toolkit schema to your schema search path:
For a description of the available administrative schema views and their usages, see the Greenplum Database Reference Guide .
SQL Standard Error Codes
The following table lists all the defined error codes. Some are not used, but are defined by the SQL standard. The error classes are also shown. For each error class there is a standard error code having the last three characters 000. This code is used only for error conditions that fall within the class but do not have any more-specific code assigned.
The PL/pgSQL condition name for each error code is the same as the phrase shown in the table, with underscores substituted for spaces. For example, code 22012, DIVISION BY ZERO, has condition name DIVISION_BY_ZERO. Condition names can be written in either upper or lower case.
Источник