Sql error 53000 error number of workfiles per query limit exceeded

Sql error 53000 error workfile per query size limit exceeded You are using an outdated browser. Please upgrade your browser to improve your experience. —> share-line —> —> —> —> plus Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to run an SQL query […]

Содержание

  1. Sql error 53000 error workfile per query size limit exceeded
  2. Приложение A. Коды ошибок PostgreSQL
  3. Sql error 53000 error workfile per query size limit exceeded
  4. Submit correction

Sql error 53000 error workfile per query size limit exceeded

You are using an outdated browser. Please upgrade your browser to improve your experience.

—> share-line

—> —> —> —> plus

Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to run an SQL query in memory.

The maximum number of spill files for a given query is governed by the gp_workfile_limit_files_per_query server configuration parameter setting. The default value of 100,000 spill files is sufficient for the majority of queries.

If a query creates more than the configured number of spill files, Greenplum Database returns this error:

Greenplum Database may generate a large number of spill files when:

  • Data skew is present in the queried data. To check for data skew, see Checking for Data Distribution Skew.
  • The amount of memory allocated for the query is too low. You control the maximum amount of memory that can be used by a query with the Greenplum Database server configuration parameters max_statement_mem and statement_mem, or through resource group or resource queue configuration.

You might be able to run the query successfully by changing the query, changing the data distribution, or changing the system memory configuration. The gp_toolkit gp_workfile_* views display spill file usage information. You can use this information to troubleshoot and tune queries. The gp_workfile_* views are described in Checking Query Disk Spill Space Usage.

Additional documentation resources:

  • Memory Consumption Parameters identifies the memory-related spill file server configuration parameters.
  • Using Resource Groups describes memory and spill considerations when resource group-based resource management is active.
  • Using Resource Queues describes memory and spill considerations when resource queue-based resource management is active.

Parent topic: Querying Data

Источник

Приложение A. Коды ошибок PostgreSQL

Всем сообщениям, которые выдаёт сервер PostgreSQL , назначены пятисимвольные коды ошибок, соответствующие кодам «SQLSTATE» , описанным в стандарте SQL. Приложения, которые должны знать, какое условие ошибки имело место, обычно проверяют код ошибки и только потом обращаются к текстовому сообщению об ошибке. Коды ошибок, скорее всего, не изменятся от выпуска к выпуску PostgreSQL , и они не меняются при локализации как сообщения об ошибках. Заметьте, что отдельные, но не все коды ошибок, которые выдаёт PostgreSQL , определены стандартом SQL; некоторые дополнительные коды ошибок для условий, не описанных стандартом, были добавлены независимо или позаимствованы из других баз данных.

Согласно стандарту, первые два символа кода ошибки обозначают класс ошибок, а последние три символа обозначают определённое условие в этом классе. Таким образом, приложение, не знающее значение определённого кода ошибки, всё же может понять, что делать, по классу ошибки.

В Таблице A-1 перечислены все коды ошибок, определённые в PostgreSQL 9.4.1. (Некоторые коды в настоящее время не используются, хотя они определены в стандарте SQL.) Также показаны классы ошибок. Для каждого класса ошибок имеется «стандартный» код ошибки с последними тремя символами 000. Этот код выдаётся только для таких условий ошибок, которые относятся к определённому классу, но не имеют более определённого кода.

Символ, указанный в колонке «Имя условия» , определяет условие в PL/pgSQL . Имена условий могут записываться в верхнем или нижнем регистре. (Заметьте, что PL/pgSQL , в отличие от ошибок, не распознаёт предупреждения; то есть классы 00, 01 и 02.)

Для некоторых типов ошибок сервер сообщает имя объекта базы данных (таблица, колонка таблицы, тип данных или ограничение), связанного с ошибкой; например, имя уникального ограничения, вызвавшего ошибку unique_violation. Такие имена передаются в отдельных полях сообщения об ошибке, чтобы приложениям не пришлось извлекать его из возможно локализованного текста ошибки для человека. На момент выхода PostgreSQL 9.3 полностью охватывались только ошибки класса SQLSTATE 23 (нарушения ограничений целостности), но в будущем должны быть охвачены и другие классы.

Источник

Sql error 53000 error workfile per query size limit exceeded

All messages emitted by the PostgreSQL server are assigned five-character error codes that follow the SQL standard’s conventions for “ SQLSTATE ” codes. Applications that need to know which error condition has occurred should usually test the error code, rather than looking at the textual error message. The error codes are less likely to change across PostgreSQL releases, and also are not subject to change due to localization of error messages. Note that some, but not all, of the error codes produced by PostgreSQL are defined by the SQL standard; some additional error codes for conditions not defined by the standard have been invented or borrowed from other databases.

According to the standard, the first two characters of an error code denote a class of errors, while the last three characters indicate a specific condition within that class. Thus, an application that does not recognize the specific error code might still be able to infer what to do from the error class.

Table A.1 lists all the error codes defined in PostgreSQL 15.1. (Some are not actually used at present, but are defined by the SQL standard.) The error classes are also shown. For each error class there is a “ standard ” error code having the last three characters 000 . This code is used only for error conditions that fall within the class but do not have any more-specific code assigned.

The symbol shown in the column “ Condition Name ” is the condition name to use in PL/pgSQL . Condition names can be written in either upper or lower case. (Note that PL/pgSQL does not recognize warning, as opposed to error, condition names; those are classes 00, 01, and 02.)

For some types of errors, the server reports the name of a database object (a table, table column, data type, or constraint) associated with the error; for example, the name of the unique constraint that caused a unique_violation error. Such names are supplied in separate fields of the error report message so that applications need not try to extract them from the possibly-localized human-readable text of the message. As of PostgreSQL 9.3, complete coverage for this feature exists only for errors in SQLSTATE class 23 (integrity constraint violation), but this is likely to be expanded in future.

Table A.1. PostgreSQL Error Codes

Error Code Condition Name
Class 00 — Successful Completion
00000 successful_completion
Class 01 — Warning
01000 warning
0100C dynamic_result_sets_returned
01008 implicit_zero_bit_padding
01003 null_value_eliminated_in_set_function
01007 privilege_not_granted
01006 privilege_not_revoked
01004 string_data_right_truncation
01P01 deprecated_feature
Class 02 — No Data (this is also a warning class per the SQL standard)
02000 no_data
02001 no_additional_dynamic_result_sets_returned
Class 03 — SQL Statement Not Yet Complete
03000 sql_statement_not_yet_complete
Class 08 — Connection Exception
08000 connection_exception
08003 connection_does_not_exist
08006 connection_failure
08001 sqlclient_unable_to_establish_sqlconnection
08004 sqlserver_rejected_establishment_of_sqlconnection
08007 transaction_resolution_unknown
08P01 protocol_violation
Class 09 — Triggered Action Exception
09000 triggered_action_exception
Class 0A — Feature Not Supported
0A000 feature_not_supported
Class 0B — Invalid Transaction Initiation
0B000 invalid_transaction_initiation
Class 0F — Locator Exception
0F000 locator_exception
0F001 invalid_locator_specification
Class 0L — Invalid Grantor
0L000 invalid_grantor
0LP01 invalid_grant_operation
Class 0P — Invalid Role Specification
0P000 invalid_role_specification
Class 0Z — Diagnostics Exception
0Z000 diagnostics_exception
0Z002 stacked_diagnostics_accessed_without_active_handler
Class 20 — Case Not Found
20000 case_not_found
Class 21 — Cardinality Violation
21000 cardinality_violation
Class 22 — Data Exception
22000 data_exception
2202E array_subscript_error
22021 character_not_in_repertoire
22008 datetime_field_overflow
22012 division_by_zero
22005 error_in_assignment
2200B escape_character_conflict
22022 indicator_overflow
22015 interval_field_overflow
2201E invalid_argument_for_logarithm
22014 invalid_argument_for_ntile_function
22016 invalid_argument_for_nth_value_function
2201F invalid_argument_for_power_function
2201G invalid_argument_for_width_bucket_function
22018 invalid_character_value_for_cast
22007 invalid_datetime_format
22019 invalid_escape_character
2200D invalid_escape_octet
22025 invalid_escape_sequence
22P06 nonstandard_use_of_escape_character
22010 invalid_indicator_parameter_value
22023 invalid_parameter_value
22013 invalid_preceding_or_following_size
2201B invalid_regular_expression
2201W invalid_row_count_in_limit_clause
2201X invalid_row_count_in_result_offset_clause
2202H invalid_tablesample_argument
2202G invalid_tablesample_repeat
22009 invalid_time_zone_displacement_value
2200C invalid_use_of_escape_character
2200G most_specific_type_mismatch
22004 null_value_not_allowed
22002 null_value_no_indicator_parameter
22003 numeric_value_out_of_range
2200H sequence_generator_limit_exceeded
22026 string_data_length_mismatch
22001 string_data_right_truncation
22011 substring_error
22027 trim_error
22024 unterminated_c_string
2200F zero_length_character_string
22P01 floating_point_exception
22P02 invalid_text_representation
22P03 invalid_binary_representation
22P04 bad_copy_file_format
22P05 untranslatable_character
2200L not_an_xml_document
2200M invalid_xml_document
2200N invalid_xml_content
2200S invalid_xml_comment
2200T invalid_xml_processing_instruction
22030 duplicate_json_object_key_value
22031 invalid_argument_for_sql_json_datetime_function
22032 invalid_json_text
22033 invalid_sql_json_subscript
22034 more_than_one_sql_json_item
22035 no_sql_json_item
22036 non_numeric_sql_json_item
22037 non_unique_keys_in_a_json_object
22038 singleton_sql_json_item_required
22039 sql_json_array_not_found
2203A sql_json_member_not_found
2203B sql_json_number_not_found
2203C sql_json_object_not_found
2203D too_many_json_array_elements
2203E too_many_json_object_members
2203F sql_json_scalar_required
2203G sql_json_item_cannot_be_cast_to_target_type
Class 23 — Integrity Constraint Violation
23000 integrity_constraint_violation
23001 restrict_violation
23502 not_null_violation
23503 foreign_key_violation
23505 unique_violation
23514 check_violation
23P01 exclusion_violation
Class 24 — Invalid Cursor State
24000 invalid_cursor_state
Class 25 — Invalid Transaction State
25000 invalid_transaction_state
25001 active_sql_transaction
25002 branch_transaction_already_active
25008 held_cursor_requires_same_isolation_level
25003 inappropriate_access_mode_for_branch_transaction
25004 inappropriate_isolation_level_for_branch_transaction
25005 no_active_sql_transaction_for_branch_transaction
25006 read_only_sql_transaction
25007 schema_and_data_statement_mixing_not_supported
25P01 no_active_sql_transaction
25P02 in_failed_sql_transaction
25P03 idle_in_transaction_session_timeout
Class 26 — Invalid SQL Statement Name
26000 invalid_sql_statement_name
Class 27 — Triggered Data Change Violation
27000 triggered_data_change_violation
Class 28 — Invalid Authorization Specification
28000 invalid_authorization_specification
28P01 invalid_password
Class 2B — Dependent Privilege Descriptors Still Exist
2B000 dependent_privilege_descriptors_still_exist
2BP01 dependent_objects_still_exist
Class 2D — Invalid Transaction Termination
2D000 invalid_transaction_termination
Class 2F — SQL Routine Exception
2F000 sql_routine_exception
2F005 function_executed_no_return_statement
2F002 modifying_sql_data_not_permitted
2F003 prohibited_sql_statement_attempted
2F004 reading_sql_data_not_permitted
Class 34 — Invalid Cursor Name
34000 invalid_cursor_name
Class 38 — External Routine Exception
38000 external_routine_exception
38001 containing_sql_not_permitted
38002 modifying_sql_data_not_permitted
38003 prohibited_sql_statement_attempted
38004 reading_sql_data_not_permitted
Class 39 — External Routine Invocation Exception
39000 external_routine_invocation_exception
39001 invalid_sqlstate_returned
39004 null_value_not_allowed
39P01 trigger_protocol_violated
39P02 srf_protocol_violated
39P03 event_trigger_protocol_violated
Class 3B — Savepoint Exception
3B000 savepoint_exception
3B001 invalid_savepoint_specification
Class 3D — Invalid Catalog Name
3D000 invalid_catalog_name
Class 3F — Invalid Schema Name
3F000 invalid_schema_name
Class 40 — Transaction Rollback
40000 transaction_rollback
40002 transaction_integrity_constraint_violation
40001 serialization_failure
40003 statement_completion_unknown
40P01 deadlock_detected
Class 42 — Syntax Error or Access Rule Violation
42000 syntax_error_or_access_rule_violation
42601 syntax_error
42501 insufficient_privilege
42846 cannot_coerce
42803 grouping_error
42P20 windowing_error
42P19 invalid_recursion
42830 invalid_foreign_key
42602 invalid_name
42622 name_too_long
42939 reserved_name
42804 datatype_mismatch
42P18 indeterminate_datatype
42P21 collation_mismatch
42P22 indeterminate_collation
42809 wrong_object_type
428C9 generated_always
42703 undefined_column
42883 undefined_function
42P01 undefined_table
42P02 undefined_parameter
42704 undefined_object
42701 duplicate_column
42P03 duplicate_cursor
42P04 duplicate_database
42723 duplicate_function
42P05 duplicate_prepared_statement
42P06 duplicate_schema
42P07 duplicate_table
42712 duplicate_alias
42710 duplicate_object
42702 ambiguous_column
42725 ambiguous_function
42P08 ambiguous_parameter
42P09 ambiguous_alias
42P10 invalid_column_reference
42611 invalid_column_definition
42P11 invalid_cursor_definition
42P12 invalid_database_definition
42P13 invalid_function_definition
42P14 invalid_prepared_statement_definition
42P15 invalid_schema_definition
42P16 invalid_table_definition
42P17 invalid_object_definition
Class 44 — WITH CHECK OPTION Violation
44000 with_check_option_violation
Class 53 — Insufficient Resources
53000 insufficient_resources
53100 disk_full
53200 out_of_memory
53300 too_many_connections
53400 configuration_limit_exceeded
Class 54 — Program Limit Exceeded
54000 program_limit_exceeded
54001 statement_too_complex
54011 too_many_columns
54023 too_many_arguments
Class 55 — Object Not In Prerequisite State
55000 object_not_in_prerequisite_state
55006 object_in_use
55P02 cant_change_runtime_param
55P03 lock_not_available
55P04 unsafe_new_enum_value_usage
Class 57 — Operator Intervention
57000 operator_intervention
57014 query_canceled
57P01 admin_shutdown
57P02 crash_shutdown
57P03 cannot_connect_now
57P04 database_dropped
57P05 idle_session_timeout
Class 58 — System Error (errors external to PostgreSQL itself)
58000 system_error
58030 io_error
58P01 undefined_file
58P02 duplicate_file
Class 72 — Snapshot Failure
72000 snapshot_too_old
Class F0 — Configuration File Error
F0000 config_file_error
F0001 lock_file_exists
Class HV — Foreign Data Wrapper Error (SQL/MED)
HV000 fdw_error
HV005 fdw_column_name_not_found
HV002 fdw_dynamic_parameter_value_needed
HV010 fdw_function_sequence_error
HV021 fdw_inconsistent_descriptor_information
HV024 fdw_invalid_attribute_value
HV007 fdw_invalid_column_name
HV008 fdw_invalid_column_number
HV004 fdw_invalid_data_type
HV006 fdw_invalid_data_type_descriptors
HV091 fdw_invalid_descriptor_field_identifier
HV00B fdw_invalid_handle
HV00C fdw_invalid_option_index
HV00D fdw_invalid_option_name
HV090 fdw_invalid_string_length_or_buffer_length
HV00A fdw_invalid_string_format
HV009 fdw_invalid_use_of_null_pointer
HV014 fdw_too_many_handles
HV001 fdw_out_of_memory
HV00P fdw_no_schemas
HV00J fdw_option_name_not_found
HV00K fdw_reply_handle
HV00Q fdw_schema_not_found
HV00R fdw_table_not_found
HV00L fdw_unable_to_create_execution
HV00M fdw_unable_to_create_reply
HV00N fdw_unable_to_establish_connection
Class P0 — PL/pgSQL Error
P0000 plpgsql_error
P0001 raise_exception
P0002 no_data_found
P0003 too_many_rows
P0004 assert_failure
Class XX — Internal Error
XX000 internal_error
XX001 data_corrupted
XX002 index_corrupted
Prev Up Next
Part VIII. Appendixes Home Appendix B. Date/Time Support

Submit correction

If you see anything in the documentation that is not correct, does not match your experience with the particular feature or requires further clarification, please use this form to report a documentation issue.

Источник

The gadget spec URL could not be found

Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to execute an SQL query in memory. 

By default maximum no of spill files that can be created is 100,000 which is sufficient for the majority of queries. However, If a query creates more than the specified number of spill files, Greenplum Database returns error:

ERROR: number of workfiles per query limit exceeded

Reasons behind  large number of spill files to be generated include:

  1. Data skew is present in the queried data.
  2. The amount memory allocated for the query is too low.

when you get this error there are many ways you solve this problem.

  1. By changing the query, changing the data distribution, 
  2. By changing the system memory configuration. 

Note: You can control the maximum amount of memory that can used by a query with the Greenplum Database server configuration parameters max_statement_mem,statement_mem or through resource queues.

Monitoring spill file usage is done by looking at the information available using gp_workfile_* views

sachi=# dv gp_toolkit.gp_workfile_*
List of relations
Schema | Name | Type | Owner | Storage
————+——————————-+——+———+———
gp_toolkit | gp_workfile_entries | view | gpadmin | none
gp_toolkit | gp_workfile_usage_per_query | view | gpadmin | none
gp_toolkit | gp_workfile_usage_per_segment | view | gpadmin | none
(3 rows)

Lets look at the definition of these views.

1. gp_workfile_entries: This view contains one row for each operator using disk space for workfiles on a segment at the current time. The view is accessible to all users, however non-superusers only to see information for the databases that they have permission to access.

2. gp_workfile_usage_per_query: This view contains one row for each query using disk space for workfiles on a segment at the current time. The view is accessible to all users, however non-superusers only to see information for the databases that they have permission to access.

3. gp_workfile_usage_per_segment:This view contains one row for each segment. Each row displays the total amount of disk space used for workfiles on the segment at the current time. The view is accessible to all users, however non-superusers only to see information for the databases that they have permission to access.

gp_workfile_usage_per_segment (one row for each segment)

||

||

V

gp_workfile_usage_per_query ( one row for each query using disk space for workfiles on a segment at the current time

gp_workfile_entries (one row for each operator using disk space for workfiles on a segment at the current time)

sachi=# d gp_toolkit.gp_workfile_entries

View «gp_toolkit.gp_workfile_entries»
Column | Type | Modifiers
—————+———+————
datname | name |
procpid | integer |
sess_id | integer |
command_cnt | integer |
usename | name |
current_query | text |
segid | integer |
slice | integer |
optype | text |
workmem | integer |
size | bigint |
numfiles | integer |
directory | text |
state | text |
utility | integer |
View definition:
WITH all_entries AS (
SELECT c.segid, c.path, c.hash, c.size, c.utility, c.state, c.workmem, c.optype, c.slice, c.sessionid, c.commandid, c.query_start, c.numfiles
FROM ONLY gp_toolkit.__gp_localid, gp_toolkit.__gp_workfile_entries_f() c(segid integer, path text, hash integer, size bigint, utility integer, state integer, workmem integer, optype text, slice integer, sessionid integer, commandid integer, query_start timestamp with time zone, numfiles integer)
UNION ALL
SELECT c.segid, c.path, c.hash, c.size, c.utility, c.state, c.workmem, c.optype, c.slice, c.sessionid, c.commandid, c.query_start, c.numfiles
FROM ONLY gp_toolkit.__gp_masterid, gp_toolkit.__gp_workfile_entries_f() c(segid integer, path text, hash integer, size bigint, utility integer, state integer, workmem integer, optype text, slice integer, sessionid integer, commandid integer, query_start timestamp with time zone, numfiles integer)
)
SELECT s.datname,
CASE
WHEN c.state = 1 THEN s.procpid
ELSE NULL::integer
END AS procpid, c.sessionid AS sess_id, c.commandid AS command_cnt, s.usename,
CASE
WHEN c.state = 1 THEN s.current_query
ELSE NULL::text
END AS current_query, c.segid, c.slice, c.optype, c.workmem, c.size, c.numfiles, c.path AS directory,
CASE
WHEN c.state = 1 THEN ‘RUNNING’::text
WHEN c.state = 2 THEN ‘CACHED’::text
WHEN c.state = 3 THEN ‘DELETING’::text
ELSE ‘UNKNOWN’::text
END AS state, c.utility
FROM all_entries c
LEFT JOIN pg_stat_activity s ON c.sessionid = s.sess_id;

sachi=# d gp_toolkit.gp_workfile_usage_per_query

View «gp_toolkit.gp_workfile_usage_per_query»
Column | Type | Modifiers
—————+———+————
datname | name |
procpid | integer |
sess_id | integer |
command_cnt | integer |
usename | name |
current_query | text |
segid | integer |
state | text |
size | numeric |
numfiles | bigint |
View definition:
SELECT gp_workfile_entries.datname, gp_workfile_entries.procpid, gp_workfile_entries.sess_id, gp_workfile_entries.command_cnt, gp_workfile_entries.usename, gp_workfile_entries.current_query, gp_workfile_entries.segid, gp_workfile_entries.state, sum(gp_workfile_entries.size) AS size, sum(gp_workfile_entries.numfiles) AS numfiles
FROM gp_toolkit.gp_workfile_entries
GROUP BY gp_workfile_entries.datname, gp_workfile_entries.procpid, gp_workfile_entries.sess_id, gp_workfile_entries.command_cnt, gp_workfile_entries.usename, gp_workfile_entries.current_query, gp_workfile_entries.segid, gp_workfile_entries.state;

sachi=# d gp_toolkit.gp_workfile_usage_per_segment
View «gp_toolkit.gp_workfile_usage_per_segment»
Column | Type | Modifiers
———-+———-+————
segid | smallint |
size | numeric |
numfiles | bigint |
View definition:
SELECT gpseg.content AS segid, COALESCE(sum(wfe.size), 0::numeric) AS size, sum(wfe.numfiles) AS numfiles
FROM ( SELECT gp_segment_configuration.content
FROM gp_segment_configuration
WHERE gp_segment_configuration.role = ‘p’::»char») gpseg
LEFT JOIN gp_toolkit.gp_workfile_entries wfe ON gpseg.content = wfe.segid
GROUP BY gpseg.content;

sachi=#

Workfile Disk Spill Space (4.3.0.0, 4.3.1.0, 4.3.2.0)

Server Configuration Parameters for gp_workfiles

1. gp_workfile_compress

_algorithm

2. gp_workfile_limit_files

_per_query

3. gp_workfile_limit_per

_query

4. gp_workfile_limit_per

_checksumming

The gadget spec URL could not be found

The gadget spec URL could not be found

The gadget spec URL could not be found

Permalink

Cannot retrieve contributors at this time


This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters

Show hidden characters

— Test workfile limits
— Ensure the queries below need to spill to disk.
set statement_mem=’1 MB’;
— SRF materializes the result in a tuplestore. Check that
— gp_workfile_limit_per_query is enforced.
select count(distinct g) from generate_series(1, 1000000) g;
count
———
1000000
(1 row)
set gp_workfile_limit_per_query=’5 MB’;
select count(distinct g) from generate_series(1, 1000000) g;
ERROR: workfile per query size limit exceeded
reset gp_workfile_limit_per_query;
— Also test limit on number of files (gp_workfile_limit_files_per_query)
set gp_workfile_limit_files_per_query=’4′;
select count(g) from generate_series(1, 500000) g
union
select count(g) from generate_series(1, 500000) g
union
select count(g) from generate_series(1, 500000) g
order by 1;
count
———
500000
(1 row)
set gp_workfile_limit_files_per_query=’2′;
select count(g) from generate_series(1, 500000) g
union
select count(g) from generate_series(1, 500000) g
union
select count(g) from generate_series(1, 500000) g
order by 1;
ERROR: number of workfiles per query limit exceeded
— We cannot test the per-segment limit, because changing it requires
— a postmaster restart. It’s enforced in the same way as the per-query
— limit, though, and it’s simpler, so if the per-query limit works,
— the per-segment limit probably works too.

Sql error 53000 error workfile per query size limit exceeded

You are using an outdated browser. Please upgrade your browser to improve your experience.

—> share-line

—> —> —> —> plus

Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to run an SQL query in memory.

The maximum number of spill files for a given query is governed by the gp_workfile_limit_files_per_query server configuration parameter setting. The default value of 100,000 spill files is sufficient for the majority of queries.

If a query creates more than the configured number of spill files, Greenplum Database returns this error:

Greenplum Database may generate a large number of spill files when:

  • Data skew is present in the queried data. To check for data skew, see Checking for Data Distribution Skew.
  • The amount of memory allocated for the query is too low. You control the maximum amount of memory that can be used by a query with the Greenplum Database server configuration parameters max_statement_mem and statement_mem, or through resource group or resource queue configuration.

You might be able to run the query successfully by changing the query, changing the data distribution, or changing the system memory configuration. The gp_toolkit gp_workfile_* views display spill file usage information. You can use this information to troubleshoot and tune queries. The gp_workfile_* views are described in Checking Query Disk Spill Space Usage.

Additional documentation resources:

  • Memory Consumption Parameters identifies the memory-related spill file server configuration parameters.
  • Using Resource Groups describes memory and spill considerations when resource group-based resource management is active.
  • Using Resource Queues describes memory and spill considerations when resource queue-based resource management is active.

Parent topic: Querying Data

Источник

Sql error 53000 error workfile per query size limit exceeded

Всем сообщениям, которые выдаёт сервер Postgres Pro , назначены пятисимвольные коды ошибок, соответствующие кодам « SQLSTATE » , описанным в стандарте SQL. Приложения, которые должны знать, какое условие ошибки имело место, обычно проверяют код ошибки и только потом обращаются к текстовому сообщению об ошибке. Коды ошибок, скорее всего, не изменятся от выпуска к выпуску Postgres Pro , и они не меняются при локализации как сообщения об ошибках. Заметьте, что отдельные, но не все коды ошибок, которые выдаёт Postgres Pro , определены стандартом SQL; некоторые дополнительные коды ошибок для условий, не описанных стандартом, были добавлены независимо или позаимствованы из других баз данных.

Согласно стандарту, первые два символа кода ошибки обозначают класс ошибок, а последние три символа обозначают определённое условие в этом классе. Таким образом, приложение, не знающее значение определённого кода ошибки, всё же может понять, что делать, по классу ошибки.

В Таблице A.1 перечислены все коды ошибок, определённые в Postgres Pro 9.5.20.1. (Некоторые коды в настоящее время не используются, хотя они определены в стандарте SQL.) Также показаны классы ошибок. Для каждого класса ошибок имеется « стандартный » код ошибки с последними тремя символами 000 . Этот код выдаётся только для таких условий ошибок, которые относятся к некоторому классу, но не имеют более определённого кода.

Символ, указанный в столбце « Имя условия » , определяет условие в PL/pgSQL . Имена условий могут записываться в верхнем или нижнем регистре. (Заметьте, что PL/pgSQL , в отличие от ошибок, не распознаёт предупреждения; то есть классы 00, 01 и 02.)

Для некоторых типов ошибок сервер сообщает имя объекта базы данных (таблица, столбец таблицы, тип данных или ограничение), связанного с ошибкой; например, имя уникального ограничения, вызвавшего ошибку unique_violation . Такие имена передаются в отдельных полях сообщения об ошибке, чтобы приложениям не пришлось извлекать его из возможно локализованного текста ошибки для человека. На момент выхода PostgreSQL 9.3 полностью охватывались только ошибки класса SQLSTATE 23 (нарушения ограничений целостности), но в будущем должны быть охвачены и другие классы.

Таблица A.1. Коды ошибок Postgres Pro

Источник

Приложение A. Коды ошибок PostgreSQL

Всем сообщениям, которые выдаёт сервер PostgreSQL , назначены пятисимвольные коды ошибок, соответствующие кодам «SQLSTATE» , описанным в стандарте SQL. Приложения, которые должны знать, какое условие ошибки имело место, обычно проверяют код ошибки и только потом обращаются к текстовому сообщению об ошибке. Коды ошибок, скорее всего, не изменятся от выпуска к выпуску PostgreSQL , и они не меняются при локализации как сообщения об ошибках. Заметьте, что отдельные, но не все коды ошибок, которые выдаёт PostgreSQL , определены стандартом SQL; некоторые дополнительные коды ошибок для условий, не описанных стандартом, были добавлены независимо или позаимствованы из других баз данных.

Согласно стандарту, первые два символа кода ошибки обозначают класс ошибок, а последние три символа обозначают определённое условие в этом классе. Таким образом, приложение, не знающее значение определённого кода ошибки, всё же может понять, что делать, по классу ошибки.

В Таблице A-1 перечислены все коды ошибок, определённые в PostgreSQL 9.4.1. (Некоторые коды в настоящее время не используются, хотя они определены в стандарте SQL.) Также показаны классы ошибок. Для каждого класса ошибок имеется «стандартный» код ошибки с последними тремя символами 000. Этот код выдаётся только для таких условий ошибок, которые относятся к определённому классу, но не имеют более определённого кода.

Символ, указанный в колонке «Имя условия» , определяет условие в PL/pgSQL . Имена условий могут записываться в верхнем или нижнем регистре. (Заметьте, что PL/pgSQL , в отличие от ошибок, не распознаёт предупреждения; то есть классы 00, 01 и 02.)

Для некоторых типов ошибок сервер сообщает имя объекта базы данных (таблица, колонка таблицы, тип данных или ограничение), связанного с ошибкой; например, имя уникального ограничения, вызвавшего ошибку unique_violation. Такие имена передаются в отдельных полях сообщения об ошибке, чтобы приложениям не пришлось извлекать его из возможно локализованного текста ошибки для человека. На момент выхода PostgreSQL 9.3 полностью охватывались только ошибки класса SQLSTATE 23 (нарушения ограничений целостности), но в будущем должны быть охвачены и другие классы.

Источник

Monitoring a Greenplum System

You can monitor a Greenplum Database system using a variety of tools included with the system or available as add-ons.

Observing the Greenplum Database system day-to-day performance helps administrators understand the system behavior, plan workflow, and troubleshoot problems. This chapter discusses tools for monitoring database performance and activity.

Also, be sure to review Recommended Monitoring and Maintenance Tasks for monitoring activities you can script to quickly detect problems in the system.

Monitoring Database Activity and Performance

Greenplum Database includes an optional system monitoring and management database, gpperfmon , that administrators can enable. The gpperfmon_install command-line utility creates the gpperfmon database and enables data collection agents that collect and store query and system metrics in the database. Administrators can query metrics in the gpperfmon database. See the documentation for the gpperfmon database in the Greenplum Database Reference Guide .

Tanzu Greenplum Command Center, an optional web-based interface, provides cluster status information, graphical administrative tools, real-time query monitoring, and historical cluster and query data. Download the Greenplum Command Center package from VMware Tanzu Network and view the documentation at the Tanzu Greenplum Command Center Documentation web site.

Monitoring System State

As a Greenplum Database administrator, you must monitor the system for problem events such as a segment going down or running out of disk space on a segment host. The following topics describe how to monitor the health of a Greenplum Database system and examine certain state information for a Greenplum Database system.

Checking System State

A Greenplum Database system is comprised of multiple PostgreSQL instances (the master and segments) spanning multiple machines. To monitor a Greenplum Database system, you need to know information about the system as a whole, as well as status information of the individual instances. The gpstate utility provides status information about a Greenplum Database system.

Viewing Master and Segment Status and Configuration

The default gpstate action is to check segment instances and show a brief status of the valid and failed segments. For example, to see a quick status of your Greenplum Database system:

To see more detailed information about your Greenplum Database array configuration, use gpstate with the -s option:

Viewing Your Mirroring Configuration and Status

If you are using mirroring for data redundancy, you may want to see the list of mirror segment instances in the system, their current synchronization status, and the mirror to primary mapping. For example, to see the mirror segments in the system and their status:

To see the primary to mirror segment mappings:

To see the status of the standby master mirror:

Checking Disk Space Usage

A database administrator’s most important monitoring task is to make sure the file systems where the master and segment data directories reside do not grow to more than 70 percent full. A filled data disk will not result in data corruption, but it may prevent normal database activity from continuing. If the disk grows too full, it can cause the database server to shut down.

You can use the gp_disk_free external table in the gp_toolkit administrative schema to check for remaining free space (in kilobytes) on the segment host file systems. For example:

Checking Sizing of Distributed Databases and Tables

The gp_toolkit administrative schema contains several views that you can use to determine the disk space usage for a distributed Greenplum Database database, schema, table, or index.

For a list of the available sizing views for checking database object sizes and disk space, see the Greenplum Database Reference Guide.

Viewing Disk Space Usage for a Database

To see the total size of a database (in bytes), use the gp_size_of_database view in the gp_toolkit administrative schema. For example:

Viewing Disk Space Usage for a Table

The gp_toolkit administrative schema contains several views for checking the size of a table. The table sizing views list the table by object ID (not by name). To check the size of a table by name, you must look up the relation name ( relname ) in the pg_class table. For example:

For a list of the available table sizing views, see the Greenplum Database Reference Guide.

Viewing Disk Space Usage for Indexes

The gp_toolkit administrative schema contains a number of views for checking index sizes. To see the total size of all index(es) on a table, use the gp_size_of_all_table_indexes view. To see the size of a particular index, use the gp_size_of_index view. The index sizing views list tables and indexes by object ID (not by name). To check the size of an index by name, you must look up the relation name ( relname ) in the pg_class table. For example:

Checking for Data Distribution Skew

All tables in Greenplum Database are distributed, meaning their data is divided across all of the segments in the system. Unevenly distributed data may diminish query processing performance. A table’s distribution policy, set at table creation time, determines how the table’s rows are distributed. For information about choosing the table distribution policy, see the following topics:

The gp_toolkit administrative schema also contains a number of views for checking data distribution skew on a table. For information about how to check for uneven data distribution, see the Greenplum Database Reference Guide .

Viewing a Table’s Distribution Key

To see the columns used as the data distribution key for a table, you can use the d+ meta-command in psql to examine the definition of a table. For example:

When you create a replicated table, Greenplum Database stores all rows in the table on every segment. Replicated tables have no distribution key. Where the d+ meta-command reports the distribution key for a normally distributed table, it shows Distributed Replicated for a replicated table.

Viewing Data Distribution

To see the data distribution of a table’s rows (the number of rows on each segment), you can run a query such as:

A table is considered to have a balanced distribution if all segments have roughly the same number of rows.

Checking for Query Processing Skew

When a query is being processed, all segments should have equal workloads to ensure the best possible performance. If you identify a poorly-performing query, you may need to investigate further using the EXPLAIN command. For information about using the EXPLAIN command and query profiling, see Query Profiling.

Query processing workload can be skewed if the table’s data distribution policy and the query predicates are not well matched. To check for processing skew, you can run a query such as:

This will show the number of rows returned by segment for the given WHERE predicate.

As noted in Viewing Data Distribution, this query will fail if you run it on a replicated table because you cannot reference the gp_segment_id system column in a query on a replicated table.

Avoiding an Extreme Skew Warning

You may receive the following warning message while running a query that performs a hash join operation:

Extreme skew in the innerside of Hashjoin

Viewing Metadata Information about Database Objects

Greenplum Database tracks various metadata information in its system catalogs about the objects stored in a database, such as tables, views, indexes and so on, as well as global objects such as roles and tablespaces.

Viewing the Last Operation Performed

You can use the system views pg_stat_operations and pg_stat_partition_operations to look up actions performed on an object, such as a table. For example, to see the actions performed on a table, such as when it was created and when it was last vacuumed and analyzed:

Viewing the Definition of an Object

To see the definition of an object, such as a table or view, you can use the d+ meta-command when working in psql . For example, to see the definition of a table:

Viewing Session Memory Usage Information

You can create and use the session_level_memory_consumption view that provides information about the current memory utilization for sessions that are running queries on Greenplum Database. The view contains session information and information such as the database that the session is connected to, the query that the session is currently running, and memory consumed by the session processes.

Creating the session_level_memory_consumption View

The session_level_memory_consumption View

The session_state.session_level_memory_consumption view provides information about memory consumption and idle time for sessions that are running SQL queries.

When resource queue-based resource management is active, the column is_runaway indicates whether Greenplum Database considers the session a runaway session based on the vmem memory consumption of the session’s queries. Under the resource queue-based resource management scheme, Greenplum Database considers the session a runaway when the queries consume an excessive amount of memory. The Greenplum Database server configuration parameter runaway_detector_activation_percent governs the conditions under which Greenplum Database considers a session a runaway session.

The is_runaway , runaway_vmem_mb , and runaway_command_cnt columns are not applicable when resource group-based resource management is active.

Table 1. session_state.session_level_memory_consumption

column type references description
datname name В Name of the database that the session is connected to.
sess_id integer В Session ID.
usename name В Name of the session user.
query text В Current SQL query that the session is running.
segid integer В Segment ID.
vmem_mb integer В Total vmem memory usage for the session in MB.
is_runaway boolean В Session is marked as runaway on the segment.
qe_count integer В Number of query processes for the session.
active_qe_count integer В Number of active query processes for the session.
dirty_qe_count integer В Number of query processes that have not yet released their memory.

The value is -1 for sessions that are not running.

runaway_vmem_mb integer В Amount of vmem memory that the session was consuming when it was marked as a runaway session. runaway_command_cnt integer В Command count for the session when it was marked as a runaway session. idle_start timestamptz В The last time a query process in this session became idle.

Viewing Query Workfile Usage Information

The Greenplum Database administrative schema gp_toolkit contains views that display information about Greenplum Database workfiles. Greenplum Database creates workfiles on disk if it does not have sufficient memory to run the query in memory. This information can be used for troubleshooting and tuning queries. The information in the views can also be used to specify the values for the Greenplum Database configuration parameters gp_workfile_limit_per_query and gp_workfile_limit_per_segment .

These are the views in the schema gp_toolkit:

  • The gp_workfile_entries view contains one row for each operator using disk space for workfiles on a segment at the current time.
  • The gp_workfile_usage_per_query view contains one row for each query using disk space for workfiles on a segment at the current time.
  • The gp_workfile_usage_per_segment view contains one row for each segment. Each row displays the total amount of disk space used for workfiles on the segment at the current time.

For information about using gp_toolkit, see Using gp_toolkit.

Viewing the Database Server Log Files

Every database instance in Greenplum Database (master and segments) runs a PostgreSQL database server with its own server log file. Log files are created in the pg_log directory of the master and each segment data directory.

Log File Format

The server log files are written in comma-separated values (CSV) format. Some log entries will not have values for all log fields. For example, only log entries associated with a query worker process will have the slice_id populated. You can identify related log entries of a particular query by the query’s session identifier ( gp_session_id ) and command identifier ( gp_command_count ).

The following fields are written to the log:

Table 2. Greenplum Database Server Log Format

# Field Name Data Type Description
1 event_time timestamp with time zone Time that the log entry was written to the log
2 user_name varchar(100) The database user name
3 database_name varchar(100) The database name
4 process_id varchar(10) The system process ID (prefixed with «p»)
5 thread_id varchar(50) The thread count (prefixed with «th»)
6 remote_host varchar(100) On the master, the hostname/address of the client machine. On the segment, the hostname/address of the master.
7 remote_port varchar(10) The segment or master port number
8 session_start_time timestamp with time zone Time session connection was opened
9 transaction_id int Top-level transaction ID on the master. This ID is the parent of any subtransactions.
10 gp_session_id text Session identifier number (prefixed with «con»)
11 gp_command_count text The command number within a session (prefixed with «cmd»)
12 gp_segment text The segment content identifier (prefixed with «seg» for primaries or «mir» for mirrors). The master always has a content ID of -1.
13 slice_id text The slice ID (portion of the query plan being executed)
14 distr_tranx_id text Distributed transaction ID
15 local_tranx_id text Local transaction ID
16 sub_tranx_id text Subtransaction ID
17 event_severity varchar(10) Values include: LOG, ERROR, FATAL, PANIC, DEBUG1, DEBUG2
18 sql_state_code varchar(10) SQL state code associated with the log message
19 event_message text Log or error message text
20 event_detail text Detail message text associated with an error or warning message
21 event_hint text Hint message text associated with an error or warning message
22 internal_query text The internally-generated query text
23 internal_query_pos int The cursor index into the internally-generated query text
24 event_context text The context in which this message gets generated
25 debug_query_string text User-supplied query string with full detail for debugging. This string can be modified for internal use.
26 error_cursor_pos int The cursor index into the query string
27 func_name text The function in which this message is generated
28 file_name text The internal code file where the message originated
29 file_line int The line of the code file where the message originated
30 stack_trace text Stack trace text associated with this message

Searching the Greenplum Server Log Files

Greenplum Database provides a utility called gplogfilter can search through a Greenplum Database log file for entries matching the specified criteria. By default, this utility searches through the Greenplum Database master log file in the default logging location. For example, to display the last three lines of each of the log files under the master directory:

To search through all segment log files simultaneously, run gplogfilter through the gpssh utility. For example, to display the last three lines of each segment log file:

Using gp_toolkit

Use the Greenplum Database administrative schema gp_toolkit to query the system catalogs, log files, and operating environment for system status information. The gp_toolkit schema contains several views you can access using SQL commands. The gp_toolkit schema is accessible to all database users. Some objects require superuser permissions. Use a command similar to the following to add the gp_toolkit schema to your schema search path:

For a description of the available administrative schema views and their usages, see the Greenplum Database Reference Guide .

SQL Standard Error Codes

The following table lists all the defined error codes. Some are not used, but are defined by the SQL standard. The error classes are also shown. For each error class there is a standard error code having the last three characters 000. This code is used only for error conditions that fall within the class but do not have any more-specific code assigned.

The PL/pgSQL condition name for each error code is the same as the phrase shown in the table, with underscores substituted for spaces. For example, code 22012, DIVISION BY ZERO, has condition name DIVISION_BY_ZERO. Condition names can be written in either upper or lower case.

Источник

Понравилась статья? Поделить с друзьями:
  • Sql error 500051
  • Sql error 50000
  • Sql error 22018
  • Sql error 42p16 error cannot drop columns from view
  • Sql error 22012 error division by zero