Error full join is only supported with merge joinable or hash joinable join conditions

Combining two spatial tables in PostGIS I have two PostGIS tables with geometries. One with number attributes and one with letter attributes. Each table has two features (rows). I would like to apply a operation on both stables so that I create the following result: The resulting table has 7 rows and contains all […]

Содержание

  1. Combining two spatial tables in PostGIS
  2. 2 Answers 2
  3. Use NATURAL FULL JOIN to compare two tables in SQL
  4. Using NATURAL FULL JOIN
  5. Pros and cons
  6. When there are NULL values in the data
  7. Row value expression NULL predicate
  8. Обсуждение: [HACKERS] PoC: full merge join on comparison clause
  9. [HACKERS] PoC: full merge join on comparison clause
  10. Re: [HACKERS] PoC: full merge join on comparison clause
  11. Re: [HACKERS] PoC: full merge join on comparison clause
  12. Re: [HACKERS] PoC: full merge join on comparison clause
  13. Re: [HACKERS] PoC: full merge join on comparison clause
  14. Re: [HACKERS] PoC: full merge join on comparison clause
  15. Re: [HACKERS] PoC: full merge join on comparison clause
  16. Re: FULL JOIN is only supported with merge-joinable join conditions
  17. Обсуждение: small table left outer join big table
  18. small table left outer join big table
  19. Re: small table left outer join big table
  20. Re: small table left outer join big table
  21. Re: small table left outer join big table
  22. Re: small table left outer join big table
  23. Re: small table left outer join big table
  24. Re: small table left outer join big table
  25. Re: small table left outer join big table
  26. Re: small table left outer join big table
  27. Re: small table left outer join big table
  28. Re: small table left outer join big table
  29. Re: small table left outer join big table

Combining two spatial tables in PostGIS

I have two PostGIS tables with geometries. One with number attributes and one with letter attributes. Each table has two features (rows).

I would like to apply a operation on both stables so that I create the following result:

The resulting table has 7 rows and contains all attributes from the input tables. Some features have both the letter attribute and number attributes.

I struggle to obtain this result using PostGIS / SQL. So far I’ve only been able to create the features that have both letter and number attributes.

Is there a way to include the remaining features in the same query? I could also create multiple tables and perform a join but this is a non-parsimonious solution. I’ve also tried ST_Union to no avail.

In ArcMap and Geopandas this is often referred to as «Union» without dissolving however as far as I can tell ST_Union in PostGIS does something different. http://geopandas.org/set_operations.html https://desktop.arcgis.com/en/arcmap/10.3/tools/analysis-toolbox/how-union-analysis-works.htm

2 Answers 2

You need the union of three sets:

1) the intersection of the two tables — has two values, one from each table.

2) The ST_DIFFERENCE Of the first table and the intersection set above — one value from 1st

3) The ST_DIFFERENCE of the second table and the intersection set above -one value from 2nd

I would turn the first set into its own table, then select the pairs of values from the two tables that show up in that table of intersections, and make a table of the ST_SYMDIFFERENCE of the two polygons. I believe that would give you groups 2 and 3 in a single pass. However, that only works if any given member of table 1 is intersected only by one member of table 2, and vice-versa. If so:

Then make a union of the intersections and the non-intersections. I’m sure there’s one really big SQL statement that would do that with no temporary tables, but I treat SQL databases like a calculator.

That code also assumes that the «Table1 value» and Table2 value are also primary, unique indexes. If not, you’ll need to pull those in.

If the two tables have members where there are multiple intersections — if Table 2 has two or more polygons intersecting with one polygon from table 1 — then THIS WILL NOT WORK RIGHT. You’d get two entries — one would be the Table 1 polygon minus the intersection with the first thing from Table2, another would be the same Table 1 polygon minus the other intersection.

That’s avoidable with a bulkier bit of code: you have to make all the members of the INTERSECTION_POLY table that contain the same value from Table 2 into a «multipolygon» that is the union of all of them, and find the ST_SYMDIFFERENCE From that multipolygon.

We can go there, but I don’t want to get into it if you don’t need it.

Источник

Use NATURAL FULL JOIN to compare two tables in SQL

There are a few ways to compare two similar tables in SQL. Assuming PostgreSQL syntax, we might have this schema:

It is now possible to use UNION and EXCEPT as suggested by Chris Saxon:

You can compare two tables with

t1 minus t2 union t2 minus t1

There’s a better way where you query each table once… but you need to list out all the columns

Luckily @StewAshton has built a neat package to generate the #SQL for you :)https://t.co/7OroPV6JdY

In PostgreSQL, we can write:

Notice how TABLE x is just standard SQL, and PostgreSQL, syntax sugar for SELECT * FROM x .

Unfortunately, this requires two accesses to each table. Can we do it with a single access?

Using NATURAL FULL JOIN

Assuming there are no NULL values, we can write this:

Why? Because a NATURAL JOIN is syntax sugar for joining using all the shared column names of the two tables, and the FULL JOIN makes sure we can retrieve also the columns that are not matched by the join predicate. Another way to write this is:

Unfortunately, as of PostgreSQL 12, this produces an error:

ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions

Pros and cons

Pros and cons compared to the set operator solution using UNION and EXCEPT :

Pros

  • Each table is accessed only once
  • Comparison is now name based, not column index based, i.e. it can still work if only parts of the columns are the shared

Cons

  • If index based column comparison was desired (because the tables are the same structurally, but do not share the exact same column names), then we’d have to rename each individual column to a common column name.
  • If there’s duplicate data, there’s going to be a cartesian product, which might make this solution quite slower
  • UNION and EXCEPT treat NULL values as “not distinct”. This isn’t the case with NATURAL JOIN . See workaround below

When there are NULL values in the data

In the presence of NULL values, we can no longer use NATURAL JOIN or JOIN .. USING . We could use the DISTINCT predicate :

Row value expression NULL predicate

Observe the usage of the esoteric NULL predicate for row value expressions, which uses the following truth table:

Yes. R IS NULL and NOT R IS NOT NULL are not the same thing in SQL…

Источник

Обсуждение: [HACKERS] PoC: full merge join on comparison clause

[HACKERS] PoC: full merge join on comparison clause

Вложения

Re: [HACKERS] PoC: full merge join on comparison clause

Re: [HACKERS] PoC: full merge join on comparison clause

Re: [HACKERS] PoC: full merge join on comparison clause

Вложения

Re: [HACKERS] PoC: full merge join on comparison clause

Re: [HACKERS] PoC: full merge join on comparison clause

Re: [HACKERS] PoC: full merge join on comparison clause

Thanks for the review.

Jeff, I’m copying you because this is relevant to our discussion about what to do with mergeopfamilies when adding new merge join types.

For mergeopfamilies, I’m not sure what is the best thing to do. I’ll try to explain my understanding of the situation, please correct me if I’m wrong.

Before the patch, mergeopfamilies was used for two things: creating equivalence classes and performing merge joins.

For equivalence classes: we look at the restriction clauses, and if they have mergeopfamilies set, it means that these clause are based on an equality operator, and the left and right variables must be equal. To record this fact, we create an equivalence class. The variables might be equal for one equality operator and not equal for another, so we record the particular operator families to which our equality operator belongs.

For merge join: we look at the join clauses, and if they have mergeopfamilies set, it means that these clauses are based on an equality operator, and we can try performing this particular join as merge join. These opfamilies are also used beforehand to create the equivalence classes for left and right variables. The equivalence classes are used to match the join clauses to pathkeys describing the ordering of join inputs.

So, if we want to start doing merge joins for operators other than equality, we still need to record their opfamilies, but only use them for the second case and not the first. I chose to put these opfamilies to different variables, and
name the one used for equivalence classes ‘equivopfamilies’ and the one used for merge joins ‘mergeopfamilies’. The equality operators are used for both cases, so we put their opfamilies into both of these variables.

I agree this might look confusing. Indeed, we could keep a single variable for opfamilies, and add separate flags that show how they can be used, be that for equivalence classes, merge joins, range joins or some combination of them. This is similar to what Jeff did in his range merge join patch [1]. I will think more about this and try to produce an updated patch.

I changed mergejoinscansel() slightly to reflect the fact that the inner relation is scanned from the beginning if we have an inequality merge clause.

I extended the comment in final_cost_mergejoin(). Not sure if that approximation makes any sense, but this is the best I could think of.

Style problems are fixed.

Attached please find the new version of the patch that addresses all the review comments except mergeopfamilies.

The current commitfest is ending, but I’d like to continue working on this patch, so I am moving it to the next one.

Источник

Re: FULL JOIN is only supported with merge-joinable join conditions

From: Tom Lane
To: «Andrus»
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: FULL JOIN is only supported with merge-joinable join conditions
Date: 2007-05-20 01:42:24
Message-ID: 20388.1179625344@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

«Andrus» writes:
> I try to port application to PostgreSQL 8.1+
> The following query runs OK in VFP but causes error in Postgres
> FULL JOIN is only supported with merge-joinable join conditions

> SELECT
> .
> FROM iandmed
> FULL JOIN koosseis ON iandmed.ametikoht=koosseis.ametikoht
> AND iandmed.Kuluobj= koosseis.objekt1
> AND iandmed.AmetiKoht is not null

Uh, can’t you just drop the «iandmed.AmetiKoht is not null» condition?
It seems redundant considering that «iandmed.ametikoht=koosseis.ametikoht»
isn’t going to succeed when ametikoht is null.

In the long run we should teach hash join to support full-join behavior,
which would allow cases like this one to work; but it seems not very
high priority, since I’ve yet to see a real-world case where a
non-merge-joinable full-join condition was really needed. (FULL JOIN
being inherently symmetric, the join condition should usually be
symmetric as well. )

Источник

Обсуждение: small table left outer join big table

small table left outer join big table

Re: small table left outer join big table

Please see the following plan:

postgres=# explain select * from small_table left outer join big_table using (id);
QUERY PLAN
—————————————————————————-
Hash Left Join (cost=126408.00..142436.98 rows=371 width=12)
Hash Cond: (small_table.id = big_table.id)
-> Seq Scan on small_table (cost=0.00..1.09 rows=9 width=8)
-> Hash (cost=59142.00..59142.00 rows=4100000 width=8)
-> Seq Scan on big_table (cost=0.00..59142.00 rows=4100000 width=8)
(5 rows)

Here I have a puzzle, why not choose the small table to build hash table? It can avoid multiple batches thus save significant I/O cost, isn’t it?

We can perform this query in two phases:
1) inner join, using the small table to build hash table.
2) check whether each tuple in the hash table has matches before, which can be done with another flag bit

The only compromise is the output order, due to the two separate phases. Not sure whether the SQL standard requires it.

SQL standard does not require the result to be in any particular order unless an ORDER BY is used.


gurjeet.singh
@ EnterpriseDB — The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurjeet@< gmail | yahoo >.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

Re: small table left outer join big table

Re: small table left outer join big table

Re: small table left outer join big table

Re: small table left outer join big table

Re: small table left outer join big table

Re: small table left outer join big table

Re: small table left outer join big table

Re: small table left outer join big table

Re: small table left outer join big table

Re: small table left outer join big table

On Wed, 2010-12-29 at 09:59 -0500, Tom Lane wrote:
> Robert Haas writes:
> > On Wed, Dec 29, 2010 at 7:34 AM, Simon Riggs wrote:
> >> It’s not a bug, that’s the way it currently works. We don’t need a test
> >> case for that.
>
> > Oh, you’re right. I missed the fact that it’s a left join.
>
> The only thing that struck me as curious about it was that the OP didn’t
> get a nestloop-with-inner-indexscan plan. That would be explainable if
> there was no index on the large table’s «id» column . but columns
> named like that usually have indexes.
>
> I can’t get all *that* excited about complicating hash joins as
> proposed. The query is still fundamentally going to be slow because
> you won’t get out of having to seqscan the large table. The only way
> to make it really fast is to not read all of the large table, and
> nestloop-with-inner-indexscan is the only plan type with a hope of
> doing that.

Seq scanning the big table isn’t bad. we’ve gone to a lot of trouble
to make it easy to do this, especially with many users.

Maintaining many large indexes is definitely bad, all that random I/O is
going to suck badly.

Seems like an interesting and relatively optimisation to me. Not sure if
this is a request for feature, or a proposal to write the optimisation.
I hope its the latter.

Thanks for your comments. Yeah I’m excited to write code for PostgreSQL, but I’m new here
and not familiar with the code routine or patch submission. I will try to learn in near future. So
for the moment, it is a request for feature, and I’m looking forward to any pgsql-hackers working
on this.

Источник

Попытка объединить 2 таблицы с таким условием ИЛИ:

FULL JOIN table1
     ON (replace(split_part(table1.contract_award_number::text, ' '::text, 2), '-'::text, ''::text)) = table2.contract_award_id 

     OR (btrim(replace(table1.solicitation_number::text, '-'::text, ''::text))) = table2.solicitation_id

Но Postgresql лает на меня:

FULL JOIN is only supported with merge-joinable or hash-joinable join conditions

Что дает? По какой-то причине, если я добавлю условие:

WHERE table1.solicitation_number::text ~~ '%%'::text 

Ошибка не возникает, но я подозреваю, что это портит результат FULL JOIN.

Спасибо за любую помощь.

2 ответа

Лучший ответ

Должна быть возможность эмулировать любое полное внешнее соединение между двумя таблицами, используя следующий запрос:

SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION ALL
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id
WHERE t1.id IS NULL

Первая половина объединения получает записи, уникальные для первой таблицы, вместе со всеми перекрывающимися записями. Вторая половина объединения получает записи, относящиеся только ко второй таблице. Применение этого шаблона к вашему запросу дает:

SELECT column1, column2, column3
FROM fpds_opportunities fpds
LEFT JOIN fbo_all_opportunity_detail fbo
    ON replace(split_part(fbo.contract_award_number::text, ' '::text, 2), 
               '-'::text, ''::text) = fpds.contract_award_id OR
       btrim(replace(fbo.solicitation_number::text, '-'::text, ''::text)) = fpds.solicitation_id
UNION ALL
SELECT column1, column2, column3
FROM fpds_opportunities fpds
RIGHT JOIN fbo_all_opportunity_detail fbo
    ON replace(split_part(fbo.contract_award_number::text, ' '::text, 2), 
               '-'::text, ''::text) = fpds.contract_award_id OR
       btrim(replace(fbo.solicitation_number::text, '-'::text, ''::text)) = fpds.solicitation_id
WHERE
    fpds.contract_award_id IS NULL AND fdps.solicitation_id IS NULL;


8

Tim Biegeleisen
21 Ноя 2017 в 06:08

Вы можете предварительно вычислить уродливые строки в подзапросе (или CTE) и JOIN с этим. (это также кажется удобным для построения и тестирования запроса; вы никогда не поймете эти строковые вещи правильно с первого раза …)


SELECT ...
FROM table2
FULL JOIN (
        SELECT *
        , replace(split_part(table1.contract_award_number::text, ' '::text, 2), '-'::text, ''::text) AS xca
        , btrim(replace(table1.solicitation_number::text, '-'::text, ''::text)) AS xsa
        FROM table1
        ) AS t1
                ON table2.contract_award_id = t1.xca
                OR table2.solicitation_id = t1.xsa
        ;


1

joop
21 Ноя 2017 в 10:38

There are a few ways to compare two similar tables in SQL. Assuming PostgreSQL syntax, we might have this schema:

CREATE TABLE t1 (a INT, b INT, c INT);
CREATE TABLE t2 (a INT, b INT, c INT);
INSERT INTO t1 VALUES (1, 2, 3), (4, 5, 6), (7, 8, 9);
INSERT INTO t2 VALUES            (4, 5, 6), (7, 8, 9), (10, 11, 12);

It is now possible to use UNION and EXCEPT as suggested by Chris Saxon:

You can compare two tables with

t1 minus t2 union t2 minus t1

There’s a better way where you query each table once… but you need to list out all the columns

Luckily @StewAshton has built a neat package to generate the #SQL for you :)https://t.co/7OroPV6JdY

— Chris Saxon (@ChrisRSaxon) August 5, 2020

In PostgreSQL, we can write:

(TABLE t1 EXCEPT TABLE t2) 
UNION
(TABLE t2 EXCEPT TABLE t1)
ORDER BY a, b, c

Notice how TABLE x is just standard SQL, and PostgreSQL, syntax sugar for SELECT * FROM x.

And we’ll get:

a |b |c |
--|--|--|
 1| 2| 3|
10|11|12|

Unfortunately, this requires two accesses to each table. Can we do it with a single access?

Using NATURAL FULL JOIN

Yes! Using NATURAL FULL JOIN, another rare use-case for this esoteric operator.

Assuming there are no NULL values, we can write this:

SELECT *
FROM (
  SELECT 't1' AS t1, t1.* FROM t1
) t1 NATURAL FULL JOIN (
  SELECT 't2' AS t2, t2.* FROM t2
) t2 
WHERE NOT (t1, t2) IS NOT NULL;

This produces:

a |b |c |t1|t2|
--|--|--|--|--|
 1| 2| 3|t1|  |
10|11|12|  |t2|

Why? Because a NATURAL JOIN is syntax sugar for joining using all the shared column names of the two tables, and the FULL JOIN makes sure we can retrieve also the columns that are not matched by the join predicate. Another way to write this is:

-- Use JOIN .. USING, instead of NATURAL JOIN
SELECT *
FROM (
  SELECT 't1' AS t1, t1.* FROM t1
) t1 FULL JOIN (
  SELECT 't2' AS t2, t2.* FROM t2
) t2 USING (a, b, c)
WHERE NOT (t1, t2) IS NOT NULL;

Or:

-- Use JOIN .. ON, instead of JOIN .. USING
SELECT 
  coalesce(t1.a, t2.a) AS a,
  coalesce(t1.b, t2.b) AS b,
  coalesce(t1.c, t2.c) AS c,
  t1.t1,
  t2.t2
FROM (
  SELECT 't1' AS t1, t1.* FROM t1
) t1 FULL JOIN (
  SELECT 't2' AS t2, t2.* FROM t2
) t2 ON (t1.a, t1.b, t1.c) = (t2.a, t2.b, t2.c) 
WHERE NOT (t1, t2) IS NOT NULL;

Unfortunately, as of PostgreSQL 12, this produces an error:

ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions

Pros and cons

Pros and cons compared to the set operator solution using UNION and EXCEPT:

Pros

  • Each table is accessed only once
  • Comparison is now name based, not column index based, i.e. it can still work if only parts of the columns are the shared

Cons

  • If index based column comparison was desired (because the tables are the same structurally, but do not share the exact same column names), then we’d have to rename each individual column to a common column name.
  • If there’s duplicate data, there’s going to be a cartesian product, which might make this solution quite slower
  • UNION and EXCEPT treat NULL values as “not distinct”. This isn’t the case with NATURAL JOIN. See workaround below

When there are NULL values in the data

In the presence of NULL values, we can no longer use NATURAL JOIN or JOIN .. USING. We could use the DISTINCT predicate:

SELECT 
  coalesce(t1.a, t2.a) AS a,
  coalesce(t1.b, t2.b) AS b,
  coalesce(t1.c, t2.c) AS c,
  t1.t1,
  t2.t2
FROM (
  SELECT 't1' AS t1, t1.* FROM t1
) t1 FULL JOIN (
  SELECT 't2' AS t2, t2.* FROM t2
) t2 ON (t1.a, t1.b, t1.c) IS NOT DISTINCT FROM (t2.a, t2.b, t2.c) 
WHERE NOT (t1, t2) IS NOT NULL;

Row value expression NULL predicate

Observe the usage of the esoteric NULL predicate for row value expressions, which uses the following truth table:

+-----------------------+-----------+---------------+---------------+-------------------+
| Expression            | R IS NULL | R IS NOT NULL | NOT R IS NULL | NOT R IS NOT NULL |
+-----------------------+-----------+---------------+---------------+-------------------+
| degree 1: null        | true      | false         | false         |  true             |
| degree 1: not null    | false     | true          | true          |  false            |
| degree > 1: all null  | true      | false         | false         |  true             |
| degree > 1: some null | false     | false         | true          |  true             |
| degree > 1: none null | false     | true          | true          |  false            |
+-----------------------+-----------+---------------+---------------+-------------------+

Yes. R IS NULL and NOT R IS NOT NULL are not the same thing in SQL…

It’s just another way of writing:

SELECT *
FROM (
  SELECT 't1' AS t1, t1.* FROM t1
) t1 NATURAL FULL JOIN (
  SELECT 't2' AS t2, t2.* FROM t2
) t2 
WHERE t1 IS NULL
OR t2 IS NULL;

Понравилась статья? Поделить с друзьями:
  • Error ftp login failed
  • Error ftdiun2k ini not found press finish to exit
  • Error fsyncing closing dev sdb input output error
  • Error fsyncing closing dev sda1 input output error
  • Error fsyncing closing dev sda input output error