Commit 1a8b9fb5 authored by Tom Lane's avatar Tom Lane

Extend the unknowns-are-same-as-known-inputs type resolution heuristic.

For a very long time, one of the parser's heuristics for resolving
ambiguous operator calls has been to assume that unknown-type literals are
of the same type as the other input (if it's known).  However, this was
only used in the first step of quickly checking for an exact-types match,
and thus did not help in resolving matches that require coercion, such as
matches to polymorphic operators.  As we add more polymorphic operators,
this becomes more of a problem.  This patch adds another use of the same
heuristic as a last-ditch check before failing to resolve an ambiguous
operator or function call.  In particular this will let us define the range
inclusion operator in a less limited way (to come in a follow-on patch).
parent bf4f96b5
...@@ -304,13 +304,18 @@ without more clues. Now discard ...@@ -304,13 +304,18 @@ without more clues. Now discard
candidates that do not accept the selected type category. Furthermore, candidates that do not accept the selected type category. Furthermore,
if any candidate accepts a preferred type in that category, if any candidate accepts a preferred type in that category,
discard candidates that accept non-preferred types for that argument. discard candidates that accept non-preferred types for that argument.
Keep all candidates if none survive these tests.
If only one candidate remains, use it; else continue to the next step.
</para> </para>
</step> </step>
<step performance="required"> <step performance="required">
<para> <para>
If only one candidate remains, use it. If no candidate or more than one If there are both <type>unknown</type> and known-type arguments, and all
candidate remains, the known-type arguments have the same type, assume that the
then fail. <type>unknown</type> arguments are also of that type, and check which
candidates can accept that type at the <type>unknown</type>-argument
positions. If exactly one candidate passes this test, use it.
Otherwise, fail.
</para> </para>
</step> </step>
</substeps> </substeps>
...@@ -376,7 +381,7 @@ be interpreted as type <type>text</type>. ...@@ -376,7 +381,7 @@ be interpreted as type <type>text</type>.
</para> </para>
<para> <para>
Here is a concatenation on unspecified types: Here is a concatenation of two values of unspecified types:
<screen> <screen>
SELECT 'abc' || 'def' AS "unspecified"; SELECT 'abc' || 'def' AS "unspecified";
...@@ -394,7 +399,7 @@ and finds that there are candidates accepting both string-category and ...@@ -394,7 +399,7 @@ and finds that there are candidates accepting both string-category and
bit-string-category inputs. Since string category is preferred when available, bit-string-category inputs. Since string category is preferred when available,
that category is selected, and then the that category is selected, and then the
preferred type for strings, <type>text</type>, is used as the specific preferred type for strings, <type>text</type>, is used as the specific
type to resolve the unknown literals as. type to resolve the unknown-type literals as.
</para> </para>
</example> </example>
...@@ -450,6 +455,36 @@ SELECT ~ CAST('20' AS int8) AS "negation"; ...@@ -450,6 +455,36 @@ SELECT ~ CAST('20' AS int8) AS "negation";
</para> </para>
</example> </example>
<example>
<title>Array Inclusion Operator Type Resolution</title>
<para>
Here is another example of resolving an operator with one known and one
unknown input:
<screen>
SELECT array[1,2] &lt;@ '{1,2,3}' as "is subset";
is subset
-----------
t
(1 row)
</screen>
The <productname>PostgreSQL</productname> operator catalog has several
entries for the infix operator <literal>&lt;@</>, but the only two that
could possibly accept an integer array on the left-hand side are
array inclusion (<type>anyarray</> <literal>&lt;@</> <type>anyarray</>)
and range inclusion (<type>anyelement</> <literal>&lt;@</> <type>anyrange</>).
Since none of these polymorphic pseudo-types (see <xref
linkend="datatype-pseudo">) are considered preferred, the parser cannot
resolve the ambiguity on that basis. However, the last resolution rule tells
it to assume that the unknown-type literal is of the same type as the other
input, that is, integer array. Now only one of the two operators can match,
so array inclusion is selected. (Had range inclusion been selected, we would
have gotten an error, because the string does not have the right format to be
a range literal.)
</para>
</example>
</sect1> </sect1>
<sect1 id="typeconv-func"> <sect1 id="typeconv-func">
...@@ -594,13 +629,18 @@ the correct choice cannot be deduced without more clues. ...@@ -594,13 +629,18 @@ the correct choice cannot be deduced without more clues.
Now discard candidates that do not accept the selected type category. Now discard candidates that do not accept the selected type category.
Furthermore, if any candidate accepts a preferred type in that category, Furthermore, if any candidate accepts a preferred type in that category,
discard candidates that accept non-preferred types for that argument. discard candidates that accept non-preferred types for that argument.
Keep all candidates if none survive these tests.
If only one candidate remains, use it; else continue to the next step.
</para> </para>
</step> </step>
<step performance="required"> <step performance="required">
<para> <para>
If only one candidate remains, use it. If no candidate or more than one If there are both <type>unknown</type> and known-type arguments, and all
candidate remains, the known-type arguments have the same type, assume that the
then fail. <type>unknown</type> arguments are also of that type, and check which
candidates can accept that type at the <type>unknown</type>-argument
positions. If exactly one candidate passes this test, use it.
Otherwise, fail.
</para> </para>
</step> </step>
</substeps> </substeps>
......
...@@ -618,14 +618,16 @@ func_select_candidate(int nargs, ...@@ -618,14 +618,16 @@ func_select_candidate(int nargs,
Oid *input_typeids, Oid *input_typeids,
FuncCandidateList candidates) FuncCandidateList candidates)
{ {
FuncCandidateList current_candidate; FuncCandidateList current_candidate,
FuncCandidateList last_candidate; first_candidate,
last_candidate;
Oid *current_typeids; Oid *current_typeids;
Oid current_type; Oid current_type;
int i; int i;
int ncandidates; int ncandidates;
int nbestMatch, int nbestMatch,
nmatch; nmatch,
nunknowns;
Oid input_base_typeids[FUNC_MAX_ARGS]; Oid input_base_typeids[FUNC_MAX_ARGS];
TYPCATEGORY slot_category[FUNC_MAX_ARGS], TYPCATEGORY slot_category[FUNC_MAX_ARGS],
current_category; current_category;
...@@ -651,9 +653,22 @@ func_select_candidate(int nargs, ...@@ -651,9 +653,22 @@ func_select_candidate(int nargs,
* take a domain as an input datatype. Such a function will be selected * take a domain as an input datatype. Such a function will be selected
* over the base-type function only if it is an exact match at all * over the base-type function only if it is an exact match at all
* argument positions, and so was already chosen by our caller. * argument positions, and so was already chosen by our caller.
*
* While we're at it, count the number of unknown-type arguments for use
* later.
*/ */
nunknowns = 0;
for (i = 0; i < nargs; i++) for (i = 0; i < nargs; i++)
{
if (input_typeids[i] != UNKNOWNOID)
input_base_typeids[i] = getBaseType(input_typeids[i]); input_base_typeids[i] = getBaseType(input_typeids[i]);
else
{
/* no need to call getBaseType on UNKNOWNOID */
input_base_typeids[i] = UNKNOWNOID;
nunknowns++;
}
}
/* /*
* Run through all candidates and keep those with the most matches on * Run through all candidates and keep those with the most matches on
...@@ -749,14 +764,16 @@ func_select_candidate(int nargs, ...@@ -749,14 +764,16 @@ func_select_candidate(int nargs,
return candidates; return candidates;
/* /*
* Still too many candidates? Try assigning types for the unknown columns. * Still too many candidates? Try assigning types for the unknown inputs.
* *
* NOTE: for a binary operator with one unknown and one non-unknown input, * If there are no unknown inputs, we have no more heuristics that apply,
* we already tried the heuristic of looking for a candidate with the * and must fail.
* known input type on both sides (see binary_oper_exact()). That's */
* essentially a special case of the general algorithm we try next. if (nunknowns == 0)
* return NULL; /* failed to select a best candidate */
* We do this by examining each unknown argument position to see if we can
/*
* The next step examines each unknown argument position to see if we can
* determine a "type category" for it. If any candidate has an input * determine a "type category" for it. If any candidate has an input
* datatype of STRING category, use STRING category (this bias towards * datatype of STRING category, use STRING category (this bias towards
* STRING is appropriate since unknown-type literals look like strings). * STRING is appropriate since unknown-type literals look like strings).
...@@ -770,9 +787,9 @@ func_select_candidate(int nargs, ...@@ -770,9 +787,9 @@ func_select_candidate(int nargs,
* Having completed this examination, remove candidates that accept the * Having completed this examination, remove candidates that accept the
* wrong category at any unknown position. Also, if at least one * wrong category at any unknown position. Also, if at least one
* candidate accepted a preferred type at a position, remove candidates * candidate accepted a preferred type at a position, remove candidates
* that accept non-preferred types. * that accept non-preferred types. If just one candidate remains,
* * return that one. However, if this rule turns out to reject all
* If we are down to one candidate at the end, we win. * candidates, keep them all instead.
*/ */
resolved_unknowns = false; resolved_unknowns = false;
for (i = 0; i < nargs; i++) for (i = 0; i < nargs; i++)
...@@ -835,6 +852,7 @@ func_select_candidate(int nargs, ...@@ -835,6 +852,7 @@ func_select_candidate(int nargs,
{ {
/* Strip non-matching candidates */ /* Strip non-matching candidates */
ncandidates = 0; ncandidates = 0;
first_candidate = candidates;
last_candidate = NULL; last_candidate = NULL;
for (current_candidate = candidates; for (current_candidate = candidates;
current_candidate != NULL; current_candidate != NULL;
...@@ -874,15 +892,78 @@ func_select_candidate(int nargs, ...@@ -874,15 +892,78 @@ func_select_candidate(int nargs,
if (last_candidate) if (last_candidate)
last_candidate->next = current_candidate->next; last_candidate->next = current_candidate->next;
else else
candidates = current_candidate->next; first_candidate = current_candidate->next;
} }
} }
if (last_candidate) /* terminate rebuilt list */
/* if we found any matches, restrict our attention to those */
if (last_candidate)
{
candidates = first_candidate;
/* terminate rebuilt list */
last_candidate->next = NULL; last_candidate->next = NULL;
} }
if (ncandidates == 1) if (ncandidates == 1)
return candidates; return candidates;
}
/*
* Last gasp: if there are both known- and unknown-type inputs, and all
* the known types are the same, assume the unknown inputs are also that
* type, and see if that gives us a unique match. If so, use that match.
*
* NOTE: for a binary operator with one unknown and one non-unknown input,
* we already tried this heuristic in binary_oper_exact(). However, that
* code only finds exact matches, whereas here we will handle matches that
* involve coercion, polymorphic type resolution, etc.
*/
if (nunknowns < nargs)
{
Oid known_type = UNKNOWNOID;
for (i = 0; i < nargs; i++)
{
if (input_base_typeids[i] == UNKNOWNOID)
continue;
if (known_type == UNKNOWNOID) /* first known arg? */
known_type = input_base_typeids[i];
else if (known_type != input_base_typeids[i])
{
/* oops, not all match */
known_type = UNKNOWNOID;
break;
}
}
if (known_type != UNKNOWNOID)
{
/* okay, just one known type, apply the heuristic */
for (i = 0; i < nargs; i++)
input_base_typeids[i] = known_type;
ncandidates = 0;
last_candidate = NULL;
for (current_candidate = candidates;
current_candidate != NULL;
current_candidate = current_candidate->next)
{
current_typeids = current_candidate->args;
if (can_coerce_type(nargs, input_base_typeids, current_typeids,
COERCION_IMPLICIT))
{
if (++ncandidates > 1)
break; /* not unique, give up */
last_candidate = current_candidate;
}
}
if (ncandidates == 1)
{
/* successfully identified a unique match */
last_candidate->next = NULL;
return last_candidate;
}
}
}
return NULL; /* failed to select a best candidate */ return NULL; /* failed to select a best candidate */
} /* func_select_candidate() */ } /* func_select_candidate() */
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment