Commit 1a8b9fb5 authored by Tom Lane's avatar Tom Lane

Extend the unknowns-are-same-as-known-inputs type resolution heuristic.

For a very long time, one of the parser's heuristics for resolving
ambiguous operator calls has been to assume that unknown-type literals are
of the same type as the other input (if it's known).  However, this was
only used in the first step of quickly checking for an exact-types match,
and thus did not help in resolving matches that require coercion, such as
matches to polymorphic operators.  As we add more polymorphic operators,
this becomes more of a problem.  This patch adds another use of the same
heuristic as a last-ditch check before failing to resolve an ambiguous
operator or function call.  In particular this will let us define the range
inclusion operator in a less limited way (to come in a follow-on patch).
parent bf4f96b5
......@@ -304,13 +304,18 @@ without more clues. Now discard
candidates that do not accept the selected type category. Furthermore,
if any candidate accepts a preferred type in that category,
discard candidates that accept non-preferred types for that argument.
Keep all candidates if none survive these tests.
If only one candidate remains, use it; else continue to the next step.
</para>
</step>
<step performance="required">
<para>
If only one candidate remains, use it. If no candidate or more than one
candidate remains,
then fail.
If there are both <type>unknown</type> and known-type arguments, and all
the known-type arguments have the same type, assume that the
<type>unknown</type> arguments are also of that type, and check which
candidates can accept that type at the <type>unknown</type>-argument
positions. If exactly one candidate passes this test, use it.
Otherwise, fail.
</para>
</step>
</substeps>
......@@ -376,7 +381,7 @@ be interpreted as type <type>text</type>.
</para>
<para>
Here is a concatenation on unspecified types:
Here is a concatenation of two values of unspecified types:
<screen>
SELECT 'abc' || 'def' AS "unspecified";
......@@ -394,7 +399,7 @@ and finds that there are candidates accepting both string-category and
bit-string-category inputs. Since string category is preferred when available,
that category is selected, and then the
preferred type for strings, <type>text</type>, is used as the specific
type to resolve the unknown literals as.
type to resolve the unknown-type literals as.
</para>
</example>
......@@ -450,6 +455,36 @@ SELECT ~ CAST('20' AS int8) AS "negation";
</para>
</example>
<example>
<title>Array Inclusion Operator Type Resolution</title>
<para>
Here is another example of resolving an operator with one known and one
unknown input:
<screen>
SELECT array[1,2] &lt;@ '{1,2,3}' as "is subset";
is subset
-----------
t
(1 row)
</screen>
The <productname>PostgreSQL</productname> operator catalog has several
entries for the infix operator <literal>&lt;@</>, but the only two that
could possibly accept an integer array on the left-hand side are
array inclusion (<type>anyarray</> <literal>&lt;@</> <type>anyarray</>)
and range inclusion (<type>anyelement</> <literal>&lt;@</> <type>anyrange</>).
Since none of these polymorphic pseudo-types (see <xref
linkend="datatype-pseudo">) are considered preferred, the parser cannot
resolve the ambiguity on that basis. However, the last resolution rule tells
it to assume that the unknown-type literal is of the same type as the other
input, that is, integer array. Now only one of the two operators can match,
so array inclusion is selected. (Had range inclusion been selected, we would
have gotten an error, because the string does not have the right format to be
a range literal.)
</para>
</example>
</sect1>
<sect1 id="typeconv-func">
......@@ -594,13 +629,18 @@ the correct choice cannot be deduced without more clues.
Now discard candidates that do not accept the selected type category.
Furthermore, if any candidate accepts a preferred type in that category,
discard candidates that accept non-preferred types for that argument.
Keep all candidates if none survive these tests.
If only one candidate remains, use it; else continue to the next step.
</para>
</step>
<step performance="required">
<para>
If only one candidate remains, use it. If no candidate or more than one
candidate remains,
then fail.
If there are both <type>unknown</type> and known-type arguments, and all
the known-type arguments have the same type, assume that the
<type>unknown</type> arguments are also of that type, and check which
candidates can accept that type at the <type>unknown</type>-argument
positions. If exactly one candidate passes this test, use it.
Otherwise, fail.
</para>
</step>
</substeps>
......
......@@ -618,14 +618,16 @@ func_select_candidate(int nargs,
Oid *input_typeids,
FuncCandidateList candidates)
{
FuncCandidateList current_candidate;
FuncCandidateList last_candidate;
FuncCandidateList current_candidate,
first_candidate,
last_candidate;
Oid *current_typeids;
Oid current_type;
int i;
int ncandidates;
int nbestMatch,
nmatch;
nmatch,
nunknowns;
Oid input_base_typeids[FUNC_MAX_ARGS];
TYPCATEGORY slot_category[FUNC_MAX_ARGS],
current_category;
......@@ -651,9 +653,22 @@ func_select_candidate(int nargs,
* take a domain as an input datatype. Such a function will be selected
* over the base-type function only if it is an exact match at all
* argument positions, and so was already chosen by our caller.
*
* While we're at it, count the number of unknown-type arguments for use
* later.
*/
nunknowns = 0;
for (i = 0; i < nargs; i++)
{
if (input_typeids[i] != UNKNOWNOID)
input_base_typeids[i] = getBaseType(input_typeids[i]);
else
{
/* no need to call getBaseType on UNKNOWNOID */
input_base_typeids[i] = UNKNOWNOID;
nunknowns++;
}
}
/*
* Run through all candidates and keep those with the most matches on
......@@ -749,14 +764,16 @@ func_select_candidate(int nargs,
return candidates;
/*
* Still too many candidates? Try assigning types for the unknown columns.
* Still too many candidates? Try assigning types for the unknown inputs.
*
* NOTE: for a binary operator with one unknown and one non-unknown input,
* we already tried the heuristic of looking for a candidate with the
* known input type on both sides (see binary_oper_exact()). That's
* essentially a special case of the general algorithm we try next.
*
* We do this by examining each unknown argument position to see if we can
* If there are no unknown inputs, we have no more heuristics that apply,
* and must fail.
*/
if (nunknowns == 0)
return NULL; /* failed to select a best candidate */
/*
* The next step examines each unknown argument position to see if we can
* determine a "type category" for it. If any candidate has an input
* datatype of STRING category, use STRING category (this bias towards
* STRING is appropriate since unknown-type literals look like strings).
......@@ -770,9 +787,9 @@ func_select_candidate(int nargs,
* Having completed this examination, remove candidates that accept the
* wrong category at any unknown position. Also, if at least one
* candidate accepted a preferred type at a position, remove candidates
* that accept non-preferred types.
*
* If we are down to one candidate at the end, we win.
* that accept non-preferred types. If just one candidate remains,
* return that one. However, if this rule turns out to reject all
* candidates, keep them all instead.
*/
resolved_unknowns = false;
for (i = 0; i < nargs; i++)
......@@ -835,6 +852,7 @@ func_select_candidate(int nargs,
{
/* Strip non-matching candidates */
ncandidates = 0;
first_candidate = candidates;
last_candidate = NULL;
for (current_candidate = candidates;
current_candidate != NULL;
......@@ -874,15 +892,78 @@ func_select_candidate(int nargs,
if (last_candidate)
last_candidate->next = current_candidate->next;
else
candidates = current_candidate->next;
first_candidate = current_candidate->next;
}
}
if (last_candidate) /* terminate rebuilt list */
/* if we found any matches, restrict our attention to those */
if (last_candidate)
{
candidates = first_candidate;
/* terminate rebuilt list */
last_candidate->next = NULL;
}
if (ncandidates == 1)
return candidates;
}
/*
* Last gasp: if there are both known- and unknown-type inputs, and all
* the known types are the same, assume the unknown inputs are also that
* type, and see if that gives us a unique match. If so, use that match.
*
* NOTE: for a binary operator with one unknown and one non-unknown input,
* we already tried this heuristic in binary_oper_exact(). However, that
* code only finds exact matches, whereas here we will handle matches that
* involve coercion, polymorphic type resolution, etc.
*/
if (nunknowns < nargs)
{
Oid known_type = UNKNOWNOID;
for (i = 0; i < nargs; i++)
{
if (input_base_typeids[i] == UNKNOWNOID)
continue;
if (known_type == UNKNOWNOID) /* first known arg? */
known_type = input_base_typeids[i];
else if (known_type != input_base_typeids[i])
{
/* oops, not all match */
known_type = UNKNOWNOID;
break;
}
}
if (known_type != UNKNOWNOID)
{
/* okay, just one known type, apply the heuristic */
for (i = 0; i < nargs; i++)
input_base_typeids[i] = known_type;
ncandidates = 0;
last_candidate = NULL;
for (current_candidate = candidates;
current_candidate != NULL;
current_candidate = current_candidate->next)
{
current_typeids = current_candidate->args;
if (can_coerce_type(nargs, input_base_typeids, current_typeids,
COERCION_IMPLICIT))
{
if (++ncandidates > 1)
break; /* not unique, give up */
last_candidate = current_candidate;
}
}
if (ncandidates == 1)
{
/* successfully identified a unique match */
last_candidate->next = NULL;
return last_candidate;
}
}
}
return NULL; /* failed to select a best candidate */
} /* func_select_candidate() */
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment