26

Microsoft SQL Server has what I consider a remarkably sensible function, try_cast() which returns a null if the cast is unsuccessful, rather than raising an error.

This makes it possible to then use a CASE expression or a coalesce to fall back on. For example:

SELECT coalesce(try_cast(data as int),0);

The question is, does PostgreSQL have something similar?

The question is asked to fill in some gaps in my knowledge, but there’s also the general principle that some prefer a less dramatic reaction to some user errors. Returning a null is more easily taken in one's stride in SQL than an error. For example SELECT * FROM data WHERE try_cast(value) IS NOT NULL;. In my experience, user errors are sometimes better handled if there is a plan B.

0

6 Answers 6

21

If casting from one specific type to one other specific type is enough, you can do this with a PL/pgSQL function:

create function try_cast_int(p_in text, p_default int default null)
   returns int
as
$$
begin
  begin
    return $1::int;
  exception 
    when others then
       return p_default;
  end;
end;
$$
language plpgsql;

Then

select try_cast_int('42'), try_cast_int('foo', -1), try_cast_int('bar')

Returns

try_cast_int | try_cast_int | try_cast_int
-------------+--------------+-------------
          42 |           -1 |             

If this is only for numbers, another approach would be to use a regular expression to check if the input string is a valid number. That would probably be faster than catching exceptions when you expect many incorrect values.

5
  • 1
    I certainly wouldn't do this unless I absolutely needed to. This is far off the path of "best practices". Commented Apr 14, 2018 at 7:28
  • My exception handling experience with PostgreSQL is pretty poor. How would you handle a second possibility that the string would successfully cast to a numeric, which could then be trucated, such as '23.67'? The condition invalid_character_value_for_cast might catch it, but I don’t know how to go on from there?
    – Manngo
    Commented Apr 14, 2018 at 7:33
  • OK, the following doesn’t require any exception handling: return floor(cast(string as numeric));. Thanks again.
    – Manngo
    Commented Apr 14, 2018 at 7:43
  • 1
    @EvanCarroll I’m not disagreeing with you, but why do you say that?
    – Manngo
    Commented Apr 14, 2018 at 7:44
  • @Manngo updated below with benchmarks Commented Apr 14, 2018 at 8:03
35

Rationale

It's hard to wrap something like SQL Server's TRY_CAST into a generic PostgreSQL function. Input and output can be any data type, but SQL is strictly typed and Postgres functions demand that parameter and return types are declared at creation time.

Postgres has the concept of polymorphic types, but function declarations accept at most one polymorphic type. The manual:

Polymorphic arguments and results are tied to each other and are resolved to a specific data type when a query calling a polymorphic function is parsed. Each position (either argument or return value) declared as anyelement is allowed to have any specific actual data type, but in any given call they must all be the same actual type.

CAST ( expression AS type ) would seem like an exception to this rule, taking any type and returning any (other) type. But cast() only looks like a function while it's an SQL syntax element under the hood. The manual:

[...] When one of the two standard cast syntaxes is used to do a run-time conversion, it will internally invoke a registered function to perform the conversion.

There is a separate function for each combination of input and output type. (You can create your own with CREATE CAST ...)

Function

My compromise is to use text as input since any type can be cast to text. The extra cast to text means extra cost (though not much). Polymorphism also adds a bit of overhead. But the moderately expensive parts are the dynamic SQL we need, the involved string concatenation and, most of all, exception handling.

That said, this little function can be used for any combination of types including array types. (But type modifiers like in varchar(20) are lost):

CREATE OR REPLACE FUNCTION try_cast(_in text, INOUT _out ANYELEMENT)
  LANGUAGE plpgsql AS
$func$
BEGIN
   EXECUTE format('SELECT %L::%s', $1, pg_typeof(_out))
   INTO  _out;
EXCEPTION WHEN others THEN
   -- do nothing: _out already carries default
END
$func$;

The INOUT parameter _out serves two purposes:

  1. declares the polymorphic type
  2. also carries the default value for error cases

You wouldn't call it like in your example:

SELECT coalesce(try_cast(data as int),0);

.. where COALESCE also eliminates genuine NULL values from the source (!!), probably not as intended. But simply:

SELECT try_cast(data, 0);

.. which returns NULL on NULL input, or 0 on invalid input.

The short syntax works while data is a character type (like text or varchar) and because 0 is a numeric literal that is implicitly typed as integer. In other cases, you may have to be more explicit:

Example calls

Untyped string literals work out of the box:

SELECT try_cast('foo', NULL::varchar);
SELECT try_cast('2018-01-41', NULL::date);   -- returns NULL
SELECT try_cast('2018-01-41', CURRENT_DATE); -- returns current date

Typed values that have a registered implicit cast to text work out of the box, too:

SELECT try_cast(name 'foobar', 'foo'::varchar);
SELECT try_cast(my_varchar_column, NULL::numeric);

Comprehensive list of data types with registered implicit cast to text:

SELECT castsource::regtype
FROM   pg_cast
WHERE  casttarget = 'text'::regtype
AND    castcontext = 'i';

All other input types require an explicit cast to text:

SELECT try_cast((inet '192.168.100.128/20')::text, NULL::cidr);
SELECT try_cast(my_text_array_column::text, NULL::int[]));

We could easily make the function body work for any type, but function type resolution fails. Related:

1
  • 2
    Stellar approach. Commented Oct 3, 2021 at 18:27
1

Here's a generic try-cast, probably very slow.

CREATE OR REPLACE FUNCTION try_cast(p_in text, type regtype, out result text )
RETURNS text AS $$
  BEGIN
    EXECUTE FORMAT('SELECT %L::%s;', $1, $2)
      INTO result;
exception 
    WHEN others THEN result = null;
  END;
$$ LANGUAGE plpgsql;

 SELECT try_cast('2.2','int')::int as "2.2"
   ,try_cast('today','int')::int as "today"
   ,try_cast('222','int')::int as "222";

 SELECT try_cast('2.2','date')::date as "2.2"
   ,try_cast('today','date')::date as "today"
   ,try_cast('222','date')::date as "222";

 SELECT try_cast('2.2','float')::float as "2.2"
   ,try_cast('today','float')::float as "today"
   ,try_cast('222','float')::float as "222";

This won't accept types like varchar(20) (though we could add another parameter to accept "typemod" like 20.

this fuctions returns text bascause postgreqsl functions must have a fixed return type. so you may need an explicit cast outside of the function to coerce the result to the type you want.

6
  • dba.stackexchange.com/q/203975/2639 Commented Apr 14, 2018 at 19:35
  • meh, varchar(20) probably doesn't need try-cast: I can't think of any input that would fail. numeric(5,3) OTOH...
    – Jasen
    Commented Apr 14, 2018 at 20:48
  • 3
    The function is a useful test, but it's not a type cast. It takes text and returns text. Your examples actually only test string literals as input. Function type resolution would fail for typed values that have no implicit cast to text. format_type() is dead freight while no type modifiers are added. The cast to regclass is the useful part to enforce valid data types. Commented Apr 15, 2018 at 0:03
  • I don't think it's possible to have a single postgresql function that returns many different types,it returns text because all type can be cast to from text.
    – Jasen
    Commented Apr 17, 2018 at 2:01
  • 1
    @Json: But it is possible. I added an answer doing just that. Commented Apr 17, 2018 at 15:02
0

Slightly different generic version - this returns true or false, depending on whether or not the value could be cast. Works with user-defined domains too. This works on PostgreSQL 12.

CREATE OR REPLACE FUNCTION appcode.try_cast (p_in TEXT, p_type VARCHAR(128))
RETURNS BOOLEAN
AS
$$
       /*
       Accepts a text value, and a data type; returns true if the
       text value can be cast to the data type, false otherwise
       */
       DECLARE
               valid_cast BOOLEAN = TRUE;
               tmp_val TEXT;
               
       BEGIN 
               -- clean up the input
               p_type = TRIM(LOWER(p_type));
               
               EXECUTE FORMAT('SELECT CAST (%s AS %s)', p_in, p_type)
               INTO tmp_val;
               -- Check for same length, because casts can sometimes truncate 
               -- values; e.g. cast 'ABC' to VARCHAR(1) would result in 'A'
               IF LENGTH(tmp_val) <> LENGTH(p_in)
               THEN
                       valid_cast = FALSE;
               END IF;
               RETURN valid_cast;
                          
               EXCEPTION
                       WHEN OTHERS THEN
                               valid_cast = FALSE;
                               
END;
$$
LANGUAGE plpgsql;  

1
  • Worth noting that this not safe against SQL injection, only works for text input, and only returns true/false - unlike try_cast() which return the cast value if possible. Commented May 8, 2021 at 22:59
0

I have accepted an answer, but I thought I might add this for posterity. Here is the version I use in my work:

DROP FUNCTION IF EXISTS cast_int;
CREATE FUNCTION cast_int(string varchar, planB int default null) RETURNS INT AS $$
    BEGIN
        RETURN floor(cast(string as numeric));
    EXCEPTION
        WHEN OTHERS THEN return planB;
    END
$$ LANGUAGE plpgsql;

This is basically derived from a_horse_with_no_name’s answer above, but just a little more consolidated.

MSSQL’s TRY_CAST is more generic when it comes to data types, just like the CAST function itself. This version requires different functions for different data types, such as cast_date. You could expand it to check the data type yourself, but that might be going too far.

-2

With PostgreSQL, internally when a coercion fails it generates a fatal exception using ereport. That's unrecoverable in coercions.

Sample Data

Let's assume an error rate of 1/5

CREATE TABLE foo AS
  SELECT CASE WHEN x%5=0 THEN 'a' ELSE x::text END
  FROM generate_series(0,1e6) AS gs(x);

Black-listing bad data.

The normal solution to this problem is to accept liberally when you create types. That's pretty much how things work now. If you need to protect against some kind of bogus input, rather than catching in the case of failure just set it to null before it fails.

SELECT NULLIF(x, 'a')::int
FROM ( VALUES ('6'),('a'),('7') ) AS t(x);

If you want you can put that into an IMMUTABLE SQL statement too.

CREATE FUNCTION safer_but_not_totally_safe_coercion( i text )
RETURNS int AS $$
  SELECT NULLIF(i, 'a')::int;
$$ LANGUAGE sql
IMMUTABLE;

-- Inlined and fast.
SELECT safer_but_not_totally_safe_coercion(x, 'a')::int
FROM ( VALUES ('6'),('a'),('7') ) AS t(x);

You can also use regexes and whatever else you want as far as verification is concerned.

EXPLAIN ANALYZE SELECT safer_but_not_totally_safe_coercion(x) FROM foo;
                                                  QUERY PLAN                                                   
---------------------------------------------------------------------------------------------------------------
 Seq Scan on foo  (cost=0.00..21925.02 rows=1000001 width=5) (actual time=0.025..210.685 rows=1000001 loops=1)
 Planning time: 0.173 ms
 Execution time: 240.462 ms
(3 rows)

Try/Catch

This method is way slow.

EXPLAIN ANALYZE SELECT try_cast_int(x) FROM foo;
                                                   QUERY PLAN                                                    
-----------------------------------------------------------------------------------------------------------------
 Seq Scan on foo  (cost=0.00..264425.26 rows=1000001 width=5) (actual time=0.104..7069.281 rows=1000001 loops=1)
 Planning time: 0.056 ms
 Execution time: 7151.917 ms
(3 rows)

If you need it then by all means you need it, however it wouldn't be the first tool I grab for.

2
  • 2
    I don't see how this handles general cases. You only check if the value is 'a' if some code inputs 'x' this still fails. Sometimes you do get strange data e.g. from external sources. Your solution would require to write any possible wrong value into the nullif() function - hardly a practical way
    – user1822
    Commented Apr 14, 2018 at 7:59
  • @a_horse_with_no_name fully agree, just depends on what on what you need. If you need the sledge hammer, by all means go for it. I've just never needed it. I often have to do this kind of thing though. Commented Apr 14, 2018 at 8:04

Not the answer you're looking for? Browse other questions tagged or ask your own question.