2
\$\begingroup\$

I'm using Postgres 14 and I've written a function (for use with Hasura GraphQL Computed Fields) that transforms a TEXT field, replacing all instances of spoiler-annotated text (e.g. ||spoiler||) with unicode full-block characters (ala classic redaction).

My self-imposed criteria for the function are:

  • Redacted sections must retain the length of the original text. (i.e. ||quite a long spoiler|| may not be reduced to any fewer characters).
  • Must not error if the field contains zero spoilers.
  • Must redact all instances of spoilers in the field.

My methodology was to find all instances of spoiler text via regex, then loop through them, replacing occurrences of the match in the original text. It feels heavy handed, but this was the best working solution I could come up with in a few hours.

-- Set up the test table
CREATE TABLE actor (
  id INT,
  backstory_raw TEXT
);
INSERT INTO actor (id, backstory_raw) VALUES (1, '||Lorem|| ipsum ||dolar sit amet||');
INSERT INTO actor (id, backstory_raw) VALUES (2, 'I have nothing to hide.');
-- End set up

-- This is the function to be critiqued:
CREATE OR REPLACE FUNCTION actor_backstory(actor_row "actor")
RETURNS TEXT
LANGUAGE plpgsql
AS $$
DECLARE
   matches TEXT[];
   match TEXT;
   res TEXT := actor_row.backstory_raw;
BEGIN
    SELECT array_agg(m) FROM regexp_matches(res, '\|\|.+?\|\|', 'g') m INTO matches;
    IF matches IS NOT NULL THEN
        FOREACH match IN ARRAY matches
        LOOP
            res := replace(res, match, repeat('█', char_length(match)));
        END LOOP;
    END IF;
    RETURN res;
END;
$$ STABLE;

-- Invoke the function to test it
SELECT actor_backstory(actor) FROM actor;

db<>fiddle here

\$\endgroup\$

0

Browse other questions tagged or ask your own question.