Skip to main content
PostgreSQL has a rich set of string operators and functions for text manipulation. These are essential for data cleaning, formatting, and search queries.

Sample data

CREATE TABLE users (
  id         SERIAL PRIMARY KEY,
  first_name VARCHAR(50),
  last_name  VARCHAR(50),
  email      VARCHAR(100),
  city       VARCHAR(50),
  bio        TEXT
);

INSERT INTO users (first_name, last_name, email, city, bio)
VALUES
  ('John',   'Doe',     'JOHN.DOE@gmail.com',     '  New York  ', 'Full-stack developer with 5 years exp.'),
  ('jane',   'SMITH',   'jane.smith@Yahoo.COM',    'san francisco', 'Data scientist and ML engineer.'),
  ('Bob',    'Wilson',  'bob_wilson@outlook.com',  'Chicago',      'DevOps engineer. AWS certified.'),
  ('Alice',  'O''Brien', 'alice.ob@company.io',    'Los Angeles',  NULL),
  ('tom',    'lee',     'TOM.LEE@HOTMAIL.COM',     ' seattle ',    'Product manager at startup.');

String operators

Concatenation (||)

The || operator joins strings together.
SELECT first_name || ' ' || last_name AS full_name
FROM users;
   full_name
--------------
 John Doe
 jane SMITH
 Bob Wilson
 Alice O'Brien
 tom lee
NULL in concatenation
If any operand is NULL, the || operator returns NULL. Use COALESCE to handle this: first_name || ' ' || COALESCE(last_name, '').

CONCAT and CONCAT_WS

Unlike ||, CONCAT treats NULL as an empty string.
-- CONCAT — joins arguments, ignores NULLs
SELECT CONCAT(first_name, ' ', last_name) AS full_name
FROM users;

-- CONCAT_WS — concat with separator (first arg is the separator)
SELECT CONCAT_WS(', ', city, 'USA') AS location
FROM users;
-- "New York, USA"

-- CONCAT_WS also skips NULLs
SELECT CONCAT_WS(' | ', first_name, bio) FROM users;
-- Alice's NULL bio is skipped: "Alice"

Case conversion

FunctionDescriptionExampleResult
UPPER(str)Convert to uppercaseUPPER('hello')HELLO
LOWER(str)Convert to lowercaseLOWER('HELLO')hello
INITCAP(str)Capitalize first letter of each wordINITCAP('hello world')Hello World
SELECT
  UPPER(first_name) AS upper_name,
  LOWER(email) AS normalized_email,
  INITCAP(first_name || ' ' || last_name) AS proper_name
FROM users;
 upper_name | normalized_email          | proper_name
------------+---------------------------+--------------
 JOHN       | john.doe@gmail.com        | John Doe
 JANE       | jane.smith@yahoo.com      | Jane Smith
 BOB        | bob_wilson@outlook.com    | Bob Wilson
 ALICE      | alice.ob@company.io       | Alice O'Brien
 TOM        | tom.lee@hotmail.com       | Tom Lee
💡 Interview Tip
Always normalize data with LOWER() or UPPER() before comparisons to avoid case-sensitivity bugs: WHERE LOWER(email) = LOWER('John@Example.com'). Better yet, use CITEXT (case-insensitive text type) or ILIKE for pattern matching.

Length and position

FunctionDescriptionExampleResult
LENGTH(str)Number of charactersLENGTH('Hello')5
CHAR_LENGTH(str)Same as LENGTHCHAR_LENGTH('Hello')5
OCTET_LENGTH(str)Number of bytesOCTET_LENGTH('cafe')4
POSITION(sub IN str)Position of first occurrence (1-based)POSITION('lo' IN 'Hello')4
STRPOS(str, sub)Same as POSITION (PG syntax)STRPOS('Hello', 'lo')4
SELECT
  email,
  LENGTH(email) AS email_length,
  POSITION('@' IN email) AS at_position
FROM users;
         email           | email_length | at_position
-------------------------+--------------+------------
 JOHN.DOE@gmail.com      |           18 |           9
 jane.smith@Yahoo.COM    |           20 |          11
 bob_wilson@outlook.com  |           22 |          11
 alice.ob@company.io     |           19 |           9
 TOM.LEE@HOTMAIL.COM     |           19 |           8

Substring extraction

FunctionDescription
SUBSTRING(str FROM start FOR length)Extract substring (1-based index)
SUBSTR(str, start, length)Shorthand (same result)
LEFT(str, n)First n characters
RIGHT(str, n)Last n characters
SPLIT_PART(str, delimiter, field)Split string and get nth part
SELECT
  email,
  SUBSTRING(email FROM 1 FOR POSITION('@' IN email) - 1) AS username,
  SPLIT_PART(email, '@', 1) AS username_v2,
  SPLIT_PART(email, '@', 2) AS domain
FROM users;
         email           | username   | username_v2 | domain
-------------------------+------------+-------------+-----------
 JOHN.DOE@gmail.com      | JOHN.DOE   | JOHN.DOE    | gmail.com
 jane.smith@Yahoo.COM    | jane.smith | jane.smith  | Yahoo.COM
 bob_wilson@outlook.com  | bob_wilson | bob_wilson  | outlook.com
SELECT
  LEFT('PostgreSQL', 8) AS left_result,    -- 'PostgreS'
  RIGHT('PostgreSQL', 3) AS right_result;  -- 'SQL'

Substring with regex

-- Extract first number from a string
SELECT SUBSTRING('Order #12345 confirmed' FROM '#([0-9]+)');
-- Result: '12345'

Trimming and padding

TRIM, LTRIM, RTRIM

SELECT
  city,
  TRIM(city) AS trimmed,               -- remove leading/trailing spaces
  LTRIM(city) AS left_trimmed,          -- remove leading spaces only
  RTRIM(city) AS right_trimmed,         -- remove trailing spaces only
  TRIM(BOTH ' ' FROM city) AS explicit  -- same as TRIM(city)
FROM users;
      city      | trimmed       | left_trimmed    | right_trimmed
----------------+---------------+-----------------+---------------
   New York     | New York      | New York        |    New York
 san francisco  | san francisco | san francisco   |  san francisco
  seattle       | seattle       | seattle         |  seattle
-- Trim specific characters (not just spaces)
SELECT TRIM(LEADING '0' FROM '000123');   -- '123'
SELECT TRIM(BOTH '.' FROM '...hello...');  -- 'hello'

LPAD, RPAD

Pad a string to a given length.
SELECT
  LPAD('42', 5, '0') AS left_padded,    -- '00042'
  RPAD('hi', 10, '.') AS right_padded;  -- 'hi........'

-- Useful for formatting order numbers
SELECT LPAD(id::TEXT, 6, '0') AS order_number
FROM orders;
-- '000001', '000042', '001234'

Replace and translate

REPLACE

Replace all occurrences of a substring.
SELECT REPLACE('Hello World', 'World', 'PostgreSQL');
-- 'Hello PostgreSQL'

-- Remove all spaces
SELECT REPLACE('H e l l o', ' ', '');
-- 'Hello'

-- Normalize data
SELECT REPLACE(LOWER(email), ' ', '') AS clean_email
FROM users;

TRANSLATE

Replace characters one-to-one (character mapping).
-- Replace a→@, e→3, o→0
SELECT TRANSLATE('hello world', 'aeo', '@30');
-- 'h3ll0 w0rld'

-- Remove all vowels
SELECT TRANSLATE('PostgreSQL', 'aeiouAEIOU', '');
-- 'PstgrSQL'

-- Clean phone numbers (remove dashes, parentheses, spaces)
SELECT TRANSLATE('(555) 123-4567', '()- ', '');
-- '5551234567'
REPLACE vs TRANSLATE
REPLACE swaps substrings — it looks for multi-character matches. TRANSLATE swaps individual characters — each character in the second argument is mapped to the corresponding character in the third argument. Use REPLACE for word substitution, TRANSLATE for character-level transformations.

Pattern matching

LIKE / ILIKE

-- LIKE is case-sensitive
SELECT * FROM users WHERE email LIKE '%gmail%';

-- ILIKE is case-insensitive (PostgreSQL-specific)
SELECT * FROM users WHERE email ILIKE '%gmail%';

-- Names starting with 'j' (case-insensitive)
SELECT * FROM users WHERE first_name ILIKE 'j%';
PatternMatches
'J%'Starts with J
'%son'Ends with son
'%an%'Contains an
'_o%'Second character is o
'___'Exactly 3 characters

SIMILAR TO (SQL-standard regex)

-- Email from gmail or yahoo
SELECT * FROM users
WHERE email SIMILAR TO '%(gmail|yahoo)%';

~ (POSIX regex)

-- Case-sensitive regex match
SELECT * FROM users WHERE email ~ '^[a-z]+\.[a-z]+@';

-- Case-insensitive regex match
SELECT * FROM users WHERE email ~* '^[a-z]+\.[a-z]+@';

-- NOT match
SELECT * FROM users WHERE email !~ 'hotmail';
OperatorDescription
~Case-sensitive regex match
~*Case-insensitive regex match
!~Case-sensitive NOT match
!~*Case-insensitive NOT match

Regex functions

REGEXP_MATCH / REGEXP_MATCHES

-- Extract domain from email (returns array)
SELECT
  email,
  (REGEXP_MATCH(email, '@(.+)$'))[1] AS domain
FROM users;
         email           | domain
-------------------------+-----------
 JOHN.DOE@gmail.com      | gmail.com
 jane.smith@Yahoo.COM    | Yahoo.COM
 bob_wilson@outlook.com  | outlook.com

REGEXP_REPLACE

-- Remove all non-alphanumeric characters
SELECT REGEXP_REPLACE('Hello, World! 123', '[^a-zA-Z0-9]', '', 'g');
-- 'HelloWorld123'

-- Mask email addresses
SELECT REGEXP_REPLACE(email, '(.).*@', '\1***@') FROM users;
-- 'J***@gmail.com', 'j***@Yahoo.COM'

REGEXP_SPLIT_TO_TABLE / REGEXP_SPLIT_TO_ARRAY

-- Split bio into words (returns rows)
SELECT REGEXP_SPLIT_TO_TABLE('Full-stack developer with 5 years', '\s+');
-- Full-stack
-- developer
-- with
-- 5
-- years

-- Split into array
SELECT REGEXP_SPLIT_TO_ARRAY('Full-stack developer', '\s+');
-- {Full-stack,developer}

Repeat and reverse

SELECT REPEAT('ab', 3);   -- 'ababab'
SELECT REPEAT('-', 20);   -- '--------------------'

SELECT REVERSE('Hello');   -- 'olleH'

String aggregation

STRING_AGG

Concatenate values from multiple rows into one string.
-- List all employees per department
SELECT
  department,
  STRING_AGG(first_name, ', ' ORDER BY first_name) AS employees
FROM employees
GROUP BY department;
 department  | employees
-------------+-------------------------------
 Design      | Alice, Mike
 Engineering | Jane, John, Sara, Tom
 Marketing   | Bob, Eve
-- With DISTINCT to remove duplicates
SELECT STRING_AGG(DISTINCT category, ' | ' ORDER BY category) FROM products;
-- 'Electronics | Furniture'

Encoding and formatting

-- ASCII value of a character
SELECT ASCII('A');   -- 65

-- Character from ASCII code
SELECT CHR(65);      -- 'A'

-- Convert to hex
SELECT TO_HEX(255); -- 'ff'

-- MD5 hash
SELECT MD5('password');
-- '5f4dcc3b5aa765d61d8327deb882cf99'

-- Format numbers
SELECT TO_CHAR(1234567.89, '9,999,999.99');
-- ' 1,234,567.89'

-- Quote string for use in SQL
SELECT QUOTE_LITERAL('O''Brien');
-- '''O''Brien'''

Quick reference table

FunctionPurposeExampleResult
||Concatenate'Hi' || ' ' || 'there'Hi there
CONCAT(a, b)Concat (NULL-safe)CONCAT('Hi', NULL, '!')Hi!
CONCAT_WS(sep, ...)Concat with separatorCONCAT_WS('-', 'a', 'b')a-b
UPPER(s)UppercaseUPPER('hello')HELLO
LOWER(s)LowercaseLOWER('HELLO')hello
INITCAP(s)Title caseINITCAP('hello world')Hello World
LENGTH(s)String lengthLENGTH('hello')5
TRIM(s)Remove whitespaceTRIM(' hi ')hi
LPAD(s, n, c)Left padLPAD('5', 3, '0')005
REPLACE(s, from, to)Replace substringREPLACE('abc', 'b', 'x')axc
SPLIT_PART(s, d, n)Split and get partSPLIT_PART('a-b-c', '-', 2)b
SUBSTRING(s FROM n FOR l)Extract partSUBSTRING('hello' FROM 2 FOR 3)ell
LEFT(s, n)First n charsLEFT('hello', 3)hel
RIGHT(s, n)Last n charsRIGHT('hello', 3)llo
STRING_AGG(s, d)Aggregate concat

Common interview questions

SELECT LOWER(email), COUNT(*)
FROM users
GROUP BY LOWER(email)
HAVING COUNT(*) > 1;
Normalizing with LOWER() catches duplicates like John@Email.com and john@email.com.
Multiple approaches:
-- Using SPLIT_PART
SELECT SPLIT_PART(email, '@', 2) AS domain FROM users;

-- Using SUBSTRING + POSITION
SELECT SUBSTRING(email FROM POSITION('@' IN email) + 1) AS domain FROM users;

-- Using REGEXP_MATCH
SELECT (REGEXP_MATCH(email, '@(.+)$'))[1] AS domain FROM users;
SPLIT_PART is the simplest and most readable.
LIKE uses simple wildcards (% for any chars, _ for one char) and is easier to read. ~ uses full POSIX regular expressions (character classes, quantifiers, groups) and is more powerful. Use LIKE for simple patterns and ~ when you need complex matching logic. ILIKE and ~* are their case-insensitive variants.
SELECT word
FROM words
WHERE LOWER(word) = LOWER(REVERSE(word));
This finds palindromes like ‘madam’, ‘racecar’, and ‘level’.