Wednesday, April 8, 2026

pgcollection 2.0: Integer Keys, Range Deletes, and Oracle Parity

In my first post about pgcollection, I introduced the collection type to address the challenge of migrating Oracle associative arrays keyed by strings to PostgreSQL. For integer-keyed associative arrays, I noted that native PostgreSQL arrays work well enough for simple cases. That holds true until the keys are sparse.

Consider this Oracle pattern:

DECLARE
  TYPE cache_t IS TABLE OF VARCHAR2(100) INDEX BY PLS_INTEGER;
  cache  cache_t;
BEGIN
  cache(1)       := 'first';
  cache(1000000) := 'millionth';
  DBMS_OUTPUT.PUT_LINE('Count: ' || cache.COUNT);  -- 2
END;

The equivalent attempt with a PostgreSQL array produces a different result:

DO $$
DECLARE
  a text[];
BEGIN
  a[1]       := 'first';
  a[1000000] := 'millionth';
  RAISE NOTICE 'Length: %', array_length(a, 1);  -- 1000000
END $$;

PostgreSQL fills positions 2 through 999,999 with NULLs. You asked for two entries and got a million-element array. Worse, it is impossible to distinguish between a key that was explicitly set to NULL and one that was never set at all. pgcollection now avoids both problems.

icollection

icollection is a 64-bit integer-keyed associative array that stores only the keys you set. The same pattern from above works as expected:

DO $$
DECLARE
  cache  icollection('text');
BEGIN
  cache[1]       := 'first';
  cache[1000000] := 'millionth';

  RAISE NOTICE 'Count: %', count(cache);  -- 2
  RAISE NOTICE 'Value: %', cache[1000000];
  RAISE NOTICE 'Key 500 exists: %', exist(cache, 500);  -- false
END $$;

icollection supports the same full set of operations as collection — subscript access, forward and reverse iteration, sorting, set-returning functions, and JSON casting — with bigint keys instead of text. It maps directly to Oracle’s TABLE OF ... INDEX BY PLS_INTEGER, with the keys widened to 64-bit so overflow is not a concern during migration.

The exist() function resolves the NULL ambiguity problem directly. With a PostgreSQL array, a[2] returns NULL whether the key was set to NULL or never set. With icollection, exist() distinguishes the two cases:

DO $$
DECLARE
  ic  icollection('text');
BEGIN
  ic[1] := 'value';
  ic[2] := NULL;

  RAISE NOTICE 'Key 2 exists: %', exist(ic, 2);  -- true (explicitly NULL)
  RAISE NOTICE 'Key 3 exists: %', exist(ic, 3);  -- false (never set)
END $$;

Bulk delete operations

Oracle’s .DELETE method has three forms: delete a single key, delete a range of keys, and delete everything. pgcollection 1.x supported only single-key deletion. Version 2.0 adds the other two.

Delete all elements:

capitals := delete(capitals);

The previous workaround was assigning NULL, but that discarded the value type. A collection('date') set to NULL would infer its type from the next add() call rather than preserving date. Calling delete() with no arguments removes every entry while keeping the type intact.

Range delete:

-- String keys: uses the collection's collation for comparison
capitals := delete(capitals, 'B', 'D');

-- Integer keys: numeric comparison
arr := delete(arr, 5, 15);

Only keys that exist within the range are removed. If the low bound is greater than the high bound, the call is a no-op. This matches Oracle’s behavior.

Oracle migration guide

The other significant addition in 2.0 is a comprehensive Oracle migration guide covering common associative array patterns side by side: basic CRUD, FIRST/NEXT iteration, EXISTS checks, bulk loading from queries, FORALL-style bulk DML, record types as values, INOUT parameters, and exception handling.

The FORALL replacement is worth examining. In Oracle, bulk DML with associative arrays looks like this:

FORALL i IN 1..ids.COUNT
  INSERT INTO employees (id, name) VALUES (ids(i), names(i));

With pgcollection, the set-returning functions make this a single statement:

INSERT INTO employees (id, name)
  SELECT key, value FROM to_table(emps);

There is no loop and no per-row context switch between PL/pgSQL and the SQL engine.

The guide also documents the behavioral differences to be aware of during migration. The most consequential is that Oracle associative arrays with string keys are always sorted, while pgcollection preserves insertion order and requires an explicit sort() call. Code that iterates with FIRST/NEXT and expects alphabetical order needs that sort() added.

Upgrading

Existing collection usage is fully backward compatible. If you are running pgcollection 1.1.1, the upgrade is a single statement:

ALTER EXTENSION collection UPDATE TO '2.0.0';

For new installations, pgcollection requires PostgreSQL 14 or later:

git clone https://github.com/aws/pgcollection.git
cd pgcollection
make && make install
CREATE EXTENSION collection;

Summary

pgcollection 2.0 closes two remaining gaps for Oracle migrations: sparse integer-keyed arrays with icollection, and the full set of Oracle .DELETE semantics with the new delete() overloads. The migration guide brings the patterns together in one place. Performance benchmarks comparing icollection to native PostgreSQL arrays are something I plan to cover in a follow-up post.

The pgcollection extension is available on GitHub under the Apache-2.0 license. If you run into issues or patterns the migration guide does not cover, open an issue on GitHub.

pgcollection 2.0: Integer Keys, Range Deletes, and Oracle Parity

In my first post about pgcollection , I introduced the collection type to address the challenge of migrating Oracle associative arrays keye...