How to Upsert Data in Relational Databases (2023)

Relational databases play a crucial role in storing, managing, and analyzing data. One common challenge developers face is inserting data where a matching record might already exist. The “upsert” operation, short for “update or insert,” provides a solution to this problem. While not all database management systems (DBMSs) have a built-in “upsert” command, the concept can still be achieved using their built-in features. In this blog, we’ll explore the upsert operation, how it works, and how to perform upserts in different databases.

What is a Database Upsert?

The upsert operation is a database operation that combines the insert and update operations into a single, atomic operation. It attempts to insert a new record into a table. If the record already exists, it updates the existing record. This operation is particularly useful when you need incrementally update a dataset that contains both new records and updates to existing records.

Upsert in MySQL and MariaDB

While MySQL and MariaDB don’t have a direct “upsert” command, you can achieve the upsert functionality using the INSERT...ON DUPLICATE KEY UPDATE statement. Here’s how it works:

Step 1 – Insert Data:
Given a table named users with columns id, name, and email. Initially, you attempt to insert data into a table.

INSERT INTO users (id, name, email) VALUES (1, 'John Doe', 'john@example.com');

1	INSERT INTO users (id, name, email) VALUES (1, 'John Doe', 'john@example.com');

Step 2 – Update on Duplicate:
If the primary key (in this case, ‘id’) constraint is violated due to duplicate data, the ON DUPLICATE KEY UPDATE clause triggers an update instead. Here is the complete SQL:

INSERT INTO users (id, name, email) VALUES (1, 'Jane Doe', 'jane@example.com')
ON DUPLICATE KEY UPDATE name = VALUES(name), email = VALUES(email);

1 2	INSERT INTO users (id, name, email) VALUES (1, 'Jane Doe', 'jane@example.com') ON DUPLICATE KEY UPDATE name = VALUES(name), email = VALUES(email);

Upsert in PostgreSQL

PostgreSQL provides the INSERT ... ON CONFLICT statement for upsert operations. Here’s how you can achieve upsert functionality in PostgreSQL.

Step 1 – Insert Data:
Attempt to insert data into a table.

INSERT INTO customers (id, name, email) VALUES (1, 'Alice', 'alice@example.com');

1	INSERT INTO customers (id, name, email) VALUES (1, 'Alice', 'alice@example.com');

Step 2 – Handle Conflict:
If a conflict arises due to unique constraints, the ON CONFLICT clause specifies how to handle it (e.g., update the existing record).

INSERT INTO customers (id, name, email) VALUES (1, 'Alice', 'alice@example.com')
ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name, email = EXCLUDED.email;

1 2	INSERT INTO customers (id, name, email) VALUES (1, 'Alice', 'alice@example.com') ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name, email = EXCLUDED.email;

Upsert in Oracle

Oracle, a widely used relational DBMS, also doesn’t have a direct “upsert” command. However, the functionality can be achieved using the MERGE statement. Let’s assume you have an employees table with columns emp_id, emp_name, and emp_salary. You want to update an employee’s salary if they exist or insert a new employee otherwise.

MERGE INTO employees e
USING dual d
ON (e.emp_id = 123)
WHEN MATCHED THEN
    UPDATE SET emp_salary = 60000
WHEN NOT MATCHED THEN
    INSERT (emp_id, emp_salary) VALUES (123, 60000);

MERGE INTO employees e

USING dual d

ON (e.emp_id = 123)

WHEN MATCHED THEN

UPDATE SET emp_salary = 60000

WHEN NOT MATCHED THEN

INSERT (emp_id, emp_salary) VALUES (123, 60000);

Upsert in Microsoft SQL Server (MSSQL)

In Microsoft SQL Server, the upsert operation can also be done using the MERGE statement, similar to Oracle. Consider a table named employees with columns emp_id, full_name, and email. You can perform an upsert operation to update the order total for an existing customer or insert a new customer.

MERGE employees AS target
USING (VALUES (1, 'Alex', 'alex@example.com')) AS source (emp_id, full_name, email)
ON target.emp_id = source.emp_id
WHEN MATCHED THEN UPDATE SET full_name = source.full_name, email = source.email
WHEN NOT MATCHED THEN INSERT (emp_id, full_name, email) VALUES (source.emp_id, source.full_name, source.email);

MERGE employees AS target

USING (VALUES (1, 'Alex', 'alex@example.com')) AS source (emp_id, full_name, email)

ON target.emp_id = source.emp_id

WHEN MATCHED THEN UPDATE SET full_name = source.full_name, email = source.email

WHEN NOT MATCHED THEN INSERT (emp_id, full_name, email) VALUES (source.emp_id, source.full_name, source.email);

Upsert in Sybase

Sybase, another popular relational DBMS, supports the upsert operation through the MERGE statement as well. Here’s how you can achieve it:

MERGE INTO clients USING (SELECT 1 AS client_id, 'Eva' AS name, 'eva@example.com' AS email) AS source
ON clients.client_id = source.client_id
WHEN MATCHED THEN UPDATE SET name = source.name, email = source.email
WHEN NOT MATCHED THEN INSERT (client_id, name, email) VALUES (source.client_id, source.name, source.email);

MERGE INTO clients USING (SELECT 1 AS client_id, 'Eva' AS name, 'eva@example.com' AS email) AS source

ON clients.client_id = source.client_id

WHEN MATCHED THEN UPDATE SET name = source.name, email = source.email

WHEN NOT MATCHED THEN INSERT (client_id, name, email) VALUES (source.client_id, source.name, source.email);

Upsert in DB2

DB2, an enterprise-level DBMS, provides the upsert operation through the MERGE statement, similar to the previously mentioned systems. Here’s how it’s done:

MERGE INTO employees AS target
USING (VALUES (1, 'Sarah', 'sarah@example.com')) AS source (emp_id, full_name, email)
ON target.emp_id = source.emp_id
WHEN MATCHED THEN UPDATE SET full_name = source.full_name, email = source.email
WHEN NOT MATCHED THEN INSERT (emp_id, full_name, email) VALUES (source.emp_id, source.full_name, source.email);

MERGE INTO employees AS target

USING (VALUES (1, 'Sarah', 'sarah@example.com')) AS source (emp_id, full_name, email)

ON target.emp_id = source.emp_id

WHEN MATCHED THEN UPDATE SET full_name = source.full_name, email = source.email

WHEN NOT MATCHED THEN INSERT (emp_id, full_name, email) VALUES (source.emp_id, source.full_name, source.email);

Upsert in H2

H2, a lightweight Java-based DBMS, also supports the upsert operation using the MERGE statement. The following simple query updates existing rows, and insert rows that don’t exist. If no key column is specified, the primary key columns are used to find the row. If more than one row per new row is affected, an exception is thrown.

MERGE INTO customers KEY(id)
VALUES (1, 'Oliver', 'oliver@example.com');

1 2	MERGE INTO customers KEY(id) VALUES (1, 'Oliver', 'oliver@example.com');

Conclusion

Upsert operations provide an efficient way to manage data insertion and updates in databases. While DBMSs don’t generally offer a direct “upsert” command, they do provide their own syntax to accomplish this function. By leveraging the INSERT...ON DUPLICATE KEY UPDATE statement in MySQL/MariaDB and the INSERT...ON CONFLICT statement in PostgreSQL and MERGE statements in other databases, developers can incrementally load batches of data while ensuring data integrity, avoid duplicates, and maintaining the latest data. Understanding and mastering these upsert techniques can greatly enhance your database management skills and application performance.

How to Upsert Data in Relational Databases (2023)

What is a Database Upsert?

Upsert in MySQL and MariaDB

Upsert in PostgreSQL

Upsert in Oracle

Upsert in Microsoft SQL Server (MSSQL)

Upsert in Sybase

Upsert in DB2

Upsert in H2

Conclusion

About The DataPipeline Team

Leave a Reply Cancel reply

Data Pipeline

Docs

Company

Tools