Relational databases play a crucial role in storing, managing, and analyzing data. One common challenge developers face is inserting data where a matching record might already exist. The “upsert” operation, short for “update or insert,” provides a solution to this problem. While not all database management systems (DBMSs) have a built-in “upsert” command, the concept can still be achieved using their built-in features. In this blog, we’ll explore the upsert operation, how it works, and how to perform upserts in different databases.
What is a Database Upsert?
The upsert operation is a database operation that combines the insert and update operations into a single, atomic operation. It attempts to insert a new record into a table. If the record already exists, it updates the existing record. This operation is particularly useful when you need incrementally update a dataset that contains both new records and updates to existing records.
Upsert in MySQL and MariaDB
While MySQL and MariaDB don’t have a direct “upsert” command, you can achieve the upsert functionality using the INSERT...ON DUPLICATE KEY UPDATE
statement. Here’s how it works:
Step 1 – Insert Data:
Given a table named users
with columns id
, name
, and email
. Initially, you attempt to insert data into a table.
1 |
INSERT INTO users (id, name, email) VALUES (1, 'John Doe', 'john@example.com'); |
Step 2 – Update on Duplicate:
If the primary key (in this case, ‘id’) constraint is violated due to duplicate data, the ON DUPLICATE KEY UPDATE
clause triggers an update instead. Here is the complete SQL:
1 2 |
INSERT INTO users (id, name, email) VALUES (1, 'Jane Doe', 'jane@example.com') ON DUPLICATE KEY UPDATE name = VALUES(name), email = VALUES(email); |
Upsert in PostgreSQL
PostgreSQL provides the INSERT ... ON CONFLICT
statement for upsert operations. Here’s how you can achieve upsert functionality in PostgreSQL.
Step 1 – Insert Data:
Attempt to insert data into a table.
1 |
INSERT INTO customers (id, name, email) VALUES (1, 'Alice', 'alice@example.com'); |
Step 2 – Handle Conflict:
If a conflict arises due to unique constraints, the ON CONFLICT
clause specifies how to handle it (e.g., update the existing record).
1 2 |
INSERT INTO customers (id, name, email) VALUES (1, 'Alice', 'alice@example.com') ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name, email = EXCLUDED.email; |
Upsert in Oracle
Oracle, a widely used relational DBMS, also doesn’t have a direct “upsert” command. However, the functionality can be achieved using the MERGE
statement. Let’s assume you have an employees
table with columns emp_id
, emp_name
, and emp_salary
. You want to update an employee’s salary if they exist or insert a new employee otherwise.
1 2 3 4 5 6 7 |
MERGE INTO employees e USING dual d ON (e.emp_id = 123) WHEN MATCHED THEN UPDATE SET emp_salary = 60000 WHEN NOT MATCHED THEN INSERT (emp_id, emp_salary) VALUES (123, 60000); |
Upsert in Microsoft SQL Server (MSSQL)
In Microsoft SQL Server, the upsert operation can also be done using the MERGE
statement, similar to Oracle. Consider a table named employees
with columns emp_id
, full_name
, and email
. You can perform an upsert operation to update the order total for an existing customer or insert a new customer.
1 2 3 4 5 |
MERGE employees AS target USING (VALUES (1, 'Alex', 'alex@example.com')) AS source (emp_id, full_name, email) ON target.emp_id = source.emp_id WHEN MATCHED THEN UPDATE SET full_name = source.full_name, email = source.email WHEN NOT MATCHED THEN INSERT (emp_id, full_name, email) VALUES (source.emp_id, source.full_name, source.email); |
Upsert in Sybase
Sybase, another popular relational DBMS, supports the upsert operation through the MERGE
statement as well. Here’s how you can achieve it:
1 2 3 4 |
MERGE INTO clients USING (SELECT 1 AS client_id, 'Eva' AS name, 'eva@example.com' AS email) AS source ON clients.client_id = source.client_id WHEN MATCHED THEN UPDATE SET name = source.name, email = source.email WHEN NOT MATCHED THEN INSERT (client_id, name, email) VALUES (source.client_id, source.name, source.email); |
Upsert in DB2
DB2, an enterprise-level DBMS, provides the upsert operation through the MERGE
statement, similar to the previously mentioned systems. Here’s how it’s done:
1 2 3 4 5 |
MERGE INTO employees AS target USING (VALUES (1, 'Sarah', 'sarah@example.com')) AS source (emp_id, full_name, email) ON target.emp_id = source.emp_id WHEN MATCHED THEN UPDATE SET full_name = source.full_name, email = source.email WHEN NOT MATCHED THEN INSERT (emp_id, full_name, email) VALUES (source.emp_id, source.full_name, source.email); |
Upsert in H2
H2, a lightweight Java-based DBMS, also supports the upsert operation using the MERGE
statement. The following simple query updates existing rows, and insert rows that don’t exist. If no key column is specified, the primary key columns are used to find the row. If more than one row per new row is affected, an exception is thrown.
1 2 |
MERGE INTO customers KEY(id) VALUES (1, 'Oliver', 'oliver@example.com'); |
Conclusion
Upsert operations provide an efficient way to manage data insertion and updates in databases. While DBMSs don’t generally offer a direct “upsert” command, they do provide their own syntax to accomplish this function. By leveraging the INSERT...ON DUPLICATE KEY UPDATE
statement in MySQL/MariaDB and the INSERT...ON CONFLICT
statement in PostgreSQL and MERGE
statements in other databases, developers can incrementally load batches of data while ensuring data integrity, avoid duplicates, and maintaining the latest data. Understanding and mastering these upsert techniques can greatly enhance your database management skills and application performance.