Generate MySql DDL From CSV

In this example, you will see how to generate a MySQL database from the data in a CSV file.  The code will automate the creation of Data Definition Language (DDL) SQL for MySQL based on the inferred data types in the CSV file.  This can reduce the time to build your data ingestion pipelines by removing the manual database creation and data conversion.

This example can be easily modified to Generate MySql DDL Programmatically where Java Objects are used to create table, relationships, and indexes.


Input File

Region,Country,Item Type,Sales Channel,Order Priority,Order Date,Order ID,Ship Date,Units Sold,Unit Price,Unit Cost,Total Revenue,Total Cost,Total Profit
Middle East and North Africa,Libya,Cosmetics,Offline,M,10/18/2014,686800706,10/31/2014,8446,437.20,263.33,3692591.20,2224085.18,1468506.02
North America,Canada,Vegetables,Online,M,11/7/2011,185941302,12/8/2011,3018,154.06,90.93,464953.08,274426.74,190526.34
Middle East and North Africa,Libya,Baby Food,Offline,C,10/31/2016,246222341,12/9/2016,1517,255.28,159.42,387259.76,241840.14,145419.62


Java Code Listing

package com.northconcepts.datapipeline.foundations.examples.datamapping;

import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.foundations.schema.EntityDef;
import com.northconcepts.datapipeline.foundations.schema.SchemaDef;
import com.northconcepts.datapipeline.sql.mysql.CreateMySqlDdlFromSchemaDef;


public class GenerateMySqlDdlFromCsv {

    public static void main(String[] args) {
        DataReader reader = new CSVReader(new File("example/data/input/1000_Sales_Records.csv"))

        GenerateEntityFromDataset generator = new GenerateEntityFromDataset();
        EntityDef entity = generator.generateEntity(reader).setName("sales");

        SchemaDef schema = new SchemaDef("marketdb")

        CreateMySqlDdlFromSchemaDef mySqlDdl = new CreateMySqlDdlFromSchemaDef(schema)




Code Walkthrough

  1. CSVReader is created corresponding to the input file 1000_Sales_Records.csv and field names are set to the first row.
  2. GenerateEntityFromDataset is used to generate an entity with the name sales.
  3. SchemaDef with the name marketdb is created and the entity is added to it.
  4. CreateMySqlDdlFromSchemaDef is used to generate MySQL DDL from the schema.
  5. .setPretty(true) is used to enhance the look of the output, it creates new lines where necessary which improves the readability of the output.
  6. .setCheckIfDropTableExists(false) when false does not check if the table exists before dropping.
  7. There are a couple of methods that can help you customize the output according to your preferences. For example:
    .setDropTable(false) removes the DROP TABLE sales; statement
    .setCheckIfTableNotExists(false) removes the IF NOT EXISTS statement in the creation of the table.


Console Output


  Region varchar (100) NOT NULL,
  Country varchar (100) NOT NULL,
  Item Type varchar (100) NOT NULL,
  Sales Channel varchar (100) NOT NULL,
  Order Priority varchar (100) NOT NULL,
  Order Date date NOT NULL,
  Order ID int NOT NULL,
  Ship Date date NOT NULL,
  Units Sold int NOT NULL,
  Unit Price double precision NOT NULL,
  Unit Cost double precision NOT NULL,
  Total Revenue double precision NOT NULL,
  Total Cost double precision NOT NULL,
  Total Profit double precision NOT NULL
Mobile Analytics