Add a Decision Tree to a Pipeline

This example shows how to integrate decision trees into your data processing pipelines.  This lets you incorporate decision-making logic into your data transformations, filtering, or predictions based on the rules defined in the decision tree structure.

Decision Tree is a hierarchical structure that represents a series of conditions and outcomes. It is used to evaluate input data by following the branches of the tree, checking conditions at each node, and ultimately reaching a final outcome based on the satisfied conditions.

In e-commerce, this example can be used to evaluate customer behavior and preferences to determine personalized product recommendations. The decision tree can consider factors such as past purchase history, browsing patterns, and demographic information to generate tailored recommendations for each customer.


Input CSV File

Handle,Title,Body (HTML),Vendor,Type,Tags,Published,Option1 Name,Option1 Value,Option2 Name,Option2 Value,Option3 Name,Option3 Value,Variant SKU,Variant Grams,Variant Inventory Tracker,Variant Inventory Qty,Variant Inventory Policy,Variant Fulfillment Service,Variant Price,Variant Compare At Price,Variant Requires Shipping,Variant Taxable,Variant Barcode,Image Src,Image Position,Image Alt Text,Gift Card,SEO Title,SEO Description,Google Shopping / Google Product Category,Google Shopping / Gender,Google Shopping / Age Group,Google Shopping / MPN,Google Shopping / AdWords Grouping,Google Shopping / AdWords Labels,Google Shopping / Condition,Google Shopping / Custom Product,Google Shopping / Custom Label 0,Google Shopping / Custom Label 1,Google Shopping / Custom Label 2,Google Shopping / Custom Label 3,Google Shopping / Custom Label 4,Variant Image,Variant Weight Unit,Variant Tax Code
chain-bracelet,7 Shakra Bracelet,"7 chakra bracelet, in blue or black.",Company 123,Bracelet,Beads,true,Color,Blue,,,,,,0,,1,deny,manual,42.99,44.99,true,true,,,1,,false,,,,,,,,,,,,,,,,,kg,
leather-anchor,Anchor Bracelet Mens,Black leather bracelet with gold or silver anchor for men.,Company 123,Bracelet,"Anchor, Gold, Leather, Silver",true,Color,Gold,,,,,,0,,1,deny,manual,69.99,85,true,true,,,1,,false,,,,,,,,,,,,,,,,,kg,


Java Code Listing

package com.northconcepts.datapipeline.foundations.examples.decisiontree;


import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.foundations.decisiontree.DecisionTree;
import com.northconcepts.datapipeline.foundations.decisiontree.DecisionTreeNode;
import com.northconcepts.datapipeline.foundations.decisiontree.DecisionTreeReader;
import com.northconcepts.datapipeline.foundations.expression.CalculatedField;
import com.northconcepts.datapipeline.job.Job;
import com.northconcepts.datapipeline.transform.SelectFields;
import com.northconcepts.datapipeline.transform.TransformingReader;

public class AddADecisionTreeToAPipeline {

    public static void main(String[] args) {

        DecisionTree tree = new DecisionTree()
            .addField(new CalculatedField("Variant Price", "toBigDecimal(${Variant Price})"))

            .setRootNode(new DecisionTreeNode()

                .addNode(new DecisionTreeNode("${Variant Price} == null || ${Variant Price} < 20")
                    .addOutcome("Shipping", "0.00")
                    .addOutcome("Total", "${Variant Price} + Shipping"))

                .addNode(new DecisionTreeNode("${Variant Price} < 50")
                    .addOutcome("Shipping", "5.00")
                    .addOutcome("Total", "${Variant Price} + Shipping"))

                .addNode(new DecisionTreeNode("${Variant Price} < 100")
                    .addOutcome("Shipping", "7.00")
                    .addOutcome("Total", "${Variant Price} + Shipping"))

                .addNode(new DecisionTreeNode("${Variant Price} >= 100")
                    .addOutcome("Shipping", "${Variant Price} * 0.10")
                    .addOutcome("Total", "${Variant Price} + Shipping")));

        DataReader reader = new CSVReader(new File("data/input/jewelry.csv"))

        reader = new DecisionTreeReader(reader, tree);

        reader = new TransformingReader(reader)
            .add(new SelectFields("Title", "Handle", "Variant Price", "Shipping", "Total"));

        DataWriter writer = StreamWriter.newSystemOutWriter();, writer);


Code Walkthrough

  1. DecisionTree instance is created.
  2. A calculated field is added to hold "Variant Price" variable which is parsed into BigDecimal data type.
  3. A root node and a variety of child nodes having conditions on properties AgehouseOwned and Income are added.
  4. In order to attach outcomes with a node, addOutcome() method is invoked.
  5. Next, CSVReader is created corresponding to the input file jewelry.csv.
  6. The data ofreader is then evaluated based on logic defined in a decision tree via DecisionTreeReader object.
  7. TransformingReader instance is used with SelectFields to specify the exact fields to be included in reader.
  8. Finally, data is transferred from the reader to the StreamWriter.newSystemOutWriter() via method.


Console Output

0 - Record (MODIFIED) {
    0:[Title]:STRING=[7 Shakra Bracelet]:String
    2:[Variant Price]:STRING=[42.99]:String

1 - Record (MODIFIED) {
    2:[Variant Price]:STRING=[42.99]:String

2 - Record (MODIFIED) {
    0:[Title]:STRING=[Anchor Bracelet Mens]:String
    2:[Variant Price]:STRING=[69.99]:String
Mobile Analytics