Sequential Pattern Mining: Example II


The last two steps of a sequential pattern mining algorithm ignored by the previous example will be discussed in the second example. The table below gives a database with the customer-sequences. We have not shown the original database in this example.

The customer sequences are in transformed form where each transaction has been replaced by the set of large itemsets contained in the transaction and the large itemsets have been replaced by integers.
⟨ {1 5} {2} {3} {4} ⟩
⟨ {1} {3} {4} {3 5} ⟩
⟨ {1} {2} {3} {4} ⟩
⟨ {1} {3} {5} ⟩
⟨ {4} {5} ⟩

For example, the customer sequence of ID 2 includes four transactions:
   {1} {3} {4} {3, 5}
It might be interpreted as
   {milk} {bread} {butter} {bread, cheese}
Two more terms are required for the following discussions: Modified Apriori algorithms are used in the Step 4. Sequence Phase to find all frequent sequences. Assume the minimum support has been specified to be 40% = 2/5 (i.e., 2 customer sequences). The first pass over the database results the large 1-sequences shown below.

For example, the support of the sequence 1 is 4 because the Itemset 1 appears in Customer IDs 1 to 4.




      If you had the choice between being    
      as rich as Bill Gates or having world peace,    
      what color Lamborghini would you buy?