Sequential Pattern Mining


A sequence is an ordered list of items, in our case web pages, ordered by time of access.
The problem of discovering sequential patterns consists of finding inter-transaction patterns such that the presence of a set of items is followed by another item in the time-stamp ordered transaction set.
In web server transaction logs, a visit by a client is recorded over a period of time. The time stamp associated with a transaction in this case will be a time interval which is determined and attached to the transaction during the data cleaning or transaction identification processes. The discovery of sequential patterns in web server access logs allows web-based organizations to predict user visit patterns and helps in targeting advertising aimed at groups of users based on these patterns. By analyzing this information, the web mining system can determine temporal relationships among data items such as the following: Another important kind of data dependency that can be discovered, using the temporal characteristics of the data, are similar time sequences. For example, we may be interested in finding common characteristics of all clients that visited a particular file within the time period . Or, conversely, we may be interested in a time interval (within a day, or within a week, etc.) in which a particular file is most accessed