Enable Multi Resolution Modeling

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Cagri Tekinay, Mamadou D. Seck, Alexander Verbraeck

Department of Systems Engineering, Faculty of Technology, Policy and Management

Delft University of Technology, Delft, Netherlands

{c.tekinay}@tudelft.nl

Introduction

Most of the real-world systems are large-scale and constituted by a vast number of technical and social components interacting in a non-trivial way. Capturing the underlying complexity of their behavior based on their known structure, investigating alternative designs and anticipating the ramifications of our action is near impossible in their totality given their scale and complexity. Abstraction is a method of simplification and dealing instead with the generalized, idealized model of a system (Lee and Fishwick, 1996). Abstraction allows model refinement while preserving the validity in a variety of situations, e.g., the validity of a model representation to a system of interest, the validity of an abstract model to the original model, the validity of a system description to either a lower or higher level description and the correctness of a simulator with respect to a model, see Zeigler (2000). The urge of model abstraction and traversing models within the level of hierarchy Frantz (1995) and Lee and Fishwick (1996) published two distinct studies on the taxonomy of model abstraction techniques.

published on works on and ( has been well studied (Model abstraction taxonomy when studying complex systems is well-recognized

of the results obtained from the simulation with respect to the question that the simulation in use is being addressed

not only important moving from one level to another Abstraction is an important concept for both system analysis and system design. In systems analysis,

Developing multi-level models has been the interest of many researchers.

In systems analysis, we are trying to understand the behavior of an existing or hypothetical system based on its known structure. Finally, in systems design, we are investigating the alternative structures for a completely new system or the redesign of an existing one.

However, the increasing complexity causes problems to grasp the whole aspects in one. Moreover, understanding better often requires to have detailed and broader view accompanied together. The urge to have a better understanding of these real-world systems, underlying complexity of real-world systems

To learn about a system by using simulations, one must build a model and execute this model by means of experimentation (Kleijnen 2008). This model represents the system of interest with its elements, their relationships and constraints (Robinson 2006). A model can be studied in different forms. It can be either a single model to analyze the system or it can be a multimodel which has many submodels in it (Fishwick 1993). For the cases where using a singular model is insufficient to analyze the large scale multi actor systems, multimodels provides a more capable modeling to the analysts (Fishwick 1991; Fishwick and Zeigler 1992; Fishwick 1993). A multimodel can be defined as a modular model in which several submodels form the total behavior of the model (Ören 1991). Multimodeling has been the main interest of several studies (Zeigler and Ören 1986; Ören 1987; Fishwick 1991; Ören 1991; Fishwick and Zeigler 1992; Fishwick 1993; Reynolds et al. 1997; Zeigler et al. 2000; Yılmaz and Ören 2004).

Real life systems are large and complex enough to understand thoroughly. Analyze and decision support tools have almost become a must to understand the underlying complexity. Modeling and simulation is one way to analyze those systems. Simulations are tools to generate data under the conditions that are provided by a model. This model Like it is mentioned by Zeigler (2000), it is However, the increasing complexity causes problems to grasp the whole aspects in one. Moreover, understanding better often requires to have detailed and broader view accompanied together.

The importance of having simulation models with multiple levels of abstractions when studying on a complex phenomenon is well-recognized [Zeigler 1976; Fishwick 1986; 1988; 1989]. The importance of having different perspectives and sometimes having multiple views to have a better understanding building multi-level simulation models

(…this chapter is going to be organized and extended…)

Temporal data mining is an area of emerging interest both in research and professional practice; see Laxman and Sastry (2006). Some of these methods are designed to analyze numerical time series, see Liu et al. (2001) for instance while other approaches are concerned with time series of symbolic values: Methods of sequential pattern analysis try to find hidden patterns or unexpected trends in categorical time series data (unsupervised rule discovery). This type of sequential pattern analysis was first introduced by Agrawal and Srikant (1995). For a review of such methods like FP-Tree (Han, Pei, and Yin, 2001), PrefixSpan (Pei et al., 2001) and others which analyze a collection of short sequences to detect similarities, refer to (Laxman and Sastry, 2006). Other approaches (Nevill-Manning and Witten, 1997; Cohen and Adams, 2001; Cohen, Adams and Heeringa, 2007; Gunawardana and Meek, 2011), including ours, assume a single long sequence, and try to detect the repeated occurrence of sequential patterns within it. In this context, the sequential patterns are often referred to as episodes, but the terminology is not unique:, for instance, define an episode to be a meaningful pattern, meaningful in any sense, i.e., depending on the context and objectives of the individual analysis.

Data is often naturally expressed as a long sequence of distinct states or discrete events over time. Typically, humans find it very difficult to identify interesting patterns or chunks within the long sequence (Gunawardana and Meek, 2011). In sequence analysis, the main concern is to identify these sequential patterns and quantify the (dis-)similarities or distances amongst a set of such patterns. The goal is to use this quantification to efficiently describe the set of patterns observed or use it as a basis to construct a dependent or independent variable for some model of related phenomena. Stated in such general terms, sequence analysis seems to be a useful tool to find answers to interesting and legitimate questions (Elzinga, 2006).

The goal of the proposed multi-step methodology is to 1) recognize frequent patterns from multi-dimensional phase sequence data which is generated by a high resolution DEVS model, 2) use the statistical properties (entropy, frequency, etc.) of these frequent patterns to successfully segment the phase sequence data into chunks, 3) generate more abstract phase variables from these chunks and 4) construct lower resolution DEVS models using these new phase variables.

The first step of the proposed multi-step methodology is to find frequent patterns that appear in multi-dimensional phase sequence data. Therefore, the next section will begin by introducing the Dynamic Frame Expansion (DFE) Algorithm. Following this, we elaborate the segmentation step of our multi-step methodology. The segmentation step is responsible for performing an unsupervised segmentation on the multi-dimensional phase sequence data. This is simply done by using the candidate patterns and their statistical signatures generated by the mining step. The segmentation process is unsupervised because the algorithm does not have any prior information about the size, the boundary (start and end events) or the position of a segment in the multi-dimensional phase sequence data. These segments that are obtained from the multi-dimensional phase sequence data will later be used as phase variables for lower resolution models.

! Maybe storing the phases with their duration, e.g. <a,3> or a3, etc. That’ll reveal that two of the same phase actually reveals two different behaviors. The more information we add to an atomic symbol, the more detail analyze become. The same thing applies for adding an environment variable!

Another important thing is the situation with subsequence and substring. The problem is analyzing all the possible subsequences is not possible if no “support threshold” is implemented. It is not only time consuming but also memory consuming. The sequential analysis in databases can work on sequences since they implement min and max thresholds and relatively shorter sequences compare to the sequential analysis on written text. Maybe after getting the size of the longest substring, we can implement that as the maximum threshold.

Mention the example of Elzinga with two strings x and y. Explain that how the Kolmogorov Complexity (1965) is actually one of the reason that modeler often choose to model at the most detailed way, because for some phenomena it is hard to grasp a more abstract understanding. Therefore, often modelers decide to leave them as it is…

The essence of modeling lies in establishing relations between pairs of system descriptions. A vertical relation is called an association mapping. It takes a system at one level of specification and generates its counterpart at another level of specification. The downward motion in the structure-to-behavior direction formally represents the process by which a simulator generates the behavior of a model (Zeigler, 200).

As the next level, the data level is a data base of measurements and observations made for the source system. When we get to Level 2, we have the ability to recreate this data using a more compact representation, such as a formula. Since, typically, there are many formulas or other means to generate the same data, the generative level, or articular means or formula we have settled on, constitutes knowledge we didn't have at the data system level. When people talk about models in the context of simulation studies they are usually referring to the concepts identified at this level. That is, to them a model means a program to generate data.

The central idea is that when we move to a lower level, we don't generate any really new knowledge-we are only making explicit what is implicit in the description we already have. One could argue that making something explicit can lead to insight, or understanding, which is a form of new knowledge, but Klir is not considering this kind of subjective (or modeler-dependent) knowledge.

Frequent Sequence Mining from Multi-Dimensional Categorical Time Series using Dynamic Frame Expansion Algorithm

The Mining of Categorical Time Series

Categorical time series come as strings or sequences of characters, each character denoting one particular state, event or a phase (Elzinga, 2006). These characters are determined by the application: for example when modeling the movement of a car in DEVS, the characters can be the symbols or acronyms for the relevant phases like cruising, accelerating, decelerating, etc. They can also be the representation of a more abstract phenomenon like parking or taking over.

Let ï" = {a, b, c, …} = {σ1 ,… , σp} be a finite set of characters, i.e. an alphabet, from which the sequences are constructed. A sequence S arise by concatenation of characters from ï" and S ï"* where ï"* denotes the set of all sequences that can be constructed from ï". 𝞮 denotes the empty sequence. |S| denotes the length of a sequence S, i.e., the number of concatenated characters it contains. A sequence Sa denoted by <a1a2a3…am> is contained in another sequence Sb = <b1b2b3… bn> if ∃k, 0<k<n such that a1a2…am = bk+1bk+2… bk+m. If sequence Sa is contained in sequence Sb, then we call Sa a subsequence of Sb and Sb a supersequence of Sa. The frequency of a subsequence S’, denoted as Æ‘ (S’, S), is the number of the sequence S’ in S such that S’ is a subsequence of S. If any sequence S’ satisfies Æ‘ (S’, S)> 1, then it is called a frequent sequence (FS). Ѱ denotes the set of all FSs for a given S where Ѱ ⊂ ï"* ⊂ï". An end sequence denoted by ES is a subsequence of S where any sequence S’ has the frequency of Æ‘ (S’, S) = 1, S’⊂S such that S’- ai = S’’ where S’’ = <a1a2…ai-1> and S’’ Ѱ. Ñ  denotes the set of ESs for a given S. U denotes the union set U = Ѱ ∪Ѡ.

Problem 1: Given a multi-dimensional phase sequence S as an input over alphabet ï", generate the union set U and construct the FS-Trie.

Dynamic Frame Expansion Algorithm

Generation of the FS-Trie

DFE algorithm is simply responsible for generating the union set U and storing the set elements in a compact trie structure, called Frequent Sequence Trie (FS-Trie), to be later used in the segmentation step. FS-Trie is an ordered tree data structure that stores all frequent sequences, end nodes and their frequency information for a given categorical time series. Unlike a binary search tree, no node in the trie stores the key associated with that node; instead, its position in the trie defines the key with which it is associated (Trie - Wikipedia, the free encyclopedia, http://en.wikipedia.org/wiki/Trie (accessed January 9, 2013). A FS-Trie grows dynamically as each symbol of the related FS is appended. Therefore, all the descendants of a node share a common prefix. The root node of the FS-Trie is always null. A FSP-Trie is similar to the common implementation of a trie in a way that there is a node for every prefix. Similar to the n-gram trie implementation of Cohen et al. (2007), the node structure is also modified to store not only the prefixes’ themselves but also an integer value associated with each and every prefix. Therefore, traversing a FS-Trie not only reveals the sequence information of a given FS prefix but also provides information about the frequency of subsequences at each level of the trie. However, an FS-Trie is different from both the common implementation and the implementation of Cohen et al. in a way that it only stores the elements in the set of U. This is mainly because evaluating all the distinct subsequences of input sequence is intractable, since an input sequence of length n has 2n-1 possible segmentations. Therefore, eliminating the non-repeated sequences dramatically reduces the size of the trie. Pseudocode for the FS-Trie generation method is shown in Algorithm 1.

Algorithm 1: GenerateFS-Trie(S)

Initially: Dynamic frame size (dfs) = , dynamic frame position (dfp) = 0, temporary frame position (tfp) = , smallestBehavioralPatternSize = , set of frequent sequential patterns Ψ = ∅, set of end sequences Ѡ = ∅, FSP-Trie = <empty>.

Input: Multi-Dimensional Phase Sequence Data (S).

while dfp <S.length - 

S’ ← S.substring(dfp, dfp+dfs)

if S’  (Ψ UѠ) then

Ƒ (S’, S ) ← S. findFrequency(S’)

else

while tfp < S.length

if S.substring(tfp, tfp+) is equal to S.substring(dfp, dfp+dfs) then

increment Ƒ (S’, S )

tfp ← tfp + 

else

tfp ← tfp + 

end if

end if

end while

if Ƒ (S’, S ) is greater than or equal to 1 then

<<detection of an atomic -sized frequent sequence>>

Ψ.add(S’)

FSP-Trie.addToFirstLevel(S’, Ƒ (S’, S ))

else

<<the sequence is neither an endnode nor a frequent sequence, do nothing>>

end if

<<expanding the dynamic frame>>

while Ƒ (S’, S ) is greater than or equal to 1 and (dfs + dfp) <S.length

dfs ← dfs + 

tfp ← dfs

S’’ ← S.substring(dfp, dfs)

Ƒ (S’’, S ) ← S. findFrequency(S’’)

if Ƒ (S’’, S ) is equal to 1 then

Ѡ.add(S’’)

FSP-Trie.add(S’’, Ƒ (S’’, S ))

<<detection of an endsequence, break the loop and stop expanding the Trie>>

else if S’’  (Ψ UѠ ) and Ƒ (S’’, S ) > 1 then

<<detection of an existing sequence in the Trie, move one level down and continue>>

dfs ← dfs + 

else if S’’  (Ψ UѠ) then

while dfs is lower than or equal to (S.length â€" dfs)

if S.substring(dfs, dfs+) is equal to S.substring(dfp, dfp+dfs) then

increment Ƒ (S’’, S )

tfp ← tfp + 

else

tfp ← tfp + 

end if

<<detection of a frequent sequence>>

Ψ.add(S’’)

FSP-Trie.add(S’’, Ƒ (S’’, S ))

end while

end while

dfp ← dfp + 

dfs ← dfp + 

end while

<<checking the final candidate>>

Sf ← S.substring(dfp, dfp+dfs)

if Sf  (Ψ UѠ) then

Ñ  .add(Sf )

FSP-Trie.addToFirstLevel(Sf , Æ‘ (Sf , S ))

else

<<reach the end of S, return FSP-Trie>>

endif

An example

Let us illustrate the execution of our DFE algorithm on an simple string sequence AAABBAABA and compare the generation of FS-Trie with respect to the n-gram trie of Cohen et al. Given the initial position of the dynamic frame (dfp) = 0, initial size of the dynamic frame (dfs)= 1 and Ψ = ∅; the union set at the end is U ={{A,6}, {AA,3}, {AAA,1}, {AAB,2}, {AABB,1}, {AB,2}, {ABB,1}, {B,3}, {BB,1}, {BA,2}, {BAA,1}, {AABA,1}, {ABA,1}} and the generated FS-Trie based on the set U is given in Figure 1:

It can be seen from the above FS-Trie implementation that all FSs, ESs and their frequency values can be obtained by traversing the trie from top to bottom. This will further allows us to calculate standardized frequencies and boundary entropies (see Section X) for each level and later segment the input sequence.

It was mentioned in the previous section that an FS-Trie is different from an n-gram trie in the way that it doesn’t require to store all the permutations of a given input sequence S. Instead it only stores the frequent and end sequences. Therefore, it is memory efficient especially when the input sequence is huge and has vast number of permutations. An n-gram tree of depth 5 for the same input sequence is given in Figure 2. The nodes with red characters and values are the ones omitted by the DFE-Algorithm.

C:\Documents and Settings\CPS\Desktop\n-gram trie.bmp

Discovering Phase Sequence Boundaries using Extended Voting Experts Algorithm

The second step of the multi-step methodology after generating the FS-Trie is to segment the given input sequence by using the statistical signature of the candidate sequences.

Voting Experts Algorithm

Voting Experts Algorithm, initially proposed by Cohen et al. (2006), is an unsupervised algorithm for the segmentation of sequences into chunks. Chunks are sequences that have low internal entropy, and high boundary entropy, meaning that items within a chunk can predict one another, but not items outside the chunk. A chunk is similar to what we call FSP in the previous sections (mention more).

Simply, Voting-Experts Algorithm includes experts that attend to boundary entropy and frequency and is easily extensible to include experts that attend to other characteristics of chunks. The algorithm simply moves a window across a time series and asks for each location in the window whether to “cut” the series at that location. Each expert casts a vote. Each location takes n steps to traverse a window of size n, and is seen by the experts in n different contexts, and may accrue up to n votes from each expert. Given the results of voting, it is a simple matter to cut the series at locations with high vote counts. Here are the steps of the algorithm:

Build an n-gram trie of depth n+1. Nodes at level i+1 of the trie represent n-grams of length i. The children of a node are the extensions of the n-gram represented by the node. For example, a b c a b d produces the following trie of depth 3:

null

2

2

b

2

b

1

d

1

c

1

d

a

a

1

1

c

Every n-gram of length 2 or less in the sequence a b c a b d is represented by a node in this tree. The numbers in the lower half of the nodes represent the frequencies of the subsequences. For example, the subsequence ab occurs twice, and every occurrence of a is followed by b.

Calculate boundary entropy. The boundary entropy of an ngram is the en- tropy of the distribution of tokens that can extend the ngram. The entropy of a distribution for a discrete random variable X is

Boundary entropy is easily calculated from the trie. For example, the node a in the tree above has entropy equal to zero because it has only one child, ab, whereas the entropy of node b is 1.0 because it has two equi-probable children, bc and bd. Clearly, only the first n levels of the n-gram tree of depth n + 1 can have node entropy scores.

Standardize frequencies and boundary entropies. In most domains, there is a systematic relationship between the length and frequency of patterns; in general, short patterns are more common than long ones. The algorithm will compare the frequencies and boundary entropies of n-grams of different lengths, but in all cases the algorithm is comparing how unusual these frequencies and entropies are, relative to other n-grams of the same length.

(Stata FAQ: How do I standardize variables in Stata?, http://www.ats.ucla.edu/stat/stata/faq/standardize.htm (accessed January 9, 2013)Standardization of values: A standardized variable (sometimes called a z-score or a standard score) is a variable that has been rescaled to have a mean of zero and a standard deviation of one. For a standardized variable, each case's value on the standardized variable indicates its difference from the mean of the original variable in number of standard deviations (of the original variable). For example, a value of 0.5 indicates that the value for that case is half a standard deviation above the mean, while a value of -2 indicates that a case has a value two standard deviations lower than the mean. Variables are standardized for a variety of reasons, for example, to make sure all variables contribute evenly to a scale when items are added together, or to make it easier to interpret results of a regression or other analysis.

Standardizing a variable is a relatively straightforward procedure. First, the mean is subtracted from the value for each case, resulting in a mean of zero. Then, the difference between the individual's score and the mean is divided by the standard deviation, which results in a standard deviation of one. If we start with a variable x, and generate a variable x*, the process is:

x* = (x-m)/sd

Where m is the mean of x, and sd is the standard deviation of x.

Score potential segment boundaries. In a sequence of length k there are k − 1 places to draw boundaries between segments, and, thus, there are 2k−1 ways to divide the sequence into segments. Our algorithm is greedy in the sense that it considers just k − 1, not 2k−1, ways to divide the sequence. It considers each possible boundary in order, starting at the beginning of the sequence. The algorithm passes a window of length n over the sequence, halting at each possible boundary. All of the locations within the window are considered, and each garners zero or one vote from each expert. Because there are two experts, for boundary-entropy and frequency, respectively, each possible boundary may accrue a maximum of 2n votes.

Segment the sequence. Each potential boundary in a sequence accrues votes, as described above, and at this point the algorithm must evaluate the boundaries in terms of the votes and decide where to segment the sequence. The method is a familiar “zero crossing” rule: If a potential boundary has a locally maximum number of votes, split the sequence at that boundary. One of the assumptions about the method is that: The number of votes for a boundary must exceed an absolute threshold, as well as be a local maximum. It is confessed that that the algorithm splits too often without this qualification.

The boundary entropy expert assigns votes to locations where the boundary entropy peaks locally; implementing the idea that entropy increases at episode boundaries. The frequency expert tries to find a “maximum likelihood tiling” of the sequence, a placement of boundaries that makes the n-grams to the left and right of the boundary as likely as possible. When both experts vote for a boundary, and especially when they vote repeatedly for the same boundary, it is likely to get a locally-maximum number of votes, and the algorithm is apt to split the sequence at that location (Cohen, Heeringa, and Adams, 2001).

Extended Voting Experts Algorithm

(Mention about the extensions that I made so far:

N-gram tree implementation is different from a FSP-Trie. A n-gram has a certain dWe have end nodes as well.

When segmenting words in a given English sentence, there is only one meaningful segmentation is possible. However, there can be several segmentations when finding the behavior patterns.

We repeat the segmentation process for n times with n different sizes of frames. n is the length of the input sequence. We wait until the segmentation values reach to a steady state.

This is good to better locate local maximums and eliminate the necessity of thresholds! This is not possible when using n-gram trees, or limited to the maximum length of the path which is `n`…

And so on…)

!!! Example segmentation for the sample input is given in the “output of model analysis.xls” file.

To reduce complexity of system analysis and control design, simplified models that capture the behavior of interest in the original system can be obtained. These simplified models, called abstractions, can be analyzed more easily than the original complex model. (Title page for ETD etd-05172007-112915, http://scholar.lib.vt.edu/theses/available/etd-05172007-112915/ (accessed January 9, 2013).

Discussion

Mention the paper of Elzinga….



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now