The JUMBL's analysis engines produce several analysis results; this page attempts to explain what those results signify, and how they may be used. Please note that which statistics are important and useful to you depends very much on what problem you are trying to solve. This page is just an overview, nothing more. For an example analysis report, click here.
The JUMBL has more than one analytical engine. Different analytical engines will produce different results. Some analytical engines may use approximations to obtain their results, and these will be slightly different from the ``analytical'' solutions of other engines. At the time of writing, the JUMBL contains four analytical engines.
There are four kinds of statistics presented in the analysis report:
Of these, the statistics which are potentially of the greatest use are those which relate to the model (overall) and to the stimuli. Nodes of a model are, in a sense, arbitrary. Though they may be said to refer to ``states of use'' additional nodes may be inserted, or existing nodes split or merged, for various purposes. In contrast, the stimuli on the arcs represent usage events observed by the system under test and are therefore the primary focus of the modeling effort.
The mathematical expectation E(X) (also called the mean) is the average value of a random variable X across many trials.
Every expectation E(X) has an associated variance Var(X). The variance can be interpreted using Chebyshev's Inequality. The probability that the observed value of random variable X differs from the expected value E(X) by at least k (for any positive k) is bounded above by Var(X)/k2. In other words: P[|X - E(X)| >= k] <= Var(X) / k2
Alternately, you may think of the variance in this way. 95% of the time, the observed value will be within SQRT[Var(X) / 0.95] of the expectation. (SQRT[x] is the square root of x.)
Often the square root of the variance is reported; this is called the standard deviation. The Chebyshev Inequality does not assume any distribution for X. If the distribution of a random variable is known, one can often do much better than the Chebyshev Inequality. For example, for a normally-distributed random variable, approximately 68% of the time the observed value is within one standard deviation of the mean, and 95% of the time the observed value is within two standard deviations of the mean.
This is the name of the model to which the results correspond.
Every model may have one or more distributions. Distributions are distinguished by keys, and designate a collection of constraints which are to be taken together. For an example of a model with multiple distributions, see the NAS TML model.
This collection of constraints may be viewed as probabilities on arcs, or as costs associated with arcs, etc. The special default distribution has no key. The following example shows two TML constraints; the first is part of the default distribution, while the second is part of the ``special'' distribution. ($ 1/2 $)
special:($ 2/3 $) Every analysis must be for one and only one distribution, so this is indicated in the analysis results page.
The number of nodes n in the model is reported. Many computations are O(n2) or O(n3), so this number is significant.
The number of arcs in the model is reported. Note that there may be many arcs between two nodes, so this number is in some sense independent of the number of nodes.
Each arc must be labeled with a stimulus, and a single stimulus may label more than one arc. This is the number of distinct stimuli in the model. Each arc from a given node must have a distinct stimulus, so the number of arcs in the model is bounded above by the number of stimuli multiplied by the number of nodes.
The average, per-step uncertainty in bits. This can be interpreted as the minimum average number of yes / no questions which must be asked to determine which path is taken. For example, if there are eight choices then one can determine which choice is made by asking three yes / no questions (splitting the number of choices in half each time). For n equally likely choices, one must ask CEIL(log2n) questions to determine which choice is made. (CEIL(n) is the smallest integer greater than or equal to n.)
When choices are not equally likely, the uncertainty is less than for the equally likely case (since some choices must be more likely than others). The uncertainty of a distribution P = {p1,p2,p3,...,pn} is given by the following famous equation derived by C. Shannon: H{P} = -SUM(pi log2 pi)
(SUM(xi) is the sum of all the xi values; SUM(xi) = x1+x2+...+xn.)
There are many different probability distributions on a Markov chain and thus many entropies. One can compute the entropy for each node of the model (using the probability distribution on the outgoing arcs), then take the average of these values, weighted by the long run occupancy for the node. This weighted average is called the source entropy of the model, and it is useful as a measure of complexity of the model.
There are two choices for the outgoing arc probability distribution. Traditionally the next node probability distribution (sometimes called the transition probability distribution) has been used--this gives the node source entropy, or the average per-step uncertainty in choosing the next node. More recently, attention has been focused on the stimuli which label the outgoing arcs. The probability distribution on the outgoing stimuli can be used instead--this gives the arc source entropy, or the average per-step uncertainty in choosing the next node and stimulus pair.
The arc source entropy is bounded below by the node source entropy. The two are equal if and only if each stimulus labels an arc to a different node.
One can reverse the entropy computation (see node source entropy and stimulus source entropy) to get a number of choices from an entropy. For example, if one has an entropy of 3 bits, this gives a sample space of 23 = 8 choices. A larger entropy represents a greater number of uniformly likely choices. For example, if one has a distribution {1/4, 1/4, 1/2}, then the entropy of this distribution is -[(1/4)(-2) + (1/4)(-2) + (1/2)(-1)] bits = 3/2 bits = 1.5 bits. This is between 1 bit (2 equally likely choices) and 2 bits (4 equally likely choices), and one could think of it as roughly 2.83 ``equally likely choices.'' One could also think of these as 2.83 ``statistically typical'' choices.
One can ``normalize'' the source entropy (which is the average per-step uncertainty) to get the average per-sequence uncertainty by dividing the source entropy by the long run occupancy of the model sink. The resulting number was used by G. Walton as the model complexity. Two to the power of this number is the number of statistically typical sequences. This is the number of sequences which the entropy represents, and it gives a sense of how long the ``long run'' actually is.
The model complexity is always larger than the source entropy (since the long run occupancy of the sink is always less than one). The number of statistically typical sequences is usually very large, and so it is given as a power of 2. For example, a statistically typical sequence number of 2^ 20 means that the model complexity is 20, and that there are roughly one million statistically typical sequences.
The number presented is based on the stimulus source entropy, not the node source entropy.
This number can be used to compare two models or assess the effect of changes to a single model in terms of the amount of testing required to achieve the long run and a sample representative of operational use. The larger this number, the more tests one can expect to execute before testing is representative of operational use.
The average number of events in a single test case (the number of node transitions between the model source and model sink) is reported along with its variance. This number can be used to determine approximately how long a single test will take to execute.
The graph cyclomatic number is a graph-theoretic result which is a measure of the complexity of the graph. This number is useful only if you are interested in graph-theoretic properties of the underlying model connectivity graph (which most of you probably aren't).
The density (number of nonzeros in the transition matrix divided by the total size of the transition matrix) is given. This number is always in the open interval (0,1). The number of nonzeros in the Markov transition matrix is reported in parentheses. If there are n nodes in the model, then the number of nonzeros is in the interval [1,n2).
The density of the transition matrix is provided for evaluation of the utility of sparse matrix techniques. If you are not on the development team, this result will probably not be of much interest to you.
The fraction of time one spends in this particular node in the long run. This is the fraction (# of occurrences of node x) / (total number of node occurrences). These numbers are all from the open interval (0,1) and the numbers in the column should add to one (since one is always in one of the nodes).
This column of numbers is also called the stationary distribution and the Perron eigenvector.
The probability that the given node occurs in a single test case. This can also be viewed as the fraction (# of test cases containing node x) / (total number of test cases). This is useful for automated testing because it reveals how often the specified node will occur in a test case. Nodes which have a high probability of occurrence should be automated; nodes which have a low probability of occurrence may be handled manually.
This is the average number of times the specified node occurs in a single test case. This is used in conjunction with the probability of occurrence. If a node occurs in about half the test cases, but occurs about ten times whenever it does occur, then the mean occurrence will be about five. Like the probability of occurrence, this number is useful for automated testing because it reveals how frequently a node will be visited in a single test case. Even if the node has a low probability of occurrence, a high mean occurrence indicates that the node should be automated.
This is the average number of events or the average number of test cases (see the column heading for the appropriate units) which will occur prior to the first visit to the specified node.
This number may be larger than the average length of the test case, since it may take several test cases before one visits a given node. This number reveals how long it will take, on average, before the specified node occurs in random testing. One can use this to determine whether an important node will be visited during random testing, or if a non-random test should be crafted to visit the specified node, as it is not expected to occur in random testing.
For nodes which represent failure, this can be viewed as the mean time until failure.
The uncertainty (see source entropy) with which the next stimulus is chosen from the specified node, in bits. This reveals how the complexity of the model is distributed among the nodes.
The fraction of time one spends in this particular stimulus in the long run. This is the fraction (# of occurrences of stimulus x) / (total number of stimulus occurrences). These numbers are all from the interval (0,1] and the numbers in the column should add to one (since one is always in one of the stimuli).
The probability that the given stimulus occurs in a single test case. This can also be viewed as the fraction (# of test cases containing stimulus x) / (total number of test cases). This is useful for automated testing because it reveals how often the specified stimulus will occur in a test case. Stimuli which have a high probability of occurrence should be automated; stimuli which have a low probability of occurrence may be handled manually.
This is the average number of times the specified stimulus occurs in a single test case. This is used in conjunction with the probability of occurrence. If a stimulus occurs in about half the test cases, but occurs about ten times whenever it does occur, then the mean occurrence will be about five. Like the probability of occurrence, this number is useful for automated testing because it reveals how frequently a stimulus will occur in a single test case. Even if the stimulus has a low probability of occurrence, a high mean occurrence indicates that the stimulus should be automated.
Computation of the variance of this metric is an open problem, and is not presently done by any of the analytical engines.
The fraction of time one spends in this particular arc in the long run. This is the fraction (# of occurrences of arc x) / (total number of arc occurrences). These numbers are all from the interval (0,1] and the numbers in the column should add to one (since one is always in one of the arcs).
The probability that the given arc occurs in a single test case. This can also be viewed as the fraction (# of test cases containing arc x) / (total number of test cases). This is useful for automated testing because it reveals how often the specified arc will occur in a test case. Arcs which have a high probability of occurrence should be automated; arcs which have a low probability of occurrence may be handled manually.
This is the average number of times the specified arc occurs in a single test case. This is used in conjunction with the probability of occurrence. If an arc occurs in about half the test cases, but occurs about ten times whenever it does occur, then the mean occurrence will be about five. Like the probability of occurrence, this number is useful for automated testing because it reveals how frequently an arc will occur in a single test case. Even if the arc has a low probability of occurrence, a high mean occurrence indicates that the arc should be automated.