Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Boolean NetworksEdda Klipp
Humboldt University Berlin
SS 2009
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
One Network, Different Models
gene a gene b
gene c gene d
C
A
D
B
AB
+
+
repression
activation
transcription
translation
gene
protein
a b
c d
Directed graphs
V = {a,b,c,d}
E = {(a,c,+),(b,c,+), (c,b,-),(c,d,-),(d,b,+)}
a b
c d
Boolean network
a(t+1) = a(t)
b(t+1) = (not c(t)) and d(t)c(t+1) = a(t) and b(t)
d(t+1) = not c(t)
a b
c d
Bayesian network
p(xa)
p(xb)p(xc|xa,xb),
p(xd|xc),
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Simplification of Gene Expression Regulation
Gene
mRNA
Protein
Gene
mRNA
ProteinTranscription Factor
A B C D E F G
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Boolean NetworkBoolean network is
- a directed graph G(V,E)
characterized by
- the number of nodes („genes“): N
- the number of inputs per node (regulatory interactions): k
A B
C
E
D
F G
N=7,kA=0, kB=1, kC=2,… in-degrees
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Boolean Logic
(George Boole, 1815-1864)Each gene can assume one of two states:
expressed („1“) or not expressed („0“)
Background: Not enough information for more detailed descriptionIncreasing complexity and computational effort for more specific models
Replacement of continuousfunctions (e.g. Hill function)by step function
Boolean models are discrete (in state and time) and deterministic.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Boolean Network
722
Boolean networks have
always a finite number of possible states: 2N
and, therefore, a finite number of state transitions:
A B
C
E
D
F G
N=7, 27 states, theoretically possible state transition
N22
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Dynamics of Boolean NetworkS
The dynamics are described by rules:
„if input value/s at time t is/are...., then output value at t+1 is....“
A B
A(t) B(t+1)
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Boolean Models: Truth functions
in output
0 0 0 1 11 0 1 0 1
p p not p
rule 0 1 2 3
A B
B(t+1) = not (A(t))rule 2
A(t) B(t+1)
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Dynamics of Boolean Networks with k=1
Linear chainA B C D
A fixed (no input). Rules 0 and 3 not considered (since independence of input).
A(t) B(t+1)B(t+1) C(t+2)
C(t+2) D(t+3)
The system reaches a steady state after N-1 time steps.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Dynamics of Boolean Networks with k=1
RingA B
C D
A
B
Again: Rules 0 and 3 not considered (since independence of input).
A(t+1)=B(t)B(t+1)=A(t)Both rule 1
A B A B A B A B0 0 1 0 0 1 1 10 0 0 1 1 0 1 10 0 1 0 0 1 1 1
A(t+1)=not B(t)B(t+1)=A(t)Both rule 1
A B A B A B A B0 0 1 0 0 1 1 11 0 1 1 0 0 0 11 1 0 1 1 0 0 00 1 0 0 1 1 1 00 0 1 0 0 1 1 1
Fixpoint or cycle of length 2 depending on initial conditions
Cycle of length 4 independent ofinitial conditions.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Attractor
The trajectory connects the successive states for increasing time.
An attractor is a region of a dynamical system's state space that the system can enter but not leave, and which contains no smaller such region (a special trajectory).
Fixpoint – cycle of length 1Cycles of length LBasin of attraction: is the surrounding region in state space such that all trajectories starting in that region end up in the attractor.
Bifurcation: appearance of a boarder separating two basins of attraction.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Boolean Models: Truth functions k=2
And Or Nor
0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
0 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
1 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
rule 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
input outputp q
AC C(t+1) = not (A(t)) and B(t)
rule 4B
p=A(t), q=B(t)
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Example Network
Three genes X, Y, and Z
X
Y
Z
Rules
X(t+1) = X(t) and Y(t) Y(t+1) = X(t) or Y(t)Z(t+1) = X(t) or (not Y(t) and Z(t))
Current Next state state000 000001 001010 010011 010100 011101 011110 111111 111
000 001 010 011
100101 110 111
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Example Network
X
Y
Z
000 001 010 011
100101 110 111
- The number of accessible states is finite, .
- Cyclic trajectories are possible.
- Not every state must be approachable from every other state.
- The successor state is unique, the predecessor state is not unique.
N2
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Example Network as Boolean Model
gene a gene b
gene c gene d
C
A
D
B
AB
+
+
repression
activation
transcription
translation
gene
protein
a b
c d
Boolean network a(t+1) = a(t)
b(t+1) = (not c(t)) and d(t)
c(t+1) = a(t) and b(t)
d(t+1) = not c(t)
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Example Network as Boolean Model
a b
c d
Boolean network
a(t+1) = a(t)
b(t+1) = (not c(t)) and d(t)
c(t+1) = a(t) and b(t)
d(t+1) = not c(t)
0000 00010001 01010010 00000011 00000100 00010101 01010110 00000111 0000
Steady state: 0101
1000 10011001 11011010 10001011 10001100 10111101 11111110 10101111 1010
Cycle: 1000 1001 1101 1111 1010 1000
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Naïve Reconstruction of Boolean Models
If it is known -the number of vertices, N, and -the number of inputs per vertex, k,-As well as a sufficient set of successive states, one can reconstruct the network
List- List for each vertex all possible input combinations
- List all respective outputs
Experiments:- Delete after every “experiment” all “wrong” entries of the list
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Naïve Reconstruction of Boolean Models
A B
N=2, k=1
Input Output A(A),B(A)A B rule 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 3 0 3 1 3 2 3 3 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 10 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 11 0 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 11 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 In out
0 0 0 1 11 0 1 0 1rule 0 1 2 3
Input Output A(B),B(B)A B rule 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 3 0 3 1 3 2 3 3 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 10 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 11 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 11 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1
Input Output A(A),B(B)A B rule 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 3 0 3 1 3 2 3 3 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 10 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 11 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 11 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1
Input Output A(B),B(A)A B rule 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 2 0 2 1 2 2 2 3 3 0 3 1 3 2 3 3 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 10 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 11 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 11 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1
A B1
2
“Experimente….”
In OutA B A B0 0 0 10 1 1 11 0 0 01 1 1 0
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Random Boolean Networks
If the rules for updating states are unknown
select rules randomly
N nodes ½ pN (N-1) edges
Rule 2
Rule 0
Rule 1
Rule 2
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Kauffman’s NK Boolean Networks
An NK automaton is an autonomous random network of N
Boolean logic elements. Each element has K inputs and
one output. The signals at inputs and outputs take binary
(0 or 1) values. The Boolean elements of the network and
the connections between elements are chosen in a
random manner. There are no external inputs to the
network. The number of elements N is assumed to be
large. S.A. Kauffman, 1969, J Theor Biol. Metabolic Stability and Epigenesis in Randomly Constructed Genetic NetsS. A. Kauffman. The Origins of Order: Self-Organization and Selection in Evolution, Oxford
University Press, New York, 1993. S.A. Kauffman, 2003, PNAS, Random Boolean Network Models and the Yeast Transcriptional Network
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Kauffman’s NK Boolean Networks
An automaton operates in discrete time. The set of the
output signals of the Boolean elements at a given
moment of time characterizes a current state of an
automaton. During an automaton operation, the
sequence of states converges to a cyclic attractor. The
states of an attractor can be considered as a "program" of
an automaton operation. The number of attractors M and
the typical attractor length L are important characteristics
of NK automata.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Kauffman’s Boolean Network
Fundamental question: require metabolic stability and epigenesis the genetic regulatory circuits to be precisely constructed??
Has fortunate evolutionary history selected only nets of highly ordered circuits which alone insure metabolic stability;Or are stability and epigenesis, even in nets of randomly interconnected regulatory circuits, to be expected as the probable consequence of as yet unknown mathematical laws?
Are living things more akin to precisely programmed automata selected by evolution, or to randomly assembled automata…?
Note: cellular differentiation despite identical sets of genes
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Kauffman’s Boolean Network
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Further Properties
K connections: 22K Boolean input functions
Nets are free of external inputs.
Once, connections and rules are selected, they remain constant and the time evolution is deterministic.
Earlier work by Walker and Ashby (1965): same Boolean functions for all genes:Choice of Boolean function affects length of cycles:
“and” yields short cycles,“exclusive or” yields cycles of immense length
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Further Properties: Cycles
State of the net: Row listing the present value of all N elements (0 or 1)
Finite number of states (2N) as system passes along a sequence of states from an arbitrary initial state, it must eventually re-enter a state previously passed a cycle
Cycle length: number of states on a re-enterant cycle of behavior
Cycle of length 1 – equilibrial state
Transient (or run-in) length: number of state between initial states and entering the cycle
Confluent: set of states leading to or being part of a cycle
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Further Properties: Number of Cycles
Such a net must contain at least one cycle, it may have more.
Their number can be counted just be releasing the net from different initial states
No state can diverge on to two different states, no state can be on two different cycles
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Further Properties: Number of Cycles
(a) A net of three binary elements, each of which receives inputs from the other two. The Boolean function assigned to each element is shown beside the element. (b) All possible states of the 3-element net are shown in the left 3 x 8 matrix below T. The subsequent state of the net at time T+ 1, shown in the matrix on the right, is derived from the inputs and functions shown in (a). (c) A kimatograph showing the sequence of state transitions leading into a state cycle of length 3. All states lie on one confluent. There are three run-ins to the single state cycle.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Example: Net with N=10
Periodic attractor (yellow)and basin of attraction(cyan)
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Example: Net with N=10
The entire state space of an RBN with 10 nodes. Note: Self connections do not appear so a period-1 attractor appears to have no outputs although each network state must have exactly one output.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Further Properties: Distance
Distance compares two states of the net
Can be defined as the number of genes with different values in two states.
For example N=5: state (00000) and state (00111) differ in the value of three elements
This is used as measure of dissimilarity between
- subsequent states on a transient- subsequent states on a cycle- cycles
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Totally Connected Nets, K=N
Is like random mapping of a finite set of numbers into itself.
Expected length of cycle is N2
E.g. net with N=200 states expected cycle length 2100 ~ 1030
Compare to Hubbel’s age of the universe: 1023
If every transition would take only a second….
Such networks are biologically impossible
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
One Connected Nets, K=1
Either one cycle of length N
Or a number of disconnected cycles for the full systems state cycles lengths are lowest
common multiples of the individual loop lenghts
the state cycle length becomes easily very large
Again biologically not feasible
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Two Connected Nets, K=2
Kauffman studied networks of N= 15, 50, 64,…, 400, 1024, .., 8191
Nets of 1000 elements possess 21000~10300 states
16 Boolean functions
Study of cycle length (surprisingly short)
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Two Connected Nets, K=2: Cycle Length
(a) A histogram of the lengths of state cycles in nets of 400 binary elements which used all 16 Boolean functions of two variables equiprobably. The distribution is skewed toward short cycles. (b) A histogram of the lengths of state cycles in nets of 400 binary elements which used neither tautology nor contradiction, but used the remaining 14Boolean functions of 2 variables equiprobably. The distribution is skewed toward short cycles.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Two Connected Nets, K=2: Cycle Length
Log median cycle length as a function of log N, in nets using all 16 Booleanfunctions of two inputs (all Boolean functions used), and in nets disallowing these two functions(tautology and contradiction not used). The asymptotic slopes are about 0.3 and 0.6.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
K=2: Transient Lengths
A scattergram of run-in length and cycle length in nets of 400 binary elementsusing neither tautology nor contradiction. Run-in length appears uncorelated with cyclelength. A log/log plot was used merely to accommodate the data.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
K=2: Number of Cycles
A histogram of the number of cycles per net in nets of 400 elements using neithertautology nor contradiction, but the remaining Boolean functions of two inputs equiprobably.The median is 10 cycles per net. The distribution is skewed toward few cycles.
Expected number of cycles: 2
N
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
K=2: Activity
After release from an arbitrary initial state:
Number of elements changing their state per state transition decreases
Example: net of 100 elements first step: about 0.4 N elements change exponential decay of this number minimum activity 0 to 0.25 N
On a cycle: 0 to 35 of 100 elements change
most genes are constant during a cycle
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
NoiseOne unit of noise may be introduced by arbitrarily changing the value of a single gene for one time moment.
The system may return to the cycle perturbed or run into a different cycle.
In a net of size N there are just N states which differ from any state in the value of just one gene
Consider a net with several cycles: By perturbing all states on each cycle (distance 1) one obtains a matrix listing all cycles and how often they are reached from another one.
By dividing all cells by the rows totals transition probabilities
The matrix is a Markov chain.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Noise: for the Example
a b
c d
Boolean network
a(t+1) = a(t)
b(t+1) = (not c(t)) and d(t)
c(t+1) = a(t) and b(t)
d(t+1) = not c(t)
Cycle 10000 00010001 01010010 00000011 00000100 00010101 01010110 00000111 0000
Steady state: 0101
Cycle 21000 10011001 11011010 10001011 10001100 10111101 11111110 10101111 1010
Cycle: 1000 1001 1101 1111 1010 1000
C1 C2C1 ¾ ¼C2 ¼ ¾
TransitionMatrix
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Noise
(a) A matrix listing the 30 cycles of one net and the total number of times one unit of perturbation shifted the net from each cycle to each cycle. The system generally returns to the cycle perturbed. Division of the value in each cell of the matrix by the total of its row yields the matrix of transition probabilities between modes of behavior which constitute a Markov chain. The transition probabilities between cycles may be asymmetric.(b) Transitions between cycles in the net shown in (a). The solid arrows are the most probable transition to a cycle other than the cycle perturbed, the dotted arrows are the second most probable. The remaining transitions are not shown. Cycles 2, 7, 5 and 15 form an ergodic set into which the remaining cycles flow. If all the transitions between cycles are included, the ergodic set of cycles becomes: 1, 2, 3, 5, 6, 12, 13, 15, 16. The remainder are transient cycles leading into this single ergodic set-.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Noise
The total number of cycles reached from each cycle after it was perturbed inall possible ways by one unit of noise correlated with the number of cycles in the net beingperturbed. The data is from nets using neither tautology nor contradiction, with N = 191,and 400.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Application to Cell Cycle
Logarithm of cell replication time in minutes against logarithm of estimated number of genes for various single cell organisms and cell types. Solid lines: connects medium replication times of bacteria, protozoa, chicken, mouse, dog, and man.
Humboldt-Universität
Zu Berlin
Edda Klipp, Humboldt-Universität zu Berlin
Application to Cellular Differentiation
The logarithm of the number of cell types is plotted against the logarithm of the estimated number of genes per cell, and the logarithm of the median number of state cycles is plotted against logarithm N. The observed and theoretical slopes are about 0.5. Scale: 2 x lo6 genes per cell = 6 x 10-12g DNA per cell.