(AV17698) Algorithmic Thinking in Biology
Articles Blog

(AV17698) Algorithmic Thinking in Biology


good evening my name is Steve Martin I’m
a faculty member in material science and engineering and tonight it’s my great
pleasure to welcome you to our second Sigma Z lecture for fall of 2010
academic year and we have another two semester to talks next spring each fall
and spring semesters we have a we have a national honors lecture and we have a
local honors lecture and so it’s great cleavage for us as a sigma zai to offer
these seminars and to engage the greater research community at Iowa State
Sigma’s is a national and international organization focused on research and it
cuts across all disciplines of research everything from of course engineering to
biology but also information and the history so it’s a very very broad
organization wonderful organization in the apart oh not the least of which is
you get to come to great lecturers like tonight so keep aware of the
advertisements our next seminar would be in February and the announcements will
be out will be also here in the Maura Union definitely on a third
typically at about 8 o’clock so if you’re interested in more information
about Sigma’s I I suppose there’s a few members of society if you’re interested
in Sigma’s I it’s open to of course researchers of all ages so please look
at Sigma’s eye it’s right on the ISU homepage and you can look into our
activities and becoming a member of Sigma Chi of course for students there’s
very low produced rates for membership it’s a great pleasure for me to
introduce I’m the president of society and Bert is a vice president of our
organization the ISU chapter of Sigma zai Dian professor Birke is a
distinguished professor in the Department of Food Sciences and human
nutrition she’s also the director for the Center for Research on botanical
dietary supplements and so she also has a great level of research experience in
food and food supplements and dietary health and so with that Diane then will
introduce our speaker so Diane please thank you very much Steve it’s great
honor for me to introduce dr. Hahn Hahn of Arc
he has a Bachelor of engineering from in electronics engineering from Bangalore
India he has a master’s in electrical and computer engineering from Drexel and
a master’s and PhD in computer science from the University of wisconsin-madison
he here at Iowa State he is a professor of computer science he founded in 1990
at Iowa State the artificial intelligence research laboratory and he
currently directs that and he also directs the senator computational
intelligence learning and discovery which was founded in 2004 is a very long
list of research into interests I think that will probably not go through the
whole list obviously artificial intelligence machine learning knowledge
representation bioinformatics computational biology and the list goes
on he’s published over 200 research
articles in refereed journals conferences and books co-edited six
books with that I will introduce a neuron we’ll leave it to professor Hahn
of our to speak to us about algorithmic thinking in biology so most of my work is done with graduate
students so you know some of the people that I work with let me just so today’s
talk I want computer science is not just about this computer science and so the
the first part of the talk like try to convince you that and then are even a
few examples of so the consequences of this way of looking at other Sciences in
particular biology okay so so let’s start with us it’s something to get us warmed out so
this is something that appeared in general a biology general a few years
ago so can a biologist fix the radio so how would you go about fixing a radio
well here’s a possible way to go about fixing the radio see open up the radio
find all the different things in it then you classify the things that you see
then maybe even try changing replacing some component see what happens if it’s
still playing the Newseum that that wasn’t probably the color of that
component was not probably critical and I would dare this outfit
if it wasn’t for the fact that this was actually written by a biologist
otherwise it would seem a little odd that a computer scientist of yourself
yes oh and if we said all science is either stamp collecting or physics and biology one might say blended biology is like a bit like
physics before Newton and physics before Newton was a descriptive Newton invented
Co invented calculus for the first time a way to talk about things like rate of
change and here is the language that that particular odd thing to happen
so biology is at a stage where physics was about the time because biology has
been a descriptive science so you collect catalog and describe things that
you see biological phenomena and this goes from you know the early days when
people taxonomist went and catalogs species and their relationships to more
recent times very sequence that we have but there it unlike physics in biology
there isn’t there hasn’t been so since me the descriptive science and
in advances in biology and limited in part by instruments of observation and
this is critical for any science in CR to be able to observe first observe then
describe some things and then try to build predictive models and then there
with limitations and all all of these but it’s very changing in the last
several decades so biology used to be yes but now we have at least the
instrumentation that allows us to gather lots of data so if you want to transform
the biology then from stamp collecting to physics from a descriptive science in
character science then they need well methods for constructing models
from data in inferring consequences of generating hypotheses that they can test
against data that you have are and then into signing up into experiments and so
on and I would argue that over here computation plays a central role gone
just let me say a little bit about accommodation so this computation we started out with Hilbert presenting
this decision problem so and many of his problems that still
be my traditions but one of the problems that he first I was is there an
effective procedure for designing very simple question and so and to make
the long story short cheering invented a little gadget called
the Turing machine which other and then basically he suggested that
computation or this effective procedures for our practical purposes what did this
Turing machine does and what it does is essentially
transform a string of letters into and the other people tried to come up
with alternative formalizations of this and they were proven to be equivalent so
there’s something fundamental computation so so this led to the church
Turing thesis which in sort of layman’s terms and in fact the more limited because
they have okay so so if you take this seriously then our theories and models
in very seriously and it works under this assumption that computation for
cognition and computational biology also takes
this seriously so by analogy I can say computational biology so that’s sort of the main idea so some
computational biology then and that’s some this premise that computation
provides the biology physics approximate analogy and as a means of so so this means an implication of this series about information processing so
for example if you look at the genome so organisms in general you can think of
them at a certain level of abstraction as beautiful self reproducing
information so and they acquire information it’s a
learning adaptation of evolution the transmit information server ology is
fundamentally an information science and that in from this point of view and then
the grand challenge of biology is given code that’s written in some unknown
programming language our challenge is to determine the syntax and the semantics
of the language in other words what does this program just to give a different example suppose
you March this game down and they could so figuring that out is like figuring
out the syntax and semantics there’s an unknown language from essentially by
observing you know some things written in that language so just some terminology so when we talk
about the genome it’s basically we refer to the entire sequence which is and just some basic quality Sedaris
audience each self has identical DNA well because but it’s program if you
think this program then the program so so this cellular differentiation is
essentially response to the signals that control orchestrate his developmental
program so try to understand that the response of this program to different
conditions is a huge problem so the program of life transcriptome refers to
the full complement RNA which then get translated into your proteins and the
proteome is the full complement of proteins that is produced so and then
there’s this term in truck term which refers to the full complement of
molecular interactions seduction and Romano and the techniques that we use that and
are no different from what you would use to analyze a social network or some
other type of map so one of the challenges is relating these different
levels so for example the program at this level how does it get translated
into this and linking up this different levels of abstraction so if you take
this approach seriously then we will have a theory so he and it goes through this process
Kali honest linear sequence know what do you
know much about shape today so presumably the information
so have a theory of protein folding when we have a algorithmic program that takes
the linear sequence of amino acids and the same sort of story can be told about
other problems in biology so and this you can take a similar sort of line of
reasoning and this has affected other disciplines I’m going to skip this
because the focus here is on biology so so this selasa so they’re not
necessarily reality but the describes certain aspects is a cartoon now if you didn’t really some like
Pacific levels in people ten days No so any model that we develop is not
necessary it’s not reasonable to ask you if a model is true this is not about
what’s true it’s about whether this model so in the case of biology you can
come up with models of patterns that describe basic entities like DNA RNA
genes proteins and so on the properties of these relationships
between those so interactions and depending on exactly what we are
interested in you can come up with different types of models these models
take the form of essentially programs so so this is sort of the big big picture but could argue that computational
models are what allows you to grasp on reality sets so so the just wrap up this section of
the talk the main idea is you understand the phenomenon when we can so inventory’s take the form of
algorithms and this has led to in recent years to
the birth of many different disciplines and they all have a combination okay
and this also has implications for research again if this if you take the
seriously any literate person has to know something about computing not
merely the use of computers but the ideas the concepts of ok so let me give
you some examples starting with something that happened about the same
time as so and this was brains neurons and so McCulloch was you know
physiologist and pizza was a mathematician who got who heard about
computing and this is a very famous seminal paper logical calculus of
neuronal activity and so set up here’s the this is probably if you took a
biology course in high school abscess and itself and that becomes the stream of
bosses to go down the center by the way that this physics of this is exactly the
same as physics and cables so this is obviously a very complicated system what’s it cartoon it’s this one but you
can model the different inputs to this neuron some variables the interactions
the synapses contacts here by some waves some numbers and the fact that is
accepting signals so you can think of it as a switch it to
uh add something very simple gadget so the output is a 1 if the weighted sum of
the inputs is greater than 0 its 0 and minus 1 otherwise it’s a two-state
device okay so and for one thing it has some
interesting connection with German so if you have this sort of linear summation
so it’s better than zero something happens to Christie otherwise something
gasps silence so what if it is equal to zero so by setting this equal to zero I
get up essentially an equation for a line on a plane and for any input that
falls on one side of the plane this device produces an output at 1 and Pleiades so if you think of these
parties in the space as modeling inputs so it could be a signal that you receive
from your visual system this can be used as a pattern classifier right it
classifies its inputs into two categories one then it says it can be
used to classify things going on plane is rotated if I change this the
classification changes so some interesting connection with geometry and
also pattern classification so we can use this as a classifier it also has
some interesting connection well by choosing the weights
appropriately yeah I can logic functions they can get it to compute some simple
logical functions and it turns out that by choosing the weights appropriately
you can get it to compute and source and knots if you know this if you have had
any exposure to boolean logic results is that if you if you can compute and
source and knots you can compute any boolean function the boolean function f
there is network of this once again computed within consciousness
of any smart step to building finite state machines these are machines that and receive inputs and the started and
that’s exactly what we have these boxes here computers so all we need is
that and if you give them a fine stator down the tunnel and then if you give it
some space to write and read from then you get during machines general-purpose
computers so this humble to go
you know this all of this came from above this result sort of father plumb
just taking this neuron so the symmetric of us can compute
arbitrary boolean functions they come up with very simple learning algorithm that
basically modifies the and it turns out that after finite number of iterations get the inputs correctly classified so
anytime you have a set of data samples and it has some data samples
face and I hear do this other than then again now this was all fifty years ago and
there are much more sophisticated learning algorithms now that can work
with more complicated set of learning problems but the basic idea is the same
which is how okay so that’s one example of and this is if the situation gets
more complicated you’ve come up with yes this has really led to a theory of
learning machines and the far more sophisticated algorithms that can learn
from data so that’s one example of a computational
in biology in the very simple context of looking at new nuts and that leads well you have computational models near
answer it all selected and I got you something about vision and
guidance many many start out like this one simply accepted taking yeah yes now let me tell you if you have is
learning machines where could he apply them well you can apply to a lot of
different scenarios in in biology and so here’s some examples so one of the basic
problems is Molly dysfunction by attacked me
so one of the basic problems is can you predict where the interaction so if you could understand how this
inflections work then you can design better drugs so one of the questions is
how can we predict interfaces from sequencer structures many techniques but
one of the ways in which occur approaches by using machine learning so
you have a data set of characters and from this you can try to learn the
gender roles that would allow you to predict these
interfaces well so this is an example of City study so you you know in this case
it was protein RNA in two pieces so you can go to the protein complexes and from
that site so you have a data set of protein RNA interfaces so from this the
task is to learn what makes so that’s the problem
so see here’s again a statement of the problem sir you have you given a protein
you don’t necessarily have a structure the complex you want to predict which
amino acid so this is a 3d structure but what you get in the sequence great which
amino acids participate in some protein protein interactions so the Guardian
hypothesis is that reflected in the local sequences the
signal is there in the sequence so if you are a machine learning person you
would approach it by generating data sets from of known complexes and then
you build so you use one of these machine learning algorithms maybe
something that’s more sophisticated than those simple one that I told you about
and then build a classifier works you can use it to classify interfaces so now
the many questions that arise in this and for each of each one of these amino
acids some neighboring region and that would
be the input for you could take the structure with Lenovo and get some
structural neighborhood try to predict it and so on so suppose you do this
again skipping all this give you the result you do this so you can get that
you have to keep in mind that and even if you don’t quite get all the
positions right if you know approximately where the interphase is
that’s already pretty good for many of the targeting experiments there are so
you know roughly where the interface is so you can try to design a drug molecule
that binds to that region so so then you might not spoil okay so
you can’t believe this so what so here’s an example of what you
can do with this information so if you build these predictors really give you
some biological insights that tell us something about biology what is it just
an exercise in yeah well here’s an his actually as
study where said this was an attempt to predict the binding sites essentially a
wider RNA which is the horse version of the HIV virus and the proteins that it
binds to and so there are predictions and makes an agency and then now a
communications and quite reasonably well and the
predictions have confirmed by the experiments it means that these kinds of
techniques cannot can be used then – first of all focused experiments because
experimentation actually experiments are fast they say nobody can mutate every
positive side so the predictions can die yes sir so this these sorts of methods
are quite useful so and in this case the actual instrumental work was
here in the veterinary medicine college and the predictions of Bernard by
experiments and so that’s a good thing so now one of the other problems is protein protein interactions we take the
molecule and to figure it out energetically favorable
confirmation weather and this is a very hard computational
problem because you’ll have to try every possible conformation by nations right so so in here again if
you can predict interfaces you can predict you can try to reduce the so
here so one of the things that you can do is if you really pay interfaces then
this docking programs the interfaces that exist in this and it turns out that simple simply
predicting the interfaces can improve the quality of results that are obtained
from the docking program so if you just took the blankness
some energy considerations and you asked confirmation this complexity and this is
because of many reasons so first of all we don’t completely have we don’t have
the accurate energy function so we don’t quite understand all the physics there
so so this and this is what you will see so the top confirmation may not
necessarily but on the land if you use this
predicted interfaces the correlation between the rank the top-ranking
conformation which has been system many other ways of determining whether
you can look at RMS the deviation from actual structures and many other
criteria it is a simple technique of using particular interfaces to rank the
confirmations improves the quality of the art which is produced by down in
docking is a basic task that’s used so so kind of just wrapping this part of
the talk up so if you so you can so I told you a little bit about this curtain
model of the neuron approaches and machine learning
approaches in general can be used in biological applications to predict what
makes an innovator now this is completely without
independent of what think I think ordinary said you also know what and and this is not just an exercise in
playing with computers it actually get some useful so so this is sort of that
Britain unlimited so in the recent yes there are
experimental techniques that for example in a cell or a tissue of all
the genes you can also measure the amount of
proteins so they can take this different techniques and you can actually get
measurements about so and you can also see what happens if you perturb the
system so for example it can take the data and you can start building
models from this so looking at individual genes or individual proteins
you can start building models of how they interact at the level of
transcription at the level of protein-protein interaction and so on
and the many different types of data and so for example the history favorite which proteins interact with which other
proteins this is noisy data but nevertheless it’s quite useful you can
use microarrays what interacts is what but it tells you
something about the activity of the various genes that are and then you can
do more complicated things where you know you can measure kinetic constants
and so on system if you look at smallest scale
very detailed information kinetic constants and you can actually build
differential equation models and this fault spectrum in between so here are
some examples of network models right so say this is the story take this and try
to generate different types of models so for example one
no Jesus and the links represent the fact that they interact and if you also
wait on the edges then it might indicate the strength of interaction and you can
generate networks like this so what can you do with this well it’s a graph right
so if I have several different species from which I could generate this graphs
then for one thing I can essentially do a graph comparison to figure out what
makes them expression are networks and a few links represent correlations between
expression levels of these genes you can have a graph from a healthy tissue a
graph from a tissue that has cancer and by comparing this graphs you can and that can then be used to generate
specific hypotheses specific more focused experiments try to figure out
mechanistically what’s going on so and this leads to a bunch of different
computational problems so for example essentially this is a and it’s from a
purely computational perspective if you have no other constraints it’s also
known to be a hard problem right so but in in in practice in biology there are
several additional here is another interaction and this gives so this gives
you this step of models and then if you you can ask the question about whether
some chain could possibly so the computational questions that
competition rate surpassing questions have implications in biology they incan
if there’s more than enough then you can figure out that and then if you did
comparative analysis again you can figure out what’s common what’s
different across different conditions itself and you can get some combination
of these models by experiments where let’s say one of those
genes expression and see what happens so so what we have here is starting with
the simplest boolean networks which have a little bit more information so you
have this hierarchy of increasingly sophisticated models but anytime you do
modeling it to keep models is that you want this it’s as accurate though the one for
which this markets being constructed well and that’s that holds true for the
models as well okay so abstraction is important and
computer science well that’s basically a few okay so I think I’m gonna skip the
rest of this which essentially details so let me just sort of wrap up since we
are out of time science so computer science and the science of information
processing so computer science has very little to do with computers that the
community use that this incident he is information
processing as long as it is information processing whether it takes place in
your brains or in cells forming societies so the competition provides the means
for describing information plus sake and the judge during this is a general
the thurible can rediscover a computer program
this means that here is a modern state the form of computer programs and and
computational biology is an example of that so we use computational models mr. because and this is a team that’s happening for example if you looked at
the New York Times a couple of days ago there’s also that went to Facebook and Sonia now has
social network data service it’s a thinking about top
questions in sciences biology or social science sorry there’s been a tremendous increase
of course in the size and the computation powerful interview you
humans maybe that is maybe not the power of algorithms or models and what rule
has the increased capability and speed of the supercomputer we call it today
impacting what you can practically do in the cyber systems in the study so it’s a
state away from supercomputers of the high performance but what I wanted to
emphasize here is thinking about problems using ideas from computing
complimentary side to that as the problems become bigger very important which you had a computer
helpful because computationally our problem
but not everything is going to be saw just by having bigger computers yes is it certain that we will not have any
analytical description of biological processes so we have to his advanced
research news competition procedures whether I can go to the
music’s that said we got cases study how close form solutions
that’s what you but there are also many so those models can be arbitrarily
I’ve seen this type doesn’t mean that you cannot understand
it so binary protein
the secondary tertiary block marigold I don’t work very well thing that’s
promise being made but it’s not so so this problem we are hosted Humanzee
has there’s seven of them but progress is being made so you can
say that you can predict and often
yeah techniques
structures and in combination with it doesn’t say yeah – already I think so
these predictions are one of the big challenges in web design is you generate
a whole bunch of potential candidates and we know that I’m an investor
so safe in that standpoint for prioritizing potential drug targets they maybe had this one
if you does that engage them but I want missiny what I was preventing
this kind of so many use them as essentially full
screen so speeding provision yes as

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top