Rod Nibbe (Case Western) Proteomics first approach for discovering sub-network targets in cancer
Move beyond single targets, an aim to identify concerted effects; combination or subnetworks of proteins characterizing the cancer. Understanding the pathophysiology of a late-stage colon cancer phenotype, not so much as a classifier but as a means to identify novel therapeutic targets.
Quantitative subnetworks based on (legacy)-PPi data with additional microarray data. Took tissue biopsies from large patient cohort, seed the search with identified proteomics targets in large curated PPI databases and score the resulting subnetworks
Top-down proteomics approach: paired normal / tumour biopsies from 12 patients. Differential image analysis to identify significantly changing proteins, MS/MS of excised spots results in lists of significant proteins associated with the phenotype. (As few as 6 patients seem to be enough to capture the majority of proteins). Highly significant both due to stringency in image analysis and database search.
Question then is: do the targets reside in interaction networks (using MetaCore). Scored significant networks, pruned to core network members, expanded by one step for functional inference.
[Looks like a fairly standard MetaCore Meta-analysis to me. Not quite sure how they scored the subnetworks with the array data, need to check the paper.]
Automated network scoring: initial small set of target proteins got expanded drastically due to MetaCore network analysis, additional information is required. Binned expression information added to the graphs and scored using mutual information. Null hypothesis is that random network activity does not discriminate between the normal and disease state.
Significant signatures (four out of 13 subnetworks): none of the original large networks was significant using MI. Exhaustive search of subnetwork results in four subnetworks significant with regards to two different background models. Subnetworks seem to be making biological / clinical sense, exploring the role of some proteins with pertubation experiments in cell lines.
Adam Smith (Wisconsin) – Clustered Alignments of Gene-Expression Time Series Data
Gene expression levels over time in different ways for different treatments. Idea is to perform similarity searches (BLAST for time series profiles). Usually restricted to discrete measurements of a continues attribute, requires reconstruction or interpolation. Time warping/alignment to maximize the similarity of the compared point helps.
Warpspace diagrams are an alternative way of showing alignment between two time series. Global alignments mean beginning and end corresponds, local or shorted alignments are the focus here and are required if one time series is more ‘evolved’ than the other.
- SCOW: efficient method for sparse time series
- Clustered alignments: subsets of genes sharing time series alignments
Known algorithm (1978) for dynamic time warping (DTW), minimize sum of euclidean distances. Parametric time warping (2005, Eilers, PTW) fits alignment function from given family. More limited in expressiveness than DTW.
Segment-based warping splits warps into individually scored segments, sits in a happy intermediate between DTW, PTW, but very slow (n^5 complexity). Correlation-optimized warping (COW, Nielsen 1998) looks for good ‘knots’ (one dimensional). SCOW searches in each dimension independently until convergence.
Evaluation with EDGE toxicology database, 11 treatments, 6-96 hours, 3 observed times for each data point well-defined zero time point, 216 genes. Take a gene and time series subset (removing time points), distort time series, then try to find the original entry in the test data set.
Cluster gene time series to get a regularization effect; algorithm based on k-means. Each cluster defined by an alignment, initial clusters calculates by a greedy algorithm, assign genes to cluster, re-calculate alignment for each cluster, iterate until convergence.
Does clustering help? Test with Mop3 knockout (circadian cycle gene); five clusters with exemplar genes identifies genes with a strong phase shift. Gene activity ‘sped up’ in the knockout.
Duygu Ucar (Ohio State) – Predicting Functionality of Protein-DNA Interactions by Integrating Diverse Evidence
Detecting interaction events between TFs and targets is crucial. Binding can be captured, but does not need to be functional. Semantics of interactions is important, but difficult to characterize (context dependency of TF binding event); half of the bindings without any effect on the gene expression. Binding changes also depending on external conditions / stimuli.
Prediction by integrating complimentary datasets. Estimating TF binding and gene expression response based on three data sets (chip-chip, PSSM and nucleosome occupancy). Binding considered functional if it changes gene expression (determined from microarray data)
Binding estimate: yeast ChIP-chip data, pValues indicate binding strength. -800/+200 bp PSSM binding scores, nucleosome depletion around active promoters combined into a probabilistic bayesian model. Integrated model outperforms individual three data sets in a five-fold cross validation
Used two data sets to correlate changes in binding events with changes in gene expression (YPD regular growth condition and stress). Distinguish between putatively functional and non-functional binding events
Functional binding rate can change under different stress conditions (GCN4 as an example). Function changes based on distance to promoter, orientation, presence/absence of co-factors. Multi-variate random forest based feature selection to identify important factors for each condidition-TF pair as well as to identify TF-TF pairs. Looks very interesting!
Saharon Rosset (Tel Aviv) – Grouped Graphical Granger Modeling for Gene Expression Regulatory Networks Discovery
Temporal causal modeling to uncover temporal relationships (time lags) between gene expression events and determine the identify the causal relationships (directionality). Ganger causality from Clive Granger (economy, nobel prize). X ‘granger causes’ Y if time delay in X is important to explain Y, usually via a linear regression model
Combine with graphical models to provide a methodology for causal modeling of temporal data; select the variables significantly affecting Y given time lag d; does one time series cause another as whole.
Followed of an overview of LASSO, Adaptive LASSO and their contribution (Group LASSO). Got to go back to the paper to really understand the underlying difference; if I got this right it is about how to handle the lag parameter. Different lag settings give significantly different results, but consistency increases with the number of time points.
Bootstrap sampling for small networks (nine genes) results in an average confidence of 80%, five of the top six links are either in BioGrid or in the literature.
Keynote: Tomaso Poggio – Computational Neuroscience: Models of the Visual System
Computational neuroscience wrt to (machine) learning and (computer) vision starting to provide new ideas and approaches. Better connections between the biology / data and modeling. CBCL face detection research started about 15 years ago, now in most modern cameras. Ten years from person / motion detection system to deployment in expensive cards.. but all based on pure engineering, no input from neuroscience.
Animals (and humans) can learn from small number of examples. Supervised learning algorithms (regularization techniques) from classical learning theory along with kernel machines (SVMs, radial basis functions) require higher number of samples. Kernel machines correspond to shallow networks with small number of layers and not ideal for the high sample complexity. Hierarchical organization might be able to address this (along with the poverty of stimulus problem).
Visual recognition is a difficult learning problem, e.g., is there an animal in a photo? Human brain has the equivalent of 1 million fly brain neurons. Model the ventral stream with quantitative models with feedforward connections (no backprojections as it won’t help with the immediate object detection), including millions of model neurons. Tuning of more and more complex layers learned in an unsupervised way from thousands of images. Additional training with supervised classifiers (themselves trained on small subsets of labeled images, animal / no animal present). These hierarchical feedforward models are consistent with neural data. High correlation in recognition efficiency between model and human results. Model superior to the purely engineered solution — but it is unclear why the model works which seems to be a rather general problem of these models.
Deep learning / deep belief networks to preprocess signals to reduce sampling complexity for classifiers trained with labeled examples. Layers reduce the number of samples required for a given accuracy.
Additional notes (pretty much a full transcript) in the friendfeed thread.
Other notes
- Excellent keynote, well covered by the bloggers already
- ISA-Tab/Tools/Etc tech demo. Looks fairly complete by now, see the friendfeed discussion