Introduction
DeDaL (Data-Driven network Layout) is a Cytoscape 3.0 [1] app developed by the
Computational Systems Biology of Cancer group in
Bioinformatics Laboratory of Institut Curie (Paris).
Scientific article about DeDaL has been published in BMC Systems Biology [10]
Source code of DeDaL is accessible in a github repository.
DeDaL is an universal tool and an example of non-biological application of DeDaL can be found here.
The knowledge on molecular interactions in living cells is usually represented in the form of network diagrams, depicting, for instance, protein-protein interactions, biochemical reactions or more abstract
influences of some molecule onto another molecules, etc. Providing an insightful layout for such diagrams is not a trivial problem. On the other hand, large amount of data is produced by application of high
throughput biotechnologies. There is an urgent need of developing new methods
for integrating the information provided in biological diagrams with the multidimensional -omics datasets. Classically, high-throughput data are mapped on top of the network layouts computed based on the
network structure.
DeDaL is a Cytoscape 3.0 app which uses several algorithms of
dimention reduction to produce data-driven network layouts based on
multidimensional data (typicaly gene expression). DeDaL implements
several data pre-processing and layout post-processing steps such
as continuous morphing between two arbitrary network layouts and
aligning one network layout with respect to another by rotating and
mirroring. Combining these possibilities facilitates creating insightful
network layouts representing both structural network features and the
correlation patterns in multivariate data.
The app is implemented in Java.
DeDaL app has the following functions:
- Data-Driven Layout:
pre-processing options:
- double center data
- network-smooth data
- Principal Component Analysis (PCA)
- Elastic map (non linear PCA)
- Layout aligning
- Layout morphing
We propose you to follow our Tutorial to see an example of using DeDaL with expression data for two cancer subtypes on Fanconi DNA repair pathway.
We also applied DeDaL to the network of proteins interacting with ESR1 protein [Fig 3]. In
this case, the second principal component shows, for example, that the expression
levels of EGFR and CCNE1 are differently modulated though both are upregulated
in the basal-like subtype. PCA layout also highlights a particular pattern of ex-
pression of some hub genes such as AR or EGFR, and shows that underexpressed
genes in basal-like subtype forms more tightly-connected subnetwork. Morphing
the original organic network layout with the PCA-based layout moves position of
some of the proteins, keeping the general pattern of PCA preserved. For example,
underexpressed PIK3R1, IGFR1 and ERBB2 genes are moved on the left because
each of them is connected to several overexpressed genes. Application of network
smoothing drives the hub genes to the center of the layout, because of averaging
over the hub’s neighbors. It produces more regular pattern of network connections
but approximately conserves the neighborhood relations in PCA layout. DeDaL allows to easily identify group of genes with similar expression pattern even within the group of similar level of expression. I addition it is easy to morph between structure based layout and the data-driven layout which gives the user an opportunity to access and visualize both informatons easily without using additional tools.
It is also possible to used DeDaL to visualize genetic information . We applied DeDaL to create a DDL layout for a group of yeast genes involved
in DNA repair and replication. The genetic interactions between these genes and
the epistatic profiles (computed only with respect to this group of genes) were
used from [7]. The definitions of DNA repair pathways were taken from KEGG
database [8]. Figure 4 shows the difference between application of the standard
organic layout for this small network of genetic interactions and PCA-based DDL
(computed here without applying data matrix double-centering to take into account
tendencies of genes to interact with smaller or larger number of other genes). PCA-
based DDL in this case groups the genes with respect to their epistatic profiles.
Firstly, local hub genes RAD27 and POL32 have distinct position in this layout.
Secondly, PCA-based DDL roughly groups the genes accordingly to the DNA repair
pathway in which they are involved. For example, it shows that Non-homologous end
joining DNA repair pathway is closer to Homologous recombination (HR) pathway
than to the Mismatch repair pathway. It also underlines that some homologous
recombination genes (such as RDH54) are characterized by a different pattern of
genetic interactions than the “core” HR genes RAD51, RAD52, RAD54, RAD55,RAD57.
In the next example we apply DeDaL to the Boolean model of cell fate decisions between survival,
apoptosis and non-apoptotic cell death (such as necrosis) published in [9], to group
the nodes of the influence diagram accordingly to their co-activation patterns in the
logical steady states. The table of steady states was taken from [9] (Figure 5, top
right) and used to compute the PCA-based DDL (Figure 5, bottom left). In this
DDL, nodes in close positions have similar pattern of activation in steady states
(such as RIP1 and RIP1K). We used morphing PCA-based DDL and the initial
layout of the model (as it was designed in [9]) to visualize several stable states
corresponding to different cell fates (Figure 6). In this layout co-activated nodes
tend to form compact groups. Therefore, DeDaL can be used to design layouts of
mathematical models of biological networks, using the solutions of the model.
Downloads
Install DeDaL test version
Download File
DeDaL (7Mb) last release.
Using the Cytoscape app manager
- Launch the Cytoscape app manager (menu "Apps -> App Manager-> Install from File . . . ").
- Select the DeDaL.jar file
- Click "Install".
Download files for a tutorial:
network (network.sif)
data (data.txt)
style (fanconi_style_file.xml)
Tutorial
In this tutorial you will follow a set of instructions in order to get acknowledged with DeDaL standard functions.
We used The Cancer Genome Atlas (TCGA) breast cancer transcriptomic dataset
(548 patients) and Human Reference Protein Database (HPRD[6]) database as a source of protein-protein interaction
network. As an example of a small subnetwork, we selected proteins involved
in Fanconi anemia DNA repair pathway as it is defined in Atlas of Cancer Signaling Network (ACSN).
Data set. used in this tutorial is a public data set. And it is accessible from TGCA database. Dataset contains also a column with values of the t-test computed for the
gene expression difference between the basal-like (one of the molecular
subtypes of breast cancer, significantly contributing to the
intertumoral variability) and non basal-like breast tumours, that will be used for node coloring.
In this exercise we will identify patterns in the data using PCA with network smoothing and double centering, we will perform morphing between a purely structure based layout and the PCA layout and we will align networks to allow an easier comparison between networks.
- Download files from the Downloads
- Open Cytoscape 3.0:
- a new Session is opened automatically, if not click File->New->Session
- Import network file:
- File->Import->Network->File . . .
- Select the downloaded file: network.sif from your files
- Click OK.
- Install the app using the Cytoscape app manager
- Launch the Cytoscape app manager (menu "Apps -> App Manager-> Install from File . . . ").
- Select the DeDaL.jar file
- Click "Install".
- If you can see the App Manager window as shown below, it means the app is correctly installed
- Close the dialog window of the App Manager
- Load data:
- File->Import->Table->File . . .(data.txt)
- Leave parameters by default
- Click OK.
- Color data according to the T-test values
- File->Import->Network->Style . . .
- Go to Style in Control Panel on the left
- Change the style to default_0
- You should observe nodes changed shape and are filled according to the t-test values (_TTEST column)
- Apply Data-Driven network Layout (PCA):
- Layout morphing: gradual transformation from one layout into another
- File->New->Network->Clone current network
- Layout->yFiles Layouts-> Organic - this is a layout purely network structure based (you can choose any other layout if you wish)
- Make sure the active network is network.sif_2 (should be highlighted in the left panel)
- Go to Tools->DeDaL-> Layout morphing
- In the dialogue window select network.sif and "align"
- Click OK.
- A new network will be opened and you will see the Slider dialogue
- Move the cursor to the right and follow a gradual transformation from an organic layout into a PCA layout
Morphing the organic network layout with the PCA-based layout
moves position of some of the genes, keeping the general pattern of
PCA preserved, while better reflecting the network structure.
- set it on a desired position and close the Slider window
- Layout aligning
Morphing between two network layouts might be meaningless if all nodes in one layout are systematically rotated or flipped with respect to the node positions in another layout. This situation is often the case when producing the pure data-driven layout and comparing it to the initial structure-driven layout. In this case, DeDaL allows minimizing the Euclidean distance between two layouts defined as the sum of squared Euclidean distances between all matched nodes with respect to all possible rotations and mirroring of one of the layouts.
- Go to Tools->DeDaL->Layout aligning
- Select reference network network.sif and Network to Align : network.sif_2 (in the screen shot, 3 extreme nodes are encircled with different colors and with the same colors the corresponding nodes in the second network)
- Click OK
- network.sif_1 (organic layout) is now rotated and/or mirrored to minimize distance to the network.sif the reference network (PCA Layout) what facilitates eye comparison (legend for the colors the same as above:3 extreme nodes are encircled with different colors and with the same colors the corresponding nodes in the second network)
- Save your working session
- If you want to work on this session later:
File->Save
THE END
Don't hesitate to contact us if you have questions
Contacts
References
- Cline MS, Smoot M, Cerami E et al. Integration of biological networks and gene expression data using Cytoscape. Nat Protoc. 2007;2(10):2366-82
- Jolliffe I.T. Principal Component Analysis, Series: Springer Series in Statistics, 2nd ed., Springer, NY, 2002, XXIX, 487 p. 28 illus. ISBN 978-0-387-95442-4
- Gorban A, Kegl B, Wunch D, Zinovyev A. (eds.) Principal Manifolds for Data Visualisation and Dimension Reduction. 2008. Lecture Notes in Computational Science and Engeneering 58, p.340.
- Gorban A.N., Zinovyev A.Principal manifolds and graphs in practice: from molecular biology to dynamical systems.. Int J Neural Syst, 2010; 20(3):219-32.
- Sartor MA, Mahavisno V, Keshamouni VG, Cavalcoli J et al. ConceptGen: a gene set enrichment and gene set relation mapping tool. Bioinformatics 2010 Feb 15;26(4):456-63. PMID: 20007254
- Prasad, TS Keshava, et al. Human protein reference database - 2009 update. Nucleic acids research 37.suppl 1 (2009): D767-D772.
- Costanzo, M., Baryshnikova, A., Bellay, J., Kim, Y., Spear, E.D., Sevier, C.S., Ding, H., Koh, J.L.Y., Toufighi,
K., Mostafavi, S., Prinz, J., St Onge, R.P., VanderSluis, B., Makhnevych, T., Vizeacoumar, F.J., Alizadeh, S.,
Bahr, S., Brost, R.L., Chen, Y., Cokol, M., Deshpande, R., Li, Z., Lin, Z.-Y., Liang, W., Marback, M., Paw, J.,
San Luis, B.-J., Shuteriqi, E., Tong, A.H.Y., van Dyk, N., Wallace, I.M., Whitney, J.A., Weirauch, M.T.,
Zhong, G., Zhu, H., Houry, W.A., Brudno, M., Ragibizadeh, S., Papp, B., P´al, C., Roth, F.P., Giaever, G.,
Nislow, C., Troyanskaya, O.G., Bussey, H., Bader, G.D., Gingras, A.-C., Morris, Q.D., Kim, P.M., Kaiser, C.A.,
Myers, C.L., Andrews, B.J., Boone, C.: The genetic landscape of a cell. Science 327(5964), 425–431 (2010).
doi:10.1126/science.1180823
- Kanehisa, M., Goto, S., Sato, Y., Furumichi, M., Tanabe, M.: Kegg for integration and interpretation of
large-scale molecular data sets. Nucleic Acids Res 40(Database issue), 109–114 (2012).
doi:10.1093/nar/gkr988
- Calzone, L., Tournier, L., Fourquet, S., Thieffry, D., Zhivotovsky, B., Barillot, E., Zinovyev, A.: Mathematical
modelling of cell-fate decision in response to death receptor engagement. PLoS Comput Biol 6(3), 1000702
(2010)
- Czerwinska, Urszula, et al. DeDaL: Cytoscape 3 app for producing and morphing data-driven and structure-driven network layouts. (2015) DOI: 10.1186/s12918-015-0189-4
Acknowledgements
Urszula Czerwinska is thankfull to the INSERM U900 unit for providing 5-month long internship, Eric Bonnet and Eric Viara for helping with the code and Loredana Martignetti for generously answering all questions.
|