One use case for a visualization tool related to cancer data is to visualize gene expression data across normal and cancerous tissue. The end-user would be a scientist aiming to identify genes that are upregulated in cancerous tissue for potential drug targeting. A scientist may use this tool for multiple purposes. They may already have a gene in mind and want to observe the distribution of that gene's expression levels across multiple normal tissues and cancer indications. This could be visualized with multiple boxplots representing gene expression distributions for the gene of interest, with each boxplot representing a specific tissue or cancer indication. This would allow them to analyze whether or not the gene may serve as a good target. For example, if the gene of interest is overexpressed in a specific cancerous tissue and less expressed in normal tissue, it may serve as a good target for a therapeutic drug that would then be able to target only tissue that overexpresses the gene of interest. The scientist may also not have a specific gene in mind and could use the visualization tool for more exploratory purposes. They may filter which genes they would like to see based on criteria they have and then select a cancer indication they are interested in. The resulting visualization would show them across all genes that matched filter criteria, which genes are most overexpressed in that cancer type compared to normal tissue. This could be represented with a line plot that plots the ratio of average cancer expression to normal expression on one axis against the ratio of prevalence of cancer overexpression to normal expression. Thus, from the visualization, the scientist would be able to isolate only genes with the highest average cancer overexpression as well as highest prevalence of cancer overexpression. This would give them a solid foundation for further research into best gene targets to utilize for therapeutic drug synthesis against their cancer type.
This visualization allows the user to select two genes to see how correlated they are. When graphed, the visualization will show both of the expression data on a scatterplot. In the scatterplot, red points represent TCGA data, yellow is TARGET data, and green is GTEX data, showing the different data sources. When the user clicks on a point, it will show the sample ID it comes from as well as exact expression counts for both genes.
This visualization allows the user to select the gene they would like to look at, as well as the disease they want to look at expression for. The visualization will then show gene expression counts for all normal tissue types (colored green) and compare it to the expression of the tumor tissue, highlighted in red. Hovering over each tissue types shows the exact maximum expression of that tissue. Clicking on any tissue will also highlight the correlating points in the scatterplot with the same tissue type in black.