miRNA normalization enables joint analysis of several datasets to increase sensitivity and to reveal novel miRNAs differentially expressed in breast cancer.


Ben-Elazar S(1)(2), Aure MR(3)(4), Jonsdottir K(5)(6), Leivonen SK(7), Kristensen VN(3)(4)(8)(9), Janssen EAM(5)(6), Kleivi Sahlberg K(3)(10), Lingjærde OC(3)(11), Yakhini Z(2)(12).
Author information:
(1)School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel.
(2)Department of Computer Science, Interdisciplinary Center, Herzliya, Israel.
(3)Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway.
(4)Department of Medical Genetics, Institute of Clinical Medicine, University of Oslo and Oslo University Hospital, Oslo, Norway.
(5)Department of Pathology, Stavanger University Hospital, Stavanger, Norway.
(6)Department of Chemistry, Bioscience and Environmental Engineering, University of Stavanger, Stavanger, Norway.
(7)Helsinki University Hospital Comprehensive Cancer Centre and University of Helsinki, Helsinki, Finland.
(8)Institute for Clinical Medicine, University of Oslo, Oslo, Norway.
(9)Department of Clinical Molecular Biology and Laboratory Science
(EpiGen), Division of Medicine, Akershus University Hospital, Lørenskog, Norway.
(10)Department of Research, Vestre Viken Hospital Trust, Drammen, Norway.
(11)Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway.
(12)Department of Computer Science, Technion-Israel Institute of Technology, Haifa, Israel.


Different miRNA profiling protocols and technologies introduce differences in the resulting quantitative expression profiles. These include differences in the presence (and measurability) of certain miRNAs. We present and examine a method based on quantile normalization, Adjusted Quantile Normalization (AQuN), to combine miRNA expression data from multiple studies in breast cancer into a single joint dataset for integrative analysis. By pooling multiple datasets, we obtain increased statistical power, surfacing patterns that do not emerge as statistically significant when separately analyzing these datasets. To merge several datasets, as we do here, one needs to overcome both technical and batch differences between these datasets. We compare several approaches for merging and jointly analyzing miRNA datasets. We investigate the statistical confidence for known results and highlight potential new findings that resulted from the joint analysis using AQuN. In particular, we detect several miRNAs to be differentially expressed in estrogen receptor (ER) positive versus ER negative samples. In addition, we identify new potential biomarkers and therapeutic targets for both clinical groups. As a specific example, using the AQuN-derived dataset we detect hsa-miR-193b-5p to have a statistically significant over-expression in the ER positive group, a phenomenon that was not previously reported. Furthermore, as demonstrated by functional assays in breast cancer cell lines, overexpression of hsa-miR-193b-5p in breast cancer cell lines resulted in decreased cell viability in addition to inducing apoptosis. Together, these observations suggest a novel functional role for this miRNA in breast cancer. Packages implementing AQuN are provided for Python and Matlab: https://github.com/YakhiniGroup/PyAQN.