A survey of transcriptome complexity using PacBio single-molecule real-time analysis combined with Illumina RNA sequencing for a better understanding of ricinoleic acid biosynthesis in Ricinus communis.
BACKGROUND : Ricinus communis is a highly economically valuable oil crop plant from the spurge family, Euphorbiaceae. However, the available reference genomes are incomplete and to date studies on ricinoleic acid biosynthesis at the transcriptional level are limited. RESULTS : In this study, we combined PacBio single-molecule long read isoform and Illumina RNA sequencing to identify the alternative splicing (AS) events, novel isoforms, fusion genes, long non-coding RNAs (lncRNAs) and alternative polyadenylation (APA) sites to unveil the transcriptomic complexity of castor beans and identify critical genes related to ricinoleic acid biosynthesis. Here, we identified 11,285 AS-variants distributed in 21,448 novel genes and detected 520 fusion genes, 320 lncRNAs and 9511 (APA-sites). Furthermore, a total of 6067, 5983 and 4058 differentially expressed genes between developing beans of the R. communis lines 349 and 1115 with extremely different oil content were identified at 7, 14 and 21 days after flowering, respectively. Specifically, 14, 18 and 11 DEGs were annotated encoding key enzymes related to ricinoleic acid biosynthesis reflecting the higher castor oil content of 1115 compared than 349. Quantitative real-time RT-PCR further validated fifteen of these DEGs at three-time points. CONCLUSIONS : Our results significantly improved the existed gene models of R. communis, and a putative model of key genes was built to show the differences between strains 349 and 1115, illustrating the molecular mechanism of castor oil biosynthesis. A multi-transcriptome database and candidate genes were provided to further improve the level of ricinoleic acid in transgenic crops.