Abstract

Myelodysplastic syndromes (MDSs) are chronic and often progressive myeloid neoplasms associated with remarkable heterogeneity in the histomorphology and clinical course. Various somatic mutations are involved in the pathogenesis of MDS. Recently, mutations in a gene encoding a spliceosomal protein, SF3B1, were discovered in a distinct form of MDS with ring sideroblasts. Whole exome sequencing of 15 patients with myeloid neoplasms was performed, and somatic mutations in spliceosomal genes were identified. Sanger sequencing of 310 patients was performed to assess phenotype/genotype associations. To determine the functional effect of spliceosomal mutations, we evaluated pre-mRNA splicing profiles by RNA deep sequencing. We identified additional somatic mutations in spliceosomal genes, including SF3B1, U2AF1, and SRSF2. These mutations alter pre-mRNA splicing patterns. SF3B1 mutations are prevalent in low-risk MDS with ring sideroblasts, whereas U2AF1 and SRSF2 mutations are frequent in chronic myelomonocytic leukemia and advanced forms of MDS. SF3B1 mutations are associated with a favorable prognosis, whereas U2AF1 and SRSF2 mutations are predictive for shorter survival. Mutations affecting spliceosomal genes that result in defective splicing are a new leukemogenic pathway. Spliceosomal genes are probably tumor suppressors, and their mutations may constitute diagnostic biomarkers that could potentially serve as therapeutic targets.

Introduction

The myelodysplastic syndromes (MDSs) are characterized by clonal hematopoiesis, a variety of chromosomal abnormalities, bone marrow (BM) failure, and a propensity for evolution to acute myeloid leukemia (AML). Because of their often protracted course, MDSs recapitulate the stages of acquisition of a malignant phenotype, thereby offering insights into leukemogenesis. Although, traditionally, histomorphology-based schemes have been applied to subclassify patients with MDS,1,2  this approach is unlikely to be reflective of the underlying pathogenesis. Instead, a better molecular characterization of MDS on the genomic, epigenetic, and genetic levels probably more objectively diagnoses conditions, determines patients' prognosis and, based on the underlying molecular defects, directs the application of targeted therapies. The emerging realization of the molecular diversity of MDS parallels the clinical and phenotypic heterogeneity of this disease. Moreover, molecular defects have the potential to serve as biomarkers and probably are more suitable for the identification of therapy targets and responsiveness/refractoriness to treatment.

The application of high-throughput molecular technologies, including high-density single nucleotide polymorphism arrays (SNP-As)3  and new sequencing technologies4,5  has led to the improved characterization of genomic lesions such as chromosomal aberrations and of somatic mutations affecting specific classes of genes,6  including signal transducers (eg, CBL),7-10  apoptotic genes (eg, TP53 and RAS),11-13  genes involved in epigenetic regulation of DNA (eg, DNMT3A, IDH1/2, and TET2),14-18  and histone modifiers (eg, EZH2, UTX, and ASXL1).19-24  Although some mutations in these factors are activating, most are loss-of-function or hypomorphic mutations and affect bona fide tumor suppressor genes (TSGs). Of greatest diagnostic effect are recurrent mutations found in specific genes. Most TSG mutations are not canonical, though, making systematic clinical diagnostics more difficult.

We and other groups recently identified a variety of new mutations present at distinct frequencies in subgroups of patients with MDS.25,26  Analogous to previously applied strategies that identified TET2, CBL, EZH2, and other mutations,7,18,21  we have identified mutations affecting, in a recurrent fashion, genes of the spliceosome machinery, using a combination of targeted search approaches and unbiased mass sequencing. Such mutations probably constitute a new class of TSGs ubiquitously involved in leukemogenesis.

Methods

Patient population

BM aspirates or blood samples were collected from 315 patients with MDS (n = 88), MDS/myeloproliferative neoplasms (MDS/MPNs; n = 66), MPN (n = 52), secondary AML (sAML; n = 54) that evolved from these conditions and primary AML (pAML; n = 55) seen at Cleveland Clinic or Nagoya University between 2003 and 2010 (Table 1). Informed consent for sample collection was obtained according to protocols approved by the institutional review boards and in accordance with the Declaration of Helsinki. Diagnosis was confirmed and assigned according to World Health Organization classification criteria.27  Low-risk MDS was defined as patients having < 5% myeloblasts. Patients with > 5% myeloblasts constituted those with advanced disease. Serial samples were obtained for 38 patients. To study the germline genotype, immunoselected CD3+ lymphocytes were used. Cytogenetic analysis was performed according to standard banding techniques on the basis of 20 metaphases. Clinical parameters studied included age, sex, overall survival (OS), blood counts, and metaphase cytogenetics. The median follow-up of the cohort was 18 months (1-168 months).

Table 1

Clinical characteristics of patients participating in this study

 No. 
MDS 88 
    Low risk 58 
    RCUD/RCMD/5q−/MDS-U* 38 
    RARS 20 
    High risk  
    RAEB 1/2* 30 
MDS/MPN 66 
    CMML/aCML/JMML 48 
    MDS/MPN-U* (RARS-T) 18 (11) 
MPN 52 
    PV/PMF/ET 16 
    CML 36 
AML 109 
    Primary AML 55 
    Secondary AML* 54 
 No. 
MDS 88 
    Low risk 58 
    RCUD/RCMD/5q−/MDS-U* 38 
    RARS 20 
    High risk  
    RAEB 1/2* 30 
MDS/MPN 66 
    CMML/aCML/JMML 48 
    MDS/MPN-U* (RARS-T) 18 (11) 
MPN 52 
    PV/PMF/ET 16 
    CML 36 
AML 109 
    Primary AML 55 
    Secondary AML* 54 

MDS indicates myelodysplastic syndromes; RCUD, refractory cytopenia with unilineage dysplasia; RCMD, refractory cytopenia with multilineage dysplasia; MDS-U, MDS unclassifiable; RARS, refractory anemia with ring sideroblasts; RAEB, refractory anemia with excess blasts; MDS/MPN, MDS/myeloproliferative neoplasms; CMML, chronic myelomonocytic leukemia; aCML, atypical chronic myeloid leukemia; JMML, juvenile myelomonocytic leukemia; RARS-T, RARS associated with marked thrombocytosis; PV, polycythemia vera; PMF, primary myelofibrosis; ET, essential thrombocythemia; and AML, acute myeloid leukemia.

*

Eight cases included with therapy-related myeloid malignancies.

Cytogenetics and SNP-A analyses

Technical details about sample processing for SNP-As were previously described.28,29  Affymetrix 250K and 6.0 Kit (Affymetrix) were used. A stringent algorithm was applied for the identification of SNP-A lesions. Patients with SNP-A lesions concordant with metaphase cytogenetics or typical lesions known to be recurrent required no further analysis. Changes reported in our internal or publicly available (Database of Genomic Variants30 ) copy number variation databases were considered nonsomatic and excluded. Results were analyzed with CNAG (Version 3.0)31  or Genotyping Console (Affymetrix). All other lesions were confirmed as somatic or germline by analysis of CD3-sorted cells.3 

Whole exome sequencing

Genomic DNA was extracted from BM or peripheral blood with the use of standard methods and subjected agarose gel and optical density ratio tests to confirm the purity and concentration before Covaris fragmentation. Fragmented genomic DNA (0.5-2.5 μg) was tested for size distribution and concentration with the use of an Agilent 2100 Bioanalyzer (Agilent Technologies). Illumina libraries were made from qualified fragmented gDNA with the use of NEBNext reagents (New England Biolabs), and the resulting libraries were subjected to exome enrichment with the use of NimbleGen SeqCap EZ Human Exome Library Version 2.0 (Roche NimbleGen) according to the manufacturer's instructions. Enriched libraries were tested for enrichment by quantitative PCR and for size distribution and concentration by an Agilent 2100 Bioanalyzer. The samples were then sequenced on an Illumina HiSeq2000 that generated paired end reads of 100 nucleotides. Paired BM mononuclear cells and CD3+ peripheral blood lymphocytes were used as germline controls. DNAnexus software (DNAnexus Inc; https://dnanexus.com) was used to visualize single nucleotide changes, insertions, and/or deletions at the gene, exon, and base pair levels. A rational bioanalytic algorithm was applied to identify candidate nonsynonymous alterations. Multiple steps were performed to reduce the false-positive rate within reported results. First, whole exome assembly was nonredundantly mapped with the reference genome hg19. Next, the analytic algorithm within DNAnexus called all the positions that vary from a reference genome. Each potential mutation was compared against databases of known SNPs, including Entrez Gene32  and the Ensembl Genome Browser.33  These candidate alterations were subtracted by the results of CD3+ peripheral blood DNA and subsequently validated with Sanger sequencing (see “Sanger sequencing analysis”). Moreover, spliceosome-associated gene mutations were screened with whole exome sequencing results available through The Cancer Genome Atlas (National Cancer Institute).

Sanger sequencing analysis

All exons of selected genes were amplified and underwent direct genomic sequencing by standard techniques on the ABI 3730xl DNA analyzer (Applied Biosystems) as previously described.7,18,34  All mutations were detected by bidirectional sequencing and scored as pathogenic if not present in nonclonal paired CD3-derived DNA. Frameshift mutations were validated by cloning and sequencing individual colonies (TOPO TA cloning; Invitrogen). For confirmation of the somatic nature of the mutations, exons containing mutations were tested in nonclonal control DNA.

Whole RNA deep sequencing

Total RNA was extracted from BM mononuclear cells with the use of the Nucleospin RNA II Kit (Macherey-Nagel) with DNAase treatment. The integrity and purity of total RNA were assessed with Agilent Bioanalyzer. cDNA (1-2 μg) was generated with Clontech SmartPCR cDNA kit (Clontech Laboratories) from 100 ng of total RNA. cDNA was fragmented with Covaris, profiled with Agilent Bioanalyzer, and subjected to Illumina library preparation with NEBNext reagents (New England Biolabs). The quality and quantity and the size distribution of the Illumina libraries were determined with an Agilent Bioanalyzer. The libraries were then submitted for Illumina HiSeq2000 sequencing according to standard procedures. Paired-end 90-bp reads were generated and subjected to data analysis with the use of the platform provided by DNAnexus. DNAnexus software allowed visualization of reads derived from spliced mRNA and those that completely match the genome, including both sense and antisense.

Statistical analysis of clinical data

The Kaplan-Meier method was used to analyze survival outcomes (OS) of subgroups characterized by the presence of mutant versus wild-type (WT) variants of specific spliceosome-associated gene mutations with the log-rank test (JMP9; SAS Institute). Significance was determined at a 1-sided α level of 0.05.

Results

Detection of somatic spliceosomal mutations in myeloid malignancies

Initially, we performed whole exome sequencing of 15 index cases with various forms of myeloid malignancies and identified distinct somatic mutations in genes encoding components of the spliceosomal machinery (supplemental Table 1, available on the Blood Web site; see the Supplemental Materials link at the top of the online article). A heterozygous missense mutation in U2AF1 (Q157R) was found in a patient with refractory cytopenia with unilineage dysplasia (RCUD) and uniparental disomy of 2q (UPD2q; Figure 1A). Analysis of DNA from CD3+ cells showed a much lower frequency of the base change (2 of 15 reads), highlighting the somatic nature of this alteration. The 2 reads originated from contamination by a small amount of the mutated clone. The contamination was confirmed by flow cytometric analysis after magnetic bead sorting. Similarly, heterozygous, somatic SF3B1 mutations (E622D) were detected in a patient with refractory anemia with ring sideroblasts (RARS) associated with marked thrombocytosis (RARS-T; Figure 1B). In a patient with sAML, we identified a somatic mutation (M1307I) in PRPF8 and a heterozygous mutation (R27X) in LUC7L2, respectively (Figure 1C). On the basis of these findings, we screened other rationally selected genes within the spliceosomal machinery in patients with MDS and related disorders (n = 120). With the use of this targeted approach, we identified mutations in ZRSR2 (W153X) and SRSF2 (P95R; Figure 1D-E). All of these mutations were somatic as confirmed by sequencing of the corresponding germline-derived DNA. Moreover, screening of whole exome sequencing results for AML available through the Cancer Genome Atlas showed the presence of mutations in genes encoding other components of the spliceosome, including mutations in HCFC1 (P72L), SAP130 (T247I), SRSF6 (W123S), SON (M1024V), and U2AF26 (C116X; supplemental Figure 1). We found that among the initially screened cohort of patients with myeloid malignancies, mutations were highly recurrent in U2AF1, SF3B1, and SRSF2, whereas mutations of other spliceosomal genes were each detected in 1 of 120 cases, indicating a frequency of < 1%. Although we identified 2 mutant copies of U2AF1 (21q22.3) in a sAML case with trisomy 21 (supplemental Figure 2), all other 19 cases of trisomy 21 screened for this mutation were negative. Moreover, no homozygous mutations were found in patients with UPD21q (n = 8). Similarly, all mutations of SF3B1 and SRSF2 were heterozygous, and patients with somatic UPD2q33.1 or UPD17q25 (regions containing SF3B1 and SRSF2, respectively) did not harbor homozygous mutations of the associated genes.

Figure 1

Somatic spliceosomal gene (U2AF1, SF3B1, SRSF2, LUC7L2, PRPF8, and ZRSR2) mutations as detected by next-generation sequencing (NGS) and Sanger sequencing technologies. (A) With the use of an NGS-based whole exome sequencing analysis of whole BM DNA from a patient with refractory cytopenia with unilineage dysplasia (left), a mutation of U2AF1 (21q22.3) at position 44 514 777 (T > C) was detected in 13 of 18 reads. Analysis of DNA from CD3+ cells showed a much lower frequency of the base change (2 of 15 reads; right), highlighting the somatic nature of this alteration. The finding was confirmed by Sanger sequencing. Arrows and bars indicate the specific nucleotide and predicted codon, respectively. It should be noted that U2AF1 is expressed from the minus strand; therefore, the NGS presentation (top panels) is complementally reversed in comparison to the Sanger sequencing results (middle panels). This heterozygous somatic mutation results in the predicted nucleotide change 470 A > G in exon 6 of the coding region, which lead to the amino acid change Q157R in the second zinc finger domain. In the entire cohort, 27 mutations were observed in 26 patients, including a whole gene deletion. All 26 missense mutations were located in 1 of the 2 zinc finger domains (ZNFs); 2 residues, S34 or Q157, were frequently affected (bottom panels). RRM indicates RNA recognition motif. (B) With the use of an NGS analysis of a patient with CMML (middle left), a mutation of SF3B1 (2q33.1) at position 198 267 491(C > G) was detected in 9 of 24 reads. The somatic nature of this alteration was confirmed by an analogous analysis of the CD3+ fraction, with the change being less frequent (2 of 23; middle right). The mutation was confirmed by Sanger sequencing (bottom). This heterozygous somatic mutation results in the nucleotide change 1866 G > T in exon 14 of SF3B1, resulting in the amino acid change E622D in the HSH155 domain. Analysis of the entire cohort identified mutations in 33 patients, including a case with a whole gene deletion. (C) Further screening by NGS led to the detection of a nonsense mutation (R27X) in LUC7L2 (7q34; top) which participates in the recognition of splice donor sites in association with the U1 snRNP spliceosomal subunit, and a missense mutation (M1307I) in PRPF8 (17p13.3; bottom) which is a large U5 snRNP-specific protein essential for pre-mRNA splicing. RS indicates serine/arginine-rich domain; U5 2-snRNA bdg, U5-snRNA binding site 2; and MPN, Mpr1p, Pad1p N-terminal domain. (D) Mutations of SRSF2, an arginine/serine-rich splicing factor, were detected in 29 cases among the entire cohort, including 2 whole gene deletions and a microdeletion within the gene (top). All mutations were heterozygous and affected P95. The somatic nature of the P95R mutation was confirmed with whole BM and T-cell rich fraction DNAs (bottom). (E) A nonsense mutation (W153X) was found in ZRSR2, another arginine/serine-rich splicing regulatory factor, in a case of CMML. ZRSR2 is located at Xp22.2, and the nonsense mutation was hemizygous in this male case (BM).

Figure 1

Somatic spliceosomal gene (U2AF1, SF3B1, SRSF2, LUC7L2, PRPF8, and ZRSR2) mutations as detected by next-generation sequencing (NGS) and Sanger sequencing technologies. (A) With the use of an NGS-based whole exome sequencing analysis of whole BM DNA from a patient with refractory cytopenia with unilineage dysplasia (left), a mutation of U2AF1 (21q22.3) at position 44 514 777 (T > C) was detected in 13 of 18 reads. Analysis of DNA from CD3+ cells showed a much lower frequency of the base change (2 of 15 reads; right), highlighting the somatic nature of this alteration. The finding was confirmed by Sanger sequencing. Arrows and bars indicate the specific nucleotide and predicted codon, respectively. It should be noted that U2AF1 is expressed from the minus strand; therefore, the NGS presentation (top panels) is complementally reversed in comparison to the Sanger sequencing results (middle panels). This heterozygous somatic mutation results in the predicted nucleotide change 470 A > G in exon 6 of the coding region, which lead to the amino acid change Q157R in the second zinc finger domain. In the entire cohort, 27 mutations were observed in 26 patients, including a whole gene deletion. All 26 missense mutations were located in 1 of the 2 zinc finger domains (ZNFs); 2 residues, S34 or Q157, were frequently affected (bottom panels). RRM indicates RNA recognition motif. (B) With the use of an NGS analysis of a patient with CMML (middle left), a mutation of SF3B1 (2q33.1) at position 198 267 491(C > G) was detected in 9 of 24 reads. The somatic nature of this alteration was confirmed by an analogous analysis of the CD3+ fraction, with the change being less frequent (2 of 23; middle right). The mutation was confirmed by Sanger sequencing (bottom). This heterozygous somatic mutation results in the nucleotide change 1866 G > T in exon 14 of SF3B1, resulting in the amino acid change E622D in the HSH155 domain. Analysis of the entire cohort identified mutations in 33 patients, including a case with a whole gene deletion. (C) Further screening by NGS led to the detection of a nonsense mutation (R27X) in LUC7L2 (7q34; top) which participates in the recognition of splice donor sites in association with the U1 snRNP spliceosomal subunit, and a missense mutation (M1307I) in PRPF8 (17p13.3; bottom) which is a large U5 snRNP-specific protein essential for pre-mRNA splicing. RS indicates serine/arginine-rich domain; U5 2-snRNA bdg, U5-snRNA binding site 2; and MPN, Mpr1p, Pad1p N-terminal domain. (D) Mutations of SRSF2, an arginine/serine-rich splicing factor, were detected in 29 cases among the entire cohort, including 2 whole gene deletions and a microdeletion within the gene (top). All mutations were heterozygous and affected P95. The somatic nature of the P95R mutation was confirmed with whole BM and T-cell rich fraction DNAs (bottom). (E) A nonsense mutation (W153X) was found in ZRSR2, another arginine/serine-rich splicing regulatory factor, in a case of CMML. ZRSR2 is located at Xp22.2, and the nonsense mutation was hemizygous in this male case (BM).

Clinical associations and frequencies of spliceosomal mutations in myeloid malignancies

We subsequently screened a large cohort of patients (n = 310) with MDS and related disorders (including the initially screened 120 cases) in a stepwise fashion to determine the frequency of spliceosomal mutations discovered in the index cases. On the basis of the initial screen, we noted that mutations in U2AF1, SF3B1, and SRSF2 were the most frequent. Consequently, we sequenced 120 cases for ZRSR2, LUC7L2, and PRPF8 and a total 310 for SF3B1, U2AF1, and SRSF2. All SF3B1 mutations were located in exon 14 or 15, with the K700 mutation being the most recurrent (Figure 1B). Similarly, all mutations in SRSF2 affected position P95 (Figure 1D). In contrast, mutations in U2AF1 affected exons 2 and 6, corresponding to the 2 zinc finger domains of this protein (Figure 1A; supplemental Figure 3). The extended cohort of patients was used to identify phenotype/genotype associations (supplemental Table 2). To avoid any misunderstanding, we included information as to which cases were analyzed by whole exome sequencing in supplemental Table 2. In addition, in supplemental Table 1, we clarified the mutational status of spliceosomal genes. In low-risk MDS, mutations of any 1 of these 3 genes were found in 39% of patients, and further analysis found that mutations in SF3B1 were highly associated with RARS. Among patients with MDS/MPN, SF3B1 mutations were not common in chronic myelomonocytic leukemia (CMML), but they were frequent in patients with RARS-T; thus, the presence of RS was found to correlate highly with SF3B1 mutations, irrespective of other clinical or morphologic features (Figure 2A). In contrast, U2AF1 mutations were most frequent in the high-risk MDS/AML cohort (11%), whereas SRSF2 was most frequently mutated in MDS/MPN (24%) particularly in CMML (28%; Figure 2A).

Figure 2

Frequency and phenotypic association of spliceosomal mutations in myeloid malignancies. (A) In the entire cohort (n = 310), a total of 88 mutations in the spliceosome pathway components U2AF1, SF3B1, and SRSF2 were observed in every subtype of myeloid malignancies, except for MPN. In low-risk MDS, SF3B1 mutations were most frequent among the 3 genes. In particular, SF3B1 was mutated in 15 of 20 cases of RARS (60%). In the high-risk MDS and AML group, U2AF1 mutations were most frequent (15 of 139; 10.8%). In the MDS/MPN group, SRSF2 was most frequently mutated (13 of 46; 28.2%), whereas SF3B1 is mutated at a high frequency in RARS-T (10 of 11; 90.1%). (B) Effect of spliceosomal mutations on clinical outcomes. In the entire cohort, patients with U2AF1 mutations (MT) had worse OS, compared with WT, but SF3B1 mutations made OS significantly shorter. In low-risk MDS, mutation of SF3B1 was a good prognostic factor, but SRSF2 mutations are associated with worse prognosis. In MDS/MPN, patients with mutated U2AF1 had a shorter OS, but SF3B1 mutations were associated with significantly better prognosis. In addition, SRSF2 mutations did not affect outcomes.

Figure 2

Frequency and phenotypic association of spliceosomal mutations in myeloid malignancies. (A) In the entire cohort (n = 310), a total of 88 mutations in the spliceosome pathway components U2AF1, SF3B1, and SRSF2 were observed in every subtype of myeloid malignancies, except for MPN. In low-risk MDS, SF3B1 mutations were most frequent among the 3 genes. In particular, SF3B1 was mutated in 15 of 20 cases of RARS (60%). In the high-risk MDS and AML group, U2AF1 mutations were most frequent (15 of 139; 10.8%). In the MDS/MPN group, SRSF2 was most frequently mutated (13 of 46; 28.2%), whereas SF3B1 is mutated at a high frequency in RARS-T (10 of 11; 90.1%). (B) Effect of spliceosomal mutations on clinical outcomes. In the entire cohort, patients with U2AF1 mutations (MT) had worse OS, compared with WT, but SF3B1 mutations made OS significantly shorter. In low-risk MDS, mutation of SF3B1 was a good prognostic factor, but SRSF2 mutations are associated with worse prognosis. In MDS/MPN, patients with mutated U2AF1 had a shorter OS, but SF3B1 mutations were associated with significantly better prognosis. In addition, SRSF2 mutations did not affect outcomes.

Effect of spliceosomal mutations on clinical outcomes

Subsequently, we studied the effect of the most common spliceosomal mutations on clinical outcomes. We first analyzed the entire cohort of patients (Table 1) and determined the survival of patients in whom the 3 most common spliceosomal mutations were present. We included the information as to SF3B1 mutation status of 39 cases reported in our previous study.35  When 310 patients genotyped for these mutations were analyzed, the presence of SF3B1 mutations was associated with longer survival and U2AF1 mutations with shorter survival, whereas SRSF2 mutations had no effect on survival. We then analyzed the effect of these mutations in more clinically uniform subgroups to more precisely determine their clinical consequences. As expected, in sAML and pAML, because of overall poor prognosis, the presence of spliceosomal mutations did not further affect survival (supplemental Figure 4). However, in low-risk MDS, patients with SF3B1 mutations showed a tendency toward better prognosis (P = .09), whereas those with SRSF2 mutations had worse survival. In MDS/MPN, SRSF2 mutations were more common than U2AF1; however, U2AF1 mutations were associated with shorter survival (Figure 2B). When serial samples were analyzed, we found that an U2AF1 mutation detected at the sAML stage was present from initial MDS presentations, suggesting an ancestral origin of this mutation (supplemental Figure 5). Overall, SF3B1 mutations were less prevalent in patients with advanced forms of MDS, indicating that mutation of this factor does not contribute to progression (Figure 2A).

Effects of spliceosomal mutations on spliceosomal function

Conceptually, mutations of spliceosomal proteins could result in defective splicing, including intron retention, altered splice site recognition, or altered alternative splicing. To determine the functional consequences of spliceosomal mutations on splicing, we performed whole mRNA deep sequencing. In the presence of functional spliceosomal machinery, sequencing reads are expected not to cross the intron/exon boundaries and therefore should not contain any intronic sequences. We analyzed RNA sequencing results in patients with mutations in U2AF1 (n = 3), SRSF2 (n = 2), SF3B1 (n = 2), and U2AF26 (n = 1), as well as in a healthy control and 1 patients with MDS with a WT configuration of these genes. No genome-wide increase in intron retention was observed in the patients with mutations. However, we found several specific genes in which the splicing pattern was altered. For instance, U2AF1 mutations were associated with defective splicing of intron 5 of TET2 at both splice sites (Figure 3A; supplemental Figure 6), whereas splicing of other TET2 introns were less affected. Another gene in which splicing was affected was RUNX1. At both the 3′ and 5′ splice sites of RUNX1 intron 6, unspliced reads were more frequent than spliced reads (Figure 3B). Such a splicing abnormality was more prominent in cases with SRSF2 mutations but not detected in the cases with SF3B1 mutations or WT spliceosomal genes. Similarly, U2AF26 mutations resulted in an alteration of RUNX1 splicing (supplemental Figure 7). Moreover, alternative splicing analysis showed that exon 9 of FECH was skipped in U2AF1 mutant cases but not in U2AF1 WT cases, including those with SF3B1 mutations (supplemental Figure 8).

Figure 3

Unsplicing of specific genes because of spliceosomal mutations as detected by deep RNA sequencing. Next-generation–based RNA deep sequencing was used to quantitatively study splicing patterns. (A) The top panel shows the intron 5 and exon 6 boundary of TET2 (dotted line). Five reads correspond to transcripts that were not spliced (unspliced) and 4 were spliced at this boundary. The bottom panel shows read counts at the 5′ and 3′ splice sites of each intron (3-10) of TET2. White and black bars indicate the number of spliced and unspliced reads, respectively. In a case of AML with a U2AF1 mutation, more unspliced than spliced reads were observed at the 3′ splice site of intron 5 (left panel), probably because of a loss of spliceosome function. However, unspliced RNAs were less frequent than spliced RNAs in WT RNA sequencing (right panel). (B) At both the 3′ and 5′ splice sites of RUNX1 intron 6, unspliced reads were more frequent than spliced reads in AML cases with U2AF1 and SRSF2 mutations. However, there were fewer unspliced transcripts at the same site in WT and SF3B1 mutant samples. Splicing abnormalities in the selected genes are summarized (bottom right), including the results presented in detail in supplemental Figures 7 and 8.

Figure 3

Unsplicing of specific genes because of spliceosomal mutations as detected by deep RNA sequencing. Next-generation–based RNA deep sequencing was used to quantitatively study splicing patterns. (A) The top panel shows the intron 5 and exon 6 boundary of TET2 (dotted line). Five reads correspond to transcripts that were not spliced (unspliced) and 4 were spliced at this boundary. The bottom panel shows read counts at the 5′ and 3′ splice sites of each intron (3-10) of TET2. White and black bars indicate the number of spliced and unspliced reads, respectively. In a case of AML with a U2AF1 mutation, more unspliced than spliced reads were observed at the 3′ splice site of intron 5 (left panel), probably because of a loss of spliceosome function. However, unspliced RNAs were less frequent than spliced RNAs in WT RNA sequencing (right panel). (B) At both the 3′ and 5′ splice sites of RUNX1 intron 6, unspliced reads were more frequent than spliced reads in AML cases with U2AF1 and SRSF2 mutations. However, there were fewer unspliced transcripts at the same site in WT and SF3B1 mutant samples. Splicing abnormalities in the selected genes are summarized (bottom right), including the results presented in detail in supplemental Figures 7 and 8.

Discussion

In recent years, a number of somatic mutations have been associated with various myeloid malignancies, including MDS.15-17,19,22  Systematic screening studies found that such mutations do not only occur as sole abnormalities but are often found in combination, probably contributing to the phenotypic heterogeneity found in MDS.5,25,26  During our search for new molecular lesions associated with MDS, we identified somatic mutations affecting various components of the spliceosomal machinery, similar to previous reports.35-39  Until these recent results were obtained, spliceosomal dysfunction was only rarely associated with malignant transformation.

In our studies in myeloid malignancies, the most commonly affected spliceosomal genes included SF3B1, U2AF1, and SRSF2. Although these mutations were common and found in a wide spectrum of myeloid diseases, some are strongly associated with specific phenotypic features, such as SF3B1 mutations with MDS or MDS/MPN with RS and SRSF2 mutations with CMML and advanced forms of MDS such as sAML and RA with excess blasts. Serial studies performed on patients from the initial diagnosis of low-risk MDS through subsequent transformation indicate that U2AF1 mutations may represent early ancestral events. A higher cross-sectional prevalence among advanced cases suggests that both U2AF1 and SRSF2 constitute high-risk defects. Indeed, U2AF1 and SRSF2 mutations were associated with worse survival in CMML and low-risk MDS, respectively. In contrast, SF3B1 mutations are associated with generally good prognosis, compatible with their association with RARS, which has a protracted clinical course.37,39  That these mutations were not overlapping and also not found in cases with other spliceosomal defects suggests that the effect of a spliceosomal mutation on myeloid malignancies might be inhibited by mutation in another factor or that they might not have a synergetic effect on leukemogenesis.

The prognostic effect of SF3B1 mutations has been investigated in several studies. Although initial reports (354 patients) suggested SF3B1 mutations convey significantly longer OS, leukemia-free survival (LFS), and event-free survival in their cohort,37  subsequent analysis of 323 MDS cases, patients carrying an SF3B1 mutation showed a significantly better OS and no effect on LFS in RARS and refractory cytopenia with multilineage dysplasia with ringed sideroblasts (RCMD-RS). The “protective effect” of SF3B1 mutation on OS was lost in multivariate analysis, including clinical variables such as International Prognostic Scoring System score.39  Our results in the whole cohort of 310 patients with MDS were generally consistent with those studies. However, we could not detect a significant difference in OS when subgroups with RS were analyzed separately, perhaps because of low number of WT cases. More recently, a large series of patients with MDS (N = 317) showed that the SF3B1 mutation status was not associated with time to AML progression or OS, regardless if all patients or only the subgroup of patients with RS were included in the analyses.40  Similar results were reported for myelofibrosis.41  When the survival benefit of SF3B1 mutations was examined in patients with MDS with increased RS, the presence of SF3B1 mutations was associated with better OS and LFS in univariate analysis; however, significance was completely accounted for by the World Health Organization categorization for morphologic risk, and no additional prognostic value from the presence or absence of SF3B1 mutations was observed when RARS and RCMD-RS were analyzed separately.42  In sum, most of the available results suggest that SF3B1 mutations convey better survival in the cohort of whole myeloid malignancies, including RARS and RCMD-RS. However, such significant effects seem less in each background-matched specific disease phenotype by multivariate analysis, including, for example, International Prognostic Scoring System score.

Most of the mutations we detected were heterozygous, indicating that homozygous mutations may lead to cell death or that some of the functional consequences are related to dominant negative effects. Other mutations, including those in LUC7L2, ZRSR2, and PRPF8, appear to be less prevalent, and, because of the small number of positive cases within the already large cohort studied, we were unable to establish whether these mutations correlated with characteristic phenotypic features and survival. It is possible that the mutations lead to distinct phenotypes by affecting different stages of splicing and/or by causing defective splicing of specific gene transcripts. It is also possible that the phenotypic features are related to the transcriptional spectrum of cells in which the mutations occur. For instance, SF3B1 mutations may affect splicing of transcripts coding for proteins associated with iron handling in erythroid precursors, leading to RS. Such phenotypes would then be absent in myeloid cells lacking the corresponding transcripts.

The precise pathogenetic mechanisms associated with facilitation of clonal evolution remain unclear. Mechanistically, defective splicing of specific genes may have similar consequences to loss of function mutations through the retention of introns (supplemental Figure 9). We have performed deep sequencing of mRNA in hematopoietic cells derived from mutant cases and compared the results obtained in cells with a corresponding WT form of the gene. For U2AF1 and U2AF26 we found that accumulation of unspliced transcripts rather than abnormal alternative splicing is the main consequence of the mutations. We did not observe widespread defective splicing, but rather distinct changes in specific introns, as seen in RUNX1, that would functionally have the same effect as RUNX1 mutations. This splicing abnormality of RUNX1 is also observed in cases with SRSF2 mutations but not in SF3B1 mutants. U2AF1 and SRSF2 are frequently mutated in high-risk MDS and CMML whereby RUNX1 mutations are relevant, which explains our theory that spliceosomal mutations results in similar phenotype to the corresponding loss-of-function mutations. Such an indirect alteration of key proteins known to be involved in malignant transformation could explain the leukemogenic effects of spliceosomal mutations, in particular how spliceosomal mutations may result in a phenocopy of features associated with known TSG mutations. The most important results of this study of the effect of spliceosomal mutations on the transcriptome are that the associated splicing defects affect a specific subset of mRNAs (Figure 3; supplemental Figures 8-9). Some of these RNAs code for key proteins involved in malignant transformation. Thus, there is a functional convergence of diverse spliceosomal mutations toward effects on specific genes, a phenomenon that explains the similar phenotypes of some of the different mutations. In contrast, mutations in other genes can lead to unsplicing of distinct genes and thereby result in particular phenotypic features.

According to a recent report, knock down of U2AF1 expression leads to the skipping of a specific exon of FECH during splicing.43  On the basis of the same phenomenon seen in alternative splicing analysis with the use of mRNA deep sequencing of U2AF1 mutants, among the 3 most frequently mutated spliceosomal genes, U2AF1 mutations might result in loss of function. Further studies will clarify the whole alteration of mRNA splicing profiles induced by spliceosomal gene mutations.

Interestingly, spliceosome mutations, specific in SF3B1, were reported in chronic lymphocytic leukemia.44-46  The discovery of recurrent somatic mutations in various genes encoding spliceosomal proteins indicates that spliceosomal defects constitute an important and ubiquitous pathway in malignant transformation. Further mutational screening in other malignancies might uncover the pathogenesis of common or unique tumorgenesis in various tissues.

The most relevant question, both biologically and clinically, is which other gene mutations are associated with the mutation status of individual spliceosomal genes. According to our preliminary results, U2AF1 mutations seem to be most commonly associated with ASXL1 and TET2 mutations, whereas SF3B1 can occur in the context of RUNX1 mutations (data not shown). Some presentations by other groups at the 2011 meeting of the American Society of Hematology suggested a correlation between SF3B1 and DNMT3A mutations47,48  and a favorable survival effect of RUNX1 mutations in CMML with SRSF2.49  Because of the large numbers of diverse mutations, large studies will be needed to fully evaluate the correlation between the mutation status of spliceosomal genes and other genes in a proper fashion.

In sum, our studies found the widespread presence of mutations in genes involved in splicing in myeloid neoplasms. Such mutations may lead to distinct phenotypes and, because there is effect on survival, their detection may have future diagnostic utility.

The online version of this article contains a data supplement.

Acknowledgments

The authors thank The Cancer Genome Atlas for access to the whole genome sequencing results described in the text.

This work was supported by the National Institutes of Health (grants RO1HL-082983, J.P.M.; U54 RR019391, J.P.M.; and K24 HL-077522, J.P.M.); by a grant from the AA & MDS International Foundation, and by the Robert Duggan Charitable Fund (J.P.M.).

National Institutes of Health

Authorship

Contribution: H.M. designed and performed research, collected data, performed statistical analysis, and wrote the manuscript; V.V. performed research, collected data, and wrote the manuscript; H.S., A.M.J., S.A.K., A.J., and B.P. performed research; M.B., K.G., and M.G.A. collected data; M.A.S. and R.V.T. collected data, analyzed and interpreted data, and wrote the manuscript; R.A.P. contributed analytical tools, collected data, analyzed and interpreted data, and wrote the manuscript; and J.P.M. designed research, contributed analytical tools, collected data, analyzed and interpreted data, and wrote the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Jaroslaw P. Maciejewski, Taussig Cancer Institute/R40, 9500 Euclid Ave, Cleveland, OH 44195; e-mail: maciejj@ccf.org.

References

References
1
Greenberg
P
Cox
C
LeBeau
MM
, et al. 
International scoring system for evaluating prognosis in myelodysplastic syndromes.
Blood
1997
, vol. 
89
 
6
(pg. 
2079
-
2088
)
2
Vardiman
JW
Thiele
J
Arber
DA
, et al. 
The 2008 revision of the World Health Organization (WHO) classification of myeloid neoplasms and acute leukemia: rationale and important changes.
Blood
2009
, vol. 
114
 
5
(pg. 
937
-
951
)
3
Tiu
RV
Gondek
LP
O'Keefe
CL
, et al. 
New lesions detected by single nucleotide polymorphism array-based chromosomal analysis have important clinical impact in acute myeloid leukemia.
J Clin Oncol
2009
, vol. 
27
 
31
(pg. 
5219
-
5226
)
4
Ley
TJ
Mardis
ER
Ding
L
, et al. 
DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome.
Nature
2008
, vol. 
456
 
7218
(pg. 
66
-
72
)
5
Bejar
R
Stevenson
K
Abdel-Wahab
O
, et al. 
Clinical effect of point mutations in myelodysplastic syndromes.
N Engl J Med
2011
, vol. 
364
 
26
(pg. 
2496
-
2506
)
6
Mardis
ER
Ding
L
Dooling
DJ
, et al. 
Recurring mutations found by sequencing an acute myeloid leukemia genome.
N Engl J Med
2009
, vol. 
361
 
11
(pg. 
1058
-
1066
)
7
Dunbar
AJ
Gondek
LP
O'Keefe
CL
, et al. 
250K single nucleotide polymorphism array karyotyping identifies acquired uniparental disomy and homozygous mutations, including novel missense substitutions of c-Cbl, in myeloid malignancies.
Cancer Res
2008
, vol. 
68
 
24
(pg. 
10349
-
10357
)
8
Sanada
M
Suzuki
T
Shih
LY
, et al. 
Gain-of-function of mutated C-CBL tumour suppressor in myeloid neoplasms.
Nature
2009
, vol. 
460
 
7257
(pg. 
904
-
908
)
9
Bandi
SR
Brandts
C
Rensinghoff
M
, et al. 
E3 ligase-defective Cbl mutants lead to a generalized mastocytosis and myeloproliferative disease.
Blood
2009
, vol. 
114
 
19
(pg. 
4197
-
4208
)
10
Makishima
H
Cazzolli
H
Szpurka
H
, et al. 
Mutations of E3 ubiquitin ligase Cbl family members constitute a novel common pathogenic lesion in myeloid malignancies.
J Clin Oncol
2009
, vol. 
27
 
36
(pg. 
6109
-
6116
)
11
Jadersten
M
Saft
L
Smith
A
, et al. 
TP53 mutations in low-risk myelodysplastic syndromes with del(5q) predict disease progression.
J Clin Oncol
2011
, vol. 
29
 
15
(pg. 
1971
-
1979
)
12
Beer
PA
Delhommeau
F
LeCouedic
JP
, et al. 
Two routes to leukemic transformation after a JAK2 mutation-positive myeloproliferative neoplasm.
Blood
2010
, vol. 
115
 
14
(pg. 
2891
-
2900
)
13
Jasek
M
Gondek
LP
Bejanyan
N
, et al. 
TP53 mutations in myeloid malignancies are either homozygous or hemizygous due to copy number-neutral loss of heterozygosity or deletion of 17p.
Leukemia
2010
, vol. 
24
 
1
(pg. 
216
-
219
)
14
Ley
TJ
Ding
L
Walter
MJ
, et al. 
DNMT3A mutations in acute myeloid leukemia.
N Engl J Med
2010
, vol. 
363
 
25
(pg. 
2424
-
2433
)
15
Shah
MY
Licht
JD
DNMT3A mutations in acute myeloid leukemia.
Nat Genet
2010
, vol. 
43
 
4
(pg. 
289
-
290
)
16
Green
A
Beer
P
Somatic mutations of IDH1 and IDH2 in the leukemic transformation of myeloproliferative neoplasms.
N Engl J Med
2010
, vol. 
362
 
4
(pg. 
369
-
370
)
17
Delhommeau
F
Dupont
S
Della Valle
V
, et al. 
Mutation in TET2 in myeloid cancers.
N Engl J Med
2009
, vol. 
360
 
22
(pg. 
2289
-
2301
)
18
Jankowska
AM
Szpurka
H
Tiu
RV
, et al. 
Loss of heterozygosity 4q24 and TET2 mutations associated with myelodysplastic/myeloproliferative neoplasms.
Blood
2009
, vol. 
113
 
25
(pg. 
6403
-
6410
)
19
Ernst
T
Chase
AJ
Score
J
, et al. 
Inactivating mutations of the histone methyltransferase gene EZH2 in myeloid disorders.
Nat Genet
2010
, vol. 
42
 
8
(pg. 
722
-
726
)
20
Nikoloski
G
Langemeijer
SM
Kuiper
RP
, et al. 
Somatic mutations of the histone methyltransferase gene EZH2 in myelodysplastic syndromes.
Nat Genet
2010
, vol. 
42
 
8
(pg. 
665
-
667
)
21
Makishima
H
Jankowska
AM
Tiu
RV
, et al. 
Novel homo- and hemizygous mutations in EZH2 in myeloid malignancies.
Leukemia
2010
, vol. 
24
 
10
(pg. 
1799
-
1804
)
22
van Haaften
G
Dalgliesh
GL
Davies
H
, et al. 
Somatic mutations of the histone H3K27 demethylase gene UTX in human cancer.
Nat Genet
2009
, vol. 
41
 
5
(pg. 
521
-
523
)
23
Gelsi-Boyer
V
Trouplin
V
Adelaide
J
, et al. 
Mutations of polycomb-associated gene ASXL1 in myelodysplastic syndromes and chronic myelomonocytic leukaemia.
Br J Haematol
2009
, vol. 
145
 
6
(pg. 
788
-
800
)
24
Boultwood
J
Perry
J
Zaman
R
, et al. 
High-density single nucleotide polymorphism array analysis and ASXL1 gene mutation screening in chronic myeloid leukemia during disease progression.
Leukemia
2010
, vol. 
24
 
6
(pg. 
1139
-
1145
)
25
Kohlmann
A
Grossmann
V
Klein
HU
, et al. 
Next-generation sequencing technology reveals a characteristic pattern of molecular mutations in 72.8% of chronic myelomonocytic leukemia by detecting frequent alterations in TET2, CBL, RAS, and RUNX1.
J Clin Oncol
2010
, vol. 
28
 
24
(pg. 
3858
-
3865
)
26
Jankowska
AM
Makishima
H
Tiu
RV
, et al. 
Mutational spectrum analysis of chronic myelomonocytic leukemia includes genes associated with epigenetic regulation: UTX, EZH2, and DNMT3A.
Blood
2011
, vol. 
118
 
14
(pg. 
3932
-
3941
)
27
Shaffer
LG
Tommerup
N
ISCN 2009.
An International System for Human Cytogenetics Nomenclature
2009
Basel, Switzerland
Karger
28
Maciejewski
JP
Tiu
RV
O'Keefe
C
Application of array-based whole genome scanning technologies as a cytogenetic tool in haematological malignancies.
Br J Haematol
2009
, vol. 
146
 
5
(pg. 
479
-
488
)
29
Gondek
LP
Tiu
R
O'Keefe
CL
Sekeres
MA
Theil
KS
Maciejewski
JP
Chromosomal lesions and uniparental disomy detected by SNP arrays in MDS, MDS/MPD, and MDS-derived AML.
Blood
2008
, vol. 
111
 
3
(pg. 
1534
-
1542
)
30
Centre for Applied Genomics
Database of Genomic Variants.
Accessed April 1, 2011 
31
Nannya
Y
Sanada
M
Nakazaki
K
, et al. 
A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays.
Cancer Res
2005
, vol. 
65
 
14
(pg. 
6071
-
6079
)
32
National Center for Biotechnology Information
Entrez Gene.
Accessed April 7, 2011 
33
European Molecular Biology Laboratory and European Bioinformatics Institute
Ensembl Genome Browser.
Accessed April 1, 2011 
34
Makishima
H
Jankowska
AM
McDevitt
MA
, et al. 
CBL, CBLB, TET2, ASXL1, and IDH1/2 mutations and additional chromosomal aberrations constitute molecular events in chronic myelogenous leukemia.
Blood
2011
, vol. 
117
 
21
(pg. 
e198
-
e206
)
35
Visconte
V
Makishima
H
Jankowska
A
, et al. 
SF3B1, a splicing factor is frequently mutated in refractory anemia with ring sideroblasts [published online ahead of print September 2, 2011].
Leukemia
 
36
Yoshida
K
Sanada
M
Shiraishi
Y
, et al. 
Frequent pathway mutations of splicing machinery in myelodysplasia.
Nature
2011
, vol. 
478
 
7367
(pg. 
64
-
69
)
37
Papaemmanuil
E
Cazzola
M
Boultwood
J
, et al. 
Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts.
N Engl J Med
2011
, vol. 
365
 
15
(pg. 
1384
-
1395
)
38
Graubert
TA
Shen
D
Ding
L
, et al. 
Recurrent mutations in the U2AF1 splicing factor in myelodysplastic syndromes.
Nat Genet
2011
, vol. 
44
 
1
(pg. 
53
-
57
)
39
Malcovati
L
Papaemmanuil
E
Bowen
DT
, et al. 
Clinical significance of SF3B1 mutations in myelodysplastic syndromes and myelodysplastic/myeloproliferative neoplasms.
Blood
2011
, vol. 
118
 
26
(pg. 
6239
-
6246
)
40
Damm
F
Thol
F
Kosmider
O
, et al. 
SF3B1 mutations in myelodysplastic syndromes: clinical associations and prognostic implications [published online ahead of print November 8, 2011].
Leukemia
 
41
Lasho
TL
Finke
CM
Hanson
CA
, et al. 
SF3B1 mutations in primary myelofibrosis: clinical, histopathology and genetic correlates among 155 patients [published online ahead of print November 8, 2011].
Leukemia
 
42
Patnaik
MM
Lasho
TL
Hodnefield
JM
, et al. 
SF3B1 mutations are prevalent in myelodysplastic syndromes with ring sideroblasts but do not hold independent prognostic value.
Blood
2012
, vol. 
119
 
2
(pg. 
569
-
572
)
43
Fu
Y
Masuda
A
Ito
M
Shinmi
J
Ohno
K
AG-dependent 3′-splice sites are predisposed to aberrant splicing due to a mutation at the first nucleotide of an exon.
Nucleic Acids Res
2011
, vol. 
39
 
10
(pg. 
4396
-
4404
)
44
Rossi
D
Bruscaggin
A
Spina
V
, et al. 
Mutations of the SF3B1 splicing factor in chronic lymphocytic leukemia: association with progression and fludarabine-refractoriness.
Blood
2011
, vol. 
118
 
26
(pg. 
6904
-
6908
)
45
Wang
L
Lawrence
MS
Wan
Y
, et al. 
SF3B1 and other novel cancer genes in chronic lymphocytic leukemia.
N Engl J Med
2011
, vol. 
365
 
26
(pg. 
2497
-
2506
)
46
Quesada
V
Conde
L
Villamor
N
, et al. 
Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia.
Nat Genet
2011
, vol. 
44
 
1
(pg. 
47
-
52
)
47
Bejar
R
Stevenson
K
Caughey
B
, et al. 
Validation of a prognostic model and the impact of SF3B1, DNMT3A, and other mutations in 289 genetically characterized lower risk MDS patient samples [abstract].
Blood
2011
, vol. 
118
 
21
 
Abstract 969
48
Smith
AE
Mian
SA
Kulasekararaj
AG
Mohamedali
AM
Mufti
GJ
Whole exome sequencing reveals acquired SF3B1 mutations defining patients with acquired idiopathic sideroblastic anaemia [abstract].
Blood
2011
, vol. 
118
 
21
 
Abstract 2793
49
Schnittger
S
Meggendorfer
M
Kohlmann
A
, et al. 
SRSF2 is Mutated in 47.2% (77/163) of chronic myelomonocytic leukemia (CMML) and prognostically favorable in cases with concomitant RUNX1 mutations [abstract].
Blood
2011
, vol. 
118
 
21
 
Abstract 274