Dear Pantelis, 1) using the strain 'PA14 as a popular reference genome' (quoted from Dr. Apid.) a) the mapping anomaly vanishes: YAbR21-OD3-B133-3.bam 8161752 reads; of these: 8161752 (100.00%) were unpaired; of these: 317754 (3.89%) aligned 0 times 1774200 (21.74%) aligned exactly 1 time 6069798 (74.37%) aligned >1 times 96.11% overall alignment rate YAbR24-OD3-C3719-3.bam 4777820 reads; of these: 4777820 (100.00%) were unpaired; of these: 292570 (6.12%) aligned 0 times 2012491 (42.12%) aligned exactly 1 time 2472759 (51.75%) aligned >1 times 93.88% overall alignment rate b) the rRNA fractions are HUGE, really: In YAbR21-OD3-B133-3.bam 7839425 reads map to any annotation, 5958688 map to rRNAs In YAbR24-OD3-C3719-3.bam 4618578 reads map to any annotation, 1980868 map to rRNAs 2) Using the strain-specific mapping and annoation: In YAbR21-OD3-B133-3.bam 7908761 reads map to any annotation, 5946347 map to rRNAs In YAbR24-OD3-C3719-3.bam 2739311 reads map to any annotation, 371275 map to rRNAs BW, Martin Dear Pantelis, a) the PseudomonasAeruginosa sample IonXpress_003 YAbR21-OD3-B133-3 was aligned to strain B136-33 https://www.ncbi.nlm.nih.gov/genome/187?genome_assembly_id=165506 Mapping statistics: 8086125 reads; of these: 8086125 (100.00%) were unpaired; of these: 217419 (2.69%) aligned 0 times 1812687 (22.42%) aligned exactly 1 time 6056019 (74.89%) aligned >1 times 97.31% overall alignment rate Comparison with the annotation gives: # gffcompare v0.10.1 | Command line was: #/data/results/tools/rnaseq/stringtie/gffcompare-0.10.1.Linux_x86_64/gffcompare -r GCF_000359505.1_ASM35950v1_genomic.gff -o compare B136-33.gtf # #= Summary for dataset: B136-33.gtf # Query mRNAs : 6040 in 5105 loci (2 multi-exon transcripts) # (2 multi-transcript loci, ~1.2 transcripts per locus) # Reference mRNAs : 6062 in 5112 loci (0 multi-exon) # Super-loci w/ reference transcripts: 5084 #-----------------| Sensitivity | Precision | Base level: 99.6 | 99.9 | Exon level: 99.3 | 99.6 | Transcript level: 99.3 | 99.7 | Locus level: 99.5 | 99.6 | Matching intron chains: 0 Matching transcripts: 6020 Matching loci: 5086 Missed exons: 27/6062 ( 0.4%) Novel exons: 18/6042 ( 0.3%) Novel introns: 2/2 (100.0%) Missed loci: 26/5112 ( 0.5%) Novel loci: 18/5105 ( 0.4%) Total union super-loci across all input datasets: 5102 6040 out of 6040 consensus transcripts written in compare.annotated.gtf (0 discarded as redundant) a) the PseudomonasAeruginosa sample IonXpress_005 YAbR24-OD3-C3719-3 was aligned to strain C3719 https://www.ncbi.nlm.nih.gov/genome/187?genome_assembly_id=165527 Mapping statistics: 5050514 reads; of these: 5050514 (100.00%) were unpaired; of these: 2129053 (42.16%) aligned 0 times 2117581 (41.93%) aligned exactly 1 time 803880 (15.92%) aligned >1 times 57.84% overall alignment rate Comparison with the annotation gives: # gffcompare v0.10.1 | Command line was: #/data/results/tools/rnaseq/stringtie/gffcompare-0.10.1.Linux_x86_64/gffcompare -r GCF_000152525.1_ASM15252v1_genomic.gff -o compare C3719.gtf # #= Summary for dataset: C3719.gtf # Query mRNAs : 5721 in 4876 loci (0 multi-exon transcripts) # (0 multi-transcript loci, ~1.2 transcripts per locus) # Reference mRNAs : 5692 in 4844 loci (0 multi-exon) # Super-loci w/ reference transcripts: 4831 #-----------------| Sensitivity | Precision | Base level: 99.9 | 99.6 | Exon level: 99.7 | 99.2 | Transcript level: 99.7 | 99.2 | Locus level: 99.7 | 99.1 | Matching intron chains: 0 Matching transcripts: 5676 Matching loci: 4831 Missed exons: 13/5692 ( 0.2%) Novel exons: 45/5721 ( 0.8%) Missed loci: 13/4844 ( 0.3%) Novel loci: 45/4876 ( 0.9%) Total union super-loci across all input datasets: 4876 5721 out of 5721 consensus transcripts written in compare.annotated.gtf (0 discarded as redundant) Visualizations in IGV are available. BW, Martin http://software.broadinstitute.org/software/igv/download Launch with 750 MB Dear Pantelis, the bacterial 3UTRseq analysis has finished. R_2017_05_23_12_31_18_user_IONAS-347-DKlab_DrApid_170523_DK3R1-12_YA3R1-6_YAbR21-24.YAbR21-OD3-B133-3.IonXpress_003.fastq B136-33 Unpublished https://www.ncbi.nlm.nih.gov/genome/187?genome_assembly_id=165506 R_2017_05_23_12_31_18_user_IONAS-347-DKlab_DrApid_170523_DK3R1-12_YA3R1-6_YAbR21-24.YAbR24-OD3-C3719-3.IonXpress_005.fastq C3719 https://dx.doi.org/10.1073%2Fpnas.0711982105 https://www.ncbi.nlm.nih.gov/genome/187?genome_assembly_id=165527 #@ ---------- Forwarded message ---------- From: Theodoulakis Christofi Date: 31 May 2017 at 23:17 Subject: Re: Pseudomonas aeruginosa PAO1 To: Yiorgos Apidianakis Cc: Pantelis Hatzis Kalispera, Strains are: UCBPP-PA14 https://dx.doi.org/10.1186%2Fgb-2006-7-10-r90 https://www.ncbi.nlm.nih.gov/genome/187?genome_assembly_id=299954 B136-33 Unpublished https://www.ncbi.nlm.nih.gov/genome/187?genome_assembly_id=165506 MTB-1 https://dx.doi.org/10.1128%2FgenomeA.01130-13 https://www.ncbi.nlm.nih.gov/genome/187?genome_assembly_id=165510 CF5 Unpublished https://www.ncbi.nlm.nih.gov/genome/187?genome_assembly_id=165612 C3719 https://dx.doi.org/10.1073%2Fpnas.0711982105 https://www.ncbi.nlm.nih.gov/genome/187?genome_assembly_id=165527 PACS2 https://dx.doi.org/10.1016%2Fj.ygeno.2008.02.005 https://www.ncbi.nlm.nih.gov/genome/187?genome_assembly_id=165500 Lakis From: "Yiorgos Apidianakis" To: "Pantelis Hatzis" Cc: "Theodoulakis Christofi" Sent: Wednesday, May 31, 2017 6:54:01 PM Subject: Re: Pseudomonas aeruginosa PAO1 Panteli, we sent you RNAs for 6 different strains of Pseudomonas aeruginosa. It is 6 out of the following 7: B136, MTB-1, 213BR, PA14, CF5, C3719 PACS2 (with the this exact name on their file). Lakis can verify. I don't know what the best strategy is: using the PA14 as a popular reference genome or the exact genome of its strain? I guess since the strains are sequenced using the exact genome would be best. Laki please send reference to all 6 genomes. Yiorgos **************************************************************************** Yiorgos Apidianakis, Ph.D., Assistant Professor 1 Panepistimiou Ave., Department of Biological Sciences University of Cyprus, 2109 Aglatzia, Cyprus Phone: +357-22893767 (office) or +357-22893968 (lab) e-mail: apidiana@ucy.ac.cy P.O.Box 20537, 1678 Nicosia, Cyprus http://ucy.ac.cy/dir/en/component/comprofiler/userprofile/apidiana ****************************************************************************