Dear Pantelis,

at
http://genomics-lab.fleming.gr/fleming/PHlab/run341/rpkm/gencode
you'll find the tables

RPKM_PHR20r-bcatPoly_vs_PHR24r-IgG.csv
RPKM_PHR21r-bcatMono_vs_top5pct-PHR24r-IgG.csv
RPKM_PHR22r-TCF4_vs_top5pct-PHR24r-IgG.csv
RPKM_PHR23r-FUBP2_vs_PHR24r-IgG.csv

improved as discussed:
- using the lasted (lncRNA) annoation from Gencode
(old ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_26/GRCh37_mapping/gencode.v26lift37.annoation.gtf.gz)

ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_26/GRCh37_mapping/gencode.v26lift37.annotation.gtf.gz
- elimination of duplicates by keeping only the exon with the highest RPKM foldchange (vs IgG) for each gene

I also checked correalations in scatterplots using:
- only RPKMs:

RPKM_bcatMono_vs_bcatPoly.png	Pearson's product-moment correlation: 0.9530492
RPKM_bcatMono_vs_TCF4.png	Pearson's product-moment correlation: 0.9376876 
RPKM_bcatPoly_vs_TCF4.png	Pearson's product-moment correlation: 0.9342349
RPKM_FUBP2_vs_TCF4.png		Pearson's product-moment correlation: 0.6638814
RPKM_TCF4_vs_IgG.png		Pearson's product-moment correlation: 0.9251441 (!)

- RPKM foldchange vs IgG or Input

IgG_foldchange_bcatMono_vs_bcatPoly.png		Pearson's product-moment correlation: 0.6942124 
IgG_foldchange_bcatMono_vs_TCF4.png		Pearson's product-moment correlation: 0.6428579
IgG_foldchange_bcatPoly_vs_TCF4.png		Pearson's product-moment correlation: 0.6625278
IgG_foldchange_FUBP2_vs_TCF4.png		Pearson's product-moment correlation: 0.3768401

Input_foldchange_bcatMono_vs_bcatPoly.png	Pearson's product-moment correlation: 0.7967128
Input_foldchange_bcatMono_vs_TCF4.png		Pearson's product-moment correlation: 0.7605851
Input_foldchange_bcatPoly_vs_TCF4.png		Pearson's product-moment correlation: 0.7418563
Input_foldchange_FUBP2_vs_TCF4.png		Pearson's product-moment correlation: 0.3833719 

These correlations suggest:
- correlation bcatMono_vs_bcatPoly is highest
- correlation bcatMono_vs_TCF4 and bcatPoly_vs_TCF4 is slightly lower than  bcatMono_vs_bcatPoly
- using RPKM foldchange agaist IgG or Input has stronger signal than single RPKM contrasts.

Please check the new tables (note that our browser needs an update for the Gencode annotation),
I'll add GOterm enrichment analysis later.

BW,
Martin


Dear Pantelis,
at
http://genomics-lab.fleming.gr/fleming/PHlab/run341/rpkm
RPKM_PHR20r-bcatPoly.bam_vs_PHR24r-IgG.csv
RPKM_PHR21r-bcatMono.bam_vs_PHR24r-IgG.csv
RPKM_PHR22r-TCF4.bam_vs_PHR24r-IgG.csv 
RPKM_PHR23r-FUBP2.bam_vs_PHR24r-IgG.csv

you'll find my custom RPKM analysis of the RIP data.
Coverage on each exon is counted with featureCounts tool of the subread package.
The tables contain exons that have at least 10 reads in the condition and
more than 2fold higher RPKM values over IgG.
The RPKM fold change vs IgG is in the colum "fc_vs_PHR24r.IgG.bam".
The RPKM fold change vs Input is in the colum "fc_vs_PHR19r.Input.bam"

All colum headers are:
"Geneid"                 "Chr"                    "Start"                 
"End"                    "Strand"                 "Length"                
"PHR23r.FUBP2.bam"       "PHR22r.TCF4.bam"        "PHR21r.bcatMono.bam"   
"PHR20r.bcatPoly.bam"    "PHR24r.IgG.bam"         "PHR19r.Input.bam"      
"con_b.bam"              "con_a.bam"              "RPKM_PHR22r.TCF4.bam"  
"RPKM_PHR24r.IgG.bam"    "fc_vs_PHR24r.IgG.bam"   "RPKM_PHR19r.Input.bam" 
"fc_vs_PHR19r.Input.bam"

BW,
Martin


Dear Pantelis,

at
http://genomics-lab.fleming.gr/fleming/PHlab/run341/ripseeker/

you will find folders with RIPSeeker 
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3632129/pdf/gkt142.pdf
analysis for the following contrasts (1st set, 2nd set you defined):

RIPSeeker_bcatMono_vs_IgG_bothStrands
RIPSeeker_bcatPoly_vs_IgG_bothStrands
RIPSeeker_FUBP2_vs_IgG_bothStrands
RIPSeeker_TCF4_vs_IgG_bothStrands

RIPSeeker_bcatMono_vs_Input_bothStrands
RIPSeeker_bcatPoly_vs_Input_bothStrands
RIPSeeker_FUBP2_vs_Input_bothStrands
RIPSeeker_TCF4_vs_Input_bothStrands

In each folder, the files are:
RIPregions_annotated.txt
RIPregions_enrichedGO.txt
RIPregions_annotated.gff3
RIPregions.gff3

The file
RIPregions_annotated.txt
to be loaded with excle, has the list of RIP regions with
summed read count
FPK (fragment per kilobase of region length), representing a normalized read count
averaged log odd scores
p-value and
adjuster p-value

RIPregions_enrichedGO.txt
has a GOterm enrichment analysis
and the gff files can be added as UCSC tracks.

For the 3rd set analysis, I found another tool called ASpeak
https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btt428
that considers matching RNAseq data (to be defined) for normalization.


BW,
Martin


Dear Pantelis,

the tracks for run341 (sampled to PHR23r-FUBP2 with 7250244 reads) are ready at:
http://genomics-lab.fleming.gr/cgi-bin/hgTracks?db=hg19&hubUrl=http://genomics-lab.fleming.gr/fleming/PHlab/run341/hub.txt

Id nreads
PHR19r-Input 12500491
PHR20r-bcatPoly 9498439
PHR21r-bcatMono 10141601
PHR22r-TCF4 9480555
PHR23r-FUBP2 7250244
PHR24r-IgG 9183788

BW
Martin