Dear Gordon, Attached please find RayÕs detailed response to your questions. I believe that the response is quite self-contained. For even more details about the meaning of the P_cutoff, P_SH, ranking, and the interpretation of the results, you can also consult our NAR paper from January. Good luck and please let us know once you get the first results! Best regards, Jiri ________________________________________________ Jiri Vanicek Tenure-Track Assistant Professor Laboratory of Theoretical Physical Chemistry EPFL SB ISIC LCPT, BCH 3110, CH-1015 Lausanne Phone: +41 21 693 47 36, Fax: +41 21 693 97 55 E-mail: jiri.vanicek@epfl.ch , Web: http://lcpt.epfl.ch ________________________________________________ From: Ray Marin [mailto:ray.marin@epfl.ch] Sent: mercredi, 22. juin 2011 11:54 To: Vanicek Jiri Subject: Re: FW: FW: FW: FW: PACMIT - bind_hex error message I've emailed three questions earlier this afternoon - 1. Which version of ViennaRNA should I use? 1.6.1 or 1.8.X? - 1.8.X 2. What p-val cutoff value would you recommend? - We recommend to use 0.2. In the paper we show that this value offers the most significant improvement in precision (in both Drosphila and human) with respect to the method without accessibility filter. 3. Can I adjust the limits to input sequence lengths? - No. This are values that are fixed in the code and so far they cannot be modified by the user. Due to the assumptions underlying the model (i.e., coevolution of the miRNA and the target), the scoring function in PACMIT, i.e., P_SH, is expected to work properly in 3'UTRs only. In principle you coud use any kind of sequence as input but in the case of coding sequences the overrepresentation captured by P_SH would be mainly due to the conservation required to keep the protein functional, and not to the conservation required to keep the miRNA-target interaction. As for the 5'UTR, in principle you could use PACMIT, however we do not have any experience yet predicting targets in these kind of sequences. Due to these reasons, a length of 20000 is enough to analyze the annotated 3'UTRs of several species (e.g. human, mouse, Drosophila). The maximum number of miRNAs has been chosen taking into account that the species with highest number of miRNAs (i.e., human) has ~1220 mature sequences. However, I have checked and the maximum number of miRNA sequences allowed by the compiler is 16356. I am sending a new version with extended limits: maximum number of miRNAs = 16356, maximum 3'UTR length = 50000. (I checked and the last version I sent has by mistake a maximum number of miRNAs = 1000 instead of 10000) 4. Any comments that you could send on how to interpret the outputs would also be very helpful. We are very close to being able to start to test Pacmit at higher throughput. - The output that you obtain from pacmit.pl is a list of miRNA-target interactions ranked according to P_SH. The lower P_SH, the higher the chances that the interaction is biologically functional and therefore the pair will rank higher in the list. This means that you if you try to valildate the predictions you will find more true positives among the top predictions than among the middle or bottom predictions. Best regards, Ray On 06/22/2011 12:18 AM, Vanicek Jiri wrote: Hello Ray, Please think about these questions and prepare a response to send me. Thanks, Jiri From: Gordon Robertson [mailto:grobertson@bcgsc.ca] Sent: Wednesday, June 22, 2011 12:09 AM To: Martin Florez Ray Marcel Cc: Vanicek Jiri Subject: Re: FW: FW: FW: PACMIT - bind_hex error message Ray, Jiri, I'm sorry, for the question about the ViennaRNA version, I think I was confusing Pacmit with another target prediction tool that I'm testing, which requires v1.6.1. Your README says clearly: v1.8.X. Gordon Gordon Robertson wrote: Ray, Jiri The README's examples ran on CentOS 5.5 (final), as you suggested. Thank you. "Example calculations can be run using the sequences provided in Sequences/: >cd $BIN_PACMIT/Sequences/Example_utrs/ >../../rnaplfold.pl example_utr.fa 80 40" "Example calculations can be run using the sequences provided in Sequences/: >cd $BIN_PACMIT >./pacmit.pl Sequences/Example_utrs/example_utr.fa Sequences/example_mirna.fa 2 8 1 0.2" I've emailed three questions earlier this afternoon - 1. Which version of ViennaRNA should I use? 1.6.1 or 1.8.X? 2. What p-val cutoff value would you recommend? 3. Can I adjust the limits to input sequence lengths? Any comments that you could send on how to interpret the outputs would also be very helpful. We are very close to being able to start to test Pacmit at higher throughput. Thanks for your help. Gordon Ray Marin wrote: Dear Gordon, I have read the output in your last email and I noticed that your are obtaining only a segmentation fault due to the lack of arguments in the execution of each of the c commands (i.e., bind_hex and oligo_mm). This means that most likely the compatibility problem is already solved, now you just need to follow the instructions in the README file. In the output I attach below, you can see that if I call this programs without the arguments I also get the error, but you can also see that calling them with the appropriate arguments works fine. However, at the end you don't need to worry about the direct execution of these commands (as you are trying now) because that is what the perl script 'pacmit.pl' does for you (and some additional processing too). I hope that it works now for you, but please don't hesitate to contact us again if you still have problems. We appreciate that you are taking the time to make it work properly and we will be happy to provide you with all the support you might require. Best regards, Ray --------------------------------------------------------- ray@lcibpc10:~/Documents/Research/MicroRNA/Code/Pacmit_public_v2$ ll total 1.8M -rwxr-xr-x 1 ray users 865K 2011-06-21 14:58 bind_hex drwxr-xr-x 2 ray users 4.0K 2011-06-21 15:09 example_mirna -rwxr--r-- 1 ray users 1.8K 2011-01-20 09:56 extract_miRNAs_names.pl -rwxr--r-- 1 ray users 545 2011-01-20 09:56 fasta2vienna.awk -rwxr-xr-x 1 ray users 866K 2011-06-21 14:58 oligo_mm -rwxr--r-- 1 ray users 4.3K 2011-01-20 10:39 pacmit.pl -rw-r--r-- 1 ray users 3.0K 2011-06-21 15:10 README -rwxr--r-- 1 ray users 2.5K 2011-01-20 10:33 rnaplfold.pl drwxr-xr-x 3 ray users 4.0K 2011-01-20 09:56 Sequences -rwxr-xr-x 1 ray users 176 2011-01-20 09:56 single_pair.awk ray@lcibpc10:~/Documents/Research/MicroRNA/Code/Pacmit_public_v2$ ./bind_hex Segmentation fault ray@lcibpc10:~/Documents/Research/MicroRNA/Code/Pacmit_public_v2$ ./oligo_mm Segmentation fault ray@lcibpc10:~/Documents/Research/MicroRNA/Code/Pacmit_public_v2$ ./bind_hex Sequences/example_mirna.fa 2 8 ATTCTTT GACTGTT ray@lcibpc10:~/Documents/Research/MicroRNA/Code/Pacmit_public_v2$ seeds=`./bind_hex Sequences/example_mirna.fa 2 8` ray@lcibpc10:~/Documents/Research/MicroRNA/Code/Pacmit_public_v2$ ./oligo_mm Sequences/Example_utrs /example_utr.fa 2 3 0 1 1 4 0.2 0 0 $seeds ENSG00000127720 109 0 0.037 0.000 1 0.008 -2.080 ENSG00000161057 121 0 0.050 0.000 0 0.005 0.000 ENSG00000051596 300 2 0.071 -2.625 0 0.031 0.000 ray@lcibpc10:~/Documents/Research/MicroRNA/Code/Pacmit_public_v2$ ./pacmit.pl Sequences/Example_utrs/example_utr.fa Sequences/example_mirna.fa 2 8 1 0.2 pacmit.pl started at Tue Jun 21 18:55:31 CEST 2011 3'UTR fasta file: Sequences/Example_utrs/example_utr.fa miRNA fasta file: Sequences/example_mirna.fa miRNA seed: 2-8 Accessibility: 1 P_cutoff: 0.2 pacmit.pl finished at Tue Jun 21 18:55:31 CEST 2011 Output file written to: ./example_mirna/predictions_2_8_1_0.2 ray@lcibpc10:~/Documents/Research/MicroRNA/Code/Pacmit_public_v2$ cat example_mirna/predictions_2_8_1_0.2 #Sequence ID t_access No. miRNA c_access Expected c_access log(P_SH) miRNA name ENSG00000051596 300 1 2 0.071 -2.625 hsa-miR-186 ENSG00000127720 109 2 1 0.008 -2.080 hsa-miR-212 --------------------------------------------------------- Ray Marin PhD student Laboratory of Theoretical Physical Chemistry Ecole Polytechnique Federale de Lausanne (EPFL) EPFL SB ISIC LCPT BCH 3121 CH-1015 Lausanne phone: +41 21 693 9477 Email: ray.marin@epfl.ch On 06/21/2011 04:04 PM, Vanicek Jiri wrote: Segmentation fault -- Gordon Robertson BC Cancer Agency Genome Sciences Centre Vancouver BC Canada 604-707-5800 www.bcgsc.ca -- Gordon Robertson BC Cancer Agency Genome Sciences Centre Vancouver BC Canada 604-707-5800 www.bcgsc.ca