processor ID: 14850 ========================================================================== |------------------------------------------------------------------| | | | *** Running a seeded analysis *** | | | |------------------------------------------------------------------| command line: /home/grobertson/Fi/Code/20101203/GADEM_v1.3/bin/gadem -fseq /projects/remc_bigdata/Karsan/motifs/20101227/FA/DE_133-KD-402-CTL_535-self-union-TSSrgns.hg18.20101227.fa.3rd_stage_dmasker_w4.fa -fout /projects/remc_bigdata/Karsan/motifs/20101227/both/seed_CSL_m1/a/DE_133-KD-402-CTL_535-self-union-TSSrgns.hg18.20101227.fa.3rd_stage_dmasker_w4.fa.pwmCSL-MEME-m1.mx.fEM0.5.minN100.maxgap10.pgf0.pv0.0002.wt0.de_novo -em 40 -ev 1000 -fEM 0.5 -fpwm0 /projects/remc_bigdata/Karsan/motifs/PWMs/CSL-MEME-m1.mx -minN 100 -pv 0.0002 -posWt 0 -extTrim 1 -pgf 0 -fbm /home/grobertson/Fi/Code/20101203/GADEM_v1.3/KmerFreq/hg18_kmer_1to9_freq.txt -bOrder 3 -verbose 1 maximal buffer length: 15000 maximal number of sequences set: 44000 maximal number of bases per seq read: 20000 maximal number of sites in a motif: 150000 input (ChIP) sequence file: /projects/remc_bigdata/Karsan/motifs/20101227/FA/DE_133-KD-402-CTL_535-self-union-TSSrgns.hg18.20101227.fa.3rd_stage_dmasker_w4.fa number of sequences in input file: 535 average sequence length: 3087 total number of nucleotides: 1652056 max number of generations: 1 population size: 10 use a user-specified pwm as the seed /projects/remc_bigdata/Karsan/motifs/PWMs/CSL-MEME-m1.mx fraction (number) input sequences subject to EM 1.00 (535) scale factor for converting (double)pwm to (int)pwm 200 number of EM steps: 40 EM convergence criterion: 1.000000e-04 run EM on the starting pwm /projects/remc_bigdata/Karsan/motifs/PWMs/CSL-MEME-m1.mx 10 times, each with a different maxp: 0.10*numSeq 0.20*numSeq 0.30*numSeq 0.40*numSeq 0.50*numSeq 0.60*numSeq 0.70*numSeq 0.80*numSeq 0.90*numSeq 1.00*numSeq no spaced dyads are generated and used. pop=10 gen=1 (no GA). motif prior probability type (see documentation): 0 pwm score p-value cutoff for declaring binding site: 2.000000e-04 Approximate the null llr log{p(s|M)/p(s|B)} score distribution using the llr scores of random/background sequences, where M is the EM-derived motif model and B is the 3-th order Markov backgroun model. The background sequences are simulated using the [a,c,g,t] frequencies in the input data. The number sets of background sequences generated: 10 pseudo count: 0.0005 minimal infomation for trimming/extending: 0.40 0.50 0.60 minimal no. sites for each motif: 100 base extension and trimming? yes sliding window for comparing pwm similarity: 6 PWM similarity cutoff: 0.300 log(E-value) cutoff: 1000.00 number of adjacent bases included in binding site output: 10 job started: Mon Dec 27 21:13:00 2010 ========================================================================= GADEM cycle[ 1] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: TTTGGGAG 0.10 fitness: 324.17 GADEM cycle[ 2] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: TTTTwAAA 0.10 fitness: 59.15 GADEM cycle[ 3] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: TTTGAAAA 0.90 fitness: 656.62 GADEM cycle[ 4] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: TTTrkGAA 0.50 fitness: 734.41 GADEM cycle[ 5] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: AAAkAAAA 0.70 fitness: 51.49 GADEM cycle[ 6] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: AATGAAAA 0.10 fitness: 740.30 GADEM cycle[ 7] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: AATryAAA 0.80 fitness: 479.27 GADEM cycle[ 8] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: ArkrGAAA 0.70 fitness: 416.33 GADEM cycle[ 9] generation[ 1] number of unique motif(s): 2 spacedDyad: yGTGGGAA motifConsensus: GGwGGGAG 0.50 fitness: 357.94 spacedDyad: yGTGGGAA motifConsensus: ArwGAGAA 1.00 fitness: 814.26 GADEM cycle[ 10] generation[ 1] number of unique motif(s): 2 spacedDyad: yGTGGGAA motifConsensus: TmTGwAAA 0.90 fitness: 709.88 spacedDyad: yGTGGGAA motifConsensus: TsAGGAAA 0.10 fitness: 876.74 GADEM cycle[ 11] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: yCwGGAAA 0.90 fitness: 665.73 GADEM cycle[ 12] generation[ 1] number of unique motif(s): 4 spacedDyad: yGTGGGAA motifConsensus: yATTTkAA 0.90 fitness: 522.93 spacedDyad: yGTGGGAA motifConsensus: TGkkGAAA 0.20 fitness: 534.49 spacedDyad: yGTGGGAA motifConsensus: CAGGAGAA 1.00 fitness: 807.45 spacedDyad: yGTGGGAA motifConsensus: TrTkkGAA 0.50 fitness: 845.90 GADEM cycle[ 13] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: mwTrGGAA 0.70 fitness: 724.39 GADEM cycle[ 14] generation[ 1] number of unique motif(s): 1 spacedDyad: yGTGGGAA motifConsensus: TTTwTAAT 0.30 fitness: 695.66 finished: Mon Dec 27 22:14:15 2010 approximated processor time in seconds: 3675.000000