An ANSI-C program finding sub-cloning strategies, in-frame deletions and frameshifts using restriction enzymes and DNA polymerases.
gcc -oCloneIt CloneIt.c |
gcc -DNOT_UNIX -oCloneIt CloneIt.c |
-D__GNUC__
to the command line
CloneIt |
> DNA sequence pbs rf2 5895 b.p. complete sequence TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAAGCGGAT GCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGCGGCATCAGA GCAGATTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGAAAT TGTAAACGTTAATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCG GCAAAATCCCTTATAAATCAAAAGAATAGACCGAGATAGGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTA AAGAACGTGGACTCCAACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCACCCTAATC AAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGAACCCTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAA AGCCGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGCTGGCAAGTGTAGCGGTCACG (...) CTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAA GTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACTCACTATAGGGC GAATTCGAGCTCGGTACCCGGGGCTATTAAAGGTTCAATGGCGTACAGGAAACGTGGAGCGCGCCGTGAGGCGAATATAA ATAATAATGACCGAATGCAAGAGAAAGATGACGAGAAACAAGATCAAAACAATAGAATGCAGTTGTCTGATAAAGTACTT TCAAAGAAAGAGGAAGTCGTAACCGACAGTCAAGAAGAAATTAAAATTGCTGATGAAGTGAAGAAATCGACGAAAGAAGA ATCTAAACAATTGCTTGAAGTTTTGAAAACAAAAGAAGAGCACCAAAAAGAGATACAATATGAAATTTTGCAAAAAACGA TACCAACATTTGAACCAAAAGAGTCAATATTGAAAAAATTGGAGGATATCAAACCGGAACAAGCGAAGAAGCAGACTAAG CTATTTAGAATATTTGAACCGAGACAGCTACCAATTTATAGAGCGAATGGTGAAAAAGAGTTGCGTAACAGATGGTATTG GAAGCTGAAGAAAGATACTTTACCAGATGGAGATTATGATGTTAGAGAATACTTTCTAAATTTGTATGATCAGGTTCTTA CTGAAATGCCAGATTATTTACTATTAAAAGATATGGCAGTTGAAAATAAAAATTCGAGAGATGCCGGTAAAGTTGTTGAT TCTGAAACAGCAAGTATCTGTGATGCTATATTTCAAGATGAGGAAACAGAAGGTGCAGTGAGAC |
Degenerate
template are not allowed. Numbers will be
discarded. ").The DNA sequence is a circular plasmid sequence. A
short database of different
sequence of classic oligonucleotides is used by the program to try to localize
the insert bounds.
Translated sequences are supposed to be oriented from NH2 to COOH. Input is case sensitive.
-------------------------------------------------------------------------------- Input INSERT sequence name : Opening INSERT. 4737 bp in 'p5_1'. Your sequence seems to be a "T7/T3" type plasmid. The cloning boxes may be localized between 880 and 2476. The cloning box(es) boundaries and the ATG position have been searched, please check the displayed datas. -------------------------------------------------------------------------------- |
|
SACI SALI ECORI : hincii styi acsi : acci ECO52I NCOI alwi : ECL136II eaei bstdsi xhoii : bsp1286i bsh1285i MSCI xmni BAMHI : banii : NOTI eaei ecorv : vspi alwi : alw21i : HINDIII: : : : : : :: : : : : :: TGGCCATGGATATCGGAATTAATTCGGATCCGAATTCGAGCTCCGTCGACAAGCTTGCGGCCGCA \ 5266 ¥ ¥ ¥ ¥ ¥ ¥ ¥ \ ACCGGTACCTATAGCCTTAATTAAGCCTAGGCTTAAGCTCGAGGCAGCTGTTCGAACGCCGGCGT \ 5330 T P T :I S E :L I R::I R: I R: A P :S T :S L :R P H -> :A :M D: I G :I: N S D: P N S S S V: D K: L A::A A L -> : H: G Y R N: * F G :S E :F E :L R R Q A C G: R T -> : P: V $ L R: L * A :$ A :* A :R P L Q E F A: P T <- :G :T G: I A :K: I L G: L S L S S A: A T: R V::G A H <- $ R Y :R Y G :* N L::R P: K L: E L :C S :N S :R R R <- : : : : : :: : : : : :: 5266 5274 : 5283 5291 : 5303 : 5316 :: 5266 5281 5291 : 5303 : 5322 5269 5291 : 5303 : 5323 5269 5292 : 5303 : 5323 5269 5297 5310 5323 5297 5310 5303 5310 |
CloneIt has found an in-frame deletion: Digest INSERT with Bcl I [t^gatca] (1427) and PstI [CTGCA^G] (3611). 5' --TG.TAT./GAT.C AG.GTT.CTT.ACT.G-- --CG.ACC. TGC.A/GG.CAT.GCA.AGC.T-- 3' 3' --AC.ATA. CTA.G/TC.CAA.GAA.TGA.C-- --GC.TGG./ACG.T CC.GTA.CGT.TCG.A-- 5' NH2 Y D Q V L T E -- -- T C R H A S F .COOH Treat with T4 DNA polymerase. Cloning boxes boundaries :[880-1155] [3359-3634]. Original: 5' ================================================ 3' Deletion: 5' =========......................................= 3' Equivalent to a deletion of 728 amino acids [79 %] Digestion post-ligation: BamHI Sal I Acc I Nsi I . The first stop codon detected AFTER the PstI site (3611) is localized at position 3639 on insert. Translated truncated sequence:... NSSSVPGAIKGSMAYRKRGARREANINNNDRMQEKDDEKQDQNNRMQLSDKVLSKKEEVV TDSQEEIKIADEVKKSTKEESKQLEVLKTKEEHQKEIQYEILQKTIPTFEPKESILKKLE DIKPEQAKKQTKLFRIFEPRQLPIYRANGEKELRNRTYTKLKKDTLPGDYDVREYFLNLY DR---------------------------------------------------------- ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ --HASFCS... (N)ext (P)revious (D)iscard (S)ave sequence (A)bort (P)attern (I)nformations Te(X)t (H)TML |
CloneIt has found a Carboxy terminal deletion: Those sites that do not necesseraly create in frame deletion but they can be us ed to make Carboxy-terminal deletions. Digest INSERT with AccI [GT^MKAC] (1445). 5' --.ATC.AGT.//AT A.CAC.ATA.AAT.GAT-- 3' 3' --.TAG.TCA. TA//T.GTG.TAT.TTA.CTA-- 5' NH2 I S I H I N D .COOH Polymerase treatment may be needed in function of the second enzyme. BEWARE: Check if there is a STOP CODON after ligation. Cloning box boundaries :[641-809] [2161-2329]. Original: 5' ================================================== 3' Deletion: 5' =======================........................... 3' Equivalent to a deletion of about 294 amino acids [52 %] ´ Second Enzyme(s) that do not need polymerase modification: AccI (1445) . ´ Second Enzyme(s) that need polymerase modification: AatII (2082) . (N)ext (P)revious (D)iscard (A)bort (P)attern (I)nformations Te(X)t (H)TML |
CloneIt has found an Enzyme that could induce frameshift. Digest INSERT with AccI [gt^mkac] (1659). 5' --.ATC.AGT.//AT A.CAC.ATA.AAT.GAT-- 3' 3' --.TAG.TCA. TA//T.GTG.TAT.TTA.CTA-- 5' NH2 I S I H I N D .COOH Treat with Klenow DNA polymerase. Beware : AccI [1 partial site] . After digestion, fill-in and ligation: 5' .ATC.AGT.ATA.TAC.ACA.TAA.ATG 3' 3' .TAG.TCA.TAT.ATG.TGT.ATT.TAC 5' NH2 I S I Y T * M COOH ¥ FrameShift (+ 2) ¥ 2 bases Added. ¥ Site is NOT reconstitued after ligation. ¥ [51 %] percentage of Insert. FIGURE: ========================= | (+2) =========================== Translated sequence:...EFELGTRGSMATFKDACYHYKRLNKLNSLVLKLGANDETRPAPMTKYKGTCLDCCQY TNLTYCRGCALYHVCQTCSQYNRCFLDEEPHLLRMRTFKDVVTKEDIEGLLTMYETLFPINEKLVNKFINSVKQRKCRNE YLLETYNHLLMPITLQALTINLEDNVYYIFGYYDCMEHENQTPFQFINLLEKYDKLLLDDRNFHRMSHLPVILQQEYALR YFSKSRFLSKGKKRLSRSDFSDNLMEDRHSPTSLMQVVRNCISiyT*... (N)ext (P)revious (D)iscard (S)ave sequence (A)bort (P)attern (I)nformations Te(X)t (H)TML |
<U>2
<B>75/0/100/50
<T>65
<1>Tth111I <2> <3>Thermus thermophilus strain 111 <4>T. Oshima <5>GACN^NNGTC <U>1 <B>100/25/25/100 <T>65 <6> <7>ADIKNQR <8>313,1132
XmaIII and Eco52I are identic [C^GGCCG]: Keep Eco52I and discard the other. XmaCI and XmaI are identic [C^CCGGG]: Keep XmaI and discard the other. Zsp2I and NsiI are identic [ATGCA^T]: Keep NsiI and discard the other. (...) Discard AccII because it is too short [CG^CG]. Discard AciI because it is too short [CCGC(-3/-1)]. Discard AciI because it is too short [CCGC(-3/-1)]. |
A file is open by default at startup.
(UNIX only) The path to this file can be declared
in the system shell (ask your UNIX manager to make it for you)
by the variable CLONEHOME.This was a suggestion
from Dr. Gary Williams from the
UK MRC Human Genome Mapping Project Ressource Centre.
So you should write in your shell , for example:
setenv CLONEHOME /home/virim/lindenb/ANSI/CloneIt/allenz |
- Eco gaattc g^aattc
- co G^A (1/ (
- $6 (site length = 6)
- _4 (all the overhangs of 4 bases)
- _B (all the blunt sites)
- $5_B (site length = 6 and blunt)
- _+ (all the 3'overhanged sites)
- _- (all the 5'overhanged sites)
- _-4 (all the 3'overhangs 4 bases)
- _-4|gaa (all the 3'overhangs of 4 bases and containing 'gaa')
- $6_-4|gaa (site length=6, 3'overhangs of 4 bases and containing 'gaa')
See also this link.
DNA Strider Relibrary | REBASE File |
---|---|
Aat II, gacgt/c, | <1>AatII <5>GACGT^C <7>? |
REBASE File | DNA Strider Relibrary |
---|---|
<1>AatII <2> <3>Acetobacter aceti <4>IFO 3281 <5>GACGT^C <6> <7>ADEFKLMNOPRS <8>1168,1364 <1>Bsp50I <2>FnuDII <3>Bacillus species RFL50 <4>A.A. Janulaitis <5>CG^CG <6> <7> <8>215,441,463 | Aat II, gacgt/c, not available |
..G -----\ ..GAATT ..CTTAA -----/ ..CTTAAT4 DNA Polymerase: 5'->3' pol and 3'->5' exo.
..CTGCA -----\ ..C
..C -----/ ..C
..GGGT CGGA.. --\ ..GGGTCGGA --\ ..GGGTCGGA ..CCCAG T.. --/ ..CCCAG T --/ ..CCCAGggT
_______________________________ | IN | OUT | IN | OUT | _______________________________ | VECTOR | INSERT | ___________________________________________________________ Van91I [CCANNNN^NTGG ] | 0 | 0 | 0 | 0 | VspI [AT^TAAT ] | 1 | 0 | 2 | 3 | XbaI [T^CTAGA ] | 0 | 0 | 1 | 0 | XcmI [CCANNNNN^NNNNT] | 1 | 2 | 0 | 0 | XhoI [C^TCGAG ] | 0 | 0 | 0 | 0 | XhoII [R^GATCY ] | 0 | 0 | 2 | 6 | XmaI [C^CCGGG ] | 1 | 0 | 1 | 0 | XmnI [GAANN^NNTTC ] | 0 | 2 | 1 | 1 |* ___________________________________________________________ | IN | OUT | IN | OUT | _______________________________ | VECTOR | INSERT | _______________________________ ¥: Enzyme useful for digestion post-ligation. *: Enzyme useful to determine the INSERT direction. |
..G p-AATTC.. --\ ..G AATTC.. ..CAATT-p G.. --/ ..CAATT G..
CloneIt V1.0 has found a solution: Digest VECTOR with EcoRI [G^AATTC] (878) and Sal I [g^tcgac] (894). 5' --G.CCG.G/AA.TT C.CCG.GGG.ATC.CG/T.CGA. CCT.GCA.GCC.AAG-- 3' 3' --C.GGC.C TT.AA/G.GGC.CCC.TAG.GC A.GCT./GGA.CGT.CGG.TTC-- 5' NH2 P E F P G I R R P A A K .COOH Digest first with EcoRI .Then treat with Klenow DNA polymerase.Finally digest with Sal I . Digest INSERT with Sca I [AGT^ACT] (1034) and Sal I [g^tcgac] (3605). 5' --AT.AAA.GT/ A.CTT.TCA.AAG.AAA.G-- --TA.GAG./TCG.A CC.TGC.AGG.CAT.G-- 3' 3' --TA.TTT.CA/ T.GAA.AGT.TTC.TTT.C-- --AT.CTC. AGC.T/GG.ACG.TCC.GTA.C-- 3' NH2 K V L S K K E -- -- E S T C R H A .COOH Treat with T4 DNA polymerase. Sites wil be in frame ligated in 5'. The first stop codon detected BEFORE the EcoRI site (878) is localized at position 428 on vector. The first stop codon detected AFTER the Sca I site (1034) is localized at position 3558 on insert. Digestion post-ligation: no enzyme was found. (N)ext (P)revious (D)iscard (N)ext solution (S)ave sequence (A)bort Te(X)t (H)TML (P)attern (I)nformations >> |
INSERT Pattern -------------- SmaI [CCC^GGG] (897) and Bst1107I [GTA^TAC] (1659) p5_1 digestion. 1 3890 pb Bst1107I 1744 - SmaI 897 2 762 pb SmaI 897 - Bst1107I 1659 3 85 pb Bst1107I 1659 - Bst1107I 1744 *************************************************** * Beware this strategy needs partial digestion(s) * ***************************************************
Get information about: ----------------------- 1 SmaI [CCC^GGG] 2 Bst1107I [GTA^TAC] Your choice ? /*1 < 1 < 2*/:2 -------------------------------------------------------------------------------- Information about Bst1107I [GTA^TAC]. Internet WWW link: Bst1107 I ................................................................................ INSERT Pattern -------------- Bst1107I [GTA^TAC] (1659) p5_1 digestion. 1 4652 pb Bst1107I 1744 - Bst1107I 1659 2 85 pb Bst1107I 1659 - Bst1107I 1744 ................................................................................ Prototype :SnaI Microorganism :Bacillus stearothermophilus RFL1107 Source :A.A. Janulaitis Methylation site : Commercial availability : Angewandte Gentechnologie Systeme Fermentas AB Takara Shuzo Co. Ltd. Boehringer-Mannheim New England Biolabs Refs :457 ................................................................................ Looking for Isoschizomers. (*):Commercialy available) ( ) BspM90I GTA^TAC. (*) BssNAI GTA^TAC. also available at: SibEnzyme Ltd. ( ) BstBSI GTA^TAC. (*) BstZ17I GTA^TAC. also available at: New England BioLabs ( ) XcaI GTA^TAC. -------------------------------------------------------------------------------- |
GCGCOREROOT
defined) CloneIt can use the
Gap and the BestFitprograms
from the GCG Wisconsin Package to align the two DNA
(or translated) sequences.
-HELP help -U run a project -UMy_Project -VE load vector sequence -VEpGBT9 -VL vector cloning box left boundary -VL855 -VR vector cloning box right boundary -VR900 -VA vector ATG position -VA644 -VM vector restriction map -IN load insert sequence -INpbs_myGene -IL insert cloning box left boundary -IL154 -IR insert cloning box right boundary -IR2001 -ILi insert cloning box left internal boundary -ILi160 -IRi insert cloning box right internalboundary -IRi1980 -IA insert ATG position -IA155 -IM insert restriction map -R open a REBASE file -Rallenz -Y don't use memory optimization default:FALSE -E allow non-overlapping overhangs default:FALSE -P don't allow use of Klenow or T4 DNA Pol default:allow them -T don't allow partial digestions default:allow it -Z don't allow C.I.P. default:allow C.I.P. -BU don't allow incompatible buffers. default:allow incompatible buffers. -TP don't allow incompatible Temperatures. default:allow incompatible Temperatures -N intersections -GN Gap(© GCG Wisconsin package) with DNA sequences (UNIX only) -GP Gap(© GCG Wisconsin package) with translated sequences (UNIX only) -BN BestFit(© GCG Wisconsin package) with DNA sequences (UNIX only) -BP BestFit(© GCG Wisconsin package) with translated sequences (UNIX only) Cloning an INSERT into a VECTOR. CloneIt will try to find the best cloning strategy. It will stop when cloning conditions will be found. -A find Cloning strategies -H clone in frame at 5' of insert (NH2) default:FALSE -O clone in frame at 3' of insert (COOH) default:FALSE Finding in-frame deletions and frameshifts. CloneIt will try to find a maximum of cloning strategies. -F find frameshifts -D find in-frame deletions -M Maximum percentage of insert -M80 -m minimum percentage of insert -m20 -P don't allow use of Klenow or T4 DNA Pol default:allow them -T don't allow partial digestions default:allow it
-VEpGBT9 -INpbs_RF2 -Rallenz -A -H
-VEpGBT9 -INpbs_RF2 -Rallenz -A -H -IA430 -VL500 -VR523 -Y -Z -T
ask.vector set.nh2.true set.partial.false set.pol.true set.memory.true open.rebase allenz open.insert pbs_RF2 set.vector.atg 483 cloneit open.insert pGBT9_NSP1 cloneit open.insert pGBT9_NSP2 cloneit open.insert pGBT9_NSP3 cloneit open.insert pGBT9_NSP5 cloneit |
This is a link to the main menu.
I would like to thank Audrey Nepveu-de-Villemarceau, Janine L., C. Caron, Dr Christian Marck , Dr S. Hazout and his team, Philippe Bessieres, Christine Young, Mlle Derat, Dr Suzana Lopez (my manager ! ;-), Maria Piron ,Emmanuelle FORLOT (Je me remets au dessin demain...),Dr. Gary Williams , and Chris Boyd at MRC Human Genetics Unit for their help and their suggestions.
Laboratoire de Biologie Moleculaire
des Rotavirus.
Virologie et Immunologie Moleculaires.
Institut National de la Recherche Agronomique.
78350
Jouy-en-Josas Cedex FRANCE.