Calculate how frequently a given restriction site would be expected to occur in a stretch of DNA of unknown sequence. The answer will be in the form ‘One site every n base-pairs’.
Assume that each base — A, T, G or C — occurs equally frequently in the DNA.
If the probability of finding a restrictions site at any position in DNA is ‘1 in x’, one can expect to find that site once every x base-pairs. This (overall) probability may be calculated by multiplying the probabilities of occurence of the base at each individual position in the site. In this simple case of a site composed entirely of A, T, G and C, the occurrence is once every 4n base-pairs, where n is the length of the restriction site. The following is a specific example:
Take the case of the frequency of occurence of the dinucleotide, AT.
If one looks at the first base in an unknown stretch of DNA there is a 1-in-4 chance it will be A (because it could equally likely be T, G or C).
If one then looks at the second base, the chance that it will be the required T, will again be 1-in-4.
The overall chance of finding AT is obtained by multiplying these two probabilities, 4x4: 1-in-16.
If you do not see why the individual probabilities are multiplied to give 1-in-16, count the number of different possible dinucleotides: AA, AC, AG, AT, CA, CC, CG, CT, GA, GC, GG, GT, TA, TC, TG, TT.