# Topic: Ewens sampling formula

 Lindhard Lecture 15.05.2007   (Site not responding. Last check: 2007-10-21) Ewens har arbejdet inden for human og evolutionær populationsgenetik og gennem de senere år inden for computational biology. I 1972 introducerede han Ewens' Sampling Formula (ESF), der er en sandsynlighedfordeling af alle inddelinger af mængden {1,2,...,n}. ESF har fundet stor anvendelse inden for en række anvendelses-områder, ud over biologi blandt andet i fysik og matematik, og den opstår naturligt i The Chinese Restaurant Problem. www.ctn.au.dk /nyheder/arrangementer/lindhard_15052007   (284 words)

 Ewens's sampling formula - Wikipedia, the free encyclopedia In population genetics, Ewens's sampling formula, introduced by Warren Ewens, states that under certain conditions (specified below), if a random sample of n gametes is taken from a population and classified according to the gene at a particular locus then the probability that there are a The assumptions are (1) the sample size n is small by comparison to the size of the whole population, and (2) the population is in statistical equilibrium under mutation and genetic drift and the role of selection at the locus in question is negligible, and (3) every mutant allele is novel. This is a probability distribution on the set of all partitions of the integer n. en.wikipedia.org /wiki/Ewens's_sampling_formula   (278 words)

 Nat' Academies Press, Calculating the Secrets of Life: Contributions of the Mathematical Sciences to Molecular Biology ... The Ewens sampling formula is then described, followed by a brief digression into the simulation structure of mutations in the coalescent, both in top-down and bottom-up form. The Ewens sampling formula was derived as a means to analyze allozyme frequency data that became prevalent in the late 1960s. The Coalescent and Mutation The genealogy of a sample of n genes (that is, stretches of DNA sequence) drawn at random from a large population of approximately constant size may be described in terms of independent exponential random variables Tn,Tn-1,. www.nap.edu /books/0309048869/html/114.html   (6585 words)

 JOS Online: Abstract Ewens's sampling formula (Ewens 1972), which is mainly studied in statistical ecology, has been used to assess the microdata disclosure risk. Pitman (1995) considered an extension of the Ewens sampling formula, and in the present article we evaluate the usefulness of the Pitman sampling formula in the disclosure field. These results suggest that the Pitman sampling formula is very promising for the microdata disclosure problem as well as for statistical ecology. www.jos.nu /Articles/abstract.asp?article=174499   (122 words)

 Abstract   (Site not responding. Last check: 2007-10-21) We develop a sampling theory for genes sampled from a population evolving with deterministically varying size. We use a coalescent approach to provide recursions for the probabilities of particular sample configurations, and describe a Monte Carlo method by which the solutions to such recursions can be approximated. The methods are illustrated with data from the mitochondrial control region sampled from a North American Indian tribe. www-hto.usc.edu /papers/abstracts/roysoc5.html   (105 words)

 Wiskundemeisjes » 2007 » februari Daarna werkte Ewens in de wiskundige populatiegenetica, waarvan hij een van de pioniers was. Hij was betrokken bij de ontwikkeling van de TDT (transmission disequilibrium test), iets dat hij zelf heel simpel noemt, maar iets heel slims is. Ook dit artikel werd een citatie-klassieker. Kortom: Ewens is een veelzijdig man, en in elk vakgebied waar hij zich in gemengd heeft, heeft hij belangrijke dingen gedaan. www.wiskundemeisjes.nl /2007/02   (1530 words)

 [No title]   (Site not responding. Last check: 2007-10-21) Each regenerative composition structure is represented by a process of random sampling of points from an exponential distribution on the positive halfline, and separating the points into clusters by an independent regenerative random set. Examples are composition structures derived from residual allocation models, including one associated with the Ewens sampling formula, and composition structures derived from the zero set of a Brownian motion or Bessel process. We provide characterisation results and formulas relating the distribution of the regenerative composition to the L{\'e}vy parameters of a subordinator whose range is the corresponding regenerative set. math.berkeley.edu /~sheff/abs/pitman.txt   (177 words)

 [No title]   (Site not responding. Last check: 2007-10-21) The molecular variability observed in the sample is due to the effects of mutation. Such sampling distributions can rarely be found explicitly -- the Ewens sampling formula, that appeared in the early 70s as a model for allozyme frequency data, is the lone exception -- but rather they arise as the solution of complicated recursive linear systems. Mutation in the coalescent and sampling formulas 4. www.math.utah.edu /pub/mathbio/fall.lecturers   (994 words)

 Oxford Mathematical Genetics : Importance sampling on coalescent histories. II. Subdivided population ...   (Site not responding. Last check: 2007-10-21) De Iorio and Griffiths (2004) developed a new method of constructing sequential importance-sampling proposal distributions on coalescent histories of a sample of genes for computing the likelihood of a type configuration of genes in the sample by simulation. The method is based on approximating the diffusion-process generator describing the distribution of population gene frequencies, leading to an approximate sample distribution and finally to importance-sampling proposal distributions. An algorithm for computing the likelihood of a sample configuration of genes from a subdivided population in an infinitely-many-alleles model of mutation is derived, extending Ewens's (1972) sampling formula in a single population. www.stats.ox.ac.uk /mathgenbio/b2hd-DeIorioGriffiths2004b.html   (391 words)

 Working Papers LABORatorio R. Revelli The clustering of agents in the market is a typical problem dealt with by the new approaches to macroeconomic modeling, that describe macroscopic variables in terms of the behavior of a large collection of microeconomic entities. This formula can be traced back to Fisher as “species sampling”, and its main use was restricted to Genetics for a long time. As ESF is an equilibrium distribution satisfying the detailed balance, some cumbersome properties are derived in a simple way. www.labor-torino.it /english/research/labsim/wild/wp-html/wp-costantini-garibaldi.html   (202 words)

 Probability distribution - Wikipedia, the free encyclopedia For any set of independent random variables the probability density function of the joint distribution is the product of the individual ones. Two or more random variables on the same sample space The Ewens's sampling formula is a probability distribution on the set of all partitions of an integer n, arising in population genetics. www.wikipedia.org /wiki/Probability_distribution   (1335 words)

 The Poisson-Dirichlet Distribution And Its Relatives Revisited - Holst (ResearchIndex)   (Site not responding. Last check: 2007-10-21) Size-biased sampling and the GEM distribution are considered. Ewens sampling formula and random permutations, generated by the Chinese restaurant process, are also investigated. The used methods are elementary and based on properties of the finite-dimensional Dirichlet distribution. sherry.ifi.unizh.ch /holst01poissondirichlet.html   (368 words)

 Signatures of selection at molecular level in two genes implicated in human familial cancers   (Site not responding. Last check: 2007-10-21) We used a sample of intronic SNPs forming haplotypes, which serve as tools to investigate the genetic diversity and disease association of the target gene. Blood samples were collected from residents of Houston, TX, from four major ethnic groups: Caucasians, African-Americans, Hispanics, and Asians. Ewens Sampling Formula, derived under neutrality and no recombination, provides expected frequencies of haplotypes existing in a given number of copies (Hartl and Clark 1997). www.iscb.org /ismb2004/posters/chrisc1ATrice.edu_494.html   (891 words)

 Tom Kurtz Home Page Stationary solutions and forward equations for controlled and singular martingale problems Gaussian limits associated with the Poisson-Dirichlet distribution and the Ewens sampling formula IEEE Transactions on Signal Processing 49 (2001), 1824 - 1830. www.math.wisc.edu /~kurtz   (348 words)

 Basic Population Genetics [M.Tevfik Dorak] Their formula predicts the expected genotype frequencies using the allele frequencies in a diploid Mendelian population. The haplotype frequency calculated with this formula from the population data compares reasonably well with the estimates obtained directly from counting haplotypes constructed from family segregation data. Genetic distance is a measurement of genetic relatedness of samples of populations (whereas genetic diversity represents diversity within a population). dorakmt.tripod.com /evolution/popgen.html   (5022 words)

 Gaussian Limits Associated with the Poisson-Dirichlet Distribution and the Ewens Sampling Formula (ResearchIndex)   (Site not responding. Last check: 2007-10-21) We prove a variety of Gaussian limit theorems for functions of the population frequencies as the mutation rate  goes to in nity. In particular, we show that if a sample of size n is drawn from a population described by the Poisson{Dirichlet distribution, then the conditional probability of a particular... 7 The population structure associated with the Ewens sampling.. citeseer.ist.psu.edu /325064.html   (335 words)

 Publikacijos Random permutations and the Ewens sampling formula in genetics, Proceedings of the 7th Vilnius Conf. Limit processes with independent increments for the Ewens sampling formula, Annals of the Institute of Statistical Mathematics, 2002, 54(3), 607-620 (jointly with G.J.Babu). Infinitely divisible limit processes for the Ewens sampling formula, Lith. www.mif.vu.lt /ttsk/bylos/man/me-publ.htm   (616 words)

 February 25 - Feng   (Site not responding. Last check: 2007-10-21) Ewens sampling formula describes the distribution of a random sample of size n taken from a selectively neutral haploid population which has evolved toward equilibrium. In this talk a detailed comparison between Ewens sampling formula and its two parameter generalization will be made from four different aspects:the urn scheme,the continuous time construction, the limiting behaviour, and the their derivation from different subordinators. [1] Ewens, W. "The sampling theory of selectively neutral alleles," Theoretical Population Biology Vol 3, 87-112. www.math.mcmaster.ca /peter/seminars/seminars9798/sem980225.html   (211 words)

 Gaussian Limits Associated with the Poisson--Dirichlet Distribution and the Ewens Sampling Formula   (Site not responding. Last check: 2007-10-21) We prove a variety of Gaussian limit theorems for functions of the population frequencies as the mutation rate $\theta$ goes to infinity. In particular, we show that if a sample of size $n$ is drawn from a population described by the Poisson--Dirichlet distribution, then the conditional probability of a particular sample configuration is asymptotically normal with mean and variance dertermined by the Ewens sampling formula. The asymptotic normality of the conditional sampling distribution is somewhat surprising since it is a fairly complicated function of the population frequencies. www.webpages.uidaho.edu /~krone/pd.html   (171 words)

 IngentaConnect Sampling from Finite Random Partitions   (Site not responding. Last check: 2007-10-21) Also, the problem of counting the number of fragments in the k-sample with i representatives (the fragments’ vector count) is addressed, leading to a Ewens sampling formula for finite random partitions. To this end, some connections of the Ewens’ problem with the birthday and coupon collector’s ones are exploited. At last, simple illustrative examples are supplied which highlight the main differences, from the sampling point of view, between the symmetric deterministic and random uniform partitions. www.ingentaconnect.com /content/klu/mcap/2003/00000005/00000004/05145502   (286 words)

 Origins of the Coalescent: 1974-1982 -- Kingman 156 (4): 1461 -- Genetics the formulae were simplified when d was allowed to tend to infinity. Thus, by the end of 1978, the nature of the Ewens sampling formula Ewens sampling formula in terms of a certain "random paintbox." www.genetics.org /cgi/content/full/156/4/1461   (1841 words)

 Citations: The stationary distribution of the infinitely-many neutral alleles diffusion model - Watterson ...   (Site not responding. Last check: 2007-10-21) Though the finite dimensional distributions of the pd distribution are difficult to describe explicitly, there are some remarkably simple formulae involving this distribution, most notably the Ewens sampling formula [23, 25] Antoniak [3] derived the Ewens sampling formula from the.... The construction (5) of (P i) with pd (distribution is related to a derivation of Dirichlet distributions from Fisher s [29] model for species sampling, which is now described. 1 a formula of Perman [43] is applied to obtain an expression for the P ff; joint density.... citeseer.ist.psu.edu /context/432773/0   (905 words)

 Microsurveys in Discrete Probability: Abstracts A typical application of probability to algorithms is to sample a randomly chosen subset of the data and make an inference about all the data with high probability. For instance, if we want to sample 3-colorings on some finite three-dimensional lattice region L, we let the state space of the Markov chain be the set of proper three colorings. Details can be found in the article How to Get a Perfectly Random Sample From a Generic Markov Chain and Generate a Random Spanning Tree of a Directed Graph by the speaker and James Propp. dimacs.rutgers.edu /Workshops/Microsurveys/abstracts.html   (3657 words)

 Modeling Linkage Disequilibrium and Identifying Recombination Hotspots Using Single-Nucleotide Polymorphism Data -- Li ... The effective sample size (ESS) for infs at the MLE is given for each data set above the graph and is a measure of the confidence infs has in its estimated likelihood curve (the larger the better). Results for infs for all data sets except 15 and 16 were kindly provided to us by P. Fearnhead and were obtained using between 50,000 and 5,000,000 iterations. sample, it is actually an estimate of the magnitude of the hotspot www.genetics.org /cgi/content/full/165/4/2213   (7503 words)

