Factbites
 Where results make sense
About us   |   Why use us?   |   Reviews   |   PR   |   Contact us  

Topic: Zipfian distribution


Related Topics

In the News (Tue 17 Nov 09)

  
  Zipf's law - Wikipedia, the free encyclopedia
In the tail of the Yule-Simon distribution the frequencies are approximately
The log-normal distribution is the distribution of a random variable whose logarithm is normally distributed, useful when small fluctuations multiply a quantity rather than add to it.
In the parabolic fractal distribution, the logarithm of the frequency is a quadratic polynomial of the logarithm of the rank.
en.wikipedia.org /wiki/Zipf's_law   (1015 words)

  
 Zipf's law
Zipf's law is the observation made by Harvard linguist George Kingsley Zipf[?] that for many frequency distributions, the n-th largest frequency is proportional to a negative power of the rank order n.
A distribution that is observed to obey Zipf's law is sometimes referred to as Zipfian distribution.
The phrase "Zipf's law" is also sometimes used to refer to the corresponding probability distribution, the zeta distribution.
www.ebroadcast.com.au /lookup/encyclopedia/zi/Zipf_distribution.html   (351 words)

  
 Zipf's Law, Benford's Law
With the view to the eerie but uniform distribution of digits of randomly selected numbers, it comes as a great surprise that, if the numbers under investigation are not entirely random but somehow socially or naturally related, the distribution of the first digit is not uniform.
As expected, the third digit is randomly distributed; the numbers of 0s, 1s, 2s, 3s, etc. in the third place are all roughly equal.
It is not true that frequency distributions are always hyperbolic in the social sciences, and always Gaussian in the natural sciences.
www.cut-the-knot.org /do_you_know/zipfLaw.shtml   (2320 words)

  
 What is to be done
The discussion list obviously achieved a remarkable success, as is evidenced by the similarity of the posting distribution to Zipfian distribution.
The proposed distribution is shown in the Figure 2 by the blue line.
The proposed distribution is shown in Figure 2 by the green line.
www.ee.ucla.edu /~simkin/what_is_to_be_done.htm   (299 words)

  
 Comments on 3370 | MetaTalk   (Site not responding. Last check: 2007-10-22)
Talking off the top of my head, you would be unlikely to get a distributed electronic communication system in which group size/activity follows a normal distribution; rather, you'd keep getting a 'Zipfian' distribution, with a few very large groups, and many many small groups, which is in fact what you find.
Whereas, assuming a Zipfian distribution, we would assume that a few people are having conversations with a lot of people, whereas most are just having one or two with one or two other people.
Scaling a Zipfian distributed rather than a homogeneous graph means scaling a graph that is relatively empty (I think).
metatalk.metafilter.com /mefi/3370   (2486 words)

  
 Unbiased selection of sample alignments
This bio-recipe shows how to select random alignments to study certain properties in a way that the sample is not biased by the peculiar distribution of sequences in the database.
The distribution of gap lengths is found to be Zipfian and this leads to a new deletion cost formula based on a logarithmic approximation.
To show the agreement between the data and the Zipfian distribution we show the expected values and cumulatives for the first 20 gap lengths.
www.biorecipes.com /SampleAlignments/code.html   (1017 words)

  
 T-106.290 Ohjelmoinnin laboratoriotyöt, k2004
If it is zero, the distribution degenerates to a uniform distribution.
Here is one implementation for generating Zipf and some other distributions.
Must be zero for uniform distributions, or positive for increasing skewedness.
www.cs.hut.fi /Opinnot/T-106.290/K2004/Ohjeet/Zipf.html   (198 words)

  
 FOA: 3.3.2 Word occurrence as a Poisson process
When the words contained in a corpus are ranked and shown to be distributed according to a Zipfian distribution, an obvious but important observation can be made: The most frequently occurring words are not really about anything.
As $L \infty$ and $p \rightarrow 0$ (and the mean value $\lambda \equiv p \cdot L \rightarrow 1$), the Poisson distribution: \Pr(f_{kd} = n) = \frac{e^{- \lambda} (\lambda)^{n}}{n!} converges to this same distribution.
If we assume that a potential keyword effectively describes some documents in a corpus but occurs at the level of chance throughout the rest of the corpus, the distribution of this keyword across the corpus can be described as the mixture of a Poisson process with some other distribution.
www-cse.ucsd.edu /~rik/foa/l2h/foa-3-3-2.html   (675 words)

  
 Online Spin » Blog Archive » The Big Hump
Twenty percent of the distribution embodies 80 percent of the substance.
Originally, the term Zipf’s law meant the observation of Harvard Linguist George Kingsley Zipf that the frequency of use of the nth-most-frequently-used word in any natural language is approximately inversely proportional to n.
The classic case of Zipf’s law is a “1/f function.” What that means is, given a set of Zipfian distributed frequencies, sorted from the most common to the least common, the second most common frequency will occur half as often as the first.
blogs.mediapost.com /spin/?p=569   (1751 words)

  
 The RDF Benchmark (RBench)   (Site not responding. Last check: 2007-10-22)
Distribution mode under nodes at various hierarchy levels (The zipfian distribution was used to simulate the classification of resources in SW applications.)
The value 0 for the level of favoured subtree means zipfian distribution favouring leaves
A value different from 0 means zipfian distribution favouring the left-most subtree rooted at this level
athena.ics.forth.gr:9090 /RDF/RBench   (312 words)

  
 number spaces from reals follow newcomb-benford and zipf
The coefficient for Zipfian correlation from Linear Regression was always around 0.9992 and converged with a increasing number of mantissa bits.
From Reals (with the S bit is always zero, an 11 bits E and a M of 4, 8, 12, 16, 20 and 24 bits in the range of 1e-0307 to 1e+0308) the cumulative frequency of the most significant digits were measured for the complete number space.
From Table 1 it is clear that the first digit distribution converges to a small extent with Zipf by increasing the mantissa resolution.
home.zonnet.nl /galien8/zipf/zipf.html   (2097 words)

  
 3quarksdaily: From the Tail: Big Fat Regret
This is what captured the imagination of a Harvard Lingusitics lecturer by the name of George Zipf who discovered that if the most frequently mentioned word in a book was used N times, the Kth most frequent word in that book would be used about N/K times.
The distribution of population in a city: Remarkably this turns out to be Zipfian as well.
The number of people who have scratched their heads about why this happens reads like a who's who of economics: Herb Simon, Paul Krugman, Benoit Mandelbrot (who isn't regarded by economists as an economist but actually is), and most recently Xavier Gabaix among lots of others.
3quarksdaily.blogs.com /3quarksdaily/2005/11/big_fat_regret.html   (1305 words)

  
 Zipfian distribution   (Site not responding. Last check: 2007-10-22)
Definition: A distribution of probabilities of occurrence that follows Zipf's law.
Note: The distribution of words is often proportional to a function like 1/n
Paul E. Black, "Zipfian distribution", in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed., U.S. National Institute of Standards and Technology.
www.nist.gov /dads/HTML/zipfian.html   (82 words)

  
 How to explane zipfian distribution over multidimensional sp
The zipfian's defination is over one dimensional space originally.
probability for points' existances is complied to zipfian distribution.
uniform distribution over [0-1] span can be generated by the computer.
www.database-forum.com /ftopic256.html   (281 words)

  
 Citations: A Benchmark Comparison of DB2 and the DBC - Steindel, Madison (ResearchIndex)
A significant contribution of this paper is to demonstrate the importance of using statistics on data distributions.
Further, the distribution of the query values (the actual constants in the predicates of the queries) is often quite different from the....
The data distribution in each column is a generalized Zipfian distribution [5] the Zipfian distribution models a significant amount of real skew data
citeseer.ist.psu.edu /context/454942/0   (283 words)

  
 unmediated: The Big Hump
Now, one of the few places it does not hold up is in the area of Video On Demand or Video Rentals and of course, the business of hits.
The nth most common frequency will occur 1/n as often as the first.  Zipf's law is an experimental law, not a theoretical one.
Zipfian distributions are commonly observed in many kinds of phenomena.  The curve that illustrates Zipf’s law looks like a long tail (hence the title of Mr.
www.unmediated.org /archives/2005/07/the_big_hump_1.php   (909 words)

  
 Alex Barnett blog : Long Tails and Zipfian distribution
"It's quite likely that your pageviews follow a Zipf distribution with classic long tail usage, since most websites have worked this way since at least 1996 (the first time I analyzed such data).
However, it would be easier to evaluate your data if you plotted the data on log-log diagrams (i.e., logarithmic scales for both x and y axes).
Basically, if the data shows as a straight line on log-log plots, then you have the expected distribution.
blogs.msdn.com /alexbarn/archive/2005/12/31/508369.aspx   (284 words)

  
 SIMS 202 Fall '99 Assignment 8
Viewing and Analyzing the Zipfian distribution: 60-90 minutes.
We now want to convert these inverted files into a form that can be viewed according to its Zipfian distribution.
Recall that the Zipfian distribution is an effect seen when the data is ordered by its rank.
www2.sims.berkeley.edu /courses/is202/f99/assignments/assign8.html   (1328 words)

  
 The Long Tail: Microsoft and the Long Tail of search
Edward Jay Epstein writes in Slate on the economics of shifting movies TV and movies from broadcast and theatrical distribution to downloading: "The real issue for the studios is how they can dig into this potential gold mine without undermining their existing revenue streams.
It may also be as close to a no-risk deal as filmmakers are likely to find: all they need provide is proof that the rights to their film have been cleared, and a master to be copied.
For those who are really into the nuts and bolts of Long Tailed distributions, an invitation from Art Zaifman: "I'm working on an R&D project at AT&T Labs that employs an in-house sampling algorithm that has been proven optimal in its ability to reduce yet retain accuracy of data that follows a heavy/long tailed distribution.
www.thelongtail.com /the_long_tail/2005/12/microsoft_and_t.html   (10525 words)

  
 ENGR 659 -- Dr. Schoenly -- exercise 7
It is possible, and in fact quite easy, to calculate an ideal or perfect Zipfian distribution, given T total word occurrences and N different ranks (i.e.
Write a program that allows the user to input T and N, then computes and displays the associated Zipfian distribution.
Your may calculate your frequency values as floating point numbers, even though in a realistic situation they would, of course, necessarily be integers.
john.cs.olemiss.edu /~sbs/659spring2004/x07.html   (121 words)

  
 CS 660: Leda Assignment
3) Write a program to produce a list containing the integers 1 through N randomly distributed with Zipfian (Lotka's, 80% - 20%) probability distribution.
Recall that the Zipfian probability distribution is given by
How do the results of the same algorithm vary between different files of the same size and distribution.
www.eli.sdsu.edu /courses/fall95/cs660/assignments/LedaAsss1.html   (326 words)

  
 SOFTWARE DOWNLOAD FROM USC DATABASE LAB   (Site not responding. Last check: 2007-10-22)
The terms of using our published software is simple.  If you download our software for non-profit activities then please acknowledged the “USC Database Laboratory” in any resulting software prototypes and publications.  If you are a for-profit organization, please contact Dr. Shahram Ghandeharizadeh prior to using our software.
Software to generate either a Zipfian or a Uniform distribution of access to a fixed number of objects (
Software to generate requests based a Poisson distribution.  This program implements an analytical model to compute the response time of a M/M/D/1 queuing model.  The analytical model is used to verify the correctness of the software that generates the different Poisson arrival rates.  Published April 9, 2005.
perspolis.usc.edu /Users/Shahram/DownloadPage.htm   (160 words)

  
 Term: Zipfian distribution
Once we make an assumption about how keywords occur within separate documents, we can derive the distribution of keywords across documents.
But the distribution of keywords assigned to documents can be expected to be much more uniform - documents are about a nearly unform or constant number of topics.
Figure 3.5 represents the index as a graph, where edges connect keyword nodes on the left with document nodes on the right.
www.dei.unipd.it /~melo/htbw/foa/terms/823.htm   (969 words)

  
 ETH - icos - Zipfian distribution - Excercise Week 5   (Site not responding. Last check: 2007-10-22)
ETH - icos - Zipfian distribution - Excercise Week 5
ETH Zurich - DINFK - ICOS - Education - - - Zipfian distribution - Excercise Week 5
Read and Understand the biorecipe  "Unbiased selection of sample alignments".  Find a piece of text on the web, count the words in the text and see if they fit a Zipfian distribution.  The darwin commands "SearchDelim", "ReadRawFile", and "table" will be useful as well as the data structure "Counter". 
www.icos.ethz.ch /education/courses/computational_biology/week5   (166 words)

  
 Zipf's law Information Center - Zipf's law
So, the most frequent word will occur approximately twice as often as the second most frequent word, which occurs twice as often as the fourth most frequent word, etc. The term has come to be used to refer to any of a family of related probability distributions.
income distribution amongst the top earning 3% of individuals (see External Links, below)
Seeing Around Corners (Artificial societies turn up Zipf's law)
www.scipeeps.com /Sci-Linguistic_Topics_U_-_Z/Zipf's_law.html   (948 words)

  
 Optimizations for Dynamic Inverted Index Maintenance - Cutting, Pedersen (ResearchIndex)   (Site not responding. Last check: 2007-10-22)
B-trees are an effective tool in implementing such indices.
The Zipfian distribution of postings suggests space and time optimizations unique to this task.
In particular, we present two novel optimizations, merge update, which performs better than straight forward block update, and pulsing which significantly reduces space requirements without sacrificing performance.
sherry.ifi.unizh.ch /cutting90optimizations.html   (458 words)

Try your search on: Qwika (all wikis)

Factbites
  About us   |   Why use us?   |   Reviews   |   Press   |   Contact us  
Copyright © 2005-2007 www.factbites.com Usage implies agreement with terms.