列出 N 以下所有素数的最快方法

Fastest way to list all primes below N

提问人:jbochi 提问时间:1/15/2010 最后编辑:bjb568jbochi 更新时间:5/13/2023 访问量:278128

问:

这是我能想到的最好的算法。

def get_primes(n):
    numbers = set(range(n, 1, -1))
    primes = []
    while numbers:
        p = numbers.pop()
        primes.append(p)
        numbers.difference_update(set(range(p*2, n+1, p)))
    return primes

>>> timeit.Timer(stmt='get_primes.get_primes(1000000)', setup='import   get_primes').timeit(1)
1.1499958793645562

它能做得更快吗?

此代码有一个缺陷:由于是一个无序集合,因此无法保证会从集合中删除最小的数字。尽管如此,它对某些输入数字有效(至少对我而言):numbersnumbers.pop()

>>> sum(get_primes(2000000))
142913828922L
#That's the correct sum of all numbers below 2 million
>>> 529 in get_primes(1000)
False
>>> 529 in get_primes(530)
True
Python 数学 优化 素数

评论

0赞 Shekhar 1/23/2010
如果像 numbers = set(range(n, 2, -2)) 这样的声明,则有问题的代码片段要快得多。但不能打败 sundaram3。谢谢你的问题。
4赞 Michael Foukarakis 9/10/2011
如果答案中可以有 Python 3 版本的函数,那就太好了。
2赞 Colonel Panic 7/4/2015
我怀疑围绕 C++ 库 primesieve 的 Python 绑定会比所有这些快几个数量级
4赞 Jared 7/8/2015
@ColonelPanic 碰巧的是,我更新了 Py3 的 github.com/jaredks/pyprimesieve 并添加到 PyPi 中。它肯定比这些快,但不是几个数量级 - 比最好的 numpy 版本快 ~5 倍。
4赞 Peter Cordes 7/19/2015
@ColonelPanic:我认为编辑旧答案以注明它们已经老化是合适的,因为这使它成为更有用的资源。如果“接受”的答案不再是最好的答案,也许可以在问题中编辑一个注释,并在 2015 年更新,为人们指出当前的最佳方法。

答:

43赞 Alex Martelli 1/15/2010 #1

这里有一个非常简洁的 Python Cookbook 示例——该 URL 上建议的最快版本是:

import itertools
def erat2( ):
    D = {  }
    yield 2
    for q in itertools.islice(itertools.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = p + q
            while x in D or not (x&1):
                x += p
            D[x] = p

这样就给

def get_primes_erat(n):
  return list(itertools.takewhile(lambda p: p<n, erat2()))

在 shell 提示符下(我更喜欢这样做)在 pri.py 中使用此代码进行测量,我观察到:

$ python2.5 -mtimeit -s'import pri' 'pri.get_primes(1000000)'
10 loops, best of 3: 1.69 sec per loop
$ python2.5 -mtimeit -s'import pri' 'pri.get_primes_erat(1000000)'
10 loops, best of 3: 673 msec per loop

因此,看起来 Cookbook 解决方案的速度是原来的两倍多。

评论

2赞 Alex Martelli 1/15/2010
@jbochi,不客气——但请看一下那个 URL,包括致谢:我们花了十个人的时间才共同完善代码到这一点,包括 Tim Peters 和 Raymond Hettinger 等 Python 性能杰出人物(自从我编辑印刷的 Cookbook 以来,我写了食谱的最终文本,但在编码方面,我的贡献与其他人不相上下)——最后, 这是非常微妙和微调的代码,这并不奇怪!
0赞 jbochi 1/15/2010
@Alex:知道你的代码“只有”我的两倍,这让我感到非常自豪。:)URL读起来也很有趣。再次感谢。
0赞 tzot 9/26/2010
只需稍作改动,它就可以变得更快:参见 stackoverflow.com/questions/2211990/...
1赞 Will Ness 8/3/2012
...此外,还可以通过额外的 ~1.2x-1.3 倍加速、将内存占用从 O(n) 大幅减少到 O(sqrt(n)) 以及提高经验时间复杂度来加快速度,方法是将素数添加到字典中,直到在输入中看到它们的平方在这里测试一下
22赞 sth 1/15/2010 #2

该算法速度很快,但它有一个严重的缺陷:

>>> sorted(get_primes(530))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73,
79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163,
167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251,
257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349,
353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443,
449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 527, 529]
>>> 17*31
527
>>> 23*23
529

您假设这将返回集合中的最小数字,但这根本无法保证。集合是无序的,会删除并返回任意元素,因此它不能用于从剩余数字中选择下一个素数。numbers.pop()pop()

414赞 unutbu 1/15/2010 #3

警告:结果可能因硬件或 Python 版本。timeit

下面是一个脚本,它比较了多种实现:

非常感谢斯蒂芬让我注意到sieve_wheel_30。 功劳归于罗伯特·威廉·汉克斯(Robert William Hanks)的primesfrom2to、primesfrom3to、rwh_primes、rwh_primes1和rwh_primes2。

在使用 psyco 测试的普通 Python 方法中,对于 n=1000000,rwh_primes1 是测试最快的。

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| rwh_primes1         | 43.0  |
| sieveOfAtkin        | 46.4  |
| rwh_primes          | 57.4  |
| sieve_wheel_30      | 63.0  |
| rwh_primes2         | 67.8  |    
| sieveOfEratosthenes | 147.0 |
| ambi_sieve_plain    | 152.0 |
| sundaram3           | 194.0 |
+---------------------+-------+

在没有psyco的情况下,在n=1000000的情况下测试的普通Python方法中,rwh_primes2是最快的。

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| rwh_primes2         | 68.1  |
| rwh_primes1         | 93.7  |
| rwh_primes          | 94.6  |
| sieve_wheel_30      | 97.4  |
| sieveOfEratosthenes | 178.0 |
| ambi_sieve_plain    | 286.0 |
| sieveOfAtkin        | 314.0 |
| sundaram3           | 416.0 |
+---------------------+-------+

在所有测试的方法中,允许numpy,对于n=1000000,primesfrom2to是测试最快的。

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| primesfrom2to       | 15.9  |
| primesfrom3to       | 18.4  |
| ambi_sieve          | 29.3  |
+---------------------+-------+

使用以下命令测量时序:

python -mtimeit -s"import primes" "primes.{method}(1000000)"

替换为每个方法名称。{method}

primes.py:

#!/usr/bin/env python
import psyco; psyco.full()
from math import sqrt, ceil
import numpy as np

def rwh_primes(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i]:
            sieve[i*i::2*i]=[False]*((n-i*i-1)/(2*i)+1)
    return [2] + [i for i in xrange(3,n,2) if sieve[i]]

def rwh_primes1(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * (n/2)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)
    return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]

def rwh_primes2(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    correction = (n%6>1)
    n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6]
    sieve = [True] * (n/3)
    sieve[0] = False
    for i in xrange(int(n**0.5)/3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      ((k*k)/3)      ::2*k]=[False]*((n/6-(k*k)/6-1)/k+1)
        sieve[(k*k+4*k-2*k*(i&1))/3::2*k]=[False]*((n/6-(k*k+4*k-2*k*(i&1))/6-1)/k+1)
    return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]]

def sieve_wheel_30(N):
    # http://zerovolt.com/?p=88
    ''' Returns a list of primes <= N using wheel criterion 2*3*5 = 30

Copyright 2009 by zerovolt.com
This code is free for non-commercial purposes, in which case you can just leave this comment as a credit for my work.
If you need this code for commercial purposes, please contact me by sending an email to: info [at] zerovolt [dot] com.'''
    __smallp = ( 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
    61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139,
    149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227,
    229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311,
    313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401,
    409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491,
    499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599,
    601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683,
    691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797,
    809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887,
    907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997)

    wheel = (2, 3, 5)
    const = 30
    if N < 2:
        return []
    if N <= const:
        pos = 0
        while __smallp[pos] <= N:
            pos += 1
        return list(__smallp[:pos])
    # make the offsets list
    offsets = (7, 11, 13, 17, 19, 23, 29, 1)
    # prepare the list
    p = [2, 3, 5]
    dim = 2 + N // const
    tk1  = [True] * dim
    tk7  = [True] * dim
    tk11 = [True] * dim
    tk13 = [True] * dim
    tk17 = [True] * dim
    tk19 = [True] * dim
    tk23 = [True] * dim
    tk29 = [True] * dim
    tk1[0] = False
    # help dictionary d
    # d[a , b] = c  ==> if I want to find the smallest useful multiple of (30*pos)+a
    # on tkc, then I need the index given by the product of [(30*pos)+a][(30*pos)+b]
    # in general. If b < a, I need [(30*pos)+a][(30*(pos+1))+b]
    d = {}
    for x in offsets:
        for y in offsets:
            res = (x*y) % const
            if res in offsets:
                d[(x, res)] = y
    # another help dictionary: gives tkx calling tmptk[x]
    tmptk = {1:tk1, 7:tk7, 11:tk11, 13:tk13, 17:tk17, 19:tk19, 23:tk23, 29:tk29}
    pos, prime, lastadded, stop = 0, 0, 0, int(ceil(sqrt(N)))
    # inner functions definition
    def del_mult(tk, start, step):
        for k in xrange(start, len(tk), step):
            tk[k] = False
    # end of inner functions definition
    cpos = const * pos
    while prime < stop:
        # 30k + 7
        if tk7[pos]:
            prime = cpos + 7
            p.append(prime)
            lastadded = 7
            for off in offsets:
                tmp = d[(7, off)]
                start = (pos + prime) if off == 7 else (prime * (const * (pos + 1 if tmp < 7 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 11
        if tk11[pos]:
            prime = cpos + 11
            p.append(prime)
            lastadded = 11
            for off in offsets:
                tmp = d[(11, off)]
                start = (pos + prime) if off == 11 else (prime * (const * (pos + 1 if tmp < 11 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 13
        if tk13[pos]:
            prime = cpos + 13
            p.append(prime)
            lastadded = 13
            for off in offsets:
                tmp = d[(13, off)]
                start = (pos + prime) if off == 13 else (prime * (const * (pos + 1 if tmp < 13 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 17
        if tk17[pos]:
            prime = cpos + 17
            p.append(prime)
            lastadded = 17
            for off in offsets:
                tmp = d[(17, off)]
                start = (pos + prime) if off == 17 else (prime * (const * (pos + 1 if tmp < 17 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 19
        if tk19[pos]:
            prime = cpos + 19
            p.append(prime)
            lastadded = 19
            for off in offsets:
                tmp = d[(19, off)]
                start = (pos + prime) if off == 19 else (prime * (const * (pos + 1 if tmp < 19 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 23
        if tk23[pos]:
            prime = cpos + 23
            p.append(prime)
            lastadded = 23
            for off in offsets:
                tmp = d[(23, off)]
                start = (pos + prime) if off == 23 else (prime * (const * (pos + 1 if tmp < 23 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 29
        if tk29[pos]:
            prime = cpos + 29
            p.append(prime)
            lastadded = 29
            for off in offsets:
                tmp = d[(29, off)]
                start = (pos + prime) if off == 29 else (prime * (const * (pos + 1 if tmp < 29 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # now we go back to top tk1, so we need to increase pos by 1
        pos += 1
        cpos = const * pos
        # 30k + 1
        if tk1[pos]:
            prime = cpos + 1
            p.append(prime)
            lastadded = 1
            for off in offsets:
                tmp = d[(1, off)]
                start = (pos + prime) if off == 1 else (prime * (const * pos + tmp) )//const
                del_mult(tmptk[off], start, prime)
    # time to add remaining primes
    # if lastadded == 1, remove last element and start adding them from tk1
    # this way we don't need an "if" within the last while
    if lastadded == 1:
        p.pop()
    # now complete for every other possible prime
    while pos < len(tk1):
        cpos = const * pos
        if tk1[pos]: p.append(cpos + 1)
        if tk7[pos]: p.append(cpos + 7)
        if tk11[pos]: p.append(cpos + 11)
        if tk13[pos]: p.append(cpos + 13)
        if tk17[pos]: p.append(cpos + 17)
        if tk19[pos]: p.append(cpos + 19)
        if tk23[pos]: p.append(cpos + 23)
        if tk29[pos]: p.append(cpos + 29)
        pos += 1
    # remove exceeding if present
    pos = len(p) - 1
    while p[pos] > N:
        pos -= 1
    if pos < len(p) - 1:
        del p[pos+1:]
    # return p list
    return p

def sieveOfEratosthenes(n):
    """sieveOfEratosthenes(n): return the list of the primes < n."""
    # Code from: <[email protected]>, Nov 30 2006
    # http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d
    if n <= 2:
        return []
    sieve = range(3, n, 2)
    top = len(sieve)
    for si in sieve:
        if si:
            bottom = (si*si - 3) // 2
            if bottom >= top:
                break
            sieve[bottom::si] = [0] * -((bottom - top) // si)
    return [2] + [el for el in sieve if el]

def sieveOfAtkin(end):
    """sieveOfAtkin(end): return a list of all the prime numbers <end
    using the Sieve of Atkin."""
    # Code by Steve Krenzel, <[email protected]>, improved
    # Code: https://web.archive.org/web/20080324064651/http://krenzel.info/?p=83
    # Info: http://en.wikipedia.org/wiki/Sieve_of_Atkin
    assert end > 0
    lng = ((end-1) // 2)
    sieve = [False] * (lng + 1)

    x_max, x2, xd = int(sqrt((end-1)/4.0)), 0, 4
    for xd in xrange(4, 8*x_max + 2, 8):
        x2 += xd
        y_max = int(sqrt(end-x2))
        n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in xrange((n_diff - 1) << 1, -1, -8):
            m = n % 12
            if m == 1 or m == 5:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, x2, xd = int(sqrt((end-1) / 3.0)), 0, 3
    for xd in xrange(3, 6 * x_max + 2, 6):
        x2 += xd
        y_max = int(sqrt(end-x2))
        n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1
        if not(n & 1):
            n -= n_diff
            n_diff -= 2
        for d in xrange((n_diff - 1) << 1, -1, -8):
            if n % 12 == 7:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, y_min, x2, xd = int((2 + sqrt(4-8*(1-end)))/4), -1, 0, 3
    for x in xrange(1, x_max + 1):
        x2 += xd
        xd += 6
        if x2 >= end: y_min = (((int(ceil(sqrt(x2 - end))) - 1) << 1) - 2) << 1
        n, n_diff = ((x*x + x) << 1) - 1, (((x-1) << 1) - 2) << 1
        for d in xrange(n_diff, y_min, -8):
            if n % 12 == 11:
                m = n >> 1
                sieve[m] = not sieve[m]
            n += d

    primes = [2, 3]
    if end <= 3:
        return primes[:max(0,end-2)]

    for n in xrange(5 >> 1, (int(sqrt(end))+1) >> 1):
        if sieve[n]:
            primes.append((n << 1) + 1)
            aux = (n << 1) + 1
            aux *= aux
            for k in xrange(aux, end, 2 * aux):
                sieve[k >> 1] = False

    s  = int(sqrt(end)) + 1
    if s  % 2 == 0:
        s += 1
    primes.extend([i for i in xrange(s, end, 2) if sieve[i >> 1]])

    return primes

def ambi_sieve_plain(n):
    s = range(3, n, 2)
    for m in xrange(3, int(n**0.5)+1, 2): 
        if s[(m-3)/2]: 
            for t in xrange((m*m-3)/2,(n>>1)-1,m):
                s[t]=0
    return [2]+[t for t in s if t>0]

def sundaram3(max_n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/2073279#2073279
    numbers = range(3, max_n+1, 2)
    half = (max_n)//2
    initial = 4

    for step in xrange(3, max_n+1, 2):
        for i in xrange(initial, half, step):
            numbers[i-1] = 0
        initial += 2*(step+1)

        if initial > half:
            return [2] + filter(None, numbers)

################################################################################
# Using Numpy:
def ambi_sieve(n):
    # http://tommih.blogspot.com/2009/04/fast-prime-number-generator.html
    s = np.arange(3, n, 2)
    for m in xrange(3, int(n ** 0.5)+1, 2): 
        if s[(m-3)/2]: 
            s[(m*m-3)/2::m]=0
    return np.r_[2, s[s>0]]

def primesfrom3to(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns a array of primes, p < n """
    assert n>=2
    sieve = np.ones(n/2, dtype=np.bool)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = False
    return np.r_[2, 2*np.nonzero(sieve)[0][1::]+1]    

def primesfrom2to(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = np.ones(n/3 + (n%6==2), dtype=np.bool)
    sieve[0] = False
    for i in xrange(int(n**0.5)/3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[      ((k*k)/3)      ::2*k] = False
            sieve[(k*k+4*k-2*k*(i&1))/3::2*k] = False
    return np.r_[2,3,((3*np.nonzero(sieve)[0]+1)|1)]

if __name__=='__main__':
    import itertools
    import sys

    def test(f1,f2,num):
        print('Testing {f1} and {f2} return same results'.format(
            f1=f1.func_name,
            f2=f2.func_name))
        if not all([a==b for a,b in itertools.izip_longest(f1(num),f2(num))]):
            sys.exit("Error: %s(%s) != %s(%s)"%(f1.func_name,num,f2.func_name,num))

    n=1000000
    test(sieveOfAtkin,sieveOfEratosthenes,n)
    test(sieveOfAtkin,ambi_sieve,n)
    test(sieveOfAtkin,ambi_sieve_plain,n) 
    test(sieveOfAtkin,sundaram3,n)
    test(sieveOfAtkin,sieve_wheel_30,n)
    test(sieveOfAtkin,primesfrom3to,n)
    test(sieveOfAtkin,primesfrom2to,n)
    test(sieveOfAtkin,rwh_primes,n)
    test(sieveOfAtkin,rwh_primes1,n)         
    test(sieveOfAtkin,rwh_primes2,n)

运行脚本会测试所有实现是否给出相同的结果。

评论

7赞 Alex Martelli 1/15/2010
如果你对非纯 Python 代码感兴趣,那么你应该看看——它通过其类型的方法对素数有很好的支持。gmpynext_primempz
1赞 Ehsan Kia 4/28/2014
如果你使用的是pypy,这些基准测试(psyco基准测试)似乎相当不对劲。令人惊讶的是,我发现 sieveOfEratosthenes 和 ambi_sieve_plain 是最快的 pypy。这就是我为非 numpy 发现的 gist.github.com/5bf466bb1ee9e5726a52
1赞 gaborous 7/9/2015
如果有人想知道这里的函数如何与维基教科书的 PG7.8 相比,对于没有 psyco 或 pypy 的纯 python:对于 n = 1000000:PG7.8:每个循环 4.93 秒;rwh_primes1:每个环路 69 ms;rwh_primes2:每个环路 57.1 ms
10赞 noɥʇʎԀʎzɐɹƆ 12/24/2016
你能用 PyPy 更新它吗,现在 psyco 已经死了,PyPy 已经取代了它?
10赞 cs95 11/12/2018
如果这些函数和计时可以针对 python3 进行更新,那就太好了。
3赞 Daniel G 1/15/2010 #4

对于最快的代码,numpy 解决方案是最好的。不过,出于纯粹的学术原因,我发布了我的纯 python 版本,它比上面发布的食谱版本快不到 50%。由于我在内存中制作了整个列表,因此您需要足够的空间来容纳所有内容,但它似乎可以很好地扩展。

def daniel_sieve_2(maxNumber):
    """
    Given a number, returns all numbers less than or equal to
    that number which are prime.
    """
    allNumbers = range(3, maxNumber+1, 2)
    for mIndex, number in enumerate(xrange(3, maxNumber+1, 2)):
        if allNumbers[mIndex] == 0:
            continue
        # now set all multiples to 0
        for index in xrange(mIndex+number, (maxNumber-3)/2+1, number):
            allNumbers[index] = 0
    return [2] + filter(lambda n: n!=0, allNumbers)

结果如下:

>>>mine = timeit.Timer("daniel_sieve_2(1000000)",
...                    "from sieves import daniel_sieve_2")
>>>prev = timeit.Timer("get_primes_erat(1000000)",
...                    "from sieves import get_primes_erat")
>>>print "Mine: {0:0.4f} ms".format(min(mine.repeat(3, 1))*1000)
Mine: 428.9446 ms
>>>print "Previous Best {0:0.4f} ms".format(min(prev.repeat(3, 1))*1000)
Previous Best 621.3581 ms
19赞 Kimvais 1/15/2010 #5

对于具有足够大 N 的真正最快的解决方案,下载一个预先计算的素数列表,将其存储为元组,然后执行类似操作:

for pos,i in enumerate(primes):
    if i > N:
        print primes[:pos]

如果只有这样,计算更多的素数并将新列表保存在代码中,那么下次它同样快。N > primes[-1]

始终跳出框框思考。

评论

11赞 Daniel G 1/15/2010
不过,为了公平起见,您必须计算下载、解压缩和格式化素数的时间,并将其与使用算法生成素数的时间进行比较 - 这些算法中的任何一种都可以轻松地将结果写入文件以供以后使用。我认为在这种情况下,如果有足够的内存来实际计算所有小于 982,451,653 的素数,numpy 解决方案仍然会更快。
3赞 Kimvais 1/15/2010
@Daniel正确。然而,商店,你所拥有的,并在需要时继续存在......
0赞 Ross 10/23/2014
@Daniel G 我认为下载时间无关紧要。这真的不是关于生成数字吗,所以你要考虑用于创建你正在下载的列表的算法。任何时候的复杂性都会忽略给定它 O(n) 的文件传输一次。
1赞 Batman 5/23/2017
UTM prime 页面的 FAQ 建议计算小素数比从磁盘上读取它们更快(问题是小意味着什么)。
28赞 jbochi 1/16/2010 #6

使用 Sundaram 的 Sieve,我想我打破了纯 Python 的记录:

def sundaram3(max_n):
    numbers = range(3, max_n+1, 2)
    half = (max_n)//2
    initial = 4

    for step in xrange(3, max_n+1, 2):
        for i in xrange(initial, half, step):
            numbers[i-1] = 0
        initial += 2*(step+1)

        if initial > half:
            return [2] + filter(None, numbers)

比较:

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.get_primes_erat(1000000)"
10 loops, best of 3: 710 msec per loop

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.daniel_sieve_2(1000000)"
10 loops, best of 3: 435 msec per loop

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.sundaram3(1000000)"
10 loops, best of 3: 327 msec per loop

评论

1赞 mthurlin 1/20/2010
我设法通过在函数顶部添加“zero = 0”,然后将过滤器中的 lambda 替换为“zero.__sub__”,将您的函数速度提高了大约 20%。不是世界上最漂亮的代码,但速度更快:)
1赞 jbochi 1/20/2010
@truppo:感谢您的评论!我刚刚意识到传递而不是原始函数有效,它甚至比Nonezero.__sub__
10赞 wrhall 9/22/2013
你知道吗,如果你通过它就会回来?它似乎用无数的——也许是所有的——奇数来做到这一点(即使它们不是质数)sundaram3(9)[2, 3, 5, 7, 9]
1赞 Assem 1/7/2016
它有一个问题:Sundaram3(7071) 包括 7071,而它不是 Prime
4赞 Ruggiero Spearman 1/21/2010 #7

假设 N < 9,080,191 的 Miller-Rabin 素数检验的确定性实现

import sys

def miller_rabin_pass(a, n):
    d = n - 1
    s = 0
    while d % 2 == 0:
        d >>= 1
        s += 1

    a_to_power = pow(a, d, n)
    if a_to_power == 1:
        return True
    for i in range(s-1):
        if a_to_power == n - 1:
            return True
        a_to_power = (a_to_power * a_to_power) % n
    return a_to_power == n - 1


def miller_rabin(n):
    if n <= 2:
        return n == 2

    if n < 2_047:
        return miller_rabin_pass(2, n)

    return all(miller_rabin_pass(a, n) for a in (31, 73))


n = int(sys.argv[1])
primes = [2]
for p in range(3,n,2):
  if miller_rabin(p):
    primes.append(p)
print len(primes)

根据维基百科上的文章(http://en.wikipedia.org/wiki/Miller-Rabin_primality_test),在 a = 31 和 73 的情况下测试 N < 9,080,191 足以决定 N 是否是复合的。

我从原始 Miller-Rabin 测试的概率实现中改编了源代码,如下所示:https://www.literateprograms.org/miller-rabin_primality_test__python_.html

评论

1赞 jbochi 1/21/2010
感谢 Miller-Rabin 的素数测试,但这段代码实际上速度较慢,并且没有提供正确的结果。37 是素数,未通过测试。
0赞 Ruggiero Spearman 1/21/2010
我想 37 是特例之一,我的错。尽管我对确定性版本充满希望:)
0赞 Misguided 11/27/2013
拉宾·米勒(Rabin Miller)没有任何特殊情况。
3赞 Logan 12/25/2013
你误读了这篇文章。它是 31 岁,而不是 37 岁。这就是您的实现失败的原因。
4赞 Dave W. Smith 1/21/2010 #8

如果你可以控制 N,列出所有素数的最快方法是预先计算它们。认真地。预计算是一种被忽视的优化方式。

评论

3赞 jbochi 1/21/2010
或者从这里下载它们 primes.utm.edu/lists/small/millions,但这个想法是测试 python 的极限,看看优化是否能产生漂亮的代码。
4赞 MAK 1/22/2010 #9

这是我通常用来在 Python 中生成素数的代码:

$ python -mtimeit -s'import sieve' 'sieve.sieve(1000000)' 
10 loops, best of 3: 445 msec per loop
$ cat sieve.py
from math import sqrt

def sieve(size):
 prime=[True]*size
 rng=xrange
 limit=int(sqrt(size))

 for i in rng(3,limit+1,+2):
  if prime[i]:
   prime[i*i::+i]=[False]*len(prime[i*i::+i])

 return [2]+[i for i in rng(3,size,+2) if prime[i]]

if __name__=='__main__':
 print sieve(100)

它无法与此处发布的更快解决方案竞争,但至少它是纯 python。

感谢您发布这个问题。我今天真的学到了很多东西。

1赞 akuhn 1/25/2010 #10

我的猜测是,所有方法中最快的方法是在代码中对素数进行硬编码。

那么,为什么不直接编写一个缓慢的脚本来生成另一个包含所有数字的源文件,然后在运行实际程序时导入该源文件。

当然,这只有在编译时知道 N 的上限时才有效,但(几乎)所有项目 Euler 问题都是如此。

 

PS:我可能是错的,虽然用硬连线素数解析源代码比首先计算它们慢,但据我所知,Python 从编译的 .pyc 文件运行,所以在这种情况下,读取所有素数最多为 N 的二进制数组应该是血腥的快。

160赞 19 revs, 4 users 76%Robert William Hanks #11

更快、更内存的纯 Python 代码:

def primes(n):
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i]:
            sieve[i*i::2*i]=[False]*((n-i*i-1)//(2*i)+1)
    return [2] + [i for i in range(3,n,2) if sieve[i]]

或从半筛开始

def primes1(n):
    """ Returns  a list of primes < n """
    sieve = [True] * (n//2)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = [False] * ((n-i*i-1)//(2*i)+1)
    return [2] + [2*i+1 for i in range(1,n//2) if sieve[i]]

更快、更内存的 numpy 代码:

import numpy
def primesfrom3to(n):
    """ Returns a array of primes, 3 <= p < n """
    sieve = numpy.ones(n//2, dtype=bool)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = False
    return 2*numpy.nonzero(sieve)[0][1::]+1

从三分之一的筛子开始的更快变化:

import numpy
def primesfrom2to(n):
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = numpy.ones(n//3 + (n%6==2), dtype=bool)
    for i in range(1,int(n**0.5)//3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[       k*k//3     ::2*k] = False
            sieve[k*(k-2*(i&1)+4)//3::2*k] = False
    return numpy.r_[2,3,((3*numpy.nonzero(sieve)[0][1:]+1)|1)]

上述代码的(难以编码的)纯 python 版本是:

def primes2(n):
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    n, correction = n-n%6+6, 2-(n%6>1)
    sieve = [True] * (n//3)
    for i in range(1,int(n**0.5)//3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      k*k//3      ::2*k] = [False] * ((n//6-k*k//6-1)//k+1)
        sieve[k*(k-2*(i&1)+4)//3::2*k] = [False] * ((n//6-k*(k-2*(i&1)+4)//6-1)//k+1)
    return [2,3] + [3*i+1|1 for i in range(1,n//3-correction) if sieve[i]]

不幸的是,纯 python 没有采用更简单、更快速的 numpy 赋值方式,并且在循环中调用 in 太慢了。因此,我不得不即兴发挥来纠正输入(并避免更多的数学运算),并做一些极端(和痛苦的)数学魔术。len()[False]*len(sieve[((k*k)//3)::2*k])

就我个人而言,我认为 numpy(被广泛使用)不是 Python 标准库的一部分是一种耻辱,而且语法和速度的改进似乎被 Python 开发人员完全忽视了。

评论

10赞 Adam 3/25/2013
Numpy 现在与 Python 3 兼容。它不在标准库中这一事实是件好事,这样他们就可以有自己的发布周期。
1赞 hiro protagonist 7/18/2015
我建议只将二进制值存储在数组中 - 就像这里使用的那样(对于最简单的素数筛;不是这里比赛的竞争者!stackoverflow.com/questions/31120986/......bitarray
0赞 355durch113 10/17/2015
在方法中强制转换时,分割是否应该在括号内?primesfrom2to()
3赞 Moebius 4/9/2016
对于与 python 3 兼容的纯 python 版本,请点击以下链接: stackoverflow.com/a/33356284/2482582
4赞 Him 8/2/2019
神圣的屁股零食这个傻瓜很快。
1赞 user1016274 8/13/2010 #12

很抱歉打扰,但 erat2() 在算法中存在严重缺陷。

在搜索下一个复合时,我们只需要测试奇数。 q,p 两者都是奇数;那么 Q+P 是偶数,不需要测试,但 Q+2*P 总是奇数。这消除了 while 循环条件下的“if even”测试,并节省了大约 30% 的运行时间。

当我们在做的时候:不要使用优雅的“D.pop(q,None)”获取和删除方法,而是使用“if q in D: p=D[q],del D[q]”,这是两倍的速度!至少在我的机器 (P3-1Ghz) 上。 因此,我建议使用这种巧妙算法的实现:

def erat3( ):
    from itertools import islice, count

    # q is the running integer that's checked for primeness.
    # yield 2 and no other even number thereafter
    yield 2
    D = {}
    # no need to mark D[4] as we will test odd numbers only
    for q in islice(count(3),0,None,2):
        if q in D:                  #  is composite
            p = D[q]
            del D[q]
            # q is composite. p=D[q] is the first prime that
            # divides it. Since we've reached q, we no longer
            # need it in the map, but we'll mark the next
            # multiple of its witnesses to prepare for larger
            # numbers.
            x = q + p+p        # next odd(!) multiple
            while x in D:      # skip composites
                x += p+p
            D[x] = p
        else:                  # is prime
            # q is a new prime.
            # Yield it and mark its first multiple that isn't
            # already marked in previous iterations.
            D[q*q] = q
            yield q

评论

0赞 Will Ness 11/11/2012
有关延迟将素数添加到字典中(直到在输入中看到素数的平方),请参阅 stackoverflow.com/a/10733621/849891
4赞 nolfonzo 9/4/2010 #13

使用 Numpy 的半筛子的实现略有不同:

http://rebrained.com/?p=458

import math
import numpy
def prime6(upto):
    primes=numpy.arange(3,upto+1,2)
    isprime=numpy.ones((upto-1)/2,dtype=bool)
    for factor in primes[:int(math.sqrt(upto))]:
        if isprime[(factor-2)/2]: isprime[(factor*3-2)/2:(upto-1)/2:factor]=0
    return numpy.insert(primes[isprime],0,2)

有人可以将其与其他时间进行比较吗?在我的机器上,它似乎与其他 Numpy 半筛相当。

评论

0赞 jfs 9/4/2010
upto=10**6: - 7 毫秒; - 12 ms ideone.com/oDg2Yprimesfrom2to()prime6()
0赞 tzot 9/26/2010 #14

到目前为止,我尝试过的最快的方法是基于 Python 说明书 erat2 函数:

import itertools as it
def erat2a( ):
    D = {  }
    yield 2
    for q in it.islice(it.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = q + 2*p
            while x in D:
                x += 2*p
            D[x] = p

有关加速的说明,请参阅答案。

2赞 smac89 2/20/2013 #15

第一次使用 python,所以我在这里使用的一些方法可能看起来有点麻烦。我只是直接将我的 c++ 代码转换为 python,这就是我所拥有的(尽管在 python 中有点 slowww)

#!/usr/bin/env python
import time

def GetPrimes(n):

    Sieve = [1 for x in xrange(n)]

    Done = False
    w = 3

    while not Done:

        for q in xrange (3, n, 2):
            Prod = w*q
            if Prod < n:
                Sieve[Prod] = 0
            else:
                break

        if w > (n/2):
            Done = True
        w += 2

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

pythonw Primes.py

在 12.799119 秒内找到 664579 个素数!

#!/usr/bin/env python
import time

def GetPrimes2(n):

    Sieve = [1 for x in xrange(n)]

    for q in xrange (3, n, 2):
        k = q
        for y in xrange(k*3, n, k*2):
            Sieve[y] = 0

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes2(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

pythonw Primes2.py

在 10.230172 秒内找到 664579 个素数!

#!/usr/bin/env python
import time

def GetPrimes3(n):

    Sieve = [1 for x in xrange(n)]

    for q in xrange (3, n, 2):
        k = q
        for y in xrange(k*k, n, k << 1):
            Sieve[y] = 0

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes3(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

Python Primes2.py

在 7.113776 秒内找到 664579 个素数!

1赞 cobie 4/15/2013 #16

我可能会迟到,但必须为此添加我自己的代码。它在空间中使用大约 n/2,因为我们不需要存储偶数,而且我还使用了 bitarray python 模块,进一步大幅减少了内存消耗,并能够计算高达 1,000,000,000 的所有素数

from bitarray import bitarray
def primes_to(n):
    size = n//2
    sieve = bitarray(size)
    sieve.setall(1)
    limit = int(n**0.5)
    for i in range(1,limit):
        if sieve[i]:
            val = 2*i+1
            sieve[(i+i*val)::val] = 0
    return [2] + [2*i+1 for i, v in enumerate(sieve) if v and i > 0]

python -m timeit -n10 -s "import euler" "euler.primes_to(1000000000)"
10 loops, best of 3: 46.5 sec per loop

这是在 64 位 2.4GHZ MAC OSX 10.8.3 上运行的

评论

2赞 Will Ness 4/17/2013
为未知机器发布一个计时什么也没说。这里公认的答案是“没有 psyco,因为 n=1000000,rwh_primes2 是最快的”。因此,如果您在同一台机器上提供该代码和您的代码的计时,并且还以 2、4、10 百万计,那么它将提供更多信息。
1赞 GordonBGood 8/20/2013
-1,此代码依赖于用 C 语言实现的 bitarray 的特殊功能,这就是为什么代码速度很快的原因,因为大部分工作都是在切片分配中的本机代码中完成的。bitarray 包打破了可变序列的适当切片(在一定范围内建立索引)的标准定义,因为它允许将单个布尔值 0/1 或 True/False 分配给切片的所有元素,而纯 Python 的标准行为似乎是不允许这样做,只允许赋值 0,在这种情况下,它被视为序列中所有切片元素的 del/数组。
1赞 GordonBGood 8/20/2013
续:如果要比较调用非标准的原生代码,我们不妨编写一个基于 C 代码的“fastprimes”序列生成器包,例如 Kim Walisch 的 primesieve 代码,只需调用序列生成器,即可在几秒钟内生成 40 亿加 32 位数字范围内的所有素数。这也几乎不使用内存,因为链接的代码基于埃拉托色尼的分段筛子,因此只使用几十千字节的RAM,如果生成序列,则不需要列表存储。
3赞 ABri 9/22/2013 #17

我知道比赛已经关闭了好几年。...

尽管如此,这是我对纯 python 素数筛的建议,基于在向前处理筛子时使用适当的步骤省略 2、3 和 5 的倍数。尽管如此,N<10^9 实际上比威廉·汉克斯 (William Hanks) rwh_primes2和rwh_primes1@Robert优越的解决方案要慢。通过使用高于 1.5* 10^8 的ctypes.c_ushort筛阵列,它在某种程度上适应了内存限制。

10^6

$ python -mtimeit -s“导入 primeSieveSpeedComp” “primeSieveSpeedComp.primeSieveSeq(1000000)” 10 个循环,三局两胜:每个循环 46.7 毫秒

比较:$ python -mtimeit -s“import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes1(1000000)” 10 循环,三局两胜:43.2 每环 msec 比较:$ python -m timeit -s“import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes2(1000000)” 10 循环,三局两胜制:34.5 每环 msec

10^7

$ python -mtimeit -s“导入 primeSieveSpeedComp” “primeSieveSpeedComp.primeSieveSeq(10000000)” 10 个循环,三局两胜:每个循环 530 毫秒

比较:$ python -mtimeit -s“import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes1(10000000)” 10 循环,三局两胜制:494 每环 msec 比较:$ python -m timeit -s“import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes2(10000000)” 10 循环,三局两胜制:375 每环 msec

10^8

$ python -mtimeit -s“导入 primeSieveSpeedComp” “primeSieveSpeedComp.primeSieveSeq(100000000)” 10 个循环,三局两胜制:每个循环 5.55 秒

比较:$ python -mtimeit -s“import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes1(100000000)” 10 个循环,三局两胜:5.33 每循环秒数 比较:$ python -m timeit -s“import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes2(100000000)” 10 个循环,三局两胜:3.95 每循环秒数

10^9

$ python -mtimeit -s“导入primeSieveSpeedComp”“primeSieveSpeedComp.primeSieveSeq(1000000000)” 10 个循环,三局两胜:每个循环 61.2

比较:$ python -mtimeit -n 3 -s“import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes1(1000000000)” 3 个循环,三局两胜制:每个循环 97.8

比较:$ python -m timeit -s“import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes2(1000000000)” 10 个循环,三局两胜: 每个循环 41.9 秒

您可以将下面的代码复制到 ubuntus primeSieveSpeedComp 中来查看此测试。

def primeSieveSeq(MAX_Int):
    if MAX_Int > 5*10**8:
        import ctypes
        int16Array = ctypes.c_ushort * (MAX_Int >> 1)
        sieve = int16Array()
        #print 'uses ctypes "unsigned short int Array"'
    else:
        sieve = (MAX_Int >> 1) * [False]
        #print 'uses python list() of long long int'
    if MAX_Int < 10**8:
        sieve[4::3] = [True]*((MAX_Int - 8)/6+1)
        sieve[12::5] = [True]*((MAX_Int - 24)/10+1)
    r = [2, 3, 5]
    n = 0
    for i in xrange(int(MAX_Int**0.5)/30+1):
        n += 3
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 3
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
    if MAX_Int < 10**8:
        return [2, 3, 5]+[(p << 1) + 1 for p in [n for n in xrange(3, MAX_Int >> 1) if not sieve[n]]]
    n = n >> 1
    try:
        for i in xrange((MAX_Int-2*n)/30 + 1):
            n += 3
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 3
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
    except:
        pass
    return r

评论

0赞 Will Ness 3/3/2016
要可视化测试结果,请将它们绘制在对数-对数尺度上,以查看和比较增长的经验顺序
0赞 ABri 3/13/2016
@感谢您的输入,下次我需要这样的比较时,我会牢记这一点
-1赞 user3261711 2/2/2014 #18

这是一个优雅而简单的解决方案,用于使用存储列表查找素数。从 4 个变量开始,你只需要测试除数的奇数素数,你只需要测试你测试的素数的一半(测试 9、11、13 是否除以 17 没有意义)。它测试以前存储的素数作为除数。

    # Program to calculate Primes
 primes = [1,3,5,7]
for n in range(9,100000,2):
    for x in range(1,(len(primes)/2)):
        if n % primes[x] == 0:
            break
    else:
        primes.append(n)
print primes
0赞 Stefan Gruenwald 11/3/2014 #19

随着时间的流逝,我收集了几个质数筛。我的电脑上最快的是这个:

from time import time
# 175 ms for all the primes up to the value 10**6
def primes_sieve(limit):
    a = [True] * limit
    a[0] = a[1] = False
    #a[2] = True
    for n in xrange(4, limit, 2):
        a[n] = False
    root_limit = int(limit**.5)+1
    for i in xrange(3,root_limit):
        if a[i]:
            for n in xrange(i*i, limit, 2*i):
                a[n] = False
    return a

LIMIT = 10**6
s=time()
primes = primes_sieve(LIMIT)
print time()-s
-4赞 lavee_singh 2/14/2015 #20

这是您可以与他人比较的方式。

# You have to list primes upto n
nums = xrange(2, n)
for i in range(2, 10):
    nums = filter(lambda s: s==i or s%i, nums)
print nums

就是这么简单...

0赞 James Salmoni 2/14/2015 #21

我对这个问题的回答很慢,但这似乎是一个有趣的练习。我正在使用可能在作弊的numpy,我怀疑这种方法是最快的,但应该很清楚。它筛选一个仅引用其索引的布尔数组,并从所有 True 值的索引中引出质数。无需模数。

import numpy as np
def ajs_primes3a(upto):
    mat = np.ones((upto), dtype=bool)
    mat[0] = False
    mat[1] = False
    mat[4::2] = False
    for idx in range(3, int(upto ** 0.5)+1, 2):
        mat[idx*2::idx] = False
    return np.where(mat == True)[0]

评论

0赞 jfs 2/15/2015
它是不正确的,例如 -> 。 不是素数ajs_primes3a(10)array([2, 3, 5, 7, 9])9
0赞 James Salmoni 2/16/2015
你发现了一个我没有发现的边缘情况——干得好!问题出在“for idx in range(3, int(up to ** 0.5), 2):”中,它应该是“for idx in range(3, int(upto ** 0.5) + 1, 2):”。谢谢,但它现在可以工作了。
0赞 James Salmoni 2/16/2015
原因是 idx 循环上升到“最多 ** 05”,对于最多 15 个(包括 15 个)的情况。从 16 岁开始,它工作正常。这是我没有测试过的一组边缘情况。添加 1 意味着它应该适用于所有数字。
0赞 jfs 2/16/2015
现在似乎有效了。它是返回数组的基于解决方案中最慢的。注意:没有真正的埃拉托色尼筛实现使用模 - 无需提及。您可以使用代替 .而不是.numpymat[idx*idx::idx]mat[idx*2::idx]np.nonzero(mat)[0]np.where(mat == True)[0]
0赞 James Salmoni 2/16/2015
谢谢JF。我针对 prime6() 进行了测试,并在 prime6() 接管时更快地获得了高达 (IIRC) 大约 250k 的结果。primesfrom2to() 更快。在长达 20m 时,ajs_primes3a() 需要 0.034744977951ms,prime6() 需要 0.0222899913788ms,primesfrom2to() 需要 0.0104751586914ms(同一台机器,相同的负载,10 个时间中最好的)。老实说,这比我想象的要好!
15赞 Colonel Panic 7/4/2015 #22

如果你不想重新发明轮子,你可以安装符号数学库sympy(是的,它与Python 3兼容)

pip install sympy

并使用 primerange 函数

from sympy import sieve
primes = list(sieve.primerange(1, 10**6))

评论

1赞 lukejanicke 8/5/2020
我注意到这打印了整个列表,而从社区 wiki 答案返回.这是缩短 NumPy 的事情吗?primesfrom2to(10000)[ 2 3 5 ... 9949 9967 9973]nd.array
9赞 Colonel Panic 7/9/2015 #23

编写自己的素数查找代码很有启发性,但手头有一个快速可靠的库也很有用。我围绕 C++ 库 primesieve 写了一个包装器,命名为 primesieve-python

尝试一下pip install primesieve

import primesieve
primes = primesieve.generate_primes(10**8)

我很想看看比较的速度。

评论

0赞 ljetibo 7/14/2015
这并不完全是 OP 的命令,但我不明白为什么投反对票。这是一个 2.8 秒的解决方案,不像其他一些外部模块。我在源代码中注意到它是线程的,对它的扩展能力进行了任何测试?
0赞 Colonel Panic 7/14/2015
@ljetibo欢呼声。瓶颈似乎是将 C++ 向量复制到 Python 列表,因此该函数比count_primesgenerate_primes
0赞 Colonel Panic 7/14/2015
在我的计算机上,它可以轻松生成高达 1e8 的素数(它为 1e9 提供 MemoryError),并计算高达 1e10 的素数。上面@HappyLeapSecond比较了 1e6 的算法
11赞 Jason 10/27/2015 #24

如果您接受 itertools 但不接受 numpy,这里是 Python 3 的 rwh_primes2 的改编版,它在我的机器上运行速度大约是原来的两倍。唯一实质性的变化是使用字节数组而不是布尔值列表,并使用压缩而不是列表推导来构建最终列表。(如果可以的话,我会把它添加为像 moarningsun 这样的评论。

import itertools
izip = itertools.zip_longest
chain = itertools.chain.from_iterable
compress = itertools.compress
def rwh_primes2_python3(n):
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    zero = bytearray([False])
    size = n//3 + (n % 6 == 2)
    sieve = bytearray([True]) * size
    sieve[0] = False
    for i in range(int(n**0.5)//3+1):
      if sieve[i]:
        k=3*i+1|1
        start = (k*k+4*k-2*k*(i&1))//3
        sieve[(k*k)//3::2*k]=zero*((size - (k*k)//3 - 1) // (2 * k) + 1)
        sieve[  start ::2*k]=zero*((size -   start  - 1) // (2 * k) + 1)
    ans = [2,3]
    poss = chain(izip(*[range(i, n, 6) for i in (1,5)]))
    ans.extend(compress(poss, sieve))
    return ans

比较:

>>> timeit.timeit('primes.rwh_primes2(10**6)', setup='import primes', number=1)
0.0652179726976101
>>> timeit.timeit('primes.rwh_primes2_python3(10**6)', setup='import primes', number=1)
0.03267321276325674

>>> timeit.timeit('primes.rwh_primes2(10**8)', setup='import primes', number=1)
6.394284538007014
>>> timeit.timeit('primes.rwh_primes2_python3(10**8)', setup='import primes', number=1)
3.833829450302801
4赞 lifolofi 10/6/2016 #25

这一切都是编写和测试的。因此,没有必要重新发明轮子。

python -m timeit -r10 -s"from sympy import sieve" "primes = list(sieve.primerange(1, 10**6))"

让我们破纪录地跑了 12.2 毫秒

10 loops, best of 10: 12.2 msec per loop

如果这还不够快,你可以试试 PyPy:

pypy -m timeit -r10 -s"from sympy import sieve" "primes = list(sieve.primerange(1, 10**6))"

其结果是:

10 loops, best of 10: 2.03 msec per loop

有 247 张赞成票的答案列出了 15.9 毫秒的最佳解决方案。 比较一下!!

3赞 Bruno Adelé 10/18/2016 #26

我测试了一些 unutbu 的函数,我用饥饿的百万数计算了它

获胜者是使用 numpy 库的函数,

注意:进行内存利用率测试也很有趣:)

Computation time result

示例代码

我的 github 存储库上的完整代码

#!/usr/bin/env python

import lib
import timeit
import sys
import math
import datetime

import prettyplotlib as ppl
import numpy as np

import matplotlib.pyplot as plt
from prettyplotlib import brewer2mpl

primenumbers_gen = [
    'sieveOfEratosthenes',
    'ambi_sieve',
    'ambi_sieve_plain',
    'sundaram3',
    'sieve_wheel_30',
    'primesfrom3to',
    'primesfrom2to',
    'rwh_primes',
    'rwh_primes1',
    'rwh_primes2',
]

def human_format(num):
    # https://stackoverflow.com/questions/579310/formatting-long-numbers-as-strings-in-python?answertab=active#tab-top
    magnitude = 0
    while abs(num) >= 1000:
        magnitude += 1
        num /= 1000.0
    # add more suffixes if you need them
    return '%.2f%s' % (num, ['', 'K', 'M', 'G', 'T', 'P'][magnitude])


if __name__=='__main__':

    # Vars
    n = 10000000 # number itereration generator
    nbcol = 5 # For decompose prime number generator
    nb_benchloop = 3 # Eliminate false positive value during the test (bench average time)
    datetimeformat = '%Y-%m-%d %H:%M:%S.%f'
    config = 'from __main__ import n; import lib'
    primenumbers_gen = {
        'sieveOfEratosthenes': {'color': 'b'},
        'ambi_sieve': {'color': 'b'},
        'ambi_sieve_plain': {'color': 'b'},
         'sundaram3': {'color': 'b'},
        'sieve_wheel_30': {'color': 'b'},
# # #        'primesfrom2to': {'color': 'b'},
        'primesfrom3to': {'color': 'b'},
        # 'rwh_primes': {'color': 'b'},
        # 'rwh_primes1': {'color': 'b'},
        'rwh_primes2': {'color': 'b'},
    }


    # Get n in command line
    if len(sys.argv)>1:
        n = int(sys.argv[1])

    step = int(math.ceil(n / float(nbcol)))
    nbs = np.array([i * step for i in range(1, int(nbcol) + 1)])
    set2 = brewer2mpl.get_map('Paired', 'qualitative', 12).mpl_colors

    print datetime.datetime.now().strftime(datetimeformat)
    print("Compute prime number to %(n)s" % locals())
    print("")

    results = dict()
    for pgen in primenumbers_gen:
        results[pgen] = dict()
        benchtimes = list()
        for n in nbs:
            t = timeit.Timer("lib.%(pgen)s(n)" % locals(), setup=config)
            execute_times = t.repeat(repeat=nb_benchloop,number=1)
            benchtime = np.mean(execute_times)
            benchtimes.append(benchtime)
        results[pgen] = {'benchtimes':np.array(benchtimes)}

fig, ax = plt.subplots(1)
plt.ylabel('Computation time (in second)')
plt.xlabel('Numbers computed')
i = 0
for pgen in primenumbers_gen:

    bench = results[pgen]['benchtimes']
    avgs = np.divide(bench,nbs)
    avg = np.average(bench, weights=nbs)

    # Compute linear regression
    A = np.vstack([nbs, np.ones(len(nbs))]).T
    a, b = np.linalg.lstsq(A, nbs*avgs)[0]

    # Plot
    i += 1
    #label="%(pgen)s" % locals()
    #ppl.plot(nbs, nbs*avgs, label=label, lw=1, linestyle='--', color=set2[i % 12])
    label="%(pgen)s avg" % locals()
    ppl.plot(nbs, a * nbs + b, label=label, lw=2, color=set2[i % 12])
print datetime.datetime.now().strftime(datetimeformat)

ppl.legend(ax, loc='upper left', ncol=4)

# Change x axis label
ax.get_xaxis().get_major_formatter().set_scientific(False)
fig.canvas.draw()
labels = [human_format(int(item.get_text())) for item in ax.get_xticklabels()]

ax.set_xticklabels(labels)
ax = plt.gca()

plt.show()

评论

2赞 Will Ness 8/26/2019
为了比较算法性能,最好以对数-对数尺度绘制。
3赞 Smart Manoj 7/19/2017 #27

对于 Python 3

def rwh_primes2(n):
    correction = (n%6>1)
    n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6]
    sieve = [True] * (n//3)
    sieve[0] = False
    for i in range(int(n**0.5)//3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      ((k*k)//3)      ::2*k]=[False]*((n//6-(k*k)//6-1)//k+1)
        sieve[(k*k+4*k-2*k*(i&1))//3::2*k]=[False]*((n//6-(k*k+4*k-2*k*(i&1))//6-1)//k+1)
    return [2,3] + [3*i+1|1 for i in range(1,n//3-correction) if sieve[i]]
9赞 Bruno Astrolino 10/9/2017 #28

这是最快的函数之一的两个更新(纯 Python 3.6)版本,

from itertools import compress

def rwh_primes1v1(n):
    """ Returns  a list of primes < n for n > 2 """
    sieve = bytearray([True]) * (n//2)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = bytearray((n-i*i-1)//(2*i)+1)
    return [2,*compress(range(3,n,2), sieve[1:])]

def rwh_primes1v2(n):
    """ Returns a list of primes < n for n > 2 """
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return [2,*compress(range(3,n,2), sieve[1:])]

评论

2赞 it's a hire car baby 9/12/2018
在 Python 3 中,我使用了这个函数 stackoverflow.com/a/3035188/7799269 但用 // 替换了 // 和 xrange 和 range,它们似乎比这快得多。
1赞 Alexander 2/6/2018 #29

这里有一个有趣的技术,可以使用python的列表推导式生成素数(但不是最有效的):

noprimes = [j for i in range(2, 8) for j in range(i*2, 50, i)]
primes = [x for x in range(2, 50) if x not in noprimes]

评论

0赞 AcK 6/11/2021
示例链接已死。
0赞 Alexander 6/11/2021
谢谢,我已经删除了链接。
1赞 Peter Mølgaard Pallesen 5/10/2019 #30

这是 Sieve of Eratosthenes 的 numpy 版本,具有良好的复杂度(低于对长度为 n 的数组进行排序)和矢量化。相比之下,这@unutbu倍的速度与具有 46 个微分子的封装一样快,可以找到 100 万以下的所有素数。

import numpy as np 
def generate_primes(n):
    is_prime = np.ones(n+1,dtype=bool)
    is_prime[0:2] = False
    for i in range(int(n**0.5)+1):
        if is_prime[i]:
            is_prime[i**2::i]=False
    return np.where(is_prime)[0]

计时:

import time    
for i in range(2,10):
    timer =time.time()
    generate_primes(10**i)
    print('n = 10^',i,' time =', round(time.time()-timer,6))

>> n = 10^ 2  time = 5.6e-05
>> n = 10^ 3  time = 6.4e-05
>> n = 10^ 4  time = 0.000114
>> n = 10^ 5  time = 0.000593
>> n = 10^ 6  time = 0.00467
>> n = 10^ 7  time = 0.177758
>> n = 10^ 8  time = 1.701312
>> n = 10^ 9  time = 19.322478
4赞 MrSeeker 5/24/2019 #31

Pure Python 中最快的素筛

from itertools import compress

def half_sieve(n):
    """
    Returns a list of prime numbers less than `n`.
    """
    if n <= 2:
        return []
    sieve = bytearray([True]) * (n // 2)
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i // 2]:
            sieve[i * i // 2::i] = bytearray((n - i * i - 1) // (2 * i) + 1)
    primes = list(compress(range(1, n, 2), sieve))
    primes[0] = 2
    return primes

我优化了埃拉托色尼的筛子的速度和内存。

基准

from time import clock
import platform

def benchmark(iterations, limit):
    start = clock()
    for x in range(iterations):
        half_sieve(limit)
    end = clock() - start
    print(f'{end/iterations:.4f} seconds for primes < {limit}')

if __name__ == '__main__':
    print(platform.python_version())
    print(platform.platform())
    print(platform.processor())
    it = 10
    for pw in range(4, 9):
        benchmark(it, 10**pw)

输出

>>> 3.6.7
>>> Windows-10-10.0.17763-SP0
>>> Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
>>> 0.0003 seconds for primes < 10000
>>> 0.0021 seconds for primes < 100000
>>> 0.0204 seconds for primes < 1000000
>>> 0.2389 seconds for primes < 10000000
>>> 2.6702 seconds for primes < 100000000
0赞 Greg Ames 9/11/2019 #32

在这里找到了一个纯 Python 2 素数生成器,在 Willy Good 的评论中,它比 rwh2_primes 快。

def primes235(limit):
yield 2; yield 3; yield 5
if limit < 7: return
modPrms = [7,11,13,17,19,23,29,31]
gaps = [4,2,4,2,4,6,2,6,4,2,4,2,4,6,2,6] # 2 loops for overflow
ndxs = [0,0,0,0,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5,5,5,6,6,7,7,7,7,7,7]
lmtbf = (limit + 23) // 30 * 8 - 1 # integral number of wheels rounded up
lmtsqrt = (int(limit ** 0.5) - 7)
lmtsqrt = lmtsqrt // 30 * 8 + ndxs[lmtsqrt % 30] # round down on the wheel
buf = [True] * (lmtbf + 1)
for i in xrange(lmtsqrt + 1):
    if buf[i]:
        ci = i & 7; p = 30 * (i >> 3) + modPrms[ci]
        s = p * p - 7; p8 = p << 3
        for j in range(8):
            c = s // 30 * 8 + ndxs[s % 30]
            buf[c::p8] = [False] * ((lmtbf - c) // p8 + 1)
            s += p * gaps[ci]; ci += 1
for i in xrange(lmtbf - 6 + (ndxs[(limit - 7) % 30])): # adjust for extras
    if buf[i]: yield (30 * (i >> 3) + modPrms[i & 7])

我的结果:

$ time ./prime_rwh2.py 1e8
5761455 primes found < 1e8

real    0m3.201s
user    0m2.609s
sys     0m0.578s
$ time ./prime_wheel.py 1e8
5761455 primes found < 1e8

real    0m2.710s
user    0m2.469s
sys     0m0.219s

...在我最近的中端笔记本电脑 (i5 8265U 1.6GHz) 上,在 Win 10 上运行 Ubuntu。

这是一个 mod 30 轮筛,可跳过 2、3 和 5 的倍数。当我的笔记本电脑开始用完 2.5G RAM 并交换大量时,它对我来说非常有效,最高可达 9e8。

我喜欢 mod 30,因为它只有 8 个余数,不是 2、3 或 5 的倍数。这样就可以使用移位和“&”进行乘法、除法和模数,并且应该允许将一个模组 30 轮的结果打包到一个字节中。我已将 Willy 的代码转换为分段的 mod 30 轮筛,以消除大 N 的抖动并将其发布在这里

还有一个更快的 Javascript 版本,它被分段并使用 mod 210 轮(没有 2、3、5 或 7 的倍数),由 @GordonBGood 提供对我有用的深入解释。

2赞 Zaz 9/28/2019 #33

我发现这样做的最简单方法是:

primes = []
for n in range(low, high + 1):
    if all(n % i for i in primes):
        primes.append(n)
9赞 Nico Schlömer 11/22/2019 #34

我已经更新了 Python 3 的大部分代码,并将其扔到 perfplot(我的一个项目)上,看看哪个实际上是最快的。事实证明,对于大,拿走了蛋糕:nprimesfrom{2,3}to

enter image description here


重现绘图的代码:

import perfplot
from math import sqrt, ceil
import numpy as np
import sympy


def rwh_primes(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i]:
            sieve[i * i::2 * i] = [False] * ((n - i * i - 1) // (2 * i) + 1)
    return [2] + [i for i in range(3, n, 2) if sieve[i]]


def rwh_primes1(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * (n // 2)
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i // 2]:
            sieve[i * i // 2::i] = [False] * ((n - i * i - 1) // (2 * i) + 1)
    return [2] + [2 * i + 1 for i in range(1, n // 2) if sieve[i]]


def rwh_primes2(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """Input n>=6, Returns a list of primes, 2 <= p < n"""
    assert n >= 6
    correction = n % 6 > 1
    n = {0: n, 1: n - 1, 2: n + 4, 3: n + 3, 4: n + 2, 5: n + 1}[n % 6]
    sieve = [True] * (n // 3)
    sieve[0] = False
    for i in range(int(n ** 0.5) // 3 + 1):
        if sieve[i]:
            k = 3 * i + 1 | 1
            sieve[((k * k) // 3)::2 * k] = [False] * (
                (n // 6 - (k * k) // 6 - 1) // k + 1
            )
            sieve[(k * k + 4 * k - 2 * k * (i & 1)) // 3::2 * k] = [False] * (
                (n // 6 - (k * k + 4 * k - 2 * k * (i & 1)) // 6 - 1) // k + 1
            )
    return [2, 3] + [3 * i + 1 | 1 for i in range(1, n // 3 - correction) if sieve[i]]


def sieve_wheel_30(N):
    # http://zerovolt.com/?p=88
    """ Returns a list of primes <= N using wheel criterion 2*3*5 = 30

Copyright 2009 by zerovolt.com
This code is free for non-commercial purposes, in which case you can just leave this comment as a credit for my work.
If you need this code for commercial purposes, please contact me by sending an email to: info [at] zerovolt [dot] com."""
    __smallp = (
        2,
        3,
        5,
        7,
        11,
        13,
        17,
        19,
        23,
        29,
        31,
        37,
        41,
        43,
        47,
        53,
        59,
        61,
        67,
        71,
        73,
        79,
        83,
        89,
        97,
        101,
        103,
        107,
        109,
        113,
        127,
        131,
        137,
        139,
        149,
        151,
        157,
        163,
        167,
        173,
        179,
        181,
        191,
        193,
        197,
        199,
        211,
        223,
        227,
        229,
        233,
        239,
        241,
        251,
        257,
        263,
        269,
        271,
        277,
        281,
        283,
        293,
        307,
        311,
        313,
        317,
        331,
        337,
        347,
        349,
        353,
        359,
        367,
        373,
        379,
        383,
        389,
        397,
        401,
        409,
        419,
        421,
        431,
        433,
        439,
        443,
        449,
        457,
        461,
        463,
        467,
        479,
        487,
        491,
        499,
        503,
        509,
        521,
        523,
        541,
        547,
        557,
        563,
        569,
        571,
        577,
        587,
        593,
        599,
        601,
        607,
        613,
        617,
        619,
        631,
        641,
        643,
        647,
        653,
        659,
        661,
        673,
        677,
        683,
        691,
        701,
        709,
        719,
        727,
        733,
        739,
        743,
        751,
        757,
        761,
        769,
        773,
        787,
        797,
        809,
        811,
        821,
        823,
        827,
        829,
        839,
        853,
        857,
        859,
        863,
        877,
        881,
        883,
        887,
        907,
        911,
        919,
        929,
        937,
        941,
        947,
        953,
        967,
        971,
        977,
        983,
        991,
        997,
    )
    # wheel = (2, 3, 5)
    const = 30
    if N < 2:
        return []
    if N <= const:
        pos = 0
        while __smallp[pos] <= N:
            pos += 1
        return list(__smallp[:pos])
    # make the offsets list
    offsets = (7, 11, 13, 17, 19, 23, 29, 1)
    # prepare the list
    p = [2, 3, 5]
    dim = 2 + N // const
    tk1 = [True] * dim
    tk7 = [True] * dim
    tk11 = [True] * dim
    tk13 = [True] * dim
    tk17 = [True] * dim
    tk19 = [True] * dim
    tk23 = [True] * dim
    tk29 = [True] * dim
    tk1[0] = False
    # help dictionary d
    # d[a , b] = c  ==> if I want to find the smallest useful multiple of (30*pos)+a
    # on tkc, then I need the index given by the product of [(30*pos)+a][(30*pos)+b]
    # in general. If b < a, I need [(30*pos)+a][(30*(pos+1))+b]
    d = {}
    for x in offsets:
        for y in offsets:
            res = (x * y) % const
            if res in offsets:
                d[(x, res)] = y
    # another help dictionary: gives tkx calling tmptk[x]
    tmptk = {1: tk1, 7: tk7, 11: tk11, 13: tk13, 17: tk17, 19: tk19, 23: tk23, 29: tk29}
    pos, prime, lastadded, stop = 0, 0, 0, int(ceil(sqrt(N)))

    # inner functions definition
    def del_mult(tk, start, step):
        for k in range(start, len(tk), step):
            tk[k] = False

    # end of inner functions definition
    cpos = const * pos
    while prime < stop:
        # 30k + 7
        if tk7[pos]:
            prime = cpos + 7
            p.append(prime)
            lastadded = 7
            for off in offsets:
                tmp = d[(7, off)]
                start = (
                    (pos + prime)
                    if off == 7
                    else (prime * (const * (pos + 1 if tmp < 7 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 11
        if tk11[pos]:
            prime = cpos + 11
            p.append(prime)
            lastadded = 11
            for off in offsets:
                tmp = d[(11, off)]
                start = (
                    (pos + prime)
                    if off == 11
                    else (prime * (const * (pos + 1 if tmp < 11 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 13
        if tk13[pos]:
            prime = cpos + 13
            p.append(prime)
            lastadded = 13
            for off in offsets:
                tmp = d[(13, off)]
                start = (
                    (pos + prime)
                    if off == 13
                    else (prime * (const * (pos + 1 if tmp < 13 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 17
        if tk17[pos]:
            prime = cpos + 17
            p.append(prime)
            lastadded = 17
            for off in offsets:
                tmp = d[(17, off)]
                start = (
                    (pos + prime)
                    if off == 17
                    else (prime * (const * (pos + 1 if tmp < 17 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 19
        if tk19[pos]:
            prime = cpos + 19
            p.append(prime)
            lastadded = 19
            for off in offsets:
                tmp = d[(19, off)]
                start = (
                    (pos + prime)
                    if off == 19
                    else (prime * (const * (pos + 1 if tmp < 19 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 23
        if tk23[pos]:
            prime = cpos + 23
            p.append(prime)
            lastadded = 23
            for off in offsets:
                tmp = d[(23, off)]
                start = (
                    (pos + prime)
                    if off == 23
                    else (prime * (const * (pos + 1 if tmp < 23 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 29
        if tk29[pos]:
            prime = cpos + 29
            p.append(prime)
            lastadded = 29
            for off in offsets:
                tmp = d[(29, off)]
                start = (
                    (pos + prime)
                    if off == 29
                    else (prime * (const * (pos + 1 if tmp < 29 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # now we go back to top tk1, so we need to increase pos by 1
        pos += 1
        cpos = const * pos
        # 30k + 1
        if tk1[pos]:
            prime = cpos + 1
            p.append(prime)
            lastadded = 1
            for off in offsets:
                tmp = d[(1, off)]
                start = (
                    (pos + prime)
                    if off == 1
                    else (prime * (const * pos + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
    # time to add remaining primes
    # if lastadded == 1, remove last element and start adding them from tk1
    # this way we don't need an "if" within the last while
    if lastadded == 1:
        p.pop()
    # now complete for every other possible prime
    while pos < len(tk1):
        cpos = const * pos
        if tk1[pos]:
            p.append(cpos + 1)
        if tk7[pos]:
            p.append(cpos + 7)
        if tk11[pos]:
            p.append(cpos + 11)
        if tk13[pos]:
            p.append(cpos + 13)
        if tk17[pos]:
            p.append(cpos + 17)
        if tk19[pos]:
            p.append(cpos + 19)
        if tk23[pos]:
            p.append(cpos + 23)
        if tk29[pos]:
            p.append(cpos + 29)
        pos += 1
    # remove exceeding if present
    pos = len(p) - 1
    while p[pos] > N:
        pos -= 1
    if pos < len(p) - 1:
        del p[pos + 1 :]
    # return p list
    return p


def sieve_of_eratosthenes(n):
    """sieveOfEratosthenes(n): return the list of the primes < n."""
    # Code from: <[email protected]>, Nov 30 2006
    # http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d
    if n <= 2:
        return []
    sieve = list(range(3, n, 2))
    top = len(sieve)
    for si in sieve:
        if si:
            bottom = (si * si - 3) // 2
            if bottom >= top:
                break
            sieve[bottom::si] = [0] * -((bottom - top) // si)
    return [2] + [el for el in sieve if el]


def sieve_of_atkin(end):
    """return a list of all the prime numbers <end using the Sieve of Atkin."""
    # Code by Steve Krenzel, <[email protected]>, improved
    # Code: https://web.archive.org/web/20080324064651/http://krenzel.info/?p=83
    # Info: http://en.wikipedia.org/wiki/Sieve_of_Atkin
    assert end > 0
    lng = (end - 1) // 2
    sieve = [False] * (lng + 1)

    x_max, x2, xd = int(sqrt((end - 1) / 4.0)), 0, 4
    for xd in range(4, 8 * x_max + 2, 8):
        x2 += xd
        y_max = int(sqrt(end - x2))
        n, n_diff = x2 + y_max * y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in range((n_diff - 1) << 1, -1, -8):
            m = n % 12
            if m == 1 or m == 5:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, x2, xd = int(sqrt((end - 1) / 3.0)), 0, 3
    for xd in range(3, 6 * x_max + 2, 6):
        x2 += xd
        y_max = int(sqrt(end - x2))
        n, n_diff = x2 + y_max * y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in range((n_diff - 1) << 1, -1, -8):
            if n % 12 == 7:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, y_min, x2, xd = int((2 + sqrt(4 - 8 * (1 - end))) / 4), -1, 0, 3
    for x in range(1, x_max + 1):
        x2 += xd
        xd += 6
        if x2 >= end:
            y_min = (((int(ceil(sqrt(x2 - end))) - 1) << 1) - 2) << 1
        n, n_diff = ((x * x + x) << 1) - 1, (((x - 1) << 1) - 2) << 1
        for d in range(n_diff, y_min, -8):
            if n % 12 == 11:
                m = n >> 1
                sieve[m] = not sieve[m]
            n += d

    primes = [2, 3]
    if end <= 3:
        return primes[: max(0, end - 2)]

    for n in range(5 >> 1, (int(sqrt(end)) + 1) >> 1):
        if sieve[n]:
            primes.append((n << 1) + 1)
            aux = (n << 1) + 1
            aux *= aux
            for k in range(aux, end, 2 * aux):
                sieve[k >> 1] = False

    s = int(sqrt(end)) + 1
    if s % 2 == 0:
        s += 1
    primes.extend([i for i in range(s, end, 2) if sieve[i >> 1]])

    return primes


def ambi_sieve_plain(n):
    s = list(range(3, n, 2))
    for m in range(3, int(n ** 0.5) + 1, 2):
        if s[(m - 3) // 2]:
            for t in range((m * m - 3) // 2, (n >> 1) - 1, m):
                s[t] = 0
    return [2] + [t for t in s if t > 0]


def sundaram3(max_n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/2073279#2073279
    numbers = range(3, max_n + 1, 2)
    half = (max_n) // 2
    initial = 4

    for step in range(3, max_n + 1, 2):
        for i in range(initial, half, step):
            numbers[i - 1] = 0
        initial += 2 * (step + 1)

        if initial > half:
            return [2] + filter(None, numbers)


# Using Numpy:
def ambi_sieve(n):
    # http://tommih.blogspot.com/2009/04/fast-prime-number-generator.html
    s = np.arange(3, n, 2)
    for m in range(3, int(n ** 0.5) + 1, 2):
        if s[(m - 3) // 2]:
            s[(m * m - 3) // 2::m] = 0
    return np.r_[2, s[s > 0]]


def primesfrom3to(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns an array of primes, p < n """
    assert n >= 2
    sieve = np.ones(n // 2, dtype=bool)
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i // 2]:
            sieve[i * i // 2::i] = False
    return np.r_[2, 2 * np.nonzero(sieve)[0][1::] + 1]


def primesfrom2to(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns an array of primes, 2 <= p < n """
    assert n >= 6
    sieve = np.ones(n // 3 + (n % 6 == 2), dtype=bool)
    sieve[0] = False
    for i in range(int(n ** 0.5) // 3 + 1):
        if sieve[i]:
            k = 3 * i + 1 | 1
            sieve[((k * k) // 3)::2 * k] = False
            sieve[(k * k + 4 * k - 2 * k * (i & 1)) // 3::2 * k] = False
    return np.r_[2, 3, ((3 * np.nonzero(sieve)[0] + 1) | 1)]


def sympy_sieve(n):
    return list(sympy.sieve.primerange(1, n))


b = perfplot.bench(
    setup=lambda n: n,
    kernels=[
        rwh_primes,
        rwh_primes1,
        rwh_primes2,
        sieve_wheel_30,
        sieve_of_eratosthenes,
        sieve_of_atkin,
        # ambi_sieve_plain,
        # sundaram3,
        ambi_sieve,
        primesfrom3to,
        primesfrom2to,
        sympy_sieve,
    ],
    n_range=[2 ** k for k in range(3, 25)],
    xlabel="n",
)
b.save("out.png")
b.show()

评论

0赞 Will Ness 1/4/2022
嗯,对数-对数图...... :)
0赞 qwr 2/19/2023
现在用Pypy试试吧!
0赞 Asclepius 8/16/2020 #35

这是问题中解决方案的变体,应该比问题中的内容更快。它使用埃拉托色尼的静态筛子,没有其他优化。

from typing import List

def list_primes(limit: int) -> List[int]:
    primes = set(range(2, limit + 1))
    for i in range(2, limit + 1):
        if i in primes:
            primes.difference_update(set(list(range(i, limit + 1, i))[1:]))
    return sorted(primes)

>>> list_primes(100)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
3赞 Daniel Giger 7/8/2021 #36

我加入派对的速度可能很慢,但我用最快的代码弥补了它(至少在我的机器上,在撰写本文时)。这种方法同时使用 numpy 和 bitarray,并受到这个答案的启发。primesfrom2to

import numpy as np
from bitarray import bitarray


def bit_primes(n):
    bit_sieve = bitarray(n // 3 + (n % 6 == 2))
    bit_sieve.setall(1)
    bit_sieve[0] = False

    for i in range(int(n ** 0.5) // 3 + 1):
        if bit_sieve[i]:
            k = 3 * i + 1 | 1
            bit_sieve[k * k // 3::2 * k] = False
            bit_sieve[(k * k + 4 * k - 2 * k * (i & 1)) // 3::2 * k] = False

    np_sieve = np.unpackbits(np.frombuffer(bit_sieve.tobytes(), dtype=np.uint8)).view(bool)
    return np.concatenate(((2, 3), ((3 * np.flatnonzero(np_sieve) + 1) | 1)))

下面是与 的比较,之前发现 是 unutbu 比较中最快的解决方案:primesfrom2to

python3 -m timeit -s "import fast_primes" "fast_primes.bit_primes(1000000)"
200 loops, best of 5: 1.19 msec per loop

python3 -m timeit -s "import fast_primes" "fast_primes.primesfrom2to(1000000)"
200 loops, best of 5: 1.23 msec per loop

对于找到 100 万以下的素数,速度略快。 对于较大的值,差异可能更显著。在某些情况下,速度是原来的两倍以上:bit_primesnbit_primes

python3 -m timeit -s "import fast_primes" "fast_primes.bit_primes(500_000_000)"
1 loop, best of 5: 540 msec per loop

python3 -m timeit -s "import fast_primes" "fast_primes.primesfrom2to(500_000_000)"
1 loop, best of 5: 1.15 sec per loop

作为参考,以下是 I 的最小修改(在 Python 3 中工作)版本:primesfrom2to

def primesfrom2to(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns a array of primes, 2 <= p < n"""
    sieve = np.ones(n // 3 + (n % 6 == 2), dtype=np.bool)
    sieve[0] = False
    for i in range(int(n ** 0.5) // 3 + 1):
        if sieve[i]:
            k = 3 * i + 1 | 1
            sieve[((k * k) // 3)::2 * k] = False
            sieve[(k * k + 4 * k - 2 * k * (i & 1)) // 3::2 * k] = False
    return np.r_[2, 3, ((3 * np.nonzero(sieve)[0] + 1) | 1)]
0赞 user16657590 9/19/2021 #37

你有一个更快的代码,也有最简单的代码生成素数。 但是对于更高的数字,它不起作用,因为它失败了,也许是方法n=10000, 10000000.pop()

考虑一下:N 是素数吗?

  • 案例一:

    你得到了一些 N 的因子,

    for i in range(2, N):
    

    如果 N 是素数,则循环执行 ~(N-2) 次。否则次数更少

  • 案例二:

    for i in range(2, int(math.sqrt(N)): 
    

    如果 N 是素数,则循环执行几乎 ~(sqrt(N)-2) 次,否则将在某处中断

  • 案例3:

    最好我们只用素数数除以 N<=sqrt(N)

    其中循环仅执行 π(sqrt(N)) 次

    π(sqrt(N)) << sqrt(N) 随着 N 的增加而增加

    from math import sqrt
    from time import *
    prime_list = [2]
    n = int(input())
    s = time()
    for n0 in range(2,n+1):
        for i0 in prime_list:
            if n0%i0==0:
                break
            elif i0>=int(sqrt(n0)):
                prime_list.append(n0)
                break
    e = time()
    print(e-s)
    #print(prime_list); print(f'pi({n})={len(prime_list)}')
    print(f'{n}: {len(prime_list)}, time: {e-s}')
    
  • 输出

    100: 25, time: 0.00010275840759277344
    1000: 168, time: 0.0008606910705566406
    10000: 1229, time: 0.015588521957397461
    100000: 9592, time: 0.023436546325683594
    1000000: 78498, time: 4.1965954303741455
    10000000: 664579, time: 109.24591708183289
    100000000: 5761455, time: 2289.130858898163
    

对于小于 1000 似乎很慢,但对于 <10^6,我认为它更快。

不过,我无法理解时间的复杂性。

3赞 Pierre D 1/22/2022 #38

我很惊讶还没有人提到 numba

该版本在 2.47 ms ± 36.5 μs 内达到 1M 标记。

几年前,阿特金筛子的一个版本的伪代码在维基百科页面的素数上给出。这已经不存在了,对阿特金筛子的引用似乎是一种不同的算法。维基百科页面的2007/03/01版本,截至2007-03-01的引物编号,显示了我用作参考的伪代码。

import numpy as np
from numba import njit

@njit
def nb_primes(n):
    # Generates prime numbers 2 <= p <= n
    # Atkin's sieve -- see https://en.wikipedia.org/w/index.php?title=Prime_number&oldid=111775466
    sqrt_n = int(np.sqrt(n)) + 1

    # initialize the sieve
    s = np.full(n + 1, -1, dtype=np.int8)
    s[2] = 1
    s[3] = 1

    # put in candidate primes:
    # integers which have an odd number of
    # representations by certain quadratic forms
    for x in range(1, sqrt_n):
        x2 = x * x
        for y in range(1, sqrt_n):
            y2 = y * y
            k = 4 * x2 + y2
            if k <= n and (k % 12 == 1 or k % 12 == 5): s[k] *= -1
            k = 3 * x2 + y2
            if k <= n and (k % 12 == 7): s[k] *= -1
            k = 3 * x2 - y2
            if k <= n and x > y and k % 12 == 11: s[k] *= -1

    # eliminate composites by sieving
    for k in range(5, sqrt_n):
        if s[k]:
            k2 = k*k
            # k is prime, omit multiples of its square; this is sufficient because
            # composites which managed to get on the list cannot be square-free
            for i in range(1, n // k2 + 1):
                j = i * k2 # j ∈ {k², 2k², 3k², ..., n}
                s[j] = -1
    return np.nonzero(s>0)[0]

# initial run for "compilation" 
nb_primes(10)

定时

In[10]:
%timeit nb_primes(1_000_000)

Out[10]:
2.47 ms ± 36.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In[11]:
%timeit nb_primes(10_000_000)

Out[11]:
33.4 ms ± 373 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In[12]:
%timeit nb_primes(100_000_000)

Out[12]:
828 ms ± 5.64 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1赞 danbst 5/20/2022 #39

2021 年的这个答案开始,我还没有发现这种方法对 10 亿以下的素数有益。bitarray

但是我能够通过几个技巧将速度提高到几乎 x2:primesfrom2to

  • 使用库将 numpy 表达式转换为分配较少的紧密循环numexpr
  • 替换为更快的替代方案np.ones
  • 以某种方式操纵筛子的前 9 个元素,因此以后无需更改阵列形状

总而言之,在我的机器上,素数<10亿的时间从25秒增加到14.5秒

import numexpr as ne
import numpy as np

def primesfrom2to_numexpr(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=24, Returns a array of primes, 2 <= p < n + a few over"""
    sieve = np.zeros((n // 3 + (n % 6 == 2))//4+1, dtype=np.int32)
    ne.evaluate('sieve + 0x01010101', out=sieve)
    sieve = sieve.view('int8')
    #sieve = np.ones(n // 3 + (n % 6 == 2), dtype=np.bool_)
    sieve[0] = 0
    for i in np.arange(int(n ** 0.5) // 3 + 1):
        if sieve[i]:
            k = 3 * i + 1 | 1
            sieve[((k * k) // 3)::2 * k] = 0
            sieve[(k * k + 4 * k - 2 * k * (i & 1)) // 3::2 * k] = 0
    sieve[[0,8]] = 1
    result = np.flatnonzero(sieve)
    ne.evaluate('result * 3 + 1 + result%2', out=result)
    result[:9] = [2,3,5,7,11,13,17,19,23]
    return result
0赞 Rohit Sohlot 4/13/2023 #40
def find_primes(n):
    # Create a boolean list to track prime numbers
    primes = [True] * (n+1)
    primes[0] = primes[1] = False

    # Mark all multiples of 2 as composite (except 2 itself)
    for i in range(4, n+1, 2):
        primes[i] = False

    # Iterate over all odd numbers and mark their multiples as composite
    for i in range(3, int(n**0.5)+1, 2):
        if primes[i]:
            for j in range(i*i, n+1, 2*i):
                primes[j] = False

    # Return a list of prime numbers
    return [i for i in range(n+1) if primes[I]]
primes = find_primes(1000000)
print(primes)

find_primes 函数将整数 n 作为输入,并返回从 1 到 n 的所有质数的列表。该代码首先创建一个长度为 n+1 的布尔列表素数来跟踪素数。然后,它将 2 的所有倍数标记为复合数(2 本身除外),因为所有其他偶数都不是素数。接下来,它遍历从 3 到 n 的平方根的所有奇数,并将它们的倍数标记为复合数。最后,它通过从布尔列表中过滤掉复合数来返回所有素数的列表。

请注意,Sieve of Eratosthenes 算法的时间复杂度为 O(n log log n),这使得它比单独测试每个数字的素数要高效得多。