R's default sampling without replacement using base::sample.int() seems to require quadratic run time, e.g., when using weights drawn from a uniform distribution. For large sample sizes, this is too slow. This package contains several alternative implementations.

Details

Implementations are adapted from https://stackoverflow.com/q/15113650/946850.

References

Efraimidis, Pavlos S., and Paul G. Spirakis. "Weighted random sampling with a reservoir." Information Processing Letters 97, no. 5 (2006): 181-185.

Wong, Chak-Kuen, and Malcolm C. Easton. "An efficient method for weighted sampling without replacement." SIAM Journal on Computing 9, no. 1 (1980): 111-113.

Author

Kirill Müller

Examples

sample_int_rej(100, 50, 1:100)
#> [1] 88 83 38 33 59 46 29 51 76 32 100 71 77 85 68 63 34 74 94 #> [20] 53 78 26 93 98 69 35 97 45 55 99 87 62 86 24 3 31 70 72 #> [39] 95 91 60 96 22 43 58 89 50 9 92 5