Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the random.distrib module #63100

Closed
serhiy-storchaka opened this issue Sep 1, 2013 · 5 comments
Closed

Add the random.distrib module #63100

serhiy-storchaka opened this issue Sep 1, 2013 · 5 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 18900
Nosy @tim-one, @rhettinger, @mdickinson, @serhiy-storchaka
Files
  • distrib.py: Sample implementation
  • distrib_bench.py: Benchmark script
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-09-07.07:24:09.617>
    created_at = <Date 2013-09-01.18:49:28.546>
    labels = ['type-feature', 'library']
    title = 'Add the random.distrib module'
    updated_at = <Date 2013-09-07.07:24:09.616>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2013-09-07.07:24:09.616>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-09-07.07:24:09.617>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2013-09-01.18:49:28.546>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['31549', '31550']
    hgrepos = []
    issue_num = 18900
    keywords = []
    message_count = 5.0
    messages = ['196727', '196732', '196744', '196752', '196778']
    nosy_count = 5.0
    nosy_names = ['tim.peters', 'rhettinger', 'mark.dickinson', 'serhiy.storchaka', 'madison.may']
    pr_nums = []
    priority = 'normal'
    resolution = 'rejected'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue18900'
    versions = ['Python 3.4']

    @serhiy-storchaka
    Copy link
    Member Author

    In some functions in the random module checking input arguments and precomputation takes a considerable portion of time. Here is a sample implementation of new random.distrib module which provides alternative faster interface to generating of random distributed values. It contains generators which generates values with same distributions as functions with same name in the random module.

    Benchmark results:

                                random distrib
    

    random() 0.061 0.055 1.12
    randrange(0, 100, 5) 1.494 0.620 2.41
    randint(1, 100) 1.283 0.551 2.33
    uniform(-10.0, 10.0) 0.332 0.121 2.73
    triangular(0.0, 10.0, 6.0) 0.661 0.317 2.09
    gauss(5.0, 2.0) 0.707 0.280 2.53
    normalvariate(5.0, 2.0) 0.867 0.553 1.57
    lognormvariate(5.0, 2.0) 1.078 0.640 1.68
    expovariate(0.1,) 0.508 0.293 1.73
    vonmisesvariate(1.0, 1.0) 1.201 0.671 1.79
    gammavariate(0.35, 1.45) 1.117 0.508 2.20
    betavariate(2.71828, 3.14159) 2.868 1.776 1.61
    paretovariate(5.0,) 0.493 0.238 2.07
    weibullvariate(1.0, 3.0) 0.670 0.402 1.67
    choice([0, 1, 2, 3, 4, 5, 6... 0.887 0.594 1.49

    Distrib functions are 1.5-2.8 times faster than random functions. Weighted choice() function (see bpo-18844) can be even dozens times faster (depends on size of the input).

    In additional some random generators (i.e. gauss()) looks simpler when implemented as generators. distrib.gauss() is twice faster than distrib.normalvariate() (both generates numbers with same distribution) and I think some other generators can be implemented more efficient in generator style.

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Sep 1, 2013
    @MadisonMay
    Copy link
    Mannequin

    MadisonMay mannequin commented Sep 1, 2013

    I like the core idea of a family of random generators, but it feels like a new module that's nearly identical to random introduces a lot of repeated code.

    Perhaps adding an additional optional arg ('generator=False', for example) to these functions in the random module would be a bit simpler.

    @serhiy-storchaka
    Copy link
    Member Author

    Of course if this idea will be accepted we can turn current functions in the random module into wrappers around generators from the distrib module.

    E.g.:

        def triangular(self, *args, **kwargs):
            return next(triangular(*args, random=self, **kwargs))

    @MadisonMay
    Copy link
    Mannequin

    MadisonMay mannequin commented Sep 1, 2013

    ...we can turn current functions in the random module into wrappers
    around generators from the distrib module.

    Makes sense.

    In light of Raymond's comments on code bloat in bpo-18844, perhaps this module could be added to PyPi to see whether or not there's interest in this kind of functionality?

    @serhiy-storchaka
    Copy link
    Member Author

    In light of Raymond's comments on code bloat in bpo-18844, perhaps this module could be added to PyPi to see whether or not there's interest in this kind of functionality?

    Agree. At first look there are no module which provides such features on PyPI. On the second hand NumpPy provides efficient C-implemented functions which are 2-10 times faster than proposed pure Python iterators. Due to this fact I withdraw my proposition. Anyone who need a performance in random generation with specific distribution can use NumPy.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant