Empirical distribution of sorted Gaussian numbers












1














I wrote a small program that does the following :




  1. Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.

  2. Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.

  3. Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.


I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.



Inverse error function & sorted Gaussian numbers plot, with N=10^5



I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
Fitted inverse error function and sorted Gaussian numbers for N=10^5



I have two questions :




  1. What is the exact value of $a$ ?

  2. How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?










share|cite|improve this question




















  • 1




    What about $sqrt 2$ ?
    – Claude Leibovici
    13 hours ago
















1














I wrote a small program that does the following :




  1. Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.

  2. Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.

  3. Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.


I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.



Inverse error function & sorted Gaussian numbers plot, with N=10^5



I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
Fitted inverse error function and sorted Gaussian numbers for N=10^5



I have two questions :




  1. What is the exact value of $a$ ?

  2. How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?










share|cite|improve this question




















  • 1




    What about $sqrt 2$ ?
    – Claude Leibovici
    13 hours ago














1












1








1







I wrote a small program that does the following :




  1. Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.

  2. Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.

  3. Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.


I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.



Inverse error function & sorted Gaussian numbers plot, with N=10^5



I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
Fitted inverse error function and sorted Gaussian numbers for N=10^5



I have two questions :




  1. What is the exact value of $a$ ?

  2. How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?










share|cite|improve this question















I wrote a small program that does the following :




  1. Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.

  2. Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.

  3. Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.


I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.



Inverse error function & sorted Gaussian numbers plot, with N=10^5



I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
Fitted inverse error function and sorted Gaussian numbers for N=10^5



I have two questions :




  1. What is the exact value of $a$ ?

  2. How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?







probability normal-distribution probability-limit-theorems sorting






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited yesterday

























asked yesterday









Florian Omnès

266




266








  • 1




    What about $sqrt 2$ ?
    – Claude Leibovici
    13 hours ago














  • 1




    What about $sqrt 2$ ?
    – Claude Leibovici
    13 hours ago








1




1




What about $sqrt 2$ ?
– Claude Leibovici
13 hours ago




What about $sqrt 2$ ?
– Claude Leibovici
13 hours ago










1 Answer
1






active

oldest

votes


















1














What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).



Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$

The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$

where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.



In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$

What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.






share|cite|improve this answer





















    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3060431%2fempirical-distribution-of-sorted-gaussian-numbers%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).



    Specifically, for a continuous increasing cdf, theoretical quantiles are given by
    $$
    x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
    $$

    The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
    $$
    hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
    $$

    where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.



    In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
    so
    $$
    Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
    $$

    What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.






    share|cite|improve this answer


























      1














      What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).



      Specifically, for a continuous increasing cdf, theoretical quantiles are given by
      $$
      x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
      $$

      The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
      $$
      hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
      $$

      where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.



      In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
      so
      $$
      Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
      $$

      What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.






      share|cite|improve this answer
























        1












        1








        1






        What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).



        Specifically, for a continuous increasing cdf, theoretical quantiles are given by
        $$
        x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
        $$

        The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
        $$
        hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
        $$

        where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.



        In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
        so
        $$
        Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
        $$

        What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.






        share|cite|improve this answer












        What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).



        Specifically, for a continuous increasing cdf, theoretical quantiles are given by
        $$
        x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
        $$

        The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
        $$
        hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
        $$

        where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.



        In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
        so
        $$
        Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
        $$

        What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered 13 hours ago









        zhoraster

        15.7k21752




        15.7k21752






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3060431%2fempirical-distribution-of-sorted-gaussian-numbers%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Mario Kart Wii

            What does “Dominus providebit” mean?

            Antonio Litta Visconti Arese