Empirical distribution of sorted Gaussian numbers
I wrote a small program that does the following :
- Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.
- Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.
- Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.
I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.
I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
I have two questions :
- What is the exact value of $a$ ?
- How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?
probability normal-distribution probability-limit-theorems sorting
add a comment |
I wrote a small program that does the following :
- Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.
- Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.
- Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.
I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.
I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
I have two questions :
- What is the exact value of $a$ ?
- How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?
probability normal-distribution probability-limit-theorems sorting
1
What about $sqrt 2$ ?
– Claude Leibovici
13 hours ago
add a comment |
I wrote a small program that does the following :
- Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.
- Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.
- Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.
I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.
I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
I have two questions :
- What is the exact value of $a$ ?
- How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?
probability normal-distribution probability-limit-theorems sorting
I wrote a small program that does the following :
- Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.
- Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.
- Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.
I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.
I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
I have two questions :
- What is the exact value of $a$ ?
- How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?
probability normal-distribution probability-limit-theorems sorting
probability normal-distribution probability-limit-theorems sorting
edited yesterday
asked yesterday
Florian Omnès
266
266
1
What about $sqrt 2$ ?
– Claude Leibovici
13 hours ago
add a comment |
1
What about $sqrt 2$ ?
– Claude Leibovici
13 hours ago
1
1
What about $sqrt 2$ ?
– Claude Leibovici
13 hours ago
What about $sqrt 2$ ?
– Claude Leibovici
13 hours ago
add a comment |
1 Answer
1
active
oldest
votes
What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).
Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$
The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$
where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.
In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$
What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3060431%2fempirical-distribution-of-sorted-gaussian-numbers%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).
Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$
The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$
where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.
In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$
What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.
add a comment |
What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).
Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$
The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$
where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.
In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$
What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.
add a comment |
What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).
Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$
The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$
where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.
In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$
What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.
What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).
Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$
The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$
where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.
In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$
What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.
answered 13 hours ago
zhoraster
15.7k21752
15.7k21752
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3060431%2fempirical-distribution-of-sorted-gaussian-numbers%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
What about $sqrt 2$ ?
– Claude Leibovici
13 hours ago