Simple Statistics/Probability Problem












2












$begingroup$


I have used a python script to identify target sequences in a DNA sequence file.



There are two classes of sequence: coding and non-coding. I have identified $728$ sequences of interest. $597$ of these fall into the coding regions and $131$ of these fall into the non-coding regions. This is the equivalent of $18%,$ non-coding, but the total non-coding region in the sequence file is $13% $.




Is there a statistical tool to demonstrate the python script identified target sequences in a non-random fashion way?




If the script identified sequences that were randomly distributed then $13% $ of them would have been found in the non-coding region, from a total of $728$ sequences. This seems like it should be reliable.



I hope my question is clear.










share|cite|improve this question









New contributor




Ryan_J_Hope is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    I tried to make your question look better, is it okay how I did it? Also, can you clarify some things, like: Do we already know that the total non-coding region in the sequence file is $13%$ before the experiment? And that means you expected there to show only $94$ non-coding sequences (equivalent to $13%$) instead of $131$?
    $endgroup$
    – Zacky
    Jan 7 at 22:39












  • $begingroup$
    Are the coding and non-coding sequences the same length?
    $endgroup$
    – N. F. Taussig
    Jan 7 at 22:41










  • $begingroup$
    Yes, the calculation was conducted by another person and is referenced from the scientific literature. Although this is important, there could have been an error made during this calculation accounting for my discrepancy. Nevertheless, I am trying to identify specific sequences and I want to be sure that the sequences are not just background noise. If the sequences were noise then I would expect them to be evenly distributed across the whole genomic sequence file and I would find 13% o my target sequence in the non-coding region and 87% in the coding region.
    $endgroup$
    – Ryan_J_Hope
    Jan 7 at 22:42












  • $begingroup$
    The coding sequences and non coding sequences are not the same length. Although, I don't see why this would affect the result. The entire genome is made up of 13% non-coding and 87% coding. There are 3871 coding sections separated by intergenic non-coding sections.
    $endgroup$
    – Ryan_J_Hope
    Jan 7 at 22:49












  • $begingroup$
    This is more a statistics question than a mathematical one. You might get better answers by posting to cross-validated. stats.stackexchange.com
    $endgroup$
    – awkward
    Jan 8 at 15:19
















2












$begingroup$


I have used a python script to identify target sequences in a DNA sequence file.



There are two classes of sequence: coding and non-coding. I have identified $728$ sequences of interest. $597$ of these fall into the coding regions and $131$ of these fall into the non-coding regions. This is the equivalent of $18%,$ non-coding, but the total non-coding region in the sequence file is $13% $.




Is there a statistical tool to demonstrate the python script identified target sequences in a non-random fashion way?




If the script identified sequences that were randomly distributed then $13% $ of them would have been found in the non-coding region, from a total of $728$ sequences. This seems like it should be reliable.



I hope my question is clear.










share|cite|improve this question









New contributor




Ryan_J_Hope is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    I tried to make your question look better, is it okay how I did it? Also, can you clarify some things, like: Do we already know that the total non-coding region in the sequence file is $13%$ before the experiment? And that means you expected there to show only $94$ non-coding sequences (equivalent to $13%$) instead of $131$?
    $endgroup$
    – Zacky
    Jan 7 at 22:39












  • $begingroup$
    Are the coding and non-coding sequences the same length?
    $endgroup$
    – N. F. Taussig
    Jan 7 at 22:41










  • $begingroup$
    Yes, the calculation was conducted by another person and is referenced from the scientific literature. Although this is important, there could have been an error made during this calculation accounting for my discrepancy. Nevertheless, I am trying to identify specific sequences and I want to be sure that the sequences are not just background noise. If the sequences were noise then I would expect them to be evenly distributed across the whole genomic sequence file and I would find 13% o my target sequence in the non-coding region and 87% in the coding region.
    $endgroup$
    – Ryan_J_Hope
    Jan 7 at 22:42












  • $begingroup$
    The coding sequences and non coding sequences are not the same length. Although, I don't see why this would affect the result. The entire genome is made up of 13% non-coding and 87% coding. There are 3871 coding sections separated by intergenic non-coding sections.
    $endgroup$
    – Ryan_J_Hope
    Jan 7 at 22:49












  • $begingroup$
    This is more a statistics question than a mathematical one. You might get better answers by posting to cross-validated. stats.stackexchange.com
    $endgroup$
    – awkward
    Jan 8 at 15:19














2












2








2


1



$begingroup$


I have used a python script to identify target sequences in a DNA sequence file.



There are two classes of sequence: coding and non-coding. I have identified $728$ sequences of interest. $597$ of these fall into the coding regions and $131$ of these fall into the non-coding regions. This is the equivalent of $18%,$ non-coding, but the total non-coding region in the sequence file is $13% $.




Is there a statistical tool to demonstrate the python script identified target sequences in a non-random fashion way?




If the script identified sequences that were randomly distributed then $13% $ of them would have been found in the non-coding region, from a total of $728$ sequences. This seems like it should be reliable.



I hope my question is clear.










share|cite|improve this question









New contributor




Ryan_J_Hope is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I have used a python script to identify target sequences in a DNA sequence file.



There are two classes of sequence: coding and non-coding. I have identified $728$ sequences of interest. $597$ of these fall into the coding regions and $131$ of these fall into the non-coding regions. This is the equivalent of $18%,$ non-coding, but the total non-coding region in the sequence file is $13% $.




Is there a statistical tool to demonstrate the python script identified target sequences in a non-random fashion way?




If the script identified sequences that were randomly distributed then $13% $ of them would have been found in the non-coding region, from a total of $728$ sequences. This seems like it should be reliable.



I hope my question is clear.







probability statistics biology






share|cite|improve this question









New contributor




Ryan_J_Hope is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|cite|improve this question









New contributor




Ryan_J_Hope is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|cite|improve this question




share|cite|improve this question








edited Jan 7 at 22:36









Zacky

5,3261754




5,3261754






New contributor




Ryan_J_Hope is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Jan 7 at 21:30









Ryan_J_HopeRyan_J_Hope

133




133




New contributor




Ryan_J_Hope is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Ryan_J_Hope is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Ryan_J_Hope is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    I tried to make your question look better, is it okay how I did it? Also, can you clarify some things, like: Do we already know that the total non-coding region in the sequence file is $13%$ before the experiment? And that means you expected there to show only $94$ non-coding sequences (equivalent to $13%$) instead of $131$?
    $endgroup$
    – Zacky
    Jan 7 at 22:39












  • $begingroup$
    Are the coding and non-coding sequences the same length?
    $endgroup$
    – N. F. Taussig
    Jan 7 at 22:41










  • $begingroup$
    Yes, the calculation was conducted by another person and is referenced from the scientific literature. Although this is important, there could have been an error made during this calculation accounting for my discrepancy. Nevertheless, I am trying to identify specific sequences and I want to be sure that the sequences are not just background noise. If the sequences were noise then I would expect them to be evenly distributed across the whole genomic sequence file and I would find 13% o my target sequence in the non-coding region and 87% in the coding region.
    $endgroup$
    – Ryan_J_Hope
    Jan 7 at 22:42












  • $begingroup$
    The coding sequences and non coding sequences are not the same length. Although, I don't see why this would affect the result. The entire genome is made up of 13% non-coding and 87% coding. There are 3871 coding sections separated by intergenic non-coding sections.
    $endgroup$
    – Ryan_J_Hope
    Jan 7 at 22:49












  • $begingroup$
    This is more a statistics question than a mathematical one. You might get better answers by posting to cross-validated. stats.stackexchange.com
    $endgroup$
    – awkward
    Jan 8 at 15:19


















  • $begingroup$
    I tried to make your question look better, is it okay how I did it? Also, can you clarify some things, like: Do we already know that the total non-coding region in the sequence file is $13%$ before the experiment? And that means you expected there to show only $94$ non-coding sequences (equivalent to $13%$) instead of $131$?
    $endgroup$
    – Zacky
    Jan 7 at 22:39












  • $begingroup$
    Are the coding and non-coding sequences the same length?
    $endgroup$
    – N. F. Taussig
    Jan 7 at 22:41










  • $begingroup$
    Yes, the calculation was conducted by another person and is referenced from the scientific literature. Although this is important, there could have been an error made during this calculation accounting for my discrepancy. Nevertheless, I am trying to identify specific sequences and I want to be sure that the sequences are not just background noise. If the sequences were noise then I would expect them to be evenly distributed across the whole genomic sequence file and I would find 13% o my target sequence in the non-coding region and 87% in the coding region.
    $endgroup$
    – Ryan_J_Hope
    Jan 7 at 22:42












  • $begingroup$
    The coding sequences and non coding sequences are not the same length. Although, I don't see why this would affect the result. The entire genome is made up of 13% non-coding and 87% coding. There are 3871 coding sections separated by intergenic non-coding sections.
    $endgroup$
    – Ryan_J_Hope
    Jan 7 at 22:49












  • $begingroup$
    This is more a statistics question than a mathematical one. You might get better answers by posting to cross-validated. stats.stackexchange.com
    $endgroup$
    – awkward
    Jan 8 at 15:19
















$begingroup$
I tried to make your question look better, is it okay how I did it? Also, can you clarify some things, like: Do we already know that the total non-coding region in the sequence file is $13%$ before the experiment? And that means you expected there to show only $94$ non-coding sequences (equivalent to $13%$) instead of $131$?
$endgroup$
– Zacky
Jan 7 at 22:39






$begingroup$
I tried to make your question look better, is it okay how I did it? Also, can you clarify some things, like: Do we already know that the total non-coding region in the sequence file is $13%$ before the experiment? And that means you expected there to show only $94$ non-coding sequences (equivalent to $13%$) instead of $131$?
$endgroup$
– Zacky
Jan 7 at 22:39














$begingroup$
Are the coding and non-coding sequences the same length?
$endgroup$
– N. F. Taussig
Jan 7 at 22:41




$begingroup$
Are the coding and non-coding sequences the same length?
$endgroup$
– N. F. Taussig
Jan 7 at 22:41












$begingroup$
Yes, the calculation was conducted by another person and is referenced from the scientific literature. Although this is important, there could have been an error made during this calculation accounting for my discrepancy. Nevertheless, I am trying to identify specific sequences and I want to be sure that the sequences are not just background noise. If the sequences were noise then I would expect them to be evenly distributed across the whole genomic sequence file and I would find 13% o my target sequence in the non-coding region and 87% in the coding region.
$endgroup$
– Ryan_J_Hope
Jan 7 at 22:42






$begingroup$
Yes, the calculation was conducted by another person and is referenced from the scientific literature. Although this is important, there could have been an error made during this calculation accounting for my discrepancy. Nevertheless, I am trying to identify specific sequences and I want to be sure that the sequences are not just background noise. If the sequences were noise then I would expect them to be evenly distributed across the whole genomic sequence file and I would find 13% o my target sequence in the non-coding region and 87% in the coding region.
$endgroup$
– Ryan_J_Hope
Jan 7 at 22:42














$begingroup$
The coding sequences and non coding sequences are not the same length. Although, I don't see why this would affect the result. The entire genome is made up of 13% non-coding and 87% coding. There are 3871 coding sections separated by intergenic non-coding sections.
$endgroup$
– Ryan_J_Hope
Jan 7 at 22:49






$begingroup$
The coding sequences and non coding sequences are not the same length. Although, I don't see why this would affect the result. The entire genome is made up of 13% non-coding and 87% coding. There are 3871 coding sections separated by intergenic non-coding sections.
$endgroup$
– Ryan_J_Hope
Jan 7 at 22:49














$begingroup$
This is more a statistics question than a mathematical one. You might get better answers by posting to cross-validated. stats.stackexchange.com
$endgroup$
– awkward
Jan 8 at 15:19




$begingroup$
This is more a statistics question than a mathematical one. You might get better answers by posting to cross-validated. stats.stackexchange.com
$endgroup$
– awkward
Jan 8 at 15:19










1 Answer
1






active

oldest

votes


















0












$begingroup$

Your null hypothesis is $H_0: p = 0.13$ against the alternative
$H_a: p ne 0.13,$ where $p = P(text{Non Coding}).$
You observe $X =131$ non-coding sequences among $n = 728$ observed,
which gives you $hat p = 0.1812$ as the observed frequency.
Because the observed frequency is substantially different from $p = 0.13$
you wonder whether this might have been an 'unlucky' draw, or whether
you have statistically significant evidence that the method of sampling is unfair.



This is called a "one-sample binomial test". Often this test is done by
using a normal approximation to the binomial distribution. You can find
that method in elementary statistics textbooks. The output below from
Minitab statistical software uses the binomial distribution to give an
exact P-value. [It seems that that SciPy also implements a version of this test, but I have not tried it.]



If the P-value is less than 5%, one says that the null
hypothesis is rejected at the 5% level of significance. Here the P-value
is printed as 0.000 which means that the P-value is smaller than 0.0005.
So it is extremely unlikely that an unbiased draw would give an observed
proportion of non-coding sequences so far from $p = 0.13.$



Test and CI for One Proportion 

Test of p = 0.13 vs p ≠ 0.13

Exact
Sample X N Sample p 95% CI P-Value
1 131 723 0.181189 (0.153769, 0.211239) 0.000


Another way to interpret the output is that a 95% confidence interval
for $p$ is $(0.154, 0.211),$ which is centered at $hat p = 0.1812,$ but
does not contain $p = 0.13.$ Thus it is difficult to believe that
the sampling procedure would have given close to the true value $p = 0.13.$





Note: Yet another approach is to note that quantiles .025 and .975 of
the 'null distribution' $mathsf{Binom}(n = 723, p = 0.13)$ are 77 and 112, respectively. Thus the observed value $X = 131$ falls considerably
above the upper 'critical value' of the null distribution for a two-sided test at the 5% level. (Computation in R.)



 qbinom(c(.025,.975), 723, .13)
[1] 77 112





share|cite|improve this answer











$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    Ryan_J_Hope is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3065525%2fsimple-statistics-probability-problem%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0












    $begingroup$

    Your null hypothesis is $H_0: p = 0.13$ against the alternative
    $H_a: p ne 0.13,$ where $p = P(text{Non Coding}).$
    You observe $X =131$ non-coding sequences among $n = 728$ observed,
    which gives you $hat p = 0.1812$ as the observed frequency.
    Because the observed frequency is substantially different from $p = 0.13$
    you wonder whether this might have been an 'unlucky' draw, or whether
    you have statistically significant evidence that the method of sampling is unfair.



    This is called a "one-sample binomial test". Often this test is done by
    using a normal approximation to the binomial distribution. You can find
    that method in elementary statistics textbooks. The output below from
    Minitab statistical software uses the binomial distribution to give an
    exact P-value. [It seems that that SciPy also implements a version of this test, but I have not tried it.]



    If the P-value is less than 5%, one says that the null
    hypothesis is rejected at the 5% level of significance. Here the P-value
    is printed as 0.000 which means that the P-value is smaller than 0.0005.
    So it is extremely unlikely that an unbiased draw would give an observed
    proportion of non-coding sequences so far from $p = 0.13.$



    Test and CI for One Proportion 

    Test of p = 0.13 vs p ≠ 0.13

    Exact
    Sample X N Sample p 95% CI P-Value
    1 131 723 0.181189 (0.153769, 0.211239) 0.000


    Another way to interpret the output is that a 95% confidence interval
    for $p$ is $(0.154, 0.211),$ which is centered at $hat p = 0.1812,$ but
    does not contain $p = 0.13.$ Thus it is difficult to believe that
    the sampling procedure would have given close to the true value $p = 0.13.$





    Note: Yet another approach is to note that quantiles .025 and .975 of
    the 'null distribution' $mathsf{Binom}(n = 723, p = 0.13)$ are 77 and 112, respectively. Thus the observed value $X = 131$ falls considerably
    above the upper 'critical value' of the null distribution for a two-sided test at the 5% level. (Computation in R.)



     qbinom(c(.025,.975), 723, .13)
    [1] 77 112





    share|cite|improve this answer











    $endgroup$


















      0












      $begingroup$

      Your null hypothesis is $H_0: p = 0.13$ against the alternative
      $H_a: p ne 0.13,$ where $p = P(text{Non Coding}).$
      You observe $X =131$ non-coding sequences among $n = 728$ observed,
      which gives you $hat p = 0.1812$ as the observed frequency.
      Because the observed frequency is substantially different from $p = 0.13$
      you wonder whether this might have been an 'unlucky' draw, or whether
      you have statistically significant evidence that the method of sampling is unfair.



      This is called a "one-sample binomial test". Often this test is done by
      using a normal approximation to the binomial distribution. You can find
      that method in elementary statistics textbooks. The output below from
      Minitab statistical software uses the binomial distribution to give an
      exact P-value. [It seems that that SciPy also implements a version of this test, but I have not tried it.]



      If the P-value is less than 5%, one says that the null
      hypothesis is rejected at the 5% level of significance. Here the P-value
      is printed as 0.000 which means that the P-value is smaller than 0.0005.
      So it is extremely unlikely that an unbiased draw would give an observed
      proportion of non-coding sequences so far from $p = 0.13.$



      Test and CI for One Proportion 

      Test of p = 0.13 vs p ≠ 0.13

      Exact
      Sample X N Sample p 95% CI P-Value
      1 131 723 0.181189 (0.153769, 0.211239) 0.000


      Another way to interpret the output is that a 95% confidence interval
      for $p$ is $(0.154, 0.211),$ which is centered at $hat p = 0.1812,$ but
      does not contain $p = 0.13.$ Thus it is difficult to believe that
      the sampling procedure would have given close to the true value $p = 0.13.$





      Note: Yet another approach is to note that quantiles .025 and .975 of
      the 'null distribution' $mathsf{Binom}(n = 723, p = 0.13)$ are 77 and 112, respectively. Thus the observed value $X = 131$ falls considerably
      above the upper 'critical value' of the null distribution for a two-sided test at the 5% level. (Computation in R.)



       qbinom(c(.025,.975), 723, .13)
      [1] 77 112





      share|cite|improve this answer











      $endgroup$
















        0












        0








        0





        $begingroup$

        Your null hypothesis is $H_0: p = 0.13$ against the alternative
        $H_a: p ne 0.13,$ where $p = P(text{Non Coding}).$
        You observe $X =131$ non-coding sequences among $n = 728$ observed,
        which gives you $hat p = 0.1812$ as the observed frequency.
        Because the observed frequency is substantially different from $p = 0.13$
        you wonder whether this might have been an 'unlucky' draw, or whether
        you have statistically significant evidence that the method of sampling is unfair.



        This is called a "one-sample binomial test". Often this test is done by
        using a normal approximation to the binomial distribution. You can find
        that method in elementary statistics textbooks. The output below from
        Minitab statistical software uses the binomial distribution to give an
        exact P-value. [It seems that that SciPy also implements a version of this test, but I have not tried it.]



        If the P-value is less than 5%, one says that the null
        hypothesis is rejected at the 5% level of significance. Here the P-value
        is printed as 0.000 which means that the P-value is smaller than 0.0005.
        So it is extremely unlikely that an unbiased draw would give an observed
        proportion of non-coding sequences so far from $p = 0.13.$



        Test and CI for One Proportion 

        Test of p = 0.13 vs p ≠ 0.13

        Exact
        Sample X N Sample p 95% CI P-Value
        1 131 723 0.181189 (0.153769, 0.211239) 0.000


        Another way to interpret the output is that a 95% confidence interval
        for $p$ is $(0.154, 0.211),$ which is centered at $hat p = 0.1812,$ but
        does not contain $p = 0.13.$ Thus it is difficult to believe that
        the sampling procedure would have given close to the true value $p = 0.13.$





        Note: Yet another approach is to note that quantiles .025 and .975 of
        the 'null distribution' $mathsf{Binom}(n = 723, p = 0.13)$ are 77 and 112, respectively. Thus the observed value $X = 131$ falls considerably
        above the upper 'critical value' of the null distribution for a two-sided test at the 5% level. (Computation in R.)



         qbinom(c(.025,.975), 723, .13)
        [1] 77 112





        share|cite|improve this answer











        $endgroup$



        Your null hypothesis is $H_0: p = 0.13$ against the alternative
        $H_a: p ne 0.13,$ where $p = P(text{Non Coding}).$
        You observe $X =131$ non-coding sequences among $n = 728$ observed,
        which gives you $hat p = 0.1812$ as the observed frequency.
        Because the observed frequency is substantially different from $p = 0.13$
        you wonder whether this might have been an 'unlucky' draw, or whether
        you have statistically significant evidence that the method of sampling is unfair.



        This is called a "one-sample binomial test". Often this test is done by
        using a normal approximation to the binomial distribution. You can find
        that method in elementary statistics textbooks. The output below from
        Minitab statistical software uses the binomial distribution to give an
        exact P-value. [It seems that that SciPy also implements a version of this test, but I have not tried it.]



        If the P-value is less than 5%, one says that the null
        hypothesis is rejected at the 5% level of significance. Here the P-value
        is printed as 0.000 which means that the P-value is smaller than 0.0005.
        So it is extremely unlikely that an unbiased draw would give an observed
        proportion of non-coding sequences so far from $p = 0.13.$



        Test and CI for One Proportion 

        Test of p = 0.13 vs p ≠ 0.13

        Exact
        Sample X N Sample p 95% CI P-Value
        1 131 723 0.181189 (0.153769, 0.211239) 0.000


        Another way to interpret the output is that a 95% confidence interval
        for $p$ is $(0.154, 0.211),$ which is centered at $hat p = 0.1812,$ but
        does not contain $p = 0.13.$ Thus it is difficult to believe that
        the sampling procedure would have given close to the true value $p = 0.13.$





        Note: Yet another approach is to note that quantiles .025 and .975 of
        the 'null distribution' $mathsf{Binom}(n = 723, p = 0.13)$ are 77 and 112, respectively. Thus the observed value $X = 131$ falls considerably
        above the upper 'critical value' of the null distribution for a two-sided test at the 5% level. (Computation in R.)



         qbinom(c(.025,.975), 723, .13)
        [1] 77 112






        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Jan 11 at 8:58

























        answered Jan 11 at 8:38









        BruceETBruceET

        35.2k71440




        35.2k71440






















            Ryan_J_Hope is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            Ryan_J_Hope is a new contributor. Be nice, and check out our Code of Conduct.













            Ryan_J_Hope is a new contributor. Be nice, and check out our Code of Conduct.












            Ryan_J_Hope is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3065525%2fsimple-statistics-probability-problem%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Mario Kart Wii

            What does “Dominus providebit” mean?

            Antonio Litta Visconti Arese