How to sort by odd lines then remove repeated values?












6















I have the following type of file:



transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +


I was trying to get something like this:



transcr_7135 +
YBL029C-A -
transcr_7135 +
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_25793 +
YAL038W +


Then, afterward, I was looking for something like this:



transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
YAL037C-B -
YAL038W +


I've scrolled through sort manual and some posts, but couldn't find anything that fit near this, just sort using numerical values to get odd lines...










share|improve this question





























    6















    I have the following type of file:



    transcr_25793 +
    YAL039C -
    transcr_25793 +
    YAL037C-B -
    transcr_20649 +
    YBL100C -
    transcr_7135 +
    YBL029C-A -
    transcr_11317 +
    YBL067C -
    transcr_25793 +
    YAL038W +
    transcr_7135 +
    YBL029W +


    I was trying to get something like this:



    transcr_7135 +
    YBL029C-A -
    transcr_7135 +
    YBL029W +
    transcr_11317 +
    YBL067C -
    transcr_20649 +
    YBL100C -
    transcr_25793 +
    YAL039C -
    transcr_25793 +
    YAL037C-B -
    transcr_25793 +
    YAL038W +


    Then, afterward, I was looking for something like this:



    transcr_7135 +
    YBL029C-A -
    YBL029W +
    transcr_11317 +
    YBL067C -
    transcr_20649 +
    YBL100C -
    transcr_25793 +
    YAL039C -
    YAL037C-B -
    YAL038W +


    I've scrolled through sort manual and some posts, but couldn't find anything that fit near this, just sort using numerical values to get odd lines...










    share|improve this question



























      6












      6








      6


      2






      I have the following type of file:



      transcr_25793 +
      YAL039C -
      transcr_25793 +
      YAL037C-B -
      transcr_20649 +
      YBL100C -
      transcr_7135 +
      YBL029C-A -
      transcr_11317 +
      YBL067C -
      transcr_25793 +
      YAL038W +
      transcr_7135 +
      YBL029W +


      I was trying to get something like this:



      transcr_7135 +
      YBL029C-A -
      transcr_7135 +
      YBL029W +
      transcr_11317 +
      YBL067C -
      transcr_20649 +
      YBL100C -
      transcr_25793 +
      YAL039C -
      transcr_25793 +
      YAL037C-B -
      transcr_25793 +
      YAL038W +


      Then, afterward, I was looking for something like this:



      transcr_7135 +
      YBL029C-A -
      YBL029W +
      transcr_11317 +
      YBL067C -
      transcr_20649 +
      YBL100C -
      transcr_25793 +
      YAL039C -
      YAL037C-B -
      YAL038W +


      I've scrolled through sort manual and some posts, but couldn't find anything that fit near this, just sort using numerical values to get odd lines...










      share|improve this question
















      I have the following type of file:



      transcr_25793 +
      YAL039C -
      transcr_25793 +
      YAL037C-B -
      transcr_20649 +
      YBL100C -
      transcr_7135 +
      YBL029C-A -
      transcr_11317 +
      YBL067C -
      transcr_25793 +
      YAL038W +
      transcr_7135 +
      YBL029W +


      I was trying to get something like this:



      transcr_7135 +
      YBL029C-A -
      transcr_7135 +
      YBL029W +
      transcr_11317 +
      YBL067C -
      transcr_20649 +
      YBL100C -
      transcr_25793 +
      YAL039C -
      transcr_25793 +
      YAL037C-B -
      transcr_25793 +
      YAL038W +


      Then, afterward, I was looking for something like this:



      transcr_7135 +
      YBL029C-A -
      YBL029W +
      transcr_11317 +
      YBL067C -
      transcr_20649 +
      YBL100C -
      transcr_25793 +
      YAL039C -
      YAL037C-B -
      YAL038W +


      I've scrolled through sort manual and some posts, but couldn't find anything that fit near this, just sort using numerical values to get odd lines...







      text-processing sort






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 18 at 13:38









      Rui F Ribeiro

      40.1k1479136




      40.1k1479136










      asked Jan 18 at 11:42









      Lucas Farinazzo MarquesLucas Farinazzo Marques

      665




      665






















          5 Answers
          5






          active

          oldest

          votes


















          3














          Not exactly the sorting order you've showed, but maby right as well?



          $ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
          transcr_7135 +
          YBL029C-A -
          YBL029W +
          transcr_11317 +
          YBL067C -
          transcr_20649 +
          YBL100C -
          transcr_25793 +
          YAL037C-B -
          YAL038W +
          YAL039C -


          EDIT:



          Insert the line number and uses it as a sorting key, should produce the exact output you like:



          $ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'





          share|improve this answer


























          • It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

            – Lucas Farinazzo Marques
            Jan 18 at 12:32



















          7














          Pure gawk solution:



          awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
          END{PROCINFO["sorted_in"]="@ind_num_asc";
          for(i in a) printf "%s","transcr_"i""a[i]"n"}' file


          The trick is to sort indexes of array a numerically with a little help of gawk's PROCINFO special array.



          transcr_7135
          YBL029C-A -
          YBL029W +
          transcr_11317
          YBL067C -
          transcr_20649
          YBL100C -
          transcr_25793
          YAL039C -
          YAL037C-B -
          YAL038W +


          BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).






          share|improve this answer

































            3














            With GNU sort and assuming the lines don't contain TAB characters:



            paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'


            Or sort -t$'t' -sk1,1V to preserve the original order for entries with identical odd lines like in your expected output.



            If you don't have GNU sort, and assuming the odd lines always follow that pattern, you can replace sort -V with sort -k1.9n.






            share|improve this answer

































              2














              for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
              do
              echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
              done


              Where a.txt is your input. Tested:



              [root@megatron ~]# cat a.txt
              transcr_25793 +
              YAL039C -
              transcr_25793 +
              YAL037C-B -
              transcr_20649 +
              YBL100C -
              transcr_7135 +
              YBL029C-A -
              transcr_11317 +
              YBL067C -
              transcr_25793 +
              YAL038W +
              transcr_7135 +
              YBL029W +
              [root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
              do
              echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
              done
              transcr_7135
              YBL029C-A -
              YBL029W +
              transcr_11317
              YBL067C -
              transcr_20649
              YBL100C -
              transcr_25793
              YAL039C -
              YAL037C-B -
              YAL038W +
              [root@megatron ~]#





              share|improve this answer


























              • It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                – Lucas Farinazzo Marques
                Jan 18 at 12:17











              • I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                – Zatarra
                Jan 18 at 12:18













              • Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                – Lucas Farinazzo Marques
                Jan 18 at 12:24











              • You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                – Zatarra
                Jan 18 at 12:28













              • Now it worked, thanks!

                – Lucas Farinazzo Marques
                Jan 18 at 12:45



















              0














              Pre- and postprocessing with awk; this does not assume that a transcr line is followed by just one Y* line; it's also idempotent -- its output could be piped back as input and it will give the same result.



              awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
              transcr_7135 +
              YBL029C-A -
              YBL029W +
              transcr_11317 +
              YBL067C -
              transcr_20649 +
              YBL100C -
              transcr_25793 +
              YAL037C-B -
              YAL038W +
              YAL039C -





              share|improve this answer























                Your Answer








                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "106"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: false,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: null,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f495272%2fhow-to-sort-by-odd-lines-then-remove-repeated-values%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                5 Answers
                5






                active

                oldest

                votes








                5 Answers
                5






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                3














                Not exactly the sorting order you've showed, but maby right as well?



                $ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
                transcr_7135 +
                YBL029C-A -
                YBL029W +
                transcr_11317 +
                YBL067C -
                transcr_20649 +
                YBL100C -
                transcr_25793 +
                YAL037C-B -
                YAL038W +
                YAL039C -


                EDIT:



                Insert the line number and uses it as a sorting key, should produce the exact output you like:



                $ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'





                share|improve this answer


























                • It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                  – Lucas Farinazzo Marques
                  Jan 18 at 12:32
















                3














                Not exactly the sorting order you've showed, but maby right as well?



                $ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
                transcr_7135 +
                YBL029C-A -
                YBL029W +
                transcr_11317 +
                YBL067C -
                transcr_20649 +
                YBL100C -
                transcr_25793 +
                YAL037C-B -
                YAL038W +
                YAL039C -


                EDIT:



                Insert the line number and uses it as a sorting key, should produce the exact output you like:



                $ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'





                share|improve this answer


























                • It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                  – Lucas Farinazzo Marques
                  Jan 18 at 12:32














                3












                3








                3







                Not exactly the sorting order you've showed, but maby right as well?



                $ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
                transcr_7135 +
                YBL029C-A -
                YBL029W +
                transcr_11317 +
                YBL067C -
                transcr_20649 +
                YBL100C -
                transcr_25793 +
                YAL037C-B -
                YAL038W +
                YAL039C -


                EDIT:



                Insert the line number and uses it as a sorting key, should produce the exact output you like:



                $ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'





                share|improve this answer















                Not exactly the sorting order you've showed, but maby right as well?



                $ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
                transcr_7135 +
                YBL029C-A -
                YBL029W +
                transcr_11317 +
                YBL067C -
                transcr_20649 +
                YBL100C -
                transcr_25793 +
                YAL037C-B -
                YAL038W +
                YAL039C -


                EDIT:



                Insert the line number and uses it as a sorting key, should produce the exact output you like:



                $ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Jan 18 at 19:59

























                answered Jan 18 at 12:25









                finswimmerfinswimmer

                52416




                52416













                • It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                  – Lucas Farinazzo Marques
                  Jan 18 at 12:32



















                • It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                  – Lucas Farinazzo Marques
                  Jan 18 at 12:32

















                It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                – Lucas Farinazzo Marques
                Jan 18 at 12:32





                It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                – Lucas Farinazzo Marques
                Jan 18 at 12:32













                7














                Pure gawk solution:



                awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
                END{PROCINFO["sorted_in"]="@ind_num_asc";
                for(i in a) printf "%s","transcr_"i""a[i]"n"}' file


                The trick is to sort indexes of array a numerically with a little help of gawk's PROCINFO special array.



                transcr_7135
                YBL029C-A -
                YBL029W +
                transcr_11317
                YBL067C -
                transcr_20649
                YBL100C -
                transcr_25793
                YAL039C -
                YAL037C-B -
                YAL038W +


                BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).






                share|improve this answer






























                  7














                  Pure gawk solution:



                  awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
                  END{PROCINFO["sorted_in"]="@ind_num_asc";
                  for(i in a) printf "%s","transcr_"i""a[i]"n"}' file


                  The trick is to sort indexes of array a numerically with a little help of gawk's PROCINFO special array.



                  transcr_7135
                  YBL029C-A -
                  YBL029W +
                  transcr_11317
                  YBL067C -
                  transcr_20649
                  YBL100C -
                  transcr_25793
                  YAL039C -
                  YAL037C-B -
                  YAL038W +


                  BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).






                  share|improve this answer




























                    7












                    7








                    7







                    Pure gawk solution:



                    awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
                    END{PROCINFO["sorted_in"]="@ind_num_asc";
                    for(i in a) printf "%s","transcr_"i""a[i]"n"}' file


                    The trick is to sort indexes of array a numerically with a little help of gawk's PROCINFO special array.



                    transcr_7135
                    YBL029C-A -
                    YBL029W +
                    transcr_11317
                    YBL067C -
                    transcr_20649
                    YBL100C -
                    transcr_25793
                    YAL039C -
                    YAL037C-B -
                    YAL038W +


                    BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).






                    share|improve this answer















                    Pure gawk solution:



                    awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
                    END{PROCINFO["sorted_in"]="@ind_num_asc";
                    for(i in a) printf "%s","transcr_"i""a[i]"n"}' file


                    The trick is to sort indexes of array a numerically with a little help of gawk's PROCINFO special array.



                    transcr_7135
                    YBL029C-A -
                    YBL029W +
                    transcr_11317
                    YBL067C -
                    transcr_20649
                    YBL100C -
                    transcr_25793
                    YAL039C -
                    YAL037C-B -
                    YAL038W +


                    BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Jan 18 at 13:51

























                    answered Jan 18 at 12:40









                    jimmijjimmij

                    31.4k872108




                    31.4k872108























                        3














                        With GNU sort and assuming the lines don't contain TAB characters:



                        paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'


                        Or sort -t$'t' -sk1,1V to preserve the original order for entries with identical odd lines like in your expected output.



                        If you don't have GNU sort, and assuming the odd lines always follow that pattern, you can replace sort -V with sort -k1.9n.






                        share|improve this answer






























                          3














                          With GNU sort and assuming the lines don't contain TAB characters:



                          paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'


                          Or sort -t$'t' -sk1,1V to preserve the original order for entries with identical odd lines like in your expected output.



                          If you don't have GNU sort, and assuming the odd lines always follow that pattern, you can replace sort -V with sort -k1.9n.






                          share|improve this answer




























                            3












                            3








                            3







                            With GNU sort and assuming the lines don't contain TAB characters:



                            paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'


                            Or sort -t$'t' -sk1,1V to preserve the original order for entries with identical odd lines like in your expected output.



                            If you don't have GNU sort, and assuming the odd lines always follow that pattern, you can replace sort -V with sort -k1.9n.






                            share|improve this answer















                            With GNU sort and assuming the lines don't contain TAB characters:



                            paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'


                            Or sort -t$'t' -sk1,1V to preserve the original order for entries with identical odd lines like in your expected output.



                            If you don't have GNU sort, and assuming the odd lines always follow that pattern, you can replace sort -V with sort -k1.9n.







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Jan 18 at 13:09

























                            answered Jan 18 at 12:27









                            Stéphane ChazelasStéphane Chazelas

                            305k57575929




                            305k57575929























                                2














                                for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done


                                Where a.txt is your input. Tested:



                                [root@megatron ~]# cat a.txt
                                transcr_25793 +
                                YAL039C -
                                transcr_25793 +
                                YAL037C-B -
                                transcr_20649 +
                                YBL100C -
                                transcr_7135 +
                                YBL029C-A -
                                transcr_11317 +
                                YBL067C -
                                transcr_25793 +
                                YAL038W +
                                transcr_7135 +
                                YBL029W +
                                [root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done
                                transcr_7135
                                YBL029C-A -
                                YBL029W +
                                transcr_11317
                                YBL067C -
                                transcr_20649
                                YBL100C -
                                transcr_25793
                                YAL039C -
                                YAL037C-B -
                                YAL038W +
                                [root@megatron ~]#





                                share|improve this answer


























                                • It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:17











                                • I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                  – Zatarra
                                  Jan 18 at 12:18













                                • Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:24











                                • You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                  – Zatarra
                                  Jan 18 at 12:28













                                • Now it worked, thanks!

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:45
















                                2














                                for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done


                                Where a.txt is your input. Tested:



                                [root@megatron ~]# cat a.txt
                                transcr_25793 +
                                YAL039C -
                                transcr_25793 +
                                YAL037C-B -
                                transcr_20649 +
                                YBL100C -
                                transcr_7135 +
                                YBL029C-A -
                                transcr_11317 +
                                YBL067C -
                                transcr_25793 +
                                YAL038W +
                                transcr_7135 +
                                YBL029W +
                                [root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done
                                transcr_7135
                                YBL029C-A -
                                YBL029W +
                                transcr_11317
                                YBL067C -
                                transcr_20649
                                YBL100C -
                                transcr_25793
                                YAL039C -
                                YAL037C-B -
                                YAL038W +
                                [root@megatron ~]#





                                share|improve this answer


























                                • It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:17











                                • I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                  – Zatarra
                                  Jan 18 at 12:18













                                • Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:24











                                • You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                  – Zatarra
                                  Jan 18 at 12:28













                                • Now it worked, thanks!

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:45














                                2












                                2








                                2







                                for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done


                                Where a.txt is your input. Tested:



                                [root@megatron ~]# cat a.txt
                                transcr_25793 +
                                YAL039C -
                                transcr_25793 +
                                YAL037C-B -
                                transcr_20649 +
                                YBL100C -
                                transcr_7135 +
                                YBL029C-A -
                                transcr_11317 +
                                YBL067C -
                                transcr_25793 +
                                YAL038W +
                                transcr_7135 +
                                YBL029W +
                                [root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done
                                transcr_7135
                                YBL029C-A -
                                YBL029W +
                                transcr_11317
                                YBL067C -
                                transcr_20649
                                YBL100C -
                                transcr_25793
                                YAL039C -
                                YAL037C-B -
                                YAL038W +
                                [root@megatron ~]#





                                share|improve this answer















                                for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done


                                Where a.txt is your input. Tested:



                                [root@megatron ~]# cat a.txt
                                transcr_25793 +
                                YAL039C -
                                transcr_25793 +
                                YAL037C-B -
                                transcr_20649 +
                                YBL100C -
                                transcr_7135 +
                                YBL029C-A -
                                transcr_11317 +
                                YBL067C -
                                transcr_25793 +
                                YAL038W +
                                transcr_7135 +
                                YBL029W +
                                [root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done
                                transcr_7135
                                YBL029C-A -
                                YBL029W +
                                transcr_11317
                                YBL067C -
                                transcr_20649
                                YBL100C -
                                transcr_25793
                                YAL039C -
                                YAL037C-B -
                                YAL038W +
                                [root@megatron ~]#






                                share|improve this answer














                                share|improve this answer



                                share|improve this answer








                                edited Jan 18 at 13:03









                                andcoz

                                12.7k33139




                                12.7k33139










                                answered Jan 18 at 12:00









                                ZatarraZatarra

                                213




                                213













                                • It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:17











                                • I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                  – Zatarra
                                  Jan 18 at 12:18













                                • Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:24











                                • You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                  – Zatarra
                                  Jan 18 at 12:28













                                • Now it worked, thanks!

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:45



















                                • It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:17











                                • I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                  – Zatarra
                                  Jan 18 at 12:18













                                • Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:24











                                • You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                  – Zatarra
                                  Jan 18 at 12:28













                                • Now it worked, thanks!

                                  – Lucas Farinazzo Marques
                                  Jan 18 at 12:45

















                                It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                – Lucas Farinazzo Marques
                                Jan 18 at 12:17





                                It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                – Lucas Farinazzo Marques
                                Jan 18 at 12:17













                                I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                – Zatarra
                                Jan 18 at 12:18







                                I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                – Zatarra
                                Jan 18 at 12:18















                                Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                – Lucas Farinazzo Marques
                                Jan 18 at 12:24





                                Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                – Lucas Farinazzo Marques
                                Jan 18 at 12:24













                                You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                – Zatarra
                                Jan 18 at 12:28







                                You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                – Zatarra
                                Jan 18 at 12:28















                                Now it worked, thanks!

                                – Lucas Farinazzo Marques
                                Jan 18 at 12:45





                                Now it worked, thanks!

                                – Lucas Farinazzo Marques
                                Jan 18 at 12:45











                                0














                                Pre- and postprocessing with awk; this does not assume that a transcr line is followed by just one Y* line; it's also idempotent -- its output could be piped back as input and it will give the same result.



                                awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
                                transcr_7135 +
                                YBL029C-A -
                                YBL029W +
                                transcr_11317 +
                                YBL067C -
                                transcr_20649 +
                                YBL100C -
                                transcr_25793 +
                                YAL037C-B -
                                YAL038W +
                                YAL039C -





                                share|improve this answer




























                                  0














                                  Pre- and postprocessing with awk; this does not assume that a transcr line is followed by just one Y* line; it's also idempotent -- its output could be piped back as input and it will give the same result.



                                  awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
                                  transcr_7135 +
                                  YBL029C-A -
                                  YBL029W +
                                  transcr_11317 +
                                  YBL067C -
                                  transcr_20649 +
                                  YBL100C -
                                  transcr_25793 +
                                  YAL037C-B -
                                  YAL038W +
                                  YAL039C -





                                  share|improve this answer


























                                    0












                                    0








                                    0







                                    Pre- and postprocessing with awk; this does not assume that a transcr line is followed by just one Y* line; it's also idempotent -- its output could be piped back as input and it will give the same result.



                                    awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
                                    transcr_7135 +
                                    YBL029C-A -
                                    YBL029W +
                                    transcr_11317 +
                                    YBL067C -
                                    transcr_20649 +
                                    YBL100C -
                                    transcr_25793 +
                                    YAL037C-B -
                                    YAL038W +
                                    YAL039C -





                                    share|improve this answer













                                    Pre- and postprocessing with awk; this does not assume that a transcr line is followed by just one Y* line; it's also idempotent -- its output could be piped back as input and it will give the same result.



                                    awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
                                    transcr_7135 +
                                    YBL029C-A -
                                    YBL029W +
                                    transcr_11317 +
                                    YBL067C -
                                    transcr_20649 +
                                    YBL100C -
                                    transcr_25793 +
                                    YAL037C-B -
                                    YAL038W +
                                    YAL039C -






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Jan 19 at 8:07









                                    mosvymosvy

                                    7,5321530




                                    7,5321530






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f495272%2fhow-to-sort-by-odd-lines-then-remove-repeated-values%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        Mario Kart Wii

                                        What does “Dominus providebit” mean?

                                        Antonio Litta Visconti Arese