How to sort by odd lines then remove repeated values?
I have the following type of file:
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +
I was trying to get something like this:
transcr_7135 +
YBL029C-A -
transcr_7135 +
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_25793 +
YAL038W +
Then, afterward, I was looking for something like this:
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
YAL037C-B -
YAL038W +
I've scrolled through sort
manual and some posts, but couldn't find anything that fit near this, just sort
using numerical values to get odd lines...
text-processing sort
add a comment |
I have the following type of file:
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +
I was trying to get something like this:
transcr_7135 +
YBL029C-A -
transcr_7135 +
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_25793 +
YAL038W +
Then, afterward, I was looking for something like this:
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
YAL037C-B -
YAL038W +
I've scrolled through sort
manual and some posts, but couldn't find anything that fit near this, just sort
using numerical values to get odd lines...
text-processing sort
add a comment |
I have the following type of file:
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +
I was trying to get something like this:
transcr_7135 +
YBL029C-A -
transcr_7135 +
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_25793 +
YAL038W +
Then, afterward, I was looking for something like this:
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
YAL037C-B -
YAL038W +
I've scrolled through sort
manual and some posts, but couldn't find anything that fit near this, just sort
using numerical values to get odd lines...
text-processing sort
I have the following type of file:
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +
I was trying to get something like this:
transcr_7135 +
YBL029C-A -
transcr_7135 +
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_25793 +
YAL038W +
Then, afterward, I was looking for something like this:
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
YAL037C-B -
YAL038W +
I've scrolled through sort
manual and some posts, but couldn't find anything that fit near this, just sort
using numerical values to get odd lines...
text-processing sort
text-processing sort
edited Jan 18 at 13:38
Rui F Ribeiro
40.1k1479136
40.1k1479136
asked Jan 18 at 11:42
Lucas Farinazzo MarquesLucas Farinazzo Marques
665
665
add a comment |
add a comment |
5 Answers
5
active
oldest
votes
Not exactly the sorting order you've showed, but maby right as well?
$ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL037C-B -
YAL038W +
YAL039C -
EDIT:
Insert the line number and uses it as a sorting key, should produce the exact output you like:
$ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!
– Lucas Farinazzo Marques
Jan 18 at 12:32
add a comment |
Pure gawk
solution:
awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
END{PROCINFO["sorted_in"]="@ind_num_asc";
for(i in a) printf "%s","transcr_"i""a[i]"n"}' file
The trick is to sort indexes of array a
numerically with a little help of gawk
's PROCINFO special array.
transcr_7135
YBL029C-A -
YBL029W +
transcr_11317
YBL067C -
transcr_20649
YBL100C -
transcr_25793
YAL039C -
YAL037C-B -
YAL038W +
BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).
add a comment |
With GNU sort
and assuming the lines don't contain TAB characters:
paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'
Or sort -t$'t' -sk1,1V
to preserve the original order for entries with identical odd lines like in your expected output.
If you don't have GNU sort
, and assuming the odd lines always follow that pattern, you can replace sort -V
with sort -k1.9n
.
add a comment |
for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
do
echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
done
Where a.txt is your input. Tested:
[root@megatron ~]# cat a.txt
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +
[root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
do
echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
done
transcr_7135
YBL029C-A -
YBL029W +
transcr_11317
YBL067C -
transcr_20649
YBL100C -
transcr_25793
YAL039C -
YAL037C-B -
YAL038W +
[root@megatron ~]#
It appeared thistranscr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information.
and so on... Do you know what I can do?
– Lucas Farinazzo Marques
Jan 18 at 12:17
I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.
– Zatarra
Jan 18 at 12:18
Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error
– Lucas Farinazzo Marques
Jan 18 at 12:24
You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt
– Zatarra
Jan 18 at 12:28
Now it worked, thanks!
– Lucas Farinazzo Marques
Jan 18 at 12:45
add a comment |
Pre- and postprocessing with awk
; this does not assume that a transcr
line is followed by just one Y*
line; it's also idempotent -- its output could be piped back as input and it will give the same result.
awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL037C-B -
YAL038W +
YAL039C -
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f495272%2fhow-to-sort-by-odd-lines-then-remove-repeated-values%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
Not exactly the sorting order you've showed, but maby right as well?
$ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL037C-B -
YAL038W +
YAL039C -
EDIT:
Insert the line number and uses it as a sorting key, should produce the exact output you like:
$ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!
– Lucas Farinazzo Marques
Jan 18 at 12:32
add a comment |
Not exactly the sorting order you've showed, but maby right as well?
$ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL037C-B -
YAL038W +
YAL039C -
EDIT:
Insert the line number and uses it as a sorting key, should produce the exact output you like:
$ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!
– Lucas Farinazzo Marques
Jan 18 at 12:32
add a comment |
Not exactly the sorting order you've showed, but maby right as well?
$ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL037C-B -
YAL038W +
YAL039C -
EDIT:
Insert the line number and uses it as a sorting key, should produce the exact output you like:
$ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
Not exactly the sorting order you've showed, but maby right as well?
$ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL037C-B -
YAL038W +
YAL039C -
EDIT:
Insert the line number and uses it as a sorting key, should produce the exact output you like:
$ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
edited Jan 18 at 19:59
answered Jan 18 at 12:25
finswimmerfinswimmer
52416
52416
It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!
– Lucas Farinazzo Marques
Jan 18 at 12:32
add a comment |
It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!
– Lucas Farinazzo Marques
Jan 18 at 12:32
It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!
– Lucas Farinazzo Marques
Jan 18 at 12:32
It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!
– Lucas Farinazzo Marques
Jan 18 at 12:32
add a comment |
Pure gawk
solution:
awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
END{PROCINFO["sorted_in"]="@ind_num_asc";
for(i in a) printf "%s","transcr_"i""a[i]"n"}' file
The trick is to sort indexes of array a
numerically with a little help of gawk
's PROCINFO special array.
transcr_7135
YBL029C-A -
YBL029W +
transcr_11317
YBL067C -
transcr_20649
YBL100C -
transcr_25793
YAL039C -
YAL037C-B -
YAL038W +
BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).
add a comment |
Pure gawk
solution:
awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
END{PROCINFO["sorted_in"]="@ind_num_asc";
for(i in a) printf "%s","transcr_"i""a[i]"n"}' file
The trick is to sort indexes of array a
numerically with a little help of gawk
's PROCINFO special array.
transcr_7135
YBL029C-A -
YBL029W +
transcr_11317
YBL067C -
transcr_20649
YBL100C -
transcr_25793
YAL039C -
YAL037C-B -
YAL038W +
BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).
add a comment |
Pure gawk
solution:
awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
END{PROCINFO["sorted_in"]="@ind_num_asc";
for(i in a) printf "%s","transcr_"i""a[i]"n"}' file
The trick is to sort indexes of array a
numerically with a little help of gawk
's PROCINFO special array.
transcr_7135
YBL029C-A -
YBL029W +
transcr_11317
YBL067C -
transcr_20649
YBL100C -
transcr_25793
YAL039C -
YAL037C-B -
YAL038W +
BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).
Pure gawk
solution:
awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
END{PROCINFO["sorted_in"]="@ind_num_asc";
for(i in a) printf "%s","transcr_"i""a[i]"n"}' file
The trick is to sort indexes of array a
numerically with a little help of gawk
's PROCINFO special array.
transcr_7135
YBL029C-A -
YBL029W +
transcr_11317
YBL067C -
transcr_20649
YBL100C -
transcr_25793
YAL039C -
YAL037C-B -
YAL038W +
BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).
edited Jan 18 at 13:51
answered Jan 18 at 12:40
jimmijjimmij
31.4k872108
31.4k872108
add a comment |
add a comment |
With GNU sort
and assuming the lines don't contain TAB characters:
paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'
Or sort -t$'t' -sk1,1V
to preserve the original order for entries with identical odd lines like in your expected output.
If you don't have GNU sort
, and assuming the odd lines always follow that pattern, you can replace sort -V
with sort -k1.9n
.
add a comment |
With GNU sort
and assuming the lines don't contain TAB characters:
paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'
Or sort -t$'t' -sk1,1V
to preserve the original order for entries with identical odd lines like in your expected output.
If you don't have GNU sort
, and assuming the odd lines always follow that pattern, you can replace sort -V
with sort -k1.9n
.
add a comment |
With GNU sort
and assuming the lines don't contain TAB characters:
paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'
Or sort -t$'t' -sk1,1V
to preserve the original order for entries with identical odd lines like in your expected output.
If you don't have GNU sort
, and assuming the odd lines always follow that pattern, you can replace sort -V
with sort -k1.9n
.
With GNU sort
and assuming the lines don't contain TAB characters:
paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'
Or sort -t$'t' -sk1,1V
to preserve the original order for entries with identical odd lines like in your expected output.
If you don't have GNU sort
, and assuming the odd lines always follow that pattern, you can replace sort -V
with sort -k1.9n
.
edited Jan 18 at 13:09
answered Jan 18 at 12:27
Stéphane ChazelasStéphane Chazelas
305k57575929
305k57575929
add a comment |
add a comment |
for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
do
echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
done
Where a.txt is your input. Tested:
[root@megatron ~]# cat a.txt
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +
[root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
do
echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
done
transcr_7135
YBL029C-A -
YBL029W +
transcr_11317
YBL067C -
transcr_20649
YBL100C -
transcr_25793
YAL039C -
YAL037C-B -
YAL038W +
[root@megatron ~]#
It appeared thistranscr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information.
and so on... Do you know what I can do?
– Lucas Farinazzo Marques
Jan 18 at 12:17
I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.
– Zatarra
Jan 18 at 12:18
Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error
– Lucas Farinazzo Marques
Jan 18 at 12:24
You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt
– Zatarra
Jan 18 at 12:28
Now it worked, thanks!
– Lucas Farinazzo Marques
Jan 18 at 12:45
add a comment |
for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
do
echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
done
Where a.txt is your input. Tested:
[root@megatron ~]# cat a.txt
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +
[root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
do
echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
done
transcr_7135
YBL029C-A -
YBL029W +
transcr_11317
YBL067C -
transcr_20649
YBL100C -
transcr_25793
YAL039C -
YAL037C-B -
YAL038W +
[root@megatron ~]#
It appeared thistranscr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information.
and so on... Do you know what I can do?
– Lucas Farinazzo Marques
Jan 18 at 12:17
I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.
– Zatarra
Jan 18 at 12:18
Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error
– Lucas Farinazzo Marques
Jan 18 at 12:24
You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt
– Zatarra
Jan 18 at 12:28
Now it worked, thanks!
– Lucas Farinazzo Marques
Jan 18 at 12:45
add a comment |
for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
do
echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
done
Where a.txt is your input. Tested:
[root@megatron ~]# cat a.txt
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +
[root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
do
echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
done
transcr_7135
YBL029C-A -
YBL029W +
transcr_11317
YBL067C -
transcr_20649
YBL100C -
transcr_25793
YAL039C -
YAL037C-B -
YAL038W +
[root@megatron ~]#
for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
do
echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
done
Where a.txt is your input. Tested:
[root@megatron ~]# cat a.txt
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +
[root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
do
echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
done
transcr_7135
YBL029C-A -
YBL029W +
transcr_11317
YBL067C -
transcr_20649
YBL100C -
transcr_25793
YAL039C -
YAL037C-B -
YAL038W +
[root@megatron ~]#
edited Jan 18 at 13:03
andcoz
12.7k33139
12.7k33139
answered Jan 18 at 12:00
ZatarraZatarra
213
213
It appeared thistranscr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information.
and so on... Do you know what I can do?
– Lucas Farinazzo Marques
Jan 18 at 12:17
I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.
– Zatarra
Jan 18 at 12:18
Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error
– Lucas Farinazzo Marques
Jan 18 at 12:24
You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt
– Zatarra
Jan 18 at 12:28
Now it worked, thanks!
– Lucas Farinazzo Marques
Jan 18 at 12:45
add a comment |
It appeared thistranscr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information.
and so on... Do you know what I can do?
– Lucas Farinazzo Marques
Jan 18 at 12:17
I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.
– Zatarra
Jan 18 at 12:18
Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error
– Lucas Farinazzo Marques
Jan 18 at 12:24
You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt
– Zatarra
Jan 18 at 12:28
Now it worked, thanks!
– Lucas Farinazzo Marques
Jan 18 at 12:45
It appeared this
transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information.
and so on... Do you know what I can do?– Lucas Farinazzo Marques
Jan 18 at 12:17
It appeared this
transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information.
and so on... Do you know what I can do?– Lucas Farinazzo Marques
Jan 18 at 12:17
I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.
– Zatarra
Jan 18 at 12:18
I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.
– Zatarra
Jan 18 at 12:18
Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error
– Lucas Farinazzo Marques
Jan 18 at 12:24
Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error
– Lucas Farinazzo Marques
Jan 18 at 12:24
You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt
– Zatarra
Jan 18 at 12:28
You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt
– Zatarra
Jan 18 at 12:28
Now it worked, thanks!
– Lucas Farinazzo Marques
Jan 18 at 12:45
Now it worked, thanks!
– Lucas Farinazzo Marques
Jan 18 at 12:45
add a comment |
Pre- and postprocessing with awk
; this does not assume that a transcr
line is followed by just one Y*
line; it's also idempotent -- its output could be piped back as input and it will give the same result.
awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL037C-B -
YAL038W +
YAL039C -
add a comment |
Pre- and postprocessing with awk
; this does not assume that a transcr
line is followed by just one Y*
line; it's also idempotent -- its output could be piped back as input and it will give the same result.
awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL037C-B -
YAL038W +
YAL039C -
add a comment |
Pre- and postprocessing with awk
; this does not assume that a transcr
line is followed by just one Y*
line; it's also idempotent -- its output could be piped back as input and it will give the same result.
awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL037C-B -
YAL038W +
YAL039C -
Pre- and postprocessing with awk
; this does not assume that a transcr
line is followed by just one Y*
line; it's also idempotent -- its output could be piped back as input and it will give the same result.
awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL037C-B -
YAL038W +
YAL039C -
answered Jan 19 at 8:07
mosvymosvy
7,5321530
7,5321530
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f495272%2fhow-to-sort-by-odd-lines-then-remove-repeated-values%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown