I'm trying to find out whether range1 numbers [both columns a and b] are the subset or lying between range2's columns [both columns b and c].
range1
a b
15 20
8 10
37 44
32 37
range2
a b c
chr1 6 12
chr2 13 21
chr3 31 35
chr4 36 45
output:
a b c
chr1 6 12 8 10
chr2 13 21 15 20
chr4 36 45 37 44
I have tried to learn from this code [which is working if we wanted to check if a single number is lying in a specific range], therefore I tried modifying the same for two both numbers. But did not work, I'm feeling I'm not able to read the second file properly.
I wanted to compare range1[a] with range2[b] and range1[b] with range2[c]. One to all comparison.
For example in the first run: the first row of range-1 with all other rows of range-2. But range1[a] should be compared only with range2[b] and similarly, range1[b] should be compared only with range2[c]. Based on this only I have written a criteria :
lbs[i] && lbsf1[j] <= ubs[i] && ubsf1[j] >= lbs[i] && ubsf1[j] <= ubs[i]
r1[a] r2[b] r1[b] r2[c]
15 > 6 20 < 12 False
15 > 13 20 < 21 True
15 > 31 20 < 35 False
15 > 36 20 < 45 False
Code: [reference but little modified]
#!/bin/bash
awk -F'\t' '
# 1st pass (fileB): read the lower and upper range bounds
FNR==NR { lbs[++count] = $2+0; ubs[count] = $3+0; next }
# 2nd pass (fileA): check each line against all ranges.
{ lbsf1[++countf1] = $1+0; ubsf1[countf1] = $2+0; next }
{
for(i=1;i<=count;++i)
{
for(j=1;j<=countf1;++j)
if (lbsf1[j] >= lbs[i] && lbsf1[j] <= ubs[i] && ubsf1[j] >= lbs[i] && ubsf1[j] <= ubs[i])
{ print lbs[i]"\t"ubs[i]"\t"lbsf1[j]"\t"ubsf1[j] ; next }
}
}
' range2 range1
Thank you.