逐行读取文件，如果满足条件则继续读取，直到下一个条件。

Question

更多

问题

逐行读取文件，如果满足条件则继续读取，直到下一个条件。

我有一个文件foo.txt

test
qwe
asd
xca
asdfarrf
sxcad
asdfa
sdca
dac
dacqa
ea
sdcv
asgfa
sdcv
ewq
qwe
a
df
fa
vas
fg
fasdf
eqw
qwe
aefawasd
adfae
asdfwe
asdf
era
fbn
tsgnjd
nuydid
hyhnydf
gby
asfga
dsg
eqw
qwe
rtargt
raga
adfgasgaa
asgarhsdtj
shyjuysy
sdgh
jstht
ewq
sdtjstsa
sdghysdmks
aadfbgns,
asfhytewat
bafg
q4t
qwe
asfdg5ab
fgshtsadtyh
wafbvg
nasfga
ghafg
ewq
qwe
afghta
asg56ang
adfg643
5aasdfgr5
asdfg
fdagh5t
ewq

我想在一个单独的文件中打印qwe和ewq之间的所有行。这就是我目前的情况：

#!/bin/bash

filename="foo.txt"

#While loop to read line by line
while read -r line
do
    readLine=$line
    #If the line starts with ST then echo the line
    if [[ $readLine = qwe* ]] ; then
        echo "$readLine"
        read line
        readLine=$line
        if [[ $readLine = ewq* ]] ; then
            echo "$readLine"
        fi
    fi
done < "$filename"

Gilles 'SO- stop being evil

已编辑的问题 18日一月 2016 в 11:56

Unix和Linux

text-processing

bash

shell-script

read

解决方案/答案

Wildcard

18日一月 2016 в 10:33

更多

正如@Costas所指出的，这项工作的正确工具是sed：

sed '/qwe/,/ewq/ w other.file' foo.txt

要打印的行可能还需要其他处理。这很好，就像这样做：

sed -e '/qwe/,/ewq/{w other.file' -e 'other processing;}' foo.txt

(当然，"其他处理"不是真正的`sed'命令。）如果你需要在打印行之后进行处理，就可以使用上述模式。如果你想做一些其他处理，然后打印该行的修改版本（这似乎更有可能），你将使用：

sed -e '/qwe/,/ewq/{processing;w other.file' -e '}' foo.txt

(注意，必须把"}"放在自己的参数中，否则它将被解释为 "other.file "名称的一部分）。

你（OP）没有说明你要对这些行进行什么"其他处理"，或者我可以说得更具体。但不管是什么处理，你肯定可以在sed中进行，或者如果这变得太麻烦，你可以在awk中进行，对上述代码的改动很小：

awk '/qwe/,/ewq/ { print > "other.file" }' foo.txt

然后，你就可以利用awk'编程语言的所有功能，在执行print'语句之前对这些行进行处理。当然，awk'（和sed'）是*为文本处理而设计的，与`bash'不同。

4

0

mikeserv

18日一月 2016 в 11:57

更多

qwe(){ printf %s\\n "$1"; }
ewq(){ :; }
IFS=   ### prep  the  loop, only IFS= once
while  read -r  in
do     case $in in
       (qwe|ewq)
           set "$in"
       ;;
       ("$processing"?)
           "$process"
       esac
       "$1" "$in"
done

这是一个非常缓慢的方法。用GNU的grep'和一个普通的*infile'*：

IFS=
while grep  -xm1 qwe
do    while read  -r  in  &&
            [ ewq != "$in" ]
      do    printf %s\\n "$in"
            : some processing
      done
done <infile

...至少可以优化掉一半的低效读取...

sed  -ne '/^qwe$/,/^ewq$/H;$!{/^qwe$/!d;}' \
      -e "x;s/'"'/&\\&&/g;s/\n/'"' '/g"    \
      -e "s/\(.*\) .e.*/p '\1/p" <input    |
sh    -c 'p(){  printf %s\\n "$@"
                for l do : process "$l"
                done
          }; . /dev/fd/0'

对于大多数 "sh "来说，这将完全避免 "read "的低效率，尽管它必须打印两次输出--一次是带引号的 "sh"，一次是不带引号的 "stdout"。它的工作方式不同，因为在大多数实现中，.命令倾向于按块而不是按字节读取输入。尽管如此，它还是完全避免了ewq - qwe，并且可以用于流式输入--比如FIFO。

qwe
asd
xca
asdfarrf
sxcad
asdfa
sdca
dac
dacqa
ea
sdcv
asgfa
sdcv
qwe
a
df
fa
vas
fg
fasdf
qwe
aefawasd
adfae
asdfwe
asdf
era
fbn
tsgnjd
nuydid
hyhnydf
gby
asfga
dsg
qwe
rtargt
raga
adfgasgaa
asgarhsdtj
shyjuysy
sdgh
jstht
qwe
asfdg5ab
fgshtsadtyh
wafbvg
nasfga
ghafg
qwe
afghta
asg56ang
adfg643
5aasdfgr5
asdfg
fdagh5t

mikeserv

编辑本段答案19日一月 2016 в 8:47

1

0

添加问题

岚，巗峃，。

全部

技术

文化/娱乐

生活/艺术

科学

专业的

业务

用户

全部

新的

热门

1

2

3

4

5

您有问题吗？将问题添加到网站上并立即得到答复

zh.kzen.dev

匿名用户 · Accepted Answer · 2016-01-18T19:55:54+00:00

你需要对你的脚本做一些修改（没有特定的顺序）：

在read前使用IFS=以避免删除前导和尾部的空格。
由于$line没有改变任何地方，所以不需要变量readLine。
不要在循环的中间使用read！！。
使用一个布尔变量来控制打印。
明确打印的开始和结束。

有了这些变化，脚本就变成了：

#!/bin/bash

filename="foo.txt"

#While loop to read line by line
while IFS= read -r line; do
    #If the line starts with ST then set var to yes.
    if [[ $line == qwe* ]] ; then
        printline="yes"
        # Just t make each line start very clear, remove in use.
        echo "----------------------->>"
    fi
    # If variable is yes, print the line.
    if [[ $printline == "yes" ]] ; then
        echo "$line"
    fi
    #If the line starts with ST then set var to no.
    if [[ $line == ewq* ]] ; then
        printline="no"
        # Just to make each line end very clear, remove in use.
        echo "----------------------------<<"
    fi
done < "$filename"

可以这样凝练：

#!/bin/bash
filename="foo.txt"
while IFS= read -r line; do
    [[ $line == qwe* ]]       && printline="yes"
    [[ $printline == "yes" ]] && echo "$line"
    [[ $line == ewq* ]]       && printline="no"
done < "$filename"

这将打印出开始和结束行（包括）。
如果不需要打印它们，就把开始和结束的测试换掉：

#!/bin/bash
filename="foo.txt"
while IFS= read -r line; do
    [[ $line == ewq* ]]       && printline="no"
    [[ $printline == "yes" ]] && echo "$line"
    [[ $line == qwe* ]]       && printline="yes"
done < "$filename"

然而，使用readarray和循环数组元素会更好（如果你有bash 4.0或更好的版本）：

#!/bin/dash
filename="infile"

readarray -t lines < "$filename"

for line in "${lines[@]}"; do
    [[ $line == ewq* ]]       && printline="no"
    [[ $printline == "yes" ]] && echo "$line"
    [[ $line == qwe* ]]       && printline="yes"
done

这将避免使用read的大部分问题。

当然，你可以使用推荐的（在评论中；谢谢，@costas）sed行，只获得要处理的行：

    #!/bin/bash
filename="foo.txt"

readarray -t lines <<< "$(sed -n '/^qwe.*/,/^ewq.*/p' "$filename")"

for line in "${lines[@]}"; do

     : # Do all your additional processing here, with a clean input.

done