今天运行hadoop的php脚本时,总是出现一个很奇怪的现象:

10/05/20 15:33:41 INFO streaming.StreamJob: map 100% reduce 100%
10/05/20 15:33:41 INFO streaming.StreamJob: To kill this job, run:
10/05/20 15:33:41 INFO streaming.StreamJob: /usr/local/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=tj1cschsvc0001:9001 -kill job_201004301826_0595
10/05/20 15:33:41 INFO streaming.StreamJob: Tracking URL: http://tj1cschsvc0001:50030/jobdetails.jsp?jobid=job_201004301826_0595
10/05/20 15:33:41 ERROR streaming.StreamJob: Job not Successful!
10/05/20 15:33:41 INFO streaming.StreamJob: killJob…
Streaming Job Failed!

mapreducer明明已经100%完成了,但是却报Job not Successful。
查看log,发现syslog中有如下错误:
2010-05-20 15:32:38,494 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
2010-05-20 15:32:38,495 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=10/0/0 in:NA [rec/s] out:NA [rec/s]
2010-05-20 15:32:38,496 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=100/0/0 in:NA [rec/s] out:NA [rec/s]
2010-05-20 15:32:38,508 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=1000/0/0 in:NA [rec/s] out:NA [rec/s]
2010-05-20 15:32:38,558 INFO org.apache.hadoop.streaming.PipeMapRed: MROutputThread done
2010-05-20 15:32:38,558 INFO org.apache.hadoop.streaming.PipeMapRed: MRErrorThread done
2010-05-20 15:32:38,562 INFO org.apache.hadoop.streaming.PipeMapRed: mapRedFinished
2010-05-20 15:32:38,567 INFO org.apache.hadoop.streaming.PipeMapRed: mapRedFinished
2010-05-20 15:32:38,581 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
java.io.IOException: subprocess still running
R/W/S=4851/0/0 in:NA [rec/s] out:NA [rec/s]
minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
HOST=null
USER=ms
HADOOP_USER=null
last Hadoop input: |null|
last tool output: |null|
Date: Thu May 20 15:32:38 CST 2010
Broken pipe
at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:131)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
2010-05-20 15:32:38,586 INFO org.apache.hadoop.mapred.TaskRunner: Runnning cleanup for the task

而且并不是每个task都出错,运行几次之后,发现4个reducer进程中,只有一个出现这个问题。
首先排查了reducer脚本代码,各个服务器代码都一致。其实从task的重试也可以看到,它在两台不同的服务器都尝试过,而其中的一台上的其他task是执行成功的。
更改reducer代码,仅读取输入,然后echo出来,还是一样的现象。
初步判断不是reducer脚本的问题。那么会同输入相关吗?将reducer改为/bin/cat,发现4个part文件中,有一个开头有一个空行,而且其他的都没有。

猜测罪魁祸首有可能就是这个空行!删除它之后,再次执行mapreducer,成功!

但是原因是什么呢?使用我的recuder脚本时,有空行就出错;而使用cat作为reducer,就正常运行。那么还是因为reducer脚本对于空行的处理不足!但是奇怪的是,直接运行reducer脚本,以echo的几个空行作为输入,也没有错误啊!

ps:这个空行的原因是,想要清空某个文件,用了 echo > filename。以后还是直接rm吧!

2 Comments

  1. SEM推广 says:

    嗯··还不错·!网站模板不错··

    SEM推广SEO推广

  2. 空虚 says:

    果然内涵。。。

Leave a Reply