本文介绍YARN自带的一个非常简单的应用程序实例—distributedshell的使用方法。它可以看做YARN编程中的“hello world”,主要功能是并行执行用户提供的shell命令或者shell脚本。

  • 运行参数介绍

DistributedShell的基本运行参数如下:

  • 运行方法

DistributedShell的运行方法如下:

在YARN安装目录下,执行以下命令:

bin/hadoop jar\
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.0.0-cdh4.1.1.jar\
  org.apache.hadoop.yarn.applications.distributedshell.Client\
   --jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.0.0-cdh4.1.1.jar\
   --shell_command ls\
   --shell_script ignore.sh\
   --num_containers 10\
   --container_memory 350\
   --master_memory 350\
   --priority 10

需要注意的是,在hadoop-2.0.3-alpha(不包括该版本)和CDH 4.1.2版本(包括该版本)之前,DistributedShell存在BUG,具体如下:

  1. 必须使用–shell_command参数
  2. 当只有shell_command参数而没有shell_script参数时,在分布式模式下(伪分布式下可以)不能执行成功,具体说明和修复方法见:https://issues.apache.org/jira/browse/YARN-253,在这个实例中,ignore.sh中的内容就是“ls”
  3. 内存设置一定要正确,不然会出现以下提示的错误:
Container [pid=4424,containerID=container_1359629844156_0004_01_000001] is running beyond virtual memory limits. Current usage: 90.1mb of 128.0mb physical memory used; 593.0mb of 268.8mb virtual memory used. Killing container.

【附】DistributedShell执行日志:

13/02/01 13:43:11 INFO distributedshell.Client: Initializing Client
13/02/01 13:43:11 INFO distributedshell.Client: Starting Client
13/02/01 13:43:11 INFO distributedshell.Client: Connecting to ResourceManager at c2-23/10.1.1.98:8032
13/02/01 13:43:12 INFO distributedshell.Client: Got Cluster metric info from ASM, numNodeManagers=3
13/02/01 13:43:12 INFO distributedshell.Client: Got Cluster node info from ASM
13/02/01 13:43:12 INFO distributedshell.Client: Got node report from ASM for, nodeId=c2-23:36594, nodeAddressc2-23:8042, nodeRackName/default-rack, nodeNumContainers0, nodeHealthStatusis_node_healthy: true, health_report: "", last_health_report_time: 1359697377337, 
13/02/01 13:43:12 INFO distributedshell.Client: Got node report from ASM for, nodeId=c2-25:41070, nodeAddressc2-25:8042, nodeRackName/default-rack, nodeNumContainers0, nodeHealthStatusis_node_healthy: true, health_report: "", last_health_report_time: 1359697367180, 
13/02/01 13:43:12 INFO distributedshell.Client: Got node report from ASM for, nodeId=c2-24:48383, nodeAddressc2-24:8042, nodeRackName/default-rack, nodeNumContainers0, nodeHealthStatusis_node_healthy: true, health_report: "", last_health_report_time: 1359699033102, 
13/02/01 13:43:12 INFO distributedshell.Client: Queue info, queueName=default, queueCurrentCapacity=0.0, queueMaxCapacity=1.0, queueApplicationCount=0, queueChildQueueCount=0
13/02/01 13:43:12 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=SUBMIT_APPLICATIONS
13/02/01 13:43:12 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=ADMINISTER_QUEUE
13/02/01 13:43:12 INFO distributedshell.Client: Got new application id=application_1359695803957_0003
13/02/01 13:43:12 INFO distributedshell.Client: Min mem capabililty of resources in this cluster 128
13/02/01 13:43:12 INFO distributedshell.Client: Max mem capabililty of resources in this cluster 10240
13/02/01 13:43:12 INFO distributedshell.Client: Setting up application submission context for ASM
13/02/01 13:43:12 INFO distributedshell.Client: Copy App Master jar from local filesystem and add to local environment
13/02/01 13:43:13 INFO distributedshell.Client: Set the environment for the application master
13/02/01 13:43:13 INFO distributedshell.Client: Trying to generate classpath for app master from current thread's classpath
13/02/01 13:43:13 INFO distributedshell.Client: Readable bytes from stream=9006
13/02/01 13:43:13 INFO distributedshell.Client: Setting up app master command
13/02/01 13:43:13 INFO distributedshell.Client: Completed setting up app master command ${JAVA_HOME}/bin/java -Xmx350m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 350 --num_containers 10 --priority 0 --shell_command ls 1><LOG_DIR>/AppMaster.stdout 2><LOG_DIR>/AppMaster.stderr 
13/02/01 13:43:13 INFO distributedshell.Client: Submitting application to ASM
13/02/01 13:43:14 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToken=null, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=0, appStartTime=1359697393467, yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, appTrackingUrl=c2-23:8088/proxy/application_1359695803957_0003/, appUser=rmss
13/02/01 13:43:15 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToken=null, appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, appStartTime=1359697393467, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=, appUser=rmss
13/02/01 13:43:16 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToken=null, appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, appStartTime=1359697393467, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=, appUser=rmss
13/02/01 13:43:17 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToken=null, appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, appStartTime=1359697393467, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=, appUser=rmss
13/02/01 13:43:18 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToken=null, appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, appStartTime=1359697393467, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=, appUser=rmss
13/02/01 13:43:19 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToken=null, appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, appStartTime=1359697393467, yarnAppState=FINISHED, distributedFinalState=SUCCEEDED, appTrackingUrl=, appUser=rmss
13/02/01 13:43:19 INFO distributedshell.Client: Application has completed successfully. Breaking monitoring loop
13/02/01 13:43:19 INFO distributedshell.Client: Application completed successfully

原创文章,转载请注明: 转载自董的博客

本文链接地址: 如何运行YARN中的DistributedShell程序

微信公众号:hadoop-123,专注于大数据技术分享,欢迎加入!

说点什么

avatar
  Subscribe  
提醒