本人一贯的风格是先了解系统的基础部分,然后在深入到高级部分;如果违背这种循序渐进的次序,也超出了本人的接受能力。古人说,学有本末,事有终始,知所先后,则尽道矣。我们还是从基础开始吧(本人上文提到的开发图片服务器还是放到后面吧)
本人在第一篇文章中描述的WordCount单词统计程序是在单机环境运行的,现在我们改造一下,改造成在单机伪分布环境中运行
新建WordCount类,继承Configured,实现Tool接口
public class WordCount extends Configured implements Tool{ public static class Map extends Mapper
因为本人是在伪分布环境测试上面的单词统计程序,需要将该类打包成jar文件,本人这里采用程序中生成临时jar文件的方式
public class EJob { // To declare global field private static ListclassPath = new ArrayList (); // To declare method public static File createTempJar(String root) throws IOException { if (! new File(root).exists()) { return null ; } Manifest manifest = new Manifest(); manifest.getMainAttributes().putValue("Manifest-Version", "1.0" ); final File jarFile = File.createTempFile("EJob-", ".jar", new File( System.getProperty("java.io.tmpdir" ))); Runtime.getRuntime().addShutdownHook( new Thread() { public void run() { jarFile.delete(); } }); JarOutputStream out = new JarOutputStream( new FileOutputStream(jarFile), manifest); createTempJarInner(out, new File(root), "" ); out.flush(); out.close(); return jarFile; } private static void createTempJarInner(JarOutputStream out, File f, String base) throws IOException { if (f.isDirectory()) { File[] fl = f.listFiles(); if (base.length() > 0 ) { base = base + "/" ; } for ( int i = 0; i < fl.length; i++ ) { createTempJarInner(out, fl[i], base + fl[i].getName()); } } else { out.putNextEntry( new JarEntry(base)); FileInputStream in = new FileInputStream(f); byte[] buffer = new byte[1024 ]; int n = in.read(buffer); while (n != -1 ) { out.write(buffer, 0 , n); n = in.read(buffer); } in.close(); } } public static ClassLoader getClassLoader() { ClassLoader parent = Thread.currentThread().getContextClassLoader(); if (parent == null ) { parent = EJob. class .getClassLoader(); } if (parent == null ) { parent = ClassLoader.getSystemClassLoader(); } return new URLClassLoader(classPath.toArray( new URL[0 ]), parent); } public static void addClasspath(String component) { if ((component != null) && (component.length() > 0 )) { try { File f = new File(component); if (f.exists()) { URL key = f.getCanonicalFile().toURL(); if (! classPath.contains(key)) { classPath.add(key); } } } catch (IOException e) { } } } }
最后我们运行上面的WordCount类的main方法,记住先要将待统计的文件上传到HDFS文件系统的/test/input目录里面(可以采用本人上文中的编程方式上传或者在eclipse的UI界面上传)
—————————————————————————
本系列Hadoop1.2.0开发笔记系本人原创
转载请注明出处 博客园 刺猬的温驯
本文链接 http://www.cnblogs.com/chenying99/archive/2013/06/02/3113474.html
转载于:https://www.cnblogs.com/chenying99/archive/2013/06/02/3113474.html
发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/214216.html原文链接:https://javaforall.net
