Hadoop整个MR的过程源码解析(二)-map端任务的执行-map输出的过程源码解析

本文深入解析Hadoop MapReduce中map任务的输出过程,特别是当有无reduce阶段时的不同处理。在无reduce阶段时,map输出直接写入HDFS;存在reduce阶段时,涉及分区、排序、溢写和combiner等关键步骤,详细解读了这些过程的源码,为MR优化提供参考。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

接着上文所说的map的输入以及通过调用读取器的nextKeyValue判断是否还有下一条数据,读取数据,重新赋值偏移量,以及给key赋值偏移量pos(当前行相对于整个文件的一个位置),value(当前行的数据),完成了map阶段的输入以及重新定义,接下来该聊的就是map的输出阶段,该阶段是比较复杂的,分为两个分支;第一分支是没有reduce的阶段,在日常的MR的过程中最简单的优化手段就是在尽量的避免reduce的产生,因为reduce阶段存在shuffle阶段,只要有reduce阶段的发生,那么shuffle阶段就是不可避免的,而在整个MR的过程中,最耗时的就是shuffle阶段,现阶段的大部分对于MR的优化基本是都是旨在减少甚至避免shuffle对整个过程造成的影响,此处就不多讲了,在接下来的文章中会将MR的调优仔细讲一下,有需要的话,可以参考之后的文章。另外一个分支就是存在reduce阶段,这个分支将是最主要也是最为复杂的阶段。

现在进入主题,进入map输出阶段的源码分析,入口就是在自定义的map的context.write()方法。

进入代码,就会发现其实是调用的mapContext的Write方法。

而这write方法其实是上一篇中在初始化MapContext的时候根据是否存在reduce阶段,选择output方式之后加入mapcontext的,代码如下;

所以问题就回到了,这个时候的output到底是啥呢?话不多说,上代码;

这下对这个output算是有一定了解了吧,对的,其实如果不存在reduce阶段的话,map的输出底层是调用hdfs的输出,将结果直接输出,不存在排序相关的过程,接下来就是直接输出的时候的相关源码了;

 /**
         * 直接输出相关的源码
         * 对容器进行初始化
         */
        NewDirectOutputCollector(MRJobConfig jobContext,
                                 JobConf job, TaskUmbilicalProtocol umbilical, TaskReporter reporter)
                throws IOException, ClassNotFoundException, InterruptedException {
            /*****************接下来是集群的监控或者通信服务,暂时不用关注*******************/
            this.reporter = reporter;
            mapOutputRecordCounter = reporter
                    .getCounter(TaskCounter.MAP_OUTPUT_RECORDS);
            fileOutputByteCounter = reporter
                    .getCounter(FileOutputFormatCounter.BYTES_WRITTEN);
            List<Statistics> matchedStats = null;
            if (outputFormat instanceof org.apache.hadoop.mapreduce.lib.output.FileOutputFormat) {
                matchedStats = getFsStatistics(org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
                        .getOutputPath(taskContext), taskContext.getConfiguration());
            }
            fsStats = matchedStats;
            /*****************end*******************/

            long bytesOutPrev = getOutputBytes(fsStats);
            /**
             * 获取当前输出格式化类的输出器
             * 这个方式可以联想到输入格式化类有自己默认的读取器,输出格式化类也有着自己的默认的读取器
             * out = new LineRecordWriter<K, V>(fileOut, keyValueSeparator);
             */
            out = outputFormat.getRecordWriter(taskContext);
            long bytesOutCurr = getOutputBytes(fsStats);
            fileOutputByteCounter.increment(bytesOutCurr - bytesOutPrev);
        }

        /**
         * 真正的写出的过程
         * @param key
         * @param value
         * @throws IOException
         * @throws InterruptedException
         */
        @Override
        @SuppressWarnings("unchecked")
        public void write(K key, V value)
                throws IOException, InterruptedException {
            reporter.progress();
            long bytesOutPrev = getOutputBytes(fsStats);
            /**
             * 此处调用了LineRecordWriter的write方法
             */
            out.write(key, value);
            long bytesOutCurr = getOutputBytes(fsStats);
            fileOutputByteCounter.increment(bytesOutCurr - bytesOutPrev);
            mapOutputRecordCounter.increment(1);
        }
//接下来是进入LineRecordWriter内部查看write的实现,需要注意的是,输出的时候是不经过缓冲区,直接将key,value进行输出
public synchronized void write(K key, V value)
      throws IOException {

      boolean nullKey = key == null || key instanceof NullWritable;
      boolean nullValue = value == null || value instanceof NullWritable;
      if (nullKey && nullValue) {
        return;
      }
      if (!nullKey) {
        writeObject(key);
      }
      /**
       * 判断key和value是否为空,不过我不知道为什么会这么判断,key值会为空吗?
       */
      if (!(nullKey || nullValue)) {
//此处的out为 DataOutputStream extends FilterOutputStream
        out.write(keyValueSeparator);
      }
      if (!nullValue) {
        writeObject(value);
      }
      out.write(newline);
    }

再往下走其实就没什么必要性,因为底层调用的hdfs的底层的一个output的工具,个人觉得没什么必要性继续往下追,所以基本也就到此,浅尝辄止了。

接下来才是重头戏,就是在存在reduce的情况下的输出到底是什么样的呢?

好了,上代码,代码如下;

首先明确一点,此处的output已经变了;

output = new NewOutputCollector(taskContext, job, umbilical, reporter);

在接下来的这段代码,将会说明reduce的个数,到底是靠什么去规定的,以及分区器是干什么的;

         /**
         * 初始化一个可排序的容器,注意此时是支持排序的,为的是输出有序的数据
         * 初始化分区器,这时候加载分区器,如果没有自定义,就会默认选择hashPartition
         * 如果reduce没有指定,默认只有一个,则每次指挥返回0号分区
         */
        NewOutputCollector(org.apache.hadoop.mapreduce.JobContext jobContext,
                           JobConf job,
                           TaskUmbilicalProtocol umbilical,
                           TaskReporter reporter
        ) throws IOException, ClassNotFoundException {
            //存储key,value,partition的容器--》MapOutputCollector--》MapOutputbuffer
            collector = createSortingCollector(job, reporter);
            /**
             * 分区个数等于设定的reduce的个数
             * reduce的个数是跟分区数相同的
             * reduece的 个数是可以人为指定的,所以reduce的个数,并行度,其实是由人为的去控制的
             * 重写分区器partitioner的作用是,通过人为的干预分组过程,尽量避免出现数据倾斜的问题
             */
            partitions = jobContext.getNumReduceTasks();
            if (partitions > 1) {
                //自定义分区器,继承org.apache.hadoop.mapreduce.Partitioner<K,V> -》根据数据抽样自定义分区器可有效防止在进行mr的时候出现数据倾斜的现象
                //默认是hash分区器
                //自定义分区器的作用是为了在一定程度上避免大量数据倾斜的产生
                partitioner = (org.apache.hadoop.mapreduce.Partitioner<K, V>)
                        ReflectionUtils.newInstance(jobContext.getPartitionerClass(), job);
            } else {
                //获取分区号
                partitioner = new org.apache.hadoop.mapreduce.Partitioner<K, V>() {
                    @Override
                    public int getPartition(K key, V value, int numPartitions) {
                        return partitions - 1;
                    }
                };
            }
        }

记住这段代码;partitions = jobContext.getNumReduceTasks();

你的分区数是由你的reduce的个数决定的,那么reduce的个数哪来的呢?就是我们自己在编写代码的时候自己设置的,也就是说我们自己可以人为的规定reduce的分区个数,但是在决定分区个数的时候最好是先做抽样,避免分区设置太多很多都是无用的,或者分区数太少,reduce的并行度起不来,导致任务执行很长时间。

接下来的重点就是;

就是这个创建可排序的容器,上代码;

private <KEY, VALUE> MapOutputCollector<KEY, VALUE>
    createSortingCollector(JobConf job, TaskReporter reporter)
            throws IOException, ClassNotFoundException {
        MapOutputCollector.Context context =
                new MapOutputCollector.Context(this, job, reporter);
        /**
         * 容器的底层实现是MapOutputBuffer
         */
        Class<?>[] collectorClasses = job.getClasses(
                JobContext.MAP_OUTPUT_COLLECTOR_CLASS_ATTR, MapOutputBuffer.class);
        int remainingCollectors = collectorClasses.length;
        for (Class clazz : collectorClasses) {
            try {
                if (!MapOutputCollector.class.isAssignableFrom(clazz)) {
                    throw new IOException("Invalid output collector class: " + clazz.getName() +
                            " (does not implement MapOutputCollector)");
                }
                Class<? extends MapOutputCollector> subclazz =
                        clazz.asSubclass(MapOutputCollector.class);
                LOG.debug("Trying map output collector class: " + subclazz.getName());
                MapOutputCollector<KEY, VALUE> collector =
                        ReflectionUtils.newInstance(subclazz, job);
                //容器初始化
                /**
                 * 几个步骤;
                 *  1;确定容器的大小也就是环形缓冲区的大小,如果用户没有设定的话,会采用默认的溢写比例0.8;环形缓冲区大小为100m
                 *  2;排序算法的确定,如果用户没有指定对key排序使用的特殊算法的话,会默认采用快排算法来对key进行排序
                 *  3;排序需要指定,按什么来排序,是字符序,还是数值序,或者自定义的排序比较器
                 *      如果没有自定义排序比较器,就会默认采用key的类型的那个比较器去做排序比较器
                 *          所以如果key为自定义的,而不是系统原生的,则必须重写排序比较器
                 */
                collector.init(context);
                LOG.info("Map output collector class = " + collector.getClass().getName());
                return collector;
            } catch (Exception e) {
                String msg = "Unable to initialize MapOutputCollector " + clazz.getName();
                if (--remainingCollectors > 0) {
                    msg += " (" + remainingCollectors + " more collector(s) to try)";
                }
                LOG.warn(msg, e);
            }
        }
        throw new IOException("Unable to initialize any output collector");
    }

接下来是容器初始化,也就是mapOutPutBuffer的初始化;

这里面有几个重点,一个是环形缓冲区的大小,溢写比例,排序比较器,commbiner

其中环形缓冲区的大小可以作为之后对MR优化的一个方向,因为环形缓冲区越大,相对应溢写时候的文件也就越大,减少大量小文件出现的情况,

第二点是commbiner的存在,就是map端的reduce,在shuffle将数据拉走之前提前对当前map输出的数据进行聚合操作,map端聚合也是MR的优化手段之一;

        public void init(MapOutputCollector.Context context
        ) throws IOException, ClassNotFoundException {
            job = context.getJobConf();
            reporter = context.getReporter();
            mapTask = context.getMapTask();
            mapOutputFile = mapTask.getMapOutputFile();
            sortPhase = mapTask.getSortPhase();
            spilledRecordsCounter = reporter.getCounter(TaskCounter.SPILLED_RECORDS);
            partitions = job.getNumReduceTasks();
            rfs = ((LocalFileSystem) FileSystem.getLocal(job)).getRaw();

            //sanity checks
            /**
             * 确定环形缓冲区大小,溢写阈值
             */
            final float spillper =
                    job.getFloat(JobContext.MAP_SORT_SPILL_PERCENT, (float) 0.8);//溢写阈值
            final int sortmb = job.getInt(JobContext.IO_SORT_MB, 100);//内存缓冲区大小
            indexCacheMemoryLimit = job.getInt(JobContext.INDEX_CACHE_MEMORY_LIMIT,
                    INDEX_CACHE_MEMORY_LIMIT_DEFAULT);
            if (spillper > (float) 1.0 || spillper <= (float) 0.0) {
                throw new IOException("Invalid \"" + JobContext.MAP_SORT_SPILL_PERCENT +
                        "\": " + spillper);
            }
            if ((sortmb & 0x7FF) != sortmb) {
                throw new IOException(
                        "Invalid \"" + JobContext.IO_SORT_MB + "\": " + sortmb);
            }
            /**
             * 基于内存的排序,此处的排序是整个MR中唯一的一次由无序变有序的排序,其余的都是在此基础之上进行排序的
             * 可以通过设置map.sort.class来选择排序的方式
             * 如果未指定排序方式,默认为快速排序
             */
            sorter = ReflectionUtils.newInstance(job.getClass("map.sort.class",
                    QuickSort.class, IndexedSorter.class), job);
            // buffers and accounting
            int maxMemUsage = sortmb << 20;
            maxMemUsage -= maxMemUsage % METASIZE;
            kvbuffer = new byte[maxMemUsage];
            bufvoid = kvbuffer.length;
            kvmeta = ByteBuffer.wrap(kvbuffer)
                    .order(ByteOrder.nativeOrder())
                    .asIntBuffer();
            setEquator(0);
            bufstart = bufend = bufindex = equator;
            kvstart = kvend = kvindex;

            maxRec = kvmeta.capacity() / NMETA;
            softLimit = (int) (kvbuffer.length * spillper);
            bufferRemaining = softLimit;
            LOG.info(JobContext.IO_SORT_MB + ": " + sortmb);
            LOG.info("soft limit at " + softLimit);
            LOG.info("bufstart = " + bufstart + "; bufvoid = " + bufvoid);
            LOG.info("kvstart = " + kvstart + "; length = " + maxRec);

            // k/v serialization
            /**
             *定义key的排序比较器
             * 如果没有认为定义
             *          WritableComparator.get(getMapOutputKeyClass().asSubclass(WritableComparable.class), this)
             *               getClass(JobContext.MAP_OUTPUT_KEY_CLASS, null, Object.class);
             *会将key所属的类型的比较器作为key的排序比较器
             * 这也就说明了,如果key值是对象的话,这个比较器是必须要重写的,需要定义key的排序的规则
             */
            comparator = job.getOutputKeyComparator();
            keyClass = (Class<K>) job.getMapOutputKeyClass();
            valClass = (Class<V>) job.getMapOutputValueClass();
            serializationFactory = new SerializationFactory(job);
            keySerializer = serializationFactory.getSerializer(keyClass);
            keySerializer.open(bb);
            valSerializer = serializationFactory.getSerializer(valClass);
            valSerializer.open(bb);

            // output counters
            mapOutputByteCounter = reporter.getCounter(TaskCounter.MAP_OUTPUT_BYTES);
            mapOutputRecordCounter =
                    reporter.getCounter(TaskCounter.MAP_OUTPUT_RECORDS);
            fileOutputByteCounter = reporter
                    .getCounter(TaskCounter.MAP_OUTPUT_MATERIALIZED_BYTES);

            // compression
            if (job.getCompressMapOutput()) {
                Class<? extends CompressionCodec> codecClass =
                        job.getMapOutputCompressorClass(DefaultCodec.class);
                codec = ReflectionUtils.newInstance(codecClass, job);
            } else {
                codec = null;
            }

            /**
             * 判断是否由commbiner的存在,如果存在就实例化
             */
            // combiner
            final Counters.Counter combineInputCounter =
                    reporter.getCounter(TaskCounter.COMBINE_INPUT_RECORDS);
            combinerRunner = CombinerRunner.create(job, getTaskID(),
                    combineInputCounter,
                    reporter, null);
            if (combinerRunner != null) {
                final Counters.Counter combineOutputCounter =
                        reporter.getCounter(TaskCounter.COMBINE_OUTPUT_RECORDS);
                combineCollector = new CombineOutputCollector<K, V>(combineOutputCounter, reporter, job);
            } else {
                combineCollector = null;
            }
            spillInProgress = false;
            minSpillsForCombine = job.getInt(JobContext.MAP_COMBINE_MIN_SPILLS, 3);
            /**
             * 以守护线程的方式启动溢写的线程,
             * 溢写线程启动之后会一直执行,检测,溢写
             * 发生溢写的时候会锁住待溢写的环形缓冲区的数据,溢写结束之后将锁释放
             */
            spillThread.setDaemon(true);
            spillThread.setName("SpillThread");
            spillLock.lock();
            try {
                spillThread.start();
                while (!spillThreadRunning) {
                    spillDone.await();
                }
            } catch (InterruptedException e) {
                throw new IOException("Spill thread failed to initialize", e);
            } finally {
                spillLock.unlock();
            }
            if (sortSpillException != null) {
                throw new IOException("Spill thread failed to initialize",
                        sortSpillException);
            }
        }

以上就是容器的初始化,现在该进入的阶段是溢写阶段了,这段代码我并没有注释,研究的不是太详细,所以就只能贴上源码;

方法入口;重点是,存入缓冲区的数据是K,V,P三个数据

 /**
         * context的write()实际上是调用的output的write的方法,collector将key,value,partition都存储起来
         *
         * context的write方法实际上是调用output的write方法,在reduce的个数不为0的情况下调用的就是newoutputcollector的write的方法
         * 这个方法会将以下三个数据写入容器中
         *          1;key
         *          2;value
         *          3;根据初始化的时候初始化的分区器,计算获取当前key值所从属的分区
         *  此处所说的容器其实是一个环形内存缓冲区
         *          collector = mapoutputbuffer
         *
         *
         * @param key
         * @param value
         * @throws IOException
         * @throws InterruptedException
         */
        @Override
        public void write(K key, V value) throws IOException, InterruptedException {
            collector.collect(key, value,
                    partitioner.getPartition(key, value, partitions));
        }

接下来这段代码是数据的写入缓冲区的过程;

        /**
         * Serialize the key, value to intermediate storage.
         * When this method returns, kvindex must refer to sufficient unused
         * storage to store one METADATA.
         */
        public synchronized void collect(K key, V value, final int partition
        ) throws IOException {
            reporter.progress();
            if (key.getClass() != keyClass) {
                throw new IOException("Type mismatch in key from map: expected "
                        + keyClass.getName() + ", received "
                        + key.getClass().getName());
            }
            if (value.getClass() != valClass) {
                throw new IOException("Type mismatch in value from map: expected "
                        + valClass.getName() + ", received "
                        + value.getClass().getName());
            }
            if (partition < 0 || partition >= partitions) {
                throw new IOException("Illegal partition for " + key + " (" +
                        partition + ")");
            }
            checkSpillException();
            bufferRemaining -= METASIZE;
            if (bufferRemaining <= 0) {
                // start spill if the thread is not running and the soft limit has been
                // reached
                spillLock.lock();
                try {
                    do {
                        if (!spillInProgress) {
                            final int kvbidx = 4 * kvindex;
                            final int kvbend = 4 * kvend;
                            // serialized, unspilled bytes always lie between kvindex and
                            // bufindex, crossing the equator. Note that any void space
                            // created by a reset must be included in "used" bytes
                            final int bUsed = distanceTo(kvbidx, bufindex);
                            final boolean bufsoftlimit = bUsed >= softLimit;
                            if ((kvbend + METASIZE) % kvbuffer.length !=
                                    equator - (equator % METASIZE)) {
                                // spill finished, reclaim space
                                resetSpill();
                                bufferRemaining = Math.min(
                                        distanceTo(bufindex, kvbidx) - 2 * METASIZE,
                                        softLimit - bUsed) - METASIZE;
                                continue;
                            } else if (bufsoftlimit && kvindex != kvend) {
                                // spill records, if any collected; check latter, as it may
                                // be possible for metadata alignment to hit spill pcnt
                                startSpill();
                                final int avgRec = (int)
                                        (mapOutputByteCounter.getCounter() /
                                                mapOutputRecordCounter.getCounter());
                                // leave at least half the split buffer for serialization data
                                // ensure that kvindex >= bufindex
                                final int distkvi = distanceTo(bufindex, kvbidx);
                                final int newPos = (bufindex +
                                        Math.max(2 * METASIZE - 1,
                                                Math.min(distkvi / 2,
                                                        distkvi / (METASIZE + avgRec) * METASIZE)))
                                        % kvbuffer.length;
                                setEquator(newPos);
                                bufmark = bufindex = newPos;
                                final int serBound = 4 * kvend;
                                // bytes remaining before the lock must be held and limits
                                // checked is the minimum of three arcs: the metadata space, the
                                // serialization space, and the soft limit
                                bufferRemaining = Math.min(
                                        // metadata max
                                        distanceTo(bufend, newPos),
                                        Math.min(
                                                // serialization max
                                                distanceTo(newPos, serBound),
                                                // soft limit
                                                softLimit)) - 2 * METASIZE;
                            }
                        }
                    } while (false);
                } finally {
                    spillLock.unlock();
                }
            }

            try {
                // serialize key bytes into buffer
                int keystart = bufindex;
                keySerializer.serialize(key);
                if (bufindex < keystart) {
                    // wrapped the key; must make contiguous
                    bb.shiftBufferedKey();
                    keystart = 0;
                }
                // serialize value bytes into buffer
                final int valstart = bufindex;
                valSerializer.serialize(value);
                // It's possible for records to have zero length, i.e. the serializer
                // will perform no writes. To ensure that the boundary conditions are
                // checked and that the kvindex invariant is maintained, perform a
                // zero-length write into the buffer. The logic monitoring this could be
                // moved into collect, but this is cleaner and inexpensive. For now, it
                // is acceptable.
                bb.write(b0, 0, 0);

                // the record must be marked after the preceding write, as the metadata
                // for this record are not yet written
                int valend = bb.markRecord();

                mapOutputRecordCounter.increment(1);
                mapOutputByteCounter.increment(
                        distanceTo(keystart, valend, bufvoid));

                // write accounting info
                kvmeta.put(kvindex + PARTITION, partition);
                kvmeta.put(kvindex + KEYSTART, keystart);
                kvmeta.put(kvindex + VALSTART, valstart);
                kvmeta.put(kvindex + VALLEN, distanceTo(valstart, valend));
                // advance kvindex
                kvindex = (kvindex - NMETA + kvmeta.capacity()) % kvmeta.capacity();
            } catch (MapBufferTooSmallException e) {
                LOG.info("Record too large for in-memory buffer: " + e.getMessage());
                spillSingleRecord(key, value, partition);
                mapOutputRecordCounter.increment(1);
                return;
            }
        }

再接下来就是spillandsort阶段了;

    protected class SpillThread extends Thread {

            @Override
            public void run() {
                spillLock.lock();
                spillThreadRunning = true;
                try {
                    while (true) {
                        spillDone.signal();
                        while (!spillInProgress) {
                            spillReady.await();
                        }
                        try {
                            spillLock.unlock();
                            sortAndSpill();
                        } catch (Throwable t) {
                            sortSpillException = t;
                        } finally {
                            spillLock.lock();
                            if (bufend < bufstart) {
                                bufvoid = kvbuffer.length;
                            }
                            kvstart = kvend;
                            bufstart = bufend;
                            spillInProgress = false;
                        }
                    }
                } catch (InterruptedException e) {
                    Thread.currentThread().interrupt();
                } finally {
                    spillLock.unlock();
                    spillThreadRunning = false;
                }
            }
        }

这段代码我们只需要关注;sortAndSpill();即可

    /**
         * 排序溢写阶段
         *      基于内存排序
         *      由无序变有序
         *      快排+归并
         *      先按分区排序后按key排序
         * @throws IOException
         * @throws ClassNotFoundException
         * @throws InterruptedException
         */
        private void sortAndSpill() throws IOException, ClassNotFoundException,
                InterruptedException {
            //approximate the length of the output file to be the length of the
            //buffer + header lengths for the partitions
            final long size = distanceTo(bufstart, bufend, bufvoid) +
                    partitions * APPROX_HEADER_LENGTH;
            FSDataOutputStream out = null;
            try {
                // create spill file
                final SpillRecord spillRec = new SpillRecord(partitions);
                /**
                 *  String.format(String.format(SPILL_FILE_PATTERN, conf.get(JobContext.TASK_ATTEMPT_ID), spillNumber)), size, conf);
                 *  SPILL_FILE_PATTERN = "%s_spill_%d.out"
                 */
                final Path filename =
                        mapOutputFile.getSpillFileForWrite(numSpills, size);
                /**
                 * 溢写的文件的文件名的格式
                 * "%s_spill_%d.out"
                 */
                out = rfs.create(filename);

                final int mstart = kvend / NMETA;
                final int mend = 1 + // kvend is a valid record
                        (kvstart >= kvend
                                ? kvstart
                                : kvmeta.capacity() + kvstart) / NMETA;
                /**
                 * 基于内存排序
                 * 排序依据是key的指定的排序方式
                 */
                sorter.sort(MapOutputBuffer.this, mstart, mend, reporter);
                int spindex = mstart;
                final IndexRecord rec = new IndexRecord();
                final InMemValBytes value = new InMemValBytes();
                for (int i = 0; i < partitions; ++i) {
                    IFile.Writer<K, V> writer = null;
                    try {
                        long segmentStart = out.getPos();
                        FSDataOutputStream partitionOut = CryptoUtils.wrapIfNecessary(job, out);
                        /**
                         * 按key排序之后,再按分区输出
                         */
                        writer = new Writer<K, V>(job, partitionOut, keyClass, valClass, codec,
                                spilledRecordsCounter);
                        if (combinerRunner == null) {
                            // spill directly
                            DataInputBuffer key = new DataInputBuffer();
                            while (spindex < mend &&
                                    kvmeta.get(offsetFor(spindex % maxRec) + PARTITION) == i) {
                                final int kvoff = offsetFor(spindex % maxRec);
                                int keystart = kvmeta.get(kvoff + KEYSTART);
                                int valstart = kvmeta.get(kvoff + VALSTART);
                                key.reset(kvbuffer, keystart, valstart - keystart);
                                getVBytesForOffset(kvoff, value);
                                writer.append(key, value);
                                ++spindex;
                            }
                        } else {
                            int spstart = spindex;
                            while (spindex < mend &&
                                    kvmeta.get(offsetFor(spindex % maxRec)
                                            + PARTITION) == i) {
                                ++spindex;
                            }
                            // Note: we would like to avoid the combiner if we've fewer
                            // than some threshold of records for a partition
                            //如果记录很少的尽量避免使用combiner
                            if (spstart != spindex) {
                                /**
                                 * 开始执行combiner
                                 * combiner有一个参数,是设定文件小于多少的时候不进行combiner,参数默认为;3
                                 */
                                combineCollector.setWriter(writer);
                                RawKeyValueIterator kvIter =
                                        new MRResultIterator(spstart, spindex);
                                combinerRunner.combine(kvIter, combineCollector);
                            }
                        }

                        // close the writer
                        writer.close();

                        // record offsets
                        rec.startOffset = segmentStart;
                        rec.rawLength = writer.getRawLength() + CryptoUtils.cryptoPadding(job);
                        rec.partLength = writer.getCompressedLength() + CryptoUtils.cryptoPadding(job);
                        spillRec.putIndex(rec, i);

                        writer = null;
                    } finally {
                        if (null != writer) writer.close();
                    }
                }

最终所有数据处理完成,会执行output.close()方法,这个方法的底层实现为;

  public void flush() throws IOException, ClassNotFoundException,
                InterruptedException {
            LOG.info("Starting flush of map output");
            spillLock.lock();
            try {
                while (spillInProgress) {
                    reporter.progress();
                    spillDone.await();
                }
                checkSpillException();

                final int kvbend = 4 * kvend;
                if ((kvbend + METASIZE) % kvbuffer.length !=
                        equator - (equator % METASIZE)) {
                    // spill finished
                    resetSpill();
                }
                if (kvindex != kvend) {
                    kvend = (kvindex + NMETA) % kvmeta.capacity();
                    bufend = bufmark;
                    LOG.info("Spilling map output");
                    LOG.info("bufstart = " + bufstart + "; bufend = " + bufmark +
                            "; bufvoid = " + bufvoid);
                    LOG.info("kvstart = " + kvstart + "(" + (kvstart * 4) +
                            "); kvend = " + kvend + "(" + (kvend * 4) +
                            "); length = " + (distanceTo(kvend, kvstart,
                            kvmeta.capacity()) + 1) + "/" + maxRec);
                    sortAndSpill();
                }
            } catch (InterruptedException e) {
                throw new IOException("Interrupted while waiting for the writer", e);
            } finally {
                spillLock.unlock();
            }
            assert !spillLock.isHeldByCurrentThread();
            // shut down spill thread and wait for it to exit. Since the preceding
            // ensures that it is finished with its work (and sortAndSpill did not
            // throw), we elect to use an interrupt instead of setting a flag.
            // Spilling simultaneously from this thread while the spill thread
            // finishes its work might be both a useful way to extend this and also
            // sufficient motivation for the latter approach.
            try {
                spillThread.interrupt();
                spillThread.join();
            } catch (InterruptedException e) {
                throw new IOException("Spill failed", e);
            }
            // release sort buffer before the merge
            kvbuffer = null;
            //归并
            mergeParts();
            Path outputPath = mapOutputFile.getOutputFile();
            fileOutputByteCounter.increment(rfs.getFileStatus(outputPath).getLen());
        }

也就是在,最终即将关闭的时候开启合并过程-》mergeParts();

//将溢写的数据最终归并成中间结果集文件,以供之后的reduce中的shuffle拉取数据
        private void mergeParts() throws IOException, InterruptedException,
                ClassNotFoundException {
            // get the approximate size of the final output/index files
            long finalOutFileSize = 0;
            long finalIndexFileSize = 0;
            final Path[] filename = new Path[numSpills];
            final TaskAttemptID mapId = getTaskID();

            for (int i = 0; i < numSpills; i++) {
                filename[i] = mapOutputFile.getSpillFile(i);
                finalOutFileSize += rfs.getFileStatus(filename[i]).getLen();
            }
            if (numSpills == 1) { //the spill is the final output
                sameVolRename(filename[0],
                        mapOutputFile.getOutputFileForWriteInVolume(filename[0]));
                if (indexCacheList.size() == 0) {
                    sameVolRename(mapOutputFile.getSpillIndexFile(0),
                            mapOutputFile.getOutputIndexFileForWriteInVolume(filename[0]));
                } else {
                    indexCacheList.get(0).writeToFile(
                            mapOutputFile.getOutputIndexFileForWriteInVolume(filename[0]), job);
                }
                sortPhase.complete();
                return;
            }

            // read in paged indices
            for (int i = indexCacheList.size(); i < numSpills; ++i) {
                Path indexFileName = mapOutputFile.getSpillIndexFile(i);
                indexCacheList.add(new SpillRecord(indexFileName, job));
            }

            //make correction in the length to include the sequence file header
            //lengths for each partition
            finalOutFileSize += partitions * APPROX_HEADER_LENGTH;
            finalIndexFileSize = partitions * MAP_OUTPUT_INDEX_RECORD_LENGTH;
            Path finalOutputFile =
                    mapOutputFile.getOutputFileForWrite(finalOutFileSize);
            Path finalIndexFile =
                    mapOutputFile.getOutputIndexFileForWrite(finalIndexFileSize);

            //The output stream for the final single output file
            FSDataOutputStream finalOut = rfs.create(finalOutputFile, true, 4096);

            if (numSpills == 0) {
                //create dummy files
                IndexRecord rec = new IndexRecord();
                SpillRecord sr = new SpillRecord(partitions);
                try {
                    for (int i = 0; i < partitions; i++) {
                        long segmentStart = finalOut.getPos();
                        FSDataOutputStream finalPartitionOut = CryptoUtils.wrapIfNecessary(job, finalOut);
                        Writer<K, V> writer =
                                new Writer<K, V>(job, finalPartitionOut, keyClass, valClass, codec, null);
                        writer.close();
                        rec.startOffset = segmentStart;
                        rec.rawLength = writer.getRawLength() + CryptoUtils.cryptoPadding(job);
                        rec.partLength = writer.getCompressedLength() + CryptoUtils.cryptoPadding(job);
                        sr.putIndex(rec, i);
                    }
                    sr.writeToFile(finalIndexFile, job);
                } finally {
                    finalOut.close();
                }
                sortPhase.complete();
                return;
            }
            {
                sortPhase.addPhases(partitions); // Divide sort phase into sub-phases

                IndexRecord rec = new IndexRecord();
                final SpillRecord spillRec = new SpillRecord(partitions);
                for (int parts = 0; parts < partitions; parts++) {
                    //create the segments to be merged
                    List<Segment<K, V>> segmentList =
                            new ArrayList<Segment<K, V>>(numSpills);
                    for (int i = 0; i < numSpills; i++) {
                        IndexRecord indexRecord = indexCacheList.get(i).getIndex(parts);

                        Segment<K, V> s =
                                new Segment<K, V>(job, rfs, filename[i], indexRecord.startOffset,
                                        indexRecord.partLength, codec, true);
                        segmentList.add(i, s);

                        if (LOG.isDebugEnabled()) {
                            LOG.debug("MapId=" + mapId + " Reducer=" + parts +
                                    "Spill =" + i + "(" + indexRecord.startOffset + "," +
                                    indexRecord.rawLength + ", " + indexRecord.partLength + ")");
                        }
                    }

                    int mergeFactor = job.getInt(JobContext.IO_SORT_FACTOR, 100);
                    // sort the segments only if there are intermediate merges
                    boolean sortSegments = segmentList.size() > mergeFactor;
                    //merge
                    @SuppressWarnings("unchecked")
                    RawKeyValueIterator kvIter = Merger.merge(job, rfs,
                            keyClass, valClass, codec,
                            segmentList, mergeFactor,
                            new Path(mapId.toString()),
                            job.getOutputKeyComparator(), reporter, sortSegments,
                            null, spilledRecordsCounter, sortPhase.phase(),
                            TaskType.MAP);

                    //write merged output to disk
                    long segmentStart = finalOut.getPos();
                    FSDataOutputStream finalPartitionOut = CryptoUtils.wrapIfNecessary(job, finalOut);
                    Writer<K, V> writer =
                            new Writer<K, V>(job, finalPartitionOut, keyClass, valClass, codec,
                                    spilledRecordsCounter);
                    if (combinerRunner == null || numSpills < minSpillsForCombine) {
                        Merger.writeFile(kvIter, writer, reporter, job);
                    } else {
                        combineCollector.setWriter(writer);
                        combinerRunner.combine(kvIter, combineCollector);
                    }

                    //close
                    writer.close();

                    sortPhase.startNextPhase();

                    // record offsets
                    rec.startOffset = segmentStart;
                    rec.rawLength = writer.getRawLength() + CryptoUtils.cryptoPadding(job);
                    rec.partLength = writer.getCompressedLength() + CryptoUtils.cryptoPadding(job);
                    spillRec.putIndex(rec, parts);
                }
                spillRec.writeToFile(finalIndexFile, job);
                finalOut.close();
                for (int i = 0; i < numSpills; i++) {
                    rfs.delete(filename[i], true);
                }
            }
        }

到此为止,map输出阶段就结束了,也就是说,map阶段基本就结束了;

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值