zhan8610189

浏览: 75584 次
性别:
来自: 北京

最近访客更多访客>>

qq85609655

TangoHuang

zzr1000

清闻啊

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

HBase读书笔记2

博客分类：

HBase

1. QoS

HBase的请求都有一个请求级别，即优先级(priorityLevel)。在RPC那一层也有它们相应级别的线程池，根据请求的优先级放到相应的线程池中。这两个线程池的线程数量分别由参数hbase.regionserver.handler.count 和hbase.regionserver.metahandler.count配置。

在regionserver中，优先级＜＝10的被认为是一个普通请求，它会分配到IPC Server handler 队列中去；优先级>10的请求是被认为是优先处理请求，它会被分配到PRI IPC Server handler中去。能够放入优先请求队列的请求有如下两个特征：

该请求和调用方法被注解@QosPriority了，并且该注解的priority值大于10。例如在HRegionServer里有这些函数是具备较高优先级别的：openRegion，closeRegion，flushRegion，splitRegion，compactRegion，getProtocolSignature，getRegionInfo，unlockRow等
该请求是操作元数据region：即操作的是.META.或者-ROOT-表

它们的值是通过org.apache.hadoop.hbase.regionserver.HRegionServer.QosFunction计算出来的。

2. ZooKeeperWatcher

ZooKeeperWatcher是HBase实现ZooKeeper Watcher的惟一实现。通过它控制着zookeeper里面所有的节点状态：创建，删除，更新，事件回调等等。在HMaster, HRegionServer和Client都只有一个的实例去连接ZooKeeper集群。

3. XxxTracker

在HBase里面有很多的Tracker类，他们分别承担着不同的作用。

ClusterStatusTracker 对应/hbase/shutdown，在hmaster中用来记录集群状态信息，例如，集群的上线时间。
DrainingServerTracker 对应/hbase/draining，在hmaster中记录这些regionserver列表不能够再分配新的region。
MetaNodeTracker 被CatalogTracker调用
CatalogTracker 监控对.META.表和-ROOT-表的可用性，管理着RootRegionTracker和MetaNodeTracker。CatalogTracker记录着.meta./-root-表所在region的状态。事实上，zookeeper/hbase/root-region-server记录着-root-表的位置，.meta.的信息记录在-root-表里，而那些用户表的信息都放在.meta.表里。
RegionServerTracker 对应/hbase/rs，维护着活着的regionserver列表信息，
RootRegionTracker 对应/hbase/root-region-server，
ZooKeeperNodeTracker 是一个对应ZooKeeper节点的Tracker，是一个抽象类。

还有一个/hbase/unassigned下面的region还是处于待分配状态。

4. hbase表状态不一致

hbase表状态不一致是通常指hbase .meta.表中的元数据信息与存取在hdfs上的数据信息不一致。造成hbase表状态不一致的原因有很多种。大多数情况下是在region split时出现hbase regionserver突然挂掉，操作失败导致hbase回滚等等原因引发的不一致。可以通过命令hbase hbck查看hbase集群状态是否是完整的，查看哪些数据是不一致的。同时可以通过hbase hbck -repair修复不一致的数据。

5. .META.表不能被split

我们一直有一种感觉，就是在hbase中有root表维护着meta表信息，按道理是可以把meta表split成两个或更多region。但是事实上，这是不行的。在checkSplit里面自然这种情况过滤掉了。可以参考JIRA Disable META splitting in 0.20

    public byte[] checkSplit() {
    // Can't split META
    if (getRegionInfo().isMetaRegion()) {
      if (shouldForceSplit()) {
        LOG.warn("Cannot split meta regions in HBase 0.20 and above");
      }
      return null;
    }

    if (!splitPolicy.shouldSplit()) {
      return null;
    }

    byte[] ret = splitPolicy.getSplitPoint();

    if (ret != null) {
      try {
        checkRow(ret, "calculated split");
      } catch (IOException e) {
        LOG.error("Ignoring invalid split", e);
        return null;
      }
    }
    return ret;
  }

6. DrainingServer

drainingServer里的regionserver不再分配新region，你即使把某个region move到该节点上，也会自动随机分配到其它的节点中去。详情可以参考这个JIRA：Support to drain RS nodes through ZK

/**
   * @param state
   * @param serverToExclude Server to exclude (we know its bad). Pass null if
   * all servers are thought to be assignable.
   * @param forceNewPlan If true, then if an existing plan exists, a new plan
   * will be generated.
   * @return Plan for passed <code>state</code> (If none currently, it creates one or
   * if no servers to assign, it returns null).
   */
  RegionPlan getRegionPlan(final RegionState state,
      final ServerName serverToExclude, final boolean forceNewPlan) {
    // Pickup existing plan or make a new one
    final String encodedName = state.getRegion().getEncodedName();
    final List<ServerName> servers = this.serverManager.getOnlineServersList();
    final List<ServerName> drainingServers = this.serverManager.getDrainingServersList();  //draining server 列表


    if (serverToExclude != null) servers.remove(serverToExclude);

    // Loop through the draining server list and remove them from the server
    // list.
    if (!drainingServers.isEmpty()) {
      for (final ServerName server: drainingServers) {  // 从onlineserver列表里面去掉draining server
        LOG.debug("Removing draining server: " + server +
            " from eligible server pool.");
        servers.remove(server);
      }
    }

    // Remove the deadNotExpired servers from the server list.
    removeDeadNotExpiredServers(servers);



    if (servers.isEmpty()) return null;

    RegionPlan randomPlan = null;
    boolean newPlan = false;
    RegionPlan existingPlan = null;

    synchronized (this.regionPlans) {
      existingPlan = this.regionPlans.get(encodedName);

      if (existingPlan != null && existingPlan.getDestination() != null) {
        LOG.debug("Found an existing plan for " +
            state.getRegion().getRegionNameAsString() +
       " destination server is " + existingPlan.getDestination().toString());
      }

      if (forceNewPlan
          || existingPlan == null
          || existingPlan.getDestination() == null
          || drainingServers.contains(existingPlan.getDestination())) {  //如果计划move 到draining server里面，那么就随机分配一个destination server
        newPlan = true;
        randomPlan = new RegionPlan(state.getRegion(), null, balancer
            .randomAssignment(servers));
        this.regionPlans.put(encodedName, randomPlan);
      }
    }

    if (newPlan) {
      LOG.debug("No previous transition plan was found (or we are ignoring " +
        "an existing plan) for " + state.getRegion().getRegionNameAsString() +
        " so generated a random one; " + randomPlan + "; " +
        serverManager.countOfRegionServers() +
               " (online=" + serverManager.getOnlineServers().size() +
               ", available=" + servers.size() + ") available servers");
        return randomPlan;
      }
    LOG.debug("Using pre-existing plan for region " +
               state.getRegion().getRegionNameAsString() + "; plan=" + existingPlan);
      return existingPlan;
  }

7. openRegion原理

openRegion就是对HRegion进行初始化工作。下面是真正进行初始化region的代码。

private long initializeRegionInternals(final CancelableProgressable reporter,
      MonitoredTask status) throws IOException, UnsupportedEncodingException {
    if (coprocessorHost != null) {
      status.setStatus("Running coprocessor pre-open hook");
      coprocessorHost.preOpen();
    }

    // Write HRI to a file in case we need to recover .META.
    status.setStatus("Writing region info on filesystem");
    checkRegioninfoOnFilesystem();

    // Remove temporary data left over from old regions
    status.setStatus("Cleaning up temporary data from old regions");
    cleanupTmpDir();

    // Load in all the HStores.
    // Get minimum of the maxSeqId across all the store.
    //
    // Context: During replay we want to ensure that we do not lose any data. So, we
    // have to be conservative in how we replay logs. For each store, we calculate
    // the maxSeqId up to which the store was flushed. But, since different stores
    // could have a different maxSeqId, we choose the
    // minimum across all the stores.
    // This could potentially result in duplication of data for stores that are ahead
    // of others. ColumnTrackers in the ScanQueryMatchers do the de-duplication, so we
    // do not have to worry.
    // TODO: If there is a store that was never flushed in a long time, we could replay
    // a lot of data. Currently, this is not a problem because we flush all the stores at
    // the same time. If we move to per-cf flushing, we might want to revisit this and send
    // in a vector of maxSeqIds instead of sending in a single number, which has to be the
    // min across all the max.
    long minSeqId = -1;
    long maxSeqId = -1;
    // initialized to -1 so that we pick up MemstoreTS from column families
    long maxMemstoreTS = -1;

    if (this.htableDescriptor != null &&
        !htableDescriptor.getFamilies().isEmpty()) {
      // initialize the thread pool for opening stores in parallel.
      ThreadPoolExecutor storeOpenerThreadPool =
        getStoreOpenAndCloseThreadPool(
          "StoreOpenerThread-" + this.regionInfo.getRegionNameAsString());
      CompletionService<Store> completionService =
        new ExecutorCompletionService<Store>(storeOpenerThreadPool);

      // initialize each store in parallel
      for (final HColumnDescriptor family : htableDescriptor.getFamilies()) {
        status.setStatus("Instantiating store for column family " + family);
        completionService.submit(new Callable<Store>() {
          public Store call() throws IOException {
            return instantiateHStore(tableDir, family);
          }
        });
      }
      try {
        for (int i = 0; i < htableDescriptor.getFamilies().size(); i++) {
          Future<Store> future = completionService.take();
          Store store = future.get();

          this.stores.put(store.getColumnFamilyName().getBytes(), store);
          long storeSeqId = store.getMaxSequenceId();
          if (minSeqId == -1 || storeSeqId < minSeqId) {
            minSeqId = storeSeqId;
          }
          if (maxSeqId == -1 || storeSeqId > maxSeqId) {
            maxSeqId = storeSeqId;
          }
          long maxStoreMemstoreTS = store.getMaxMemstoreTS();
          if (maxStoreMemstoreTS > maxMemstoreTS) {
            maxMemstoreTS = maxStoreMemstoreTS;
          }
        }
      } catch (InterruptedException e) {
        throw new IOException(e);
      } catch (ExecutionException e) {
        throw new IOException(e.getCause());
      } finally {
        storeOpenerThreadPool.shutdownNow();
      }
    }
    mvcc.initialize(maxMemstoreTS + 1);
    // Recover any edits if available.
    maxSeqId = Math.max(maxSeqId, replayRecoveredEditsIfAny(
        this.regiondir, minSeqId, reporter, status));

    status.setStatus("Cleaning up detritus from prior splits");
    // Get rid of any splits or merges that were lost in-progress.  Clean out
    // these directories here on open.  We may be opening a region that was
    // being split but we crashed in the middle of it all.
    SplitTransaction.cleanupAnySplitDetritus(this);
    FSUtils.deleteDirectory(this.fs, new Path(regiondir, MERGEDIR));

    this.writestate.setReadOnly(this.htableDescriptor.isReadOnly());

    this.writestate.flushRequested = false;
    this.writestate.compacting = 0;

    // Initialize split policy
    this.splitPolicy = RegionSplitPolicy.create(this, conf);

    this.lastFlushTime = EnvironmentEdgeManager.currentTimeMillis();
    // Use maximum of log sequenceid or that which was found in stores
    // (particularly if no recovered edits, seqid will be -1).
    long nextSeqid = maxSeqId + 1;
    LOG.info("Onlined " + this.toString() + "; next sequenceid=" + nextSeqid);

    // A region can be reopened if failed a split; reset flags
    this.closing.set(false);
    this.closed.set(false);

    if (coprocessorHost != null) {
      status.setStatus("Running coprocessor post-open hooks");
      coprocessorHost.postOpen();
    }

    status.markComplete("Region opened successfully");
    return nextSeqid;
  }

分享到：

R语言学习 | Hadoop Shuffle过程分析

2013-05-30 14:17
浏览 1275
评论(1)
分类:开源软件
查看更多

1 楼 leibnitz 2013-07-29

引用

按道理是可以把meta表split成两个或更多region。

这个我也发现.最初我认为如果不split是没有必要root的,看了jira看觉得是为以后保留的吧:-)

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

HBase读书笔记2

1. QoS

2. ZooKeeperWatcher

3. XxxTracker

4. hbase表状态不一致

5. .META.表不能被split

6. DrainingServer

7. openRegion原理

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

HBase读书笔记2

1. QoS

2. ZooKeeperWatcher

3. XxxTracker

4. hbase表状态不一致

5. .META.表不能被split

6. DrainingServer

7. openRegion原理

评论

发表评论

相关推荐

HBase Master管理功能总结

Thrift和HBase 性能评价分析

HBase配置参数

Sqoop新增批量导入HBase功能

Sqoop新增多版本导入HBase功能

HBase读书笔记1

批量Load到HBase

最近访客更多访客>>