NameNode HAをNFSベースからQJMベースに変える

大まかな手順は、

  1. 既存のHA を無効にする
  2. スタンバイNameNodeロールを削除するが、nameディレクトリは削除されない。ディレクトリは空にする。
  3. QJMベースのHAを有効にする

Configuring HDFS High Availability

Converting from NFS-mounted shared edits directory to Quorum-based Storage

Converting your High Availability configuration from using a NFS-mounted shared edits directory to Quorum-based Storage just involves disabling your current High Availability configuration, then enabling High Availability using Quorum-based Storage.

  1. Disable High Availability (see Disabling High Availability).
  2. Although the Standby NameNode role is removed, its name directories are not deleted. Empty these directories.
  3. Enable High Availability with Quorum-based Storage (see Enabling High Availability with Quorum-based Storage).
Disabling High Availability

Note:
If you have enabled Automatic Failover, you must disable it before you can disable High Availability.

To disable High Availability

  • From the Services tab, select your HDFS service.
    1. Click the Instances tab.
    2. Click Disable High Availability…
    3. Confirm that you want to take this action. If you are using Quorum-based Storage, you will have the option of disabling the Quorum-based Storage, or leaving it enabled. If you are using NameNode Federation, you should consider leaving it enabled. Cloudera Manager ensures that one NameNode is active, and saves the namespace. Then it stops the Standby NameNode, creates a SecondaryNameNode, removes the Standby NameNode role, and restarts all the HDFS services. Note that although the Standby NameNode role is removed, its name directories are not deleted. Empty these directories after making a backup of their contents. As when you enabled High Availability, you have the choice to have your dependent services restarted, and your client configuration redeployed as part of the Disable High Availability workflow. If you choose not to do this, you must do this manually.
    4. Update the Hive Metastore NameNode.
Enabling High Availability with Quorum-based Storage

After you have installed HDFS on your CDH4 cluster, the Enable High Availability workflow leads you through adding a second (Standby) NameNode and configuring JournalNodes.

  1. From the Services tab, select your HDFS service.
  2. Click the Instances tab.
  3. Click Enable High Availability (This button does not appear if this is a CDH3 version of the HDFS service.)
  4. The next screen shows the hosts that are eligible to run a Standby NameNode and the JournalNodes.
    1. Select Enable High Availability with Quorum-based Storage as the High Availability Type.
    2. Select the host where you want the Standby NameNode to be set up. The Standby ++ NameNode cannot be on the same host as the Active NameNode, and the host that is chosen should have the same hardware configuration (RAM, Disk space, number of cores, etc.) as the Active NameNode.

Select an odd number of hosts (a minimum of three) to act as JournalNodes. JournalNodes should be hosted on machines with similar hardware specification as the NameNodes. It is recommended that you put a JournalNode each on the same hosts as the Active and Standby NameNodes, and the third JournalNode on similar hardware, such as the JobTracker.

    1. Click Continue.
  1. Enter a directory location for the JournalNode edits directory into the fields for each JournalNode host.
    • You may enter only one directory for each JournalNode. The names/paths do not need to be the same on every JournalNode.
    • The directories you specify should be empty, and must have the appropriate permissions.
    • If the directories are not empty, Cloudera Manager will not delete the contents; however, in that case the data should be in sync across the edits directories of the JournalNodes and should have the same version data as the NameNodes.
  2. You can choose whether the workflow will restart the dependent services and redeploy the client configuration for HDFS. To do this manually rather than have it done as part of the workflow, uncheck these extra options.
  3. Click Continue. Cloudera Manager proceeds to execute the set of commands that will stop the dependent services, delete, create, and configure roles and directories as appropriate, and will restart the dependent services and deploy the new client configuration if those options were selected.
  4. There are some additional steps you must perform if you want to use Hive, Impala, or Hue in a cluster with High Availability configured. Follow the Post Setup Steps described below.