Monday 1 April 2019

Oracle BDA - BDR Immediate one-time Replication Schedule Fails on HDFS Replication

Issue Description 

While setting up BDR (Backup and Disaster Recovery) on 2 remote BDA (Big data appliance) racks. We have managed to finish pre-requisites, added Source BDA as Peer, made the HDFS directory to replicate snapshottable on both sides. Finally, we get a "HDFS replication command failed." error when we run our initial load. We have tested Snapshot creation for the same HDFS directory on both BDAs, and it completed successfully.  

- Main Err: "User performing the MapReduce job must be a superuser to preserve user and group permissions: admin"
- Diff-based replication not used "Used diff: false" --> Err: "No snapshottable directories have found. Reason: either run-as-user does not have permissions to get snapshottable directories or source path is not snapshottable

Issue Detail

Default supergroup here is “supergroup” and by default that group is does not exists in the BDA cluster nodes. That can be verified from cloudera manager.
From Cloudera Manager, navigate to HDFS > Configuration > Superuser Group

 

Solution

We need to create  group “supergroup” on all cluster nodes and after that should create a user lets say “admin” and add this user to supergroup.

I have used below set of commands to perform required tasks and commands needs to be executed on the first node of cluster. And this needs to be done on both source and target cluster.

# dcli -C groupadd -g 20020 supergroup
# dcli -C useradd -u 20020 -g hadoop -G hadoop,hive,supergroup -m admin
# dcli -C id admin
# dcli -C passwd -S admin
# hash=$(echo '<hidden>' | openssl passwd -1 -stdin);
# dcli -C "usermod --pass='$hash' admin"
# dcli -C passwd -S admin
# sudo -u hdfs hadoop fs -mkdir /user/admin
# sudo -u hdfs hadoop fs -chown admin:hadoop /user/admin
# sudo -u hdfs hadoop fs -ls -d /user/admin
# dcli -C chmod 755 /home/admin
# dcli -C mkdir -p /home/admin/DATA
# dcli -C ls -ld /home/admin 

Creating Replication Schedule

Now while creating replication schedule we have to use the user “admin” for Run as Username and Run on Peer as Username: 

 

Reference 

[1] https://support.oracle.com/epmos/main/hadoop/BDR-doAs-Fails-Error-User-performing-the-MapReduce-job-must-be-a-superuser-to-preserve-user-and-group-permissions
[2] https://support.oracle.com/epmos/main/hadoop/Create-HDFS-Superuser-Group