Contents
  1. 1. 需求来源
  2. 2. 配置思路
  3. 3. kerberos的配置及安装
  4. 4. hadoop配置
  5. 5. 配置
  6. 6. 本文完

需求来源

防止hadoop组件被非法侵入控制及调用.比如Datanode对读入输出并没有认证。导致如果一些客户端如果知道block的ID,就可以任意的访问DataNode上block的数据。jobtracker没有进行认证导致可以随意改变job的状态,非法用户可以伪装成nodemanager进行任务的领取。

配置思路

hadoop组件原生使用的kerberos进行授权认证,这是一种A如何向B证明自己就是他所声称的那个人的证明机制。通过加强票据的检验,可以防止组件被冒充及入侵。kerberos使用比较广泛,linux中curl命令大都集成了票据的集成,windows也有相关使用,但是笔者没有在windows配置成功。Kerberos的授权思路大致如下图所示。
cmd-markdown-logo

kerberos的配置及安装

此处不属于本文范畴所以尽量省略
以下为krb5.conf配置文件内容,在kdc和所有server都需要

[kdc]
 profile = /usr/local/var/krb5kdc/kdc.conf
[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log
[libdefaults]
 default_realm = DOMAIN.COM
 dns_lookup_realm = true
 dns_lookup_kdc = true
 ticket_lifetime = 24h
 forwardable = true
 ccache_type = 4
 proxiable = true
 renew_lifetime = 7d
[realms]
 DOMAIN.COM = {
  kdc = kerberos.DOMAIN.com
  admin_server = kerberos.DOMAIN.com
 }
[domain_realm]
 .DOMAIN.com = DOMAIN.COM
 DOMAIN.com = DOMAIN.COM
[login]
 krb4_convert = true
 krb4_get_tickets = false

如果是kdc,则需要配置/usr/local/var/krb5kdc/kdc.conf,内容如下

[kdcdefaults]
 kdc_ports = 750,88
 kdc_tcp_ports = 88
[realms]
 DOMAIN.COM = {
  acl_file = /usr/local/var/krb5kdc/kadm5.acl
  dict_file = /usr/share/dict/words
  admin_keytab = /usr/local/var/krb5kdc/kadm5.keytab
  kdc_ports = 750,88
  max_life = 1d 0h 0m 0s
  max_renewable_life = 7d 0h 0m 0s
  supported_enctypes = des3-hmac-sha1:normal des-cbc-crc:normal des:normal des:v4 des:norealm des:onlyrealm
  default_principal_flags = +preauth
 }

在KDC机器配置好管理员账户admin/admin@DOMAIN.COM之后需要生成princs,一般由两个或以上账户组成,其中一个必须为HTTP开头。
生成hadoop.keytab分发到所有hadoop组件所在的机器
测试过程
启动认证进程/usr/local/sbin/kadmind /usr/local/sbin/krb5kdc

[hadoop@he02 hadoop]$ kinit -k -t hadoop.keytab hadoop/he02@DOMAIN.COM -V
Using default cache: /tmp/krb5cc_500
Using principal: hadoop/he02@DOMAIN.COM
Using keytab: hadoop.keytab
Authenticated to Kerberos v5
[hadoop@he02 hadoop]$ klist
Ticket cache: FILE:/tmp/krb5cc_500
Default principal: hadoop/he02@DOMAIN.COM
Valid starting       Expires              Service principal
date  date  krbtgt/DOMAIN.COM@DOMAIN.COM
        renew until date

即表示认证成功

hadoop配置

以下为core-site.xml配置

    <property>
      <name>hadoop.proxyuser.hduser.hosts</name>
      <value>*</value>
    </property>
    <property>
      <name>hadoop.proxyuser.hduser.groups</name>
      <value>*</value>
    </property>
    <property>
      <name>hadoop.security.authentication</name>
      <value>kerberos</value>
    </property>
    <property>
      <name>hadoop.security.authorization</name>
      <value>true</value>
    </property>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://cs10:9000</value>
    </property>
    <property>
      <name>hadoop.proxyuser.hadoop.groups</name>
      <value>*</value>
    </property>
    <property>
      <name>hadoop.proxyuser.hadoop.hosts</name>
      <value>*</value>
    </property>

mapred-site.xml添加以下配置

  <property>
    <name>mapreduce.jobhistory.keytab</name>
    <value>/usr/local/hadoop/etc/hadoop/hadoop.keytab</value>
  </property>

  <property>
    <name>mapreduce.jobhistory.principal</name>
    <value>hadoop/_HOST@DOMAIN.COM</value>
  </property>
  <property>
    <name>mapreduce.jobtracker.kerberos.principal</name>
    <value>hadoop/_HOST@DOMAIN.COM</value>
  </property>
  <property>
    <name>mapreduce.jobtracker.keytab.file</name>
    <value>/usr/local/hadoop/etc/hadoop/hadoop.keytab</value>
  </property>

  <property>
    <name>mapreduce.tasktracker.kerberos.principal</name>
    <value>hadoop/_HOST@DOMAIN.COM</value>
  </property>
  <property>
    <name>mapreduce.tasktracker.keytab.file</name>
    <value>/usr/local/hadoop/etc/hadoop/hadoop.keytab</value>
  </property>
  <property>
    <name>mapred.task.tracker.task-controller</name>
    <value>org.apache.hadoop.mapred.LinuxTaskController</value>
  </property>
  <property>
    <name>mapreduce.tasktracker.group</name>
    <value>hadoop</value>
  </property>

yarn-site.xml添加以下配置

  <property>
    <name>yarn.resourcemanager.keytab</name>
    <value>/usr/local/hadoop/etc/hadoop/hadoop.keytab</value>
  </property>
  <property>
    <name>yarn.resourcemanager.principal</name>
    <value>hadoop/_HOST@DOMAIN.COM</value>
  </property>
  <property>
    <name>yarn.nodemanager.keytab</name>
    <value>/usr/local/hadoop/etc/hadoop/hadoop.keytab</value>
  </property>
  <property>
    <name>yarn.nodemanager.principal</name>
    <value>hadoop/_HOST@DOMAIN.COM</value>
  </property>

hdfs-site.xml添加以下配置,携带kerberos.internal.spnego.principal后缀的需要使用HTTP开头的princ,否则会报No Key to Store

  <property>
    <name>dfs.namenode.keytab.file</name>
    <value>/usr/local/hadoop/etc/hadoop/hadoop.keytab</value>
  </property>
  <property>
    <name>dfs.namenode.kerberos.principal</name>
    <value>hadoop/_HOST@DOMAIN.COM</value>
  </property>
  <property>
    <name>dfs.namenode.kerberos.internal.spnego.principal</name>
    <value>HTTP/_HOST@DOMAIN.COM</value>
  </property>
  <property>
    <name>dfs.block.access.token.enable</name>
    <value>true</value>
  </property>
  <property>
    <name>dfs.secondary.namenode.keytab.file</name>
    <value>/usr/local/hadoop/etc/hadoop/hadoop.keytab</value>
  </property>
  <property>
    <name>dfs.secondary.namenode.kerberos.principal</name>
    <value>hadoop/_HOST@DOMAIN.COM</value>
  </property>
  <property>
    <name>dfs.secondary.namenode.kerberos.internal.spnego.principal</name>
    <value>HTTP/_HOST@DOMAIN.COM</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir.perm</name>
    <value>700</value>
  </property>
  <property>
    <name>dfs.datanode.keytab.file</name>
    <value>/usr/local/hadoop/etc/hadoop/hadoop.keytab</value>
  </property>
  <property>
    <name>dfs.datanode.kerberos.principal</name>
    <value>hadoop/_HOST@DOMAIN.COM</value>
  </property>
  <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
  </property>

  <property>
    <name>dfs.web.authentication.kerberos.principal</name>
    <value>HTTP/_HOST@DOMAIN.COM</value>
  </property>

  <property>
    <name>dfs.web.authentication.kerberos.keytab</name>
    <value>/usr/local/hadoop/etc/hadoop/hadoop.keytab</value>
  </property>
    <!-- 此处在启动dataNode的时候会checkSecureConfig(dnConf, conf, resources);需要检查安全设置,设置此值不得已否则异常
    Cannot start secure DataNode without configuring either privileged resources or SASL RPC data transfer protection and SSL for HTTP.
    Using privileged resources in combination with SASL RPC data transfer protection is not supported. -->
  <property>
    <name>ignore.secure.ports.for.testing</name>
    <value>true</value>
  </property>

配置

以下摘取自apache-hadoop-webhdfs

Property Name Description
dfs.webhdfs.enabled Enable/disable WebHDFS in Namenodes and Datanodes
dfs.web.authentication.kerberos.principal The HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint. The HTTP Kerberos principal MUST start with ‘HTTP/‘ per Kerberos HTTP SPENGO specification.
dfs.web.authentication.kerberos.keytab The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint.

本文完

Contents
  1. 1. 需求来源
  2. 2. 配置思路
  3. 3. kerberos的配置及安装
  4. 4. hadoop配置
  5. 5. 配置
  6. 6. 本文完