I’ve participated in DC 27 QUALS as a member of Team Enu (30th place) and solved speedrun-008, speedrun-010, know_your_mem, and vitor. I shall write down my impressions.

# speedrun-008

Leaking stack canary + ROP. やるだけ。

# speedrun-010

UAF + One gadget RCE. やるだけ。

# know_your_mem

Binary search for mmap-ed regions.

# vitor

A long journey through Android/Crypto/shellcode/ROP/JS/Z3. JS part was analyzed by my teammate. To set DWORD PTR constraints with a left shift is a bit confusing:

# gloryhost

I couldn’t find a way to get a flag within the time. I wish I could have noticed that it was a side-channel (especially, Spectre) task when I found the reference to rdtsc.
While Ubuntu 16.04 implicitly specifies -N as the argument of nc command, Ubuntu 18.04 does not, resulting waste of time in find a way to get a message from the exported function this_is_what_ive_got(). The concept of it was interesting, but such an environment-dependent behavior is not my preference.

# RTOoOS

Hypervisor.framework-based VM escape task. The first attack surface is export. I was exhausted when I got a dump of honcho … and couldn’t solve it within the time.

# Random Thoughts

There wasn’t a very good task, but Speedruns are good for beginners and retirees (me) to practice as they can be solved in a straightforward way. The top-tier teams have solved Speedruns in about 5 minutes, but I took one and a half hours to solve ‘em. 弱すぎますね。

Last week, I gave a talk at 第51回 情報科学若手の会.

I am pleased to be able to tell the relationship between the artisanship of reverse engineering and the principle of computer science. The questions asked at the site are as follows:

Q A
Does the “operation” here mean mathematical operation or anything else? In this presentation, the word “operation” corresponds to assembly or BitVector operations.
How much advanced obfuscation is spreading? Although it is not widely used in malware for masses, it is common in government-grade malware e.g. APT28.

Also, I got the feedback:

It sounds great.

In the 会, I’ve enjoyed discussing various topics related to information science:

• Polyhedral optimization, a method to optimize a certain kind of loop called SCoP using a linear algebra model.
• The paper about principle of adversarial examples focusing on Fourier basis functions.
• Ragel, the parser generator used in the Ruby community. It was news to me.
• The hypothesis that the cerebellum may be using reinforcement learning.
• The weakness of random number generator of Dragon Quest 4 (PS version).
• About quantitative indicator of the beautiness of exploit code.
• Some binary analysis platform written in Rust e.g. wasabi, falcon and finch.
• The fact that Pokemon Go players are knowledgeable about Hilbert curve.

If these topics capture your interest, why don’t you join the next 会?

Thank you for telling me this problem, @kumagi!

# Introduction

Dog or Frog 2018shell2.picoctf.com:18318 is a classic task about adversarial examples.

The following files are given:

• model
• solution template
• notes
• source

In short, what we need to do is adding noise to given image and to lead the classifier to misrecognition.

# Solution

I wrote a patch to the solution template with reference to the tutorial and the paper with well-known panda figure.

The meaning of this implementation is the formula below which adjusted to the given model:

$\tilde{\boldsymbol{x}} = \boldsymbol{x} + \epsilon \thinspace \text{sign} (\nabla_\boldsymbol{x} \text{Loss}(\boldsymbol{x}, y))$ where $y$ is a label.

Then I’ve got:

The perturbated image:

The flag:

# Final words

The attack method used here is called fast gradient sign method. It is just like a “Hello, world!” in adversarial examples research. At this rate, machine learning might be also becoming an essential skill in CTF.

Further CTF tasks related to neural networks are below:

foolme BCTF 2017
adamtune DEF CON CTF Qualifier 2018
Astral Mind SwampCTF 2018
Pilgrim SwampCTF 2018

Let me know if there is anything else missing!

This is too brief to be called write-up. But I’m tired …

# Introduction

I’ve participated in DEF CON CTF Qualifier 2018 as a member of a certain team, ignominious 40th place. But somehow I solved 3 tasks:

• ELF Crumble
• babypwn1805
• elastic cloud compute (memory) corruption

I write down my impressions.

# ELF Crumble

This is a task to combine and execute 8 binary fragments correctly. I wrote damn brute-force solver for this, 脳が死んでいるので．

# babypwn1805

A blind pwn task. I accidentally found offset -0x38 to the GOT entry of read. Then I wrote the probabilistic solver.

# elastic cloud compute (memory) corruption

A VM escape task.

We were given qemu-system-x86_64 binary with vulnerable PCI device named ooo.　Notable functions are as follows:

function description
sub_6E61F4 correspond to ooo_mmio_write
sub_6E613C correspond to ooo_mmio_read
sub_6E64A5 invokes system("cat ./flag")

What matters is use-after-free vulnerability in:

With the clue of the chunk offset on 0x1317940, now we can overwrite malloc@GOT to sub_6E64A5 by fastbin attack, in particular using devmem.

I stayed up all night for this. I was tired but it was fun. I used these past write-ups as a reference when solving this task:

Thanks authors!

# Final Words

Other tasks I had wanted to solve are:

• flagsifier
• TechSupport
• smcauth

This year DEFCON’s organizer has changed from LegitBS to OOO (Order of the-Overflow). OOO seems to have the purpose of connecting academic research and CTF. I support this philosophy, but this competition was not perfect. My impressions are summarized as follows:

Pros Cons
Meritocratic rev/pwn. Brand-new topics i.e. blockchain, neural network, reversing of Rust binary. Many guessing tasks. Some incredible, old-fashioned tasks. In particular, sbva and ghettohackers: Throwback are quite bad.

Anyway, I’m looking forward to that next year.

HAI DOMO. This post is for 武蔵野 Advent Calendar 2017.

# Introduction

In May this year, I just started my career as an apprentice security researcher at 武蔵野某所．One of my job responsibilities is to write a “good” paper that enough to be accepted to top-tier (non-crypto) security conferences like following:

• IEEE S&P (Oakland)
• ACM CCS
• NDSS
• USENIX Security

However, I am profoundly ignorant of cardinal rules of “good” security research and technical writing. ぜんぜんわからない．俺たちは雰囲気で研究をやっている．I thought I got to do something.

The joke paper entitled Paper Gestalt, distributed in CVPR’10, gave me a suggestion.

The key idea of the paper is that “good” paper might be distinguished by image recognition. だるくなってきた．時間がないので日本語で書きます．このジョーク論文では，論文を画像に変換，局所特徴量を抽出し，論文がトップカンファレンスにacceptされるかどうか判定する分類器が提案されている．仔細はPaper Gestalt - n_hidekeyの日記を参照されたい．かっこいい数式や図がある論文はそれっぽく見えてしまうよね，という話．

# Dataset

上述のカンファレンスにacceptされた論文4年分を正例，併設ワークショップにacceptされた論文同じく4年分をトップカンファレンスにrejectされた論文とみなして負例とする．負例には諸先輩方の論文が含まれていて，すみません，でもわかってくれると思うんです．

さて一通りスクレイピングしたのち，ポスターやショートペーパーなど，4ページに満たないものを削除．キーノートやスライドも取り除く．重要なのはフルペーパーだからだ．結果，それぞれの論文数は以下の通り：

accepted rejected
1,266 794

正例のワードクラウド：

負例：

なんもわからん．

# Pre-processing

論文PDFを画像化する．

論文PDFの各ページを横に並べ，20ページに満たない場合は白紙で埋める処理を施した．例：

見ての通り，USENIX系の本会議に通った論文にはかっこいい表紙が付いてくる．他の論文と体裁を合わせるため削除：

3時間ほどかけて全PDFをImageMagickで画像化，訓練用・検証用に半々で分割．

# Training

今回はベンチマークということでLeNet-5をほぼそのまま使う．いつもいつも手書き文字を認識させられるなど過酷な拷問を受けているやつ．

フレームワークはkeras. データが少数かつclass imbalancedであることを考慮して，Building powerful image classification models using very little dataに倣い，augmentationをかけながら訓練することにした．具体的にはズームと水平方向への反転．その他各種パラメータについてはありがちな構成を雰囲気で決めている：

ReLU クロスエントロピー RMSProp 50% 64 100 validation accuracy

本来ならネットワーク構成含め細かくチューニングすべきだが，手元のショボい計算機では投稿日までに計算が終わらなさそうだったため，hyperopt/hyperoptやそのkeras連携機能であるmaxpumperla/hyperasとか，そういったかっこいいテクニックは使っていない．すみません2.

# Results

Early stoppingが効いて16エポックで学習打ち止め．学習曲線：

微妙．しかし自分が学生時代に国内研究会に投げた論文を投入したところ，

accepted rejected predict
7.3411% 92.6589% rejected

とまあ正しく判定できているっぽいのでよしとしましょう．なにが正しく判定だ．俺を，馬鹿にしているのか．いま，様々なものに対して害意を抱いています．Saliency mapの可視化とかは気が向いたら．

# Final Words

ACM CCS’17のWelcome Slidesにありがたいことばが載っている：

つまりはそういうことです．小手先の浅知恵に逃げるものはなにをやってもだめ．やるぞ〜．

HAI DOMO. This post is for 武蔵野 Advent Calendar 2017 and also for CTF Advent Calendar 2017.

# Introduction

In May this year, I participated in DEF CON CTF Qualifier 2017 as a member of a certain 武蔵野-related team. Actually, I’m not a top-tier CTF player, but I did my best and solved 4 challenges:

• crackme1
• beatmeonthedl
• enlightenment
• Pepperidge Farm

Write-ups already exist except for Pepperidge Farm. So I decided to write about it. FYI: The binaries are available at legitbs/quals-2017.

# Pepperidge Farm

Pepperidge Farm is categorized into Reverse Engineering. The problem statement is below:

Remember when the first CTF was run with a custom architecture? Pepperidge Farm remembers:
https://github.com/JonathanSalwan/VMNDH-2k12

It seems like a keygenning challenge on the custom virtual machine–VMNDH-2k12.

# VMNDH-2k12

VMNDH-2k12 is the VM built for Nuit du Hack CTF Quals 2012 as its name suggests. The architecture is described in shell-storm | Useless emulator for fun (VMNDH-2k12). This VM parses given serialized binary and repeats fetch, decode and execution.

Writing IDA loader/processor module is a common way to analyze VM-based obfuscated binary. The modules for VMNDH-2k12 and for modified version VMNDH-2k13 already exist:

Note that when you try to solve this challenge with above-mentioned processor modules with IDA Pro 7.0, the backward-compatibility issue will occur. For example, you have to change self.regFirstSreg toself.reg_first_sreg in a module.

Also, the Binary Ninja plugin has been released after quals:

But in this post, I describe a solution without both IDA Pro and Binary Ninja. Because VMNDH-2k12 is open-sourced and easy to modify.

# Surface Analysis

Yes, the VM has own debugger and disassembler.

# Modifying The Disassembler

However, there are pitfalls here.

Because it is a unique architecture, the destination of the control flow instructions are different from it shown on the disassembly dump. For example, take a look at src_vm/op_call.c:

Acording to this, I modified the dissassembler in src_vm/disass.c:

This makes it possible to correctly display the address of the call destination in the disassembly dump. In addition, jump instructions need to be modified. In the case of jnz:

Herewith,

becomes:

Good.

Also, after 0x8759 it looks like a data section.

However, parts such as 0x87e8 are misinterpreted as codes.

The data section was not obfuscated.

So I gave first aid to src_vm/disass.c:

Awful… who cares?

# First Attempt with KLEE

This is a failure case.

As we have seen so far, VMNDH-2k12 is open-sourced. So I tried to solve the challenge with source code-based symbolic execution tool–KLEE.

I modified src_vm/syscall_write.c for assertion:

Here is a modified Makefile:

I’d left all of it to KLEE and get to bed…

… It’s not going to be easy.

I also wrote solver with angr. Which symbolizes stdin, but… let’s not talk about it.

An example of insufficient SMTLIB2 representations is:

# Solution with Z3

Since there is no choice, I read all the disassembly. After some twists and turn, I realized that:

• Pepperidge Farm checks character codes against transformed 0x20 bytes values.
• 0x8247(x, y) returns x * 100 + y.

Now Z3 time. For example, subroutine 0x8269:

becomes:

This is satisfiable. But not enough. We need to add rest of constraints.

Even with halfway constraints, the process proceeds. So inscount with Pin or other dynamic binary instrumentation tools might be helpful.

Finally I got:

The conclusive SMTLIB2 representation is:

# Final Words

I enjoyed this challenge. It seems to be easy or medium-easy difficulty rating. If only I could have solved more difficult Reverse Engineering challenges during quals–liberty, godzilla, and so on.

Recently I read T. Blazytko et al. USENIX Security’17. The paper says the system named Syntia automatically deobfuscate binaries with program synthesis. Program synthesis is a method to synthesize some pieces of program from given I/O samples and possible operators–like this:

This is just a simple example. In practical, program slicing and path pruning will be needed. Both symbolic execution and program synthesis depend on SMT solver, but according to the paper, program synthesis is more suitable for deobfuscation tasks… really? I’ll investigate further.

This post is for Honeypot Advent Calendar 2017.

# Introduction

In May this year, Trend Micro researchers have announced interesting research results in the article titled Red on Red: The Attack Landscape of the Dark Web - TrendLabs Security Intelligence Blog. They had deployed a honeypot on the dark web and monitored attack activities. They’ve done great work, indeed.

Well then, with the aid of the screenshot in that article, probably I found the honeypot. May I introduce the how and why?

# Dark Web OSINT

In March this year, when they would have created an presentation slide, I have been running the crawler for the dark web. My purpose was to create a pictorial book of Tor hidden services below:

The crawler is simple, just like saying “Hello, world!” to PhantomJS. It has only capability of taking a screenshot.

I have discovered 40,208 onion domains and confirmed 1,797 domains were active.

# Image Processing

Thanks to collecting screenshots by chance, I was able to find a site similar to the screenshot in their article–with histgram calculation:

Yet domain names are not posted in their articles, I believe this is it.

The screenshot:

There are many clone sites in the dark web–for backup, or even for spying? s5**********jlp2.onion might be cloned and have maintained by others. Even in that case, the site is likely to be a honeypot.

Interestingly, I also found some posts like to induce to the site at r/onions. I believe these are done by researcher.

# Last Words

We got a glimpse of deep in abyss. This is just an accidental case study. Needless to say, no insult intended.

My crawler and image processing scripts are available at ntddk/onionstack. Crawling of the dark web is accompanied by risk. After all, with ethical considerations, I’ve deleted screenshots I’d captured except for the honeypot.

If you interested in the dark web OSINT, Dark Web | Automating OSINT Blog will be a good starting point.

Anyway, keep safety.

# Introduction

As you know, IDAPython is quite useful. And Triton concolic execution engine has python binding. Then… why not integrate them? I tried to stand on the shoulders of giants.

# Backward Program Slicing

Roughly speaking, program slicing is a method to extract subset of program which is relevant to given statement. Here is an excerpt from M. Weiser. ICSE’81:

Starting from a subset of a program’s behavior, slicing reduces that program to a minimal form which still produces that behavior. The reduced program, called a “slice”, is an independent program guaranteed to faithfully represent the original program within the domain of the specified subset of behavior.

Kudos to Jonathan Salwan, we can easily apply backward program slicing to binary analysis process with minor modification of backward_slicing.py and proving_opaque_predicates.py. I wrote a simple, tiny glue between Triton and IDA Pro:

# Showcase

The snippet extracts subset of program which is relevant to branch condition. We can run this from File -> Script file in IDA Pro menu.

becomes:

becomes:

Looks nice.

# Last Words

Triton’s emulation iteration is compatible to IDAPython manner. Therefore, The combination of IDA Pro and Triton is pretty good.

Cheers,

# Introduction

IDAPython is a powerful feature of IDA Pro, and there are many open-sourced IDAPython projects. However, we cannot use every GUI-based IDAPython script due to some Qt-related breaking changes between IDA Pro 6.8 and 6.9 or later. The main problem is about migrating no longer supported PySide code to PyQt5.

Recently I ported PySide code within idasec–one of the most sophisticated deobfuscation frameworks, which tackles opaque predicates and call stack tampering in terms of infeasibility questions, by utilizing Backward-Bounded Dynamic Symbolic Execution proposed in the remarkably well written paper S. Bardin et al. IEEE S&P’17–to PyQt5.

That’s why I decided to write this blog post for a note to self and for someone trying to do similar thing.

# Related Work

There are 2 guidances to migrate PySide code to PyQt5:

Please read them before. I only give supplemental information in addition to predecessors.

# How to Migrate

Now let’s get started.

## Change QtGui methods to QtWidgets

Most methods in QtGui migrated to QtWidgets. Therefore,

becomes:

As an example, QTextEdit described in Hex Blog. In additions, the methods to be rewritten are as follows:

• QtWidgets.QLayout
• QtWidgets.QVBoxLayout
• QtWidgets.QHBoxLayout
• QtWidgets.QWidget
• QtWidgets.QTableWidget
• QtWidgets.QListWidget
• QtWidgets.QTabWidget
• QtWidgets.QDockWidget
• QtWidgets.QTreeWidget
• QtWidgets.QTreeWidgetItem
• QtWidgets.QPushButton
• QtWidgets.QRadioButton
• QtWidgets.QToolButton
• QtWidgets.QButtonGroup
• QtWidgets.QGroupBox
• QtWidgets.QSpinBox
• QtWidgets.QCheckBox
• QtWidgets.QComboBox
• QtWidgets.QTextEdit
• QtWidgets.QLineEdit
• QtWidgets.QApplication
• QtWidgets.QLabel
• QtWidgets.QSizePolicy
• QtWidgets.QMenu
• QtWidgets.QFrame
• QtWidgets.QProgressBar
• QtWidgets.QStyle
• QtWidgets.QSpacerItem
• QtWidgets.QScrollArea
• QtWidgets.QSplitter
• There might be more…

My experience says that other than the following 3 methods may be rewritten:

• QtGui.QPixmap
• QtGui.QIcon
• QtGui.QFont

idacute may overwrite all of QtGui methods, so I think there still needs to be manual works.

## Overwrite _fromUtf8

We also need to overwrite _fromUtf8.

## Others

These issues are described by predecessors:

• Handling SIGNAL
• Change FormToPySideWidget to FormToPyQtWidget
• Change setResizeMode to setSectionResizeMode

# Conclusion

This time, I was able to run idasec on IDA Pro 7.0 with some bug fixes and dirty patches – like this cool video:

If you are an IDA Pro 7.0 user, note that other backward-compatibility issue described in IDA: IDAPython backward-compatibility with 6.95 APIs will occur.

Enjoy!

HAI DOMO VIRTUAL YOUTUBER KIZUNA AI DESU. I’m still working on my English.

Security meets Machine Learningという勉強会にて，上記のタイトルで発表した．資料はこちら：

謎の力が働いて会社からの発表になっておりますが，機械学習の研究をしているわけではありません．既存研究の再現実装を試みているとこれ中国語の部屋じゃんという気持ちになる．
ともあれ，これまで各種資料はただSpeakerDeckに載せるだけだったのを今後はブログから一元的に参照できるようにします．