< back

2024

ndvec

Compile time N-dimensional Euclidean vector.

C++23, make, clang

2023

Advent of Code in C++23

Solving all Advent of Code challenges using C++23.

C++23, make, clang

2022

Advent of Code 2022 - Python C API

Solving Advent of Code 2022 challenges using Python as a C library.

Python (C API), make, clang, curl

C stuff

Reading Modern C by Jens Gustedt and writing some small C apps.

C17, make, clang

git2json

Small command line app for converting Git log data to CSV and JSON. Mainly for filling a PostgreSQL database with data for local experimentation on different index combinations.

Cloning the most recent 1000 commits of the PostgreSQL git repository and extracting git log data of the 15 most recent commits as a list of JSON objects.
Rust, Git

core-econ

Reading The Economy and creating visualizations based on economics data with Vega-Lite.

Line chart comparing the GDP (measured in US$ year 1990) of seven countries. The mouse cursor is positioned at year 1970 with a tooltip showing West Germany at the top with $11,930 and East Germany at the bottom with $5,254.
Python, GNU make, pandas, Altair, Jinja, CSS, JavaScript

2021

Advent of Code 2021 - Rust

Solutions to Advent of Code 2021, written in Rust. Implements automated downloading of problem inputs and uploading of solutions.

Screenshot shows a part of a memoized, recursive solution to problem number 21 of Advent of Code 2021.

Rust-code implementing a memoized, recursive solution to problem number 21 of Advent of Code 2021.
Rust

bitmatch

Tiny C library for detecting bit patterns.

Terminal output of piping the 10 first bytes of the bitmach binary through both bitmatch and xxd to extracts all bits, then grepping for the hexadecimal pattern cffaedfe, which is found. Pattern cafebabe is also searched but it is not found since it is not present in the file.
C

Deep reinforcement learning for the Robotini racing simulator

Self-driving bots for the Robotini racing simulator with deep reinforcement learning (RL). The deep RL algorithm used to train these bots is deep (recurrent) deterministic policy gradients (DDPG/RDPG) from the TensorFlow Agents framework.

The recording shows the RL algorithm exploring the simulator environment in real-time, driving multiple cars simultaneously by streaming actions to the Robotini racing simulator. The DDPG TensorFlow agent is running on a Linux machine with an RTX 2080 Ti, while the racing simulator is running on a MacBook Pro. Both machines are in the same local area network, communicating over TCP.

Python, TensorFlow, TensorFlow Agents

2020

Advent of Code 2020 - Go

Golang solutions to Advent of Code 2020. Screenshot of a solution to the problem of day 6.

Go-code solving problem 6 of Advent of Code 2020 with bitwise arithmetic
Go

GitHub Heatmap Text

Write text on the GitHub contribution heatmap by autogenerating commits.

Each character of the input text is mapped to (x, y) coordinates, defined by the corresponding "Tiny" font glyph, and then to timestamps where x is weeks and y is days. Git commits are then generated at each timestamp to "render" the text.

GitHub contribution heatmap with a commit pattern that spells out 'HELLO GITHUB' in capital letters
Python

Debian package info viewer

Simple, stateless web backend for parsing Debian package info status files (e.g. /var/lib/dpkg/status) and rendering them as HTML. Written in Python using only standard library packages.

Screenshot of a simple, unstyled HTML list of Debian package names.
Python, HTML

High frequency trading risk server

Risk management server and client for automatic, high frequency trading. Written in C++17 without third-party dependencies.

C++ code for a TCP server handling incoming trade order messages
C++17, TCP

gRPC URL fetcher

Exploring the C++ API of gRPC.

This project simulates a scenario where multiple clients submit high-latency tasks to a server over a gRPC streaming connection. In this example, tasks are simulated by clients sending URLs, which the server fetches asynchronously with HTTP GET requests. The server returns unique keys for each URL, which the client can use to request the actual response contents. The server runs its own thread pool to perform URL requests using libcurl. All components are implemented as Docker containers for easier dependency management and compilation.

C++17, gRPC, libcurl

Spoken language identification toolbox: lidbox

Python toolbox built on top of TensorFlow for end-to-end spoken language identification.

I created this toolbox for automating my MSc thesis experiments at Aalto University. The toolbox aims to utilize TensorFlow as much as possible, which provides high performance feature extraction and model training pipelines for huge datasets. The largest speech dataset I used for training a convolutional neural network consisted of 3.2 terabytes of 16 kHz mono waveforms.

Three screenshots, showing an web application UI of an online language identification demo, a confusion matrix of 4 languages, and a 2 dimensional PCA projection of language embedding vectors in 4 languages.
Python, TensorFlow

2019

Comparing parallel Rust and C++

High-performance computing experiment, studying the feasibility of using Rust as a replacement for C++ when doing parallel computing.

The algorithm used in this example has a cubic time complexity and all performed optimizations aim to reduce the constant factors. Most optimizations focus on reducing memory access latency to increase data throughput in a small arithmetic-heavy loop. The results show that it is possible to write Rust that generates 64-bit x86 Intel code that matches the performance of C++ compiler output.

I also wrote a tutorial website which explains this experiment in more detail.

Bar chart depicting the performance gain in gigaflops for two C++ compilers and one Rust compiler, from each incremental program improvement from version 0 to version 7.
C++17, Rust

CUDA memory access recorder

Small CUDA library for recording GPU memory access timestamps and indexes during CUDA program execution.

The recorded data is saved as JSON, which allows the user to replay the memory access pattern in the browser. The animations displayed here show two memory access patterns, where the slower one accesses memory mostly in column-major order while the faster one uses more linear reading to read full cache-lines.

C++17, CUDA, JavaScript, HTML

CUDA memory access simulation

Simple GPU memory simulation written in pure JavaScript.

This simulator can be used for demonstrating simple GPU performance aspects such as memory access latency, caching, concurrent execution by multiple streaming multiprocessors, and memory access coalescing.

C++14, CUDA, JavaScript, HTML

2018

Celery on Kubernetes example

Exploring how to deploy a stateful web service with stateless, asynchronous backend workers on Kubernetes.

Architecture sketch of relations between all Kubernetes components in the application
Python, Minikube, Docker, Flask, Celery

Greedy string tiling

C++14, Python

CPython extension written in C++ for finding matching patterns in two texts.

This library implements the Running-Karp-Rabin (RKR) greedy string tiling (GST) algorithm. The worst case time complexity of GST is cubic, but the RKR modification utilizes hashing of subpatterns to reduce the amortized time complexity to linear.

Screenshot of a HTML page showing two columns of text side by side, with matching substrings highlighted

2017

JuiceSimulator

C++11, LiquidFun, SFML

Interactive 2D game with fluid dynamics written in C++.

Final project for an object oriented programming course. Physics provided by the LiquidFun library (based on Box2D) and graphics with the SFML framework.

In-game screenshot of a simple 2D game with fluid dynamics, showing two dispensers emitting large amounts red and green liquid.

Python code search engine: spaghetti-search

Python, Flask, Scrapy, Whoosh

Python code search engine.

Final project for a web information retrieval course. My idea is to index Python code by the string representation of its abstract syntax tree (AST). This makes all queries ASTs, which are subtrees of indexed code, allowing the usage of well-known information retrieval algorithms designed for regular text documents. This also means that variable names will not affect search results, as can be seen from the screenshot.

Short Python code snippet with its abstract syntax tree represented as an ASCII-string

code-rain

Python, curses

A result from avoiding real work for a few hours.

2016

satellite-routes

Clojure, ThreeJS

Exploring how to use Clojure as a backend web service.

The web service is stateless and computes shortest paths between graph nodes. In the client code, the graph is a constellation of connected satellites, orbiting a planet. Each edge denotes an unobstructed view between two satellites, which can be used for communications. The shortest path is computed from a point A on the planet's surface through these communication links to point B and highlighted in the UI.

Screenshot of a 3D web application displaying the Earth from space, enclosed inside a constellation of orbiting satellites, arranged into an icosahedron. The shortest path between two locations on earth is shown as a green path, connected through the orbiting satellites.

HTML formatter for programming tasks: graderutils

Python, HTML

Python library for converting programming assignment results into HTML.

Designed for Python assignments that are graded using a unit-test based test suite. This was part of a larger effort where I rewrote the grading pipeline of an algorithms course at Aalto University using Python's unittest module.

Screenshot of programming task grading results in HTML. The image shows that one test passed and another failed, followed by the full terminal output from Python's unittest package.