hi

this page contains selected code projects written by matias.lindgren () iki.fi

2020

Debian package info viewer

Simple, stateless web backend for parsing Debian package info status files (e.g. /var/lib/dpkg/status) and rendering them as HTML. Written in Python using only standard library packages.

Screenshot of a simple, unstyled HTML list of Debian package names.
Python, HTML

Spoken language identification toolbox: lidbox

Python toolbox built on top of TensorFlow for end-to-end spoken language identification. I created this toolbox for automating my MSc thesis experiments at Aalto University. The toolbox aims to utilize TensorFlow as much as possible, which provides high performance feature extraction and model training pipelines for huge datasets. The largest speech dataset I used for training a convolutional neural network consisted of 3.2 terabytes of 16 kHz mono waveforms.

Three screenshots, showing an web application UI of an online language identification demo, a confusion matrix of 4 languages, and a 2 dimensional PCA projection of language embedding vectors in 4 languages.
Python, TensorFlow

gRPC URL fetcher

This project simulates a scenario where multiple clients submit high-latency tasks to a server over a gRPC streaming connection. In this example, tasks are simulated by clients sending URLs, which the server fetches asynchronously with HTTP GET requests. The server returns unique keys for each URL, which the client can use to request the actual response contents. The server runs its own thread pool to perform URL requests using libcurl. All components are implemented as Docker containers for easier dependency management and compilation.

Animated architecture sketch showing the different stages of the bi-directional stream API from the point of view of the server and clients.
C++17, gRPC, libcurl

2019

Comparing parallel Rust and C++

High-performance computing experiment, studying the feasibility of using Rust as a replacement for C++ when doing parallel computing. The algorithm used in this example has a cubic time complexity and all performed optimizations aim to reduce the constant factors. Most optimizations focus on reducing memory access latency to increase data throughput in a small arithmetic-heavy loop. The results show that it is possible to write Rust that generates 64-bit x86 Intel code that matches the performance of C++ compiler output.

I also wrote a tutorial website which explains this experiment in more detail.

Bar chart depicting the performance gain in gigaflops for two C++ compilers and one Rust compiler, from each incremental program improvement from version 0 to version 7.
C++17, Rust

CUDA memory access recorder

Small CUDA library for recording GPU memory access timestamps and indexes during CUDA program execution. The recorded data is saved as JSON, which allows the user to replay the memory access pattern in the browser.

The animations displayed here show two memory access patterns, where the slower one accesses memory mostly in column-major order while the faster one uses more linear reading to read full cache-lines.

2 dimensional animation of a slow GPU memory access pattern into a small matrix 2 dimensional animation of a faster GPU memory access pattern into a small matrix
C++17, CUDA, JavaScript, HTML5

CUDA memory access simulation

This project implements a simple GPU memory simulator in JavaScript. It can be used for demonstrating simple GPU performance aspects such as memory access latency, caching, concurrent execution by multiple streaming multiprocessors, and memory access coalescing.

2 dimensional animation of a simulated GPU memory access pattern with highlighted CUDA code that shows the line of code being executed
C++14, CUDA, JavaScript, HTML5

2018

Celery on Kubernetes example

Exploring how to deploy a stateful web service with stateless, asynchronous backend workers on Kubernetes.

Architecture sketch of relations between all Kubernetes components in the application
Python, Minikube, Docker, Flask, Celery

Greedy string tiling

C++14, Python

Python library that implements the Running-Karp-Rabin (RKR) greedy string tiling (GST) algorithm. The worst case time complexity of GST is cubic, but the RKR modification utilizes hashing of subpatterns to reduce the amortized time complexity to linear.

The library is a CPython extension written in C++.

Screenshot of a HTML page showing two columns of text side by side, with matching substrings highlighted

2017

code-rain

Python, curses

A result from avoiding real work for a few hours.

Black screen with vertical columns of green characters moving downwards at different speeds. Mimics the code rain animation seen in the Matrix movies.

Python code search engine: spaghetti-search

Python, Flask, Scrapy, Whoosh

Final project for a web information retrieval course. My idea is to index Python code by the string representation of its abstract syntax tree (AST). This makes all queries ASTs, which are subtrees of indexed code, allowing the usage of well-known information retrieval algorithms designed for regular text documents. This also means that variable names will not affect search results, as can be seen from the screenshot.

Short Python code snippet with its abstract syntax tree represented as an ASCII-string

JuiceSimulator

C++11, LiquidFun, SFML

Final project for an object oriented programming course. Simple 2D game with fluid dynamics written in C++. Physics provided by the LiquidFun library (based on Box2D) and graphics with the SFML framework.

In-game screenshot of a simple 2D game with fluid dynamics, showing two dispensers emitting large amounts red and green liquid.

2016

satellite-routes

Clojure, ThreeJS

Exploring how to use Clojure as a backend web service. The web service is stateless and computes shortest paths between graph nodes. In the client code, the graph is a constellation of connected satellites, orbiting a planet. Each edge denotes an unobstructed view between two satellites, which can be used for communications. The shortest path is computed from a point A on the planet's surface through these communication links to point B and highlighted in the UI.

Screenshot of a 3D web application displaying the Earth from space, enclosed inside a constellation of orbiting satellites, arranged into an icosahedron. The shortest path between two locations on earth is shown as a green path, connected through the orbiting satellites.

HTML formatter for programming tasks: graderutils

Python, HTML

Python library for converting programming assignment results into HTML. Designed for Python assignments that are graded using a unit-test based test suite.

This was part of a larger effort where I rewrote the grading pipeline of an algorithms course at Aalto University using Python's unittest module.

Screenshot of programming task grading results in HTML. The image shows that one test passed and another failed, followed by the full terminal output from Python's unittest package.