pnx.guide
Yuttapichai "Guide" Kerdcharoen
LinkedIn GitHub CV

I am a software developer who is obsessed in database systems and high-performance computing. I am conducting research on designing high-performance graph algorithms.

I am ambitious to create a frontier data system group in Thailand.

Ph.D. Candidate (4th Year) @ Carnegie Mellon University and CMKL University
(Dual Degree - Electrical and Computer Engineering)
Teaching Assistant
CMU 18-746 Storage Systems (F22)
CMKL 18-613 Foundation of Computer Systems (S24)
Selected Coursework
CMU 15-721 Advanced Database Systems (S23)
CMU 18-746 Storage Systems (F21)
CMU 15-745 Optimizing Compilers for Modern Architectures (S23)
CMU 18-740 Modern Computer Architecture and Design (F22)
CMU 18-660 Optimization (S22)
CMU 18-447 Introduction to Computer Architecture (S22)
CMKL 18-613 Foundation of Computer Systems (S21)
CMKL 18-652 Foundations of Software Engineering (S21)

Previously be an Undergraduate Student @ KMITL
(Computer Science)
I founded CS-INFINITE, a group of students who initiates activities for CS-KMITL students. Examples of those activities are Code Arcade and Programming Bootcamp.

Living in Bangkok, Thailand
Previously, lived in Pittsburgh, PA, USA
Originally, lived in Nakhon Pathom, Thailand


Research

Linear Algebra Approaches for Directed Triad Counting and Enumeration Workshop Paper
The paper has been accepted for IA3 2024.

Triangle counting and enumeration are commonly used in real-world applications on directed graphs. However, the performance of triangle counting algorithms is usually benchmarked on undirected graphs. As such, many of these algorithms and formulations are not suitable for identifying the types of directed triangles in directed graphs. In this work, we show how algorithms for counting each type of directed triad (directed triangle) can be formulated using linear algebra. Leveraging the FLAME methodology, we show that provably correct counting and enumeration algorithms for directed triads can be derived from the linear algebraic formulation. These algorithms can be used to either count individual triads or together to count all possible triads. We show that despite being designed for individual use, the combined use of these algorithms yields a speedup of 16.77x to 1122.2x over the implementation in NetworkX, and 0.37x to 33.49x over GraphBLAS implementations using SuiteSparse 7.2 on various workloads from real-world directed graphs.

Exploiting Fusion Opportunities in Linear Algebraic Graph Query Engines Conference Paper
The paper has been accepted for IEEE HPEC 2023.

Queries in a graph database are often converted into a sequence of graph operations by a graph query engine. In recent years, it has been recognized that the query engine benefits from using high-performance graph libraries via the GraphBLAS interface to implement time-consuming operations such as graph traversal. However, using GraphBLAS requires explicitly casting data into linear algebra objects and decomposing the query into multiple operations, some of which are expressible by the GraphBLAS. The combination of these two requirements translates into increased memory footprints and additional execution times. In this paper, we show that fusing different stages of the query engines into GraphBLAS calls can reduce the size of the intermediate data generated during the query. Furthermore, by relaxing the semi-ring constraints imposed by GraphBLAS, more aggressive fusions of the stages can be performed. We show a speedup of up to 1235.89x (8.82x on geometric average) relative to an open-source graph query engine using GraphBLAS (i.e. RedisGraph) for processing undirected subgraph enumeration queries.

Opportunities for Linear Algebraic Graph Databases Extended Abstract
The abstract has been accepted for ARRAY 2023.

In recent years, there has been renewed interest in casting graph algorithms in the language of linear algebra. By replacing the computations with appropriate operations over different semi-rings, different graph algorithms can be cast as a sequence of linear algebra operations. In this work, we study the use of the linear algebraic approach to graph algorithms within the context of graph database systems. Specifically, we identify the issues with using existing linear algebraic graph libraries, such as SuiteSparse, which conform to the GraphBLAS specifications. We also highlight gaps between the GraphBLAS specification and computations that are required by the graph query algorithms utilized in graph databases. We show that overcoming these challenges in using a linear algebraic approach within a graph database system can lead to significant performance improvements to an open-source graph database system.

Fast Fixed-Point Decimals Ongoing


Teaching

Storage and File System Fundamentals @ CMKL University (Fall 2024)
Distributed Data Storage @ CMKL University (Fall 2024)
Parallel Computing @ CMKL University (Fall 2024)
Database Management @ SUIC (Fall 2024) Webpage
Special Topic in Computer Science 2: Computer Systems @ Computer Science, KMITL (Fall 2024) Webpage
Foundation of Programming @ K-DAI, KMITL (Summer 2024) Webpage
Database Programming in Practice @ K-DAI, KMITL (Spring 2024) Webpage

Selected Project

pgfixeypointy - Fast Fixed-Point Decimals
A PostgreSQL extension for libfixeypointy, a fast fixed-point decimal library.

Article on RedisGraph @ dbdb.io

Foreign Data Wrapper for Columnar File Formats
A PostgreSQL foreign data wrapper for accessing columnar file formats (similar to Apache Parquet). The foreign data wrapper includes query optimizations such as projection and predicate pushdown.

Compiling Neural Network with Dynamic Resource Adaptation
An approach to improve elasticity of FlexGen for supporting multi-tenancy environments.
Webpage

Hybrid Cloud/Local File System
I implemented a hybrid cloud/local (Amazon S3 and local SSDs) file system to minimize cloud storage costs. The file system can deduplicate identical file contents to reduce storage capacity. The file system can take a snapshot and recovery using taken snapshots. The file system was optimized through software-level write-back cache. The file system was implemented in C++ via FUSE.

Flash Translation Layer (FTL) for NAND Flash SSDs
I implemented a flash translation layer for NAND flash SSDs. The flash translation layer was optimized to improve wear leveling by implementing hot/cold data swapping. The flash translation layer was implemented in C++ via 746FlashSim.

ounglang
ounglang is a interpretative, turing-incomplete, and esoteric programming language for seals written in C.

PEARLS (Programming Evaluation and Rapid Learning Systems)
PEARLS is a learning management system for CS-KMITL. I designed and implemented a relational database schema for PostgreSQL.

PEARLS Lab
PEARLS Lab is a programming competition platform for Code Arcade. I designed and implemented web services and a relational database schema for PostgreSQL.


Competition

ASEAN Data Science Explorer 2020 (2nd Place, National Finals)
Thailand's Agoda Programming Competition (Top 50)
ACM-ICPC 2019 Asia Bangkok Regional Contest
ACM-ICPC 2018 Asia Nakhon Pathom Regional Contest