CV

Name: Shinwoo Park

Position: Ph.D. Candidate in Artificial Intelligence

Affiliation: Yonsei University, Seoul, South Korea

Expected Graduation: February 2026

Research Interests

My research centers on enhancing the safety and transparency of large language models (LLMs) by developing robust methods for detecting LLM-generated content. I have explored both linguistic feature-based detection and watermarking techniques—two complementary approaches that enable accurate and imperceptible attribution of LLM outputs. These methods have been applied to a wide range of content types, including natural language (in both English and Korean) and programming code (in Python, C, C++, and Java), demonstrating strong multilingual and multimodal generalization. Through these experiences, I have cultivated a deep interest in AI safety and AI ethics, particularly in building trustworthy and accountable systems for generative AI.

Research Keywords

Core: LLM-Generated Content Detection, Linguistic Feature-Based Detection, LLM Watermarking
Related: AI Safety, AI Ethics, LLM Guardrails, Responsible AI

Research Summary

My research aims to ensure the safe and responsible use of LLMs by developing reliable methods for detecting LLM-generated content. I pursue two complementary directions: (1) linguistic feature-based detection, which analyzes statistical differences in text patterns between human- and LLM-authored content, and (2) LLM watermarking, which embeds imperceptible but detectable signals during the generation process for post hoc attribution.

I apply both methods to natural language and source code, validating across English, Korean, Python, C, C++, and Java. This multimodal and multilingual focus enhances interpretability, robustness, and real-world practicality.

Research Statement

My research is driven by the goal of ensuring the safe and responsible use of LLMs through interpretable techniques for content detection. As LLMs expand across domains, risks like misinformation, academic ethics violations, and plagiarism must be addressed.

Linguistic feature-based detection:
I extract features such as word spacing, part-of-speech (POS) n-grams, and comma usage in text, and naming convention consistency or indentation in code. These features are then used in interpretable classifiers.

LLM watermarking:
I design both zero-bit and multi-bit watermarking schemes that encode signals into generation outputs for origin identification and metadata recovery, while maintaining content fluency.

My contributions span natural language and source code, validated across English and Korean, supporting the broader vision of trustworthy and safe generative AI.

Research Projects

Entrepreneurial Experience

Skills