Background

Resume

Lijie Li · Data Scientist & Photographer

Aalto UniversityKTH Royal Institute of Technology

I build multilingual AI systems for knowledge-intensive teams, and I keep a freelance-but-playful photo practice for friends, travelers, and anyone chasing better portraits. Whether I'm shaping code or light, I keep the process open, co-created, and guided by a restrained aesthetic.

AI & Machine Learning

Deep LearningNLPSpeech RecognitionGenerative ModelsRAG SystemsData Mining

Engineering & Cloud

PythonPyTorchLangChainTritonSparkAzure AI FactoryGitLinuxAsync IO

Data & Analytics

SQLTableauMongoDBSPSSVisualizationStatistics

Photography & Art

Sony AlphaCapture OneColor GradingStudio LightingCompositionVisual Storytelling
Dual Practice Background

Dual Practice

Analytical rigor meets cinematic intuition

Research notebooks, lighting studies, and retrieval diagrams live in the same workspace. Data modeling informs how I choreograph light; field recordings inspire interaction flows.

Practice #1

Data Systems Practice

Day job energy goes into multilingual RAG stacks, speech models, and measurable retrieval governance. I prefer shipping explainable systems over publishing papers.

  • Agentic RAG orchestration
  • Knowledge graphs & KG ops
  • QLoRA + TPE fine-tuning
  • Triton + GPU tooling

Practice #2

Freelance Photography Journal

Photography is a lifelong hobby and dialogue space. I freelance selectively, document rituals, and use this site to talk with friends about taste and visual research.

  • Gallery conversations & residencies
  • Slow-fashion capsule stories
  • Experimental lighting notebooks
  • Community photo salons
Case Studies Background

Hybrid Case Studies

Where data products and visuals converge

Selected systems pairing measurable rigor with sensory storytelling.

AI SystemsVTT · Finland · 3rd place · 2025 AaltoAI Hackathon

Knowledge Graph Challenge on Heterogeneous Sources

Automated ingestion + semantic entity resolution with 100% traceability.

  • Hybrid search (Qdrant ANN + BM25) fused with RRF and Cross-Encoder rerankers.
  • HDBSCAN-powered entity resolution using `text-embedding-small` vectors.
  • Evaluation suite covering Hit Rate, MRR, and innovation lineage tracking.
AI ResearchAalto University · 2nd place

SNLP Challenge: Multilingual Speech + Toxicity

WER 0.0664 / CER 0.0123 with Wav2Vec2-BERT + SpecAugment.

  • Fine-tuned multilingual BERT with Triton acceleration and WandB tracking.
  • Benchmarked four multilingual toxicity models across English / German / Finnish.
  • Blended character-level noise defenses with balanced sampling strategies.
Data ProductsKunshan Yuanpai Trading · China

Recommendation & Uni-cloud Platform

Reduced query latency and improved personalization for merchandising teams.

  • DBSCAN clustering + MAB exploration to surface high-value customer cohorts.
  • Optimized MongoDB schema and SQL interfaces for order + inventory ops.
  • Built Tableau dashboards to translate raw telemetry into decisions.
AI EngineeringP&G · Finland · 3rd place · 2025 Junction

Agent Challenge on Automated Personalized Marketing

Multimodal n8n Agentic workflow for localized multi-channel assets.

  • Engineered a multimodal n8n Agentic workflow that adapts visual elements and optimizes text constraints for specific channels, achieving cultural localization.
  • Implemented a Self-Reflective and adaptive design with CoT reasoning to iteratively critique outputs and enforce safety guardrails.
Studio Notes Background

Studio Notes & Dialogue

Build logs, moodboards, and open conversations

My background spans research-heavy programs, yet the work I share here stays grounded in shipped systems, experiments, and visual notebooks. I publish working notes, lighting studies, and retrieval diagrams so friends can drop by, swap tactics, or plan a casual photo walk.

Moodboard
Moodboard
Moodboard
Moodboard
Experience Background

Experience

Parallel tracks

View full CV on LinkedIn ↗

AI / Data Roles

Aug 2025 — Present

Data Scientist

Lexembed · Stockholm, Sweden

Designing multilingual knowledge engines that blend Agentic RAG, case-based reasoning, and knowledge graphs for legal intelligence teams.

  • Built a multi-hop QA flow that fuses entity extraction with graph traversals for rapid compliance research.
  • Introduced quantitative retrieval guardrails using RAGAS and automated regression suites for every release.

Aug 2023 — Mar 2024

Data Specialist (Intern)

International Digital Economy Academy · Shenzhen, China

Owned the end-to-end lifecycle for policy moderation models, from generative data augmentation to adversarial hardening and deployment.

  • Fine-tuned DeBERTaV3 with QLoRA + TPE, cutting VRAM usage by 80% and improving F1 by 5 points.
  • Used TextAttack adversarial suites to harden classifiers and validated robustness with macro-F1 and MCC dashboards.

Creative Commissions

2024 — Present

Portrait & Travel Sessions Photographer

Freelance Studio · Stockholm · On-location

Think of me as the friend who carries cameras, chats through nerves, and helps you leave with portraits you actually like—whether it’s passport refreshes or playful travel diaries.

  • Deliver same-day biometric-friendly headshots for visas and IDs, plus natural retouching (skin tones, stray hairs) without the heavy filter look.
  • Join shoots as a travel buddy—mapping quiet alleys, cafés, or ferries—so the day feels like hanging out rather than a formal booking.
  • Help prep outfits and pacing, but note I can’t stamp or certify official documents—everything stays casual and personal.

2023 — Present

Street & Candid Sessions

Self-initiated · Stockholm

Lead relaxed portrait walks through Gamla Stan, Södermalm backstreets, and lakeside trails—no stylists, just a friend with a camera and plenty of time.

  • Guide you in prepping outfits and playlists, then stroll together so the shoot feels like catching up rather than performing.
  • Capture both candid street frames and clean portraits, retouching lightly while keeping your features and mood intact.
  • Share albums plus editing notes so you can re-export or print with the same color story later.
Capabilities Background

Capabilities

Core skills

Data Science / AI

  • Agentic RAG orchestration with LangChain + custom tools
  • Python, PyTorch, Triton inference, Qdrant/BM25 hybrid retrieval
  • ASR & NLP fine-tuning (DeBERTaV3, Wav2Vec2, multilingual BERT)

Photography Practice

  • Portrait direction & candid street sessions
  • On-location natural light planning (Stockholm / EU / Shenzhen)
  • Color-proofing, light post-processing, and proof print prep

Commissions & Engagements

Ways we can collaborate

Portrait Sessions

Book me as a portrait/travel friend—passport renewals, casual street walks, light retouching included (official stamped docs not supported).

Data Systems Engagements

Open to full-time roles or embedded sprints for retrieval, multilingual QA, ASR research, or evaluation pipelines.

Contact Background

Availability

Accepting shoots & data engagements

Currently booking portrait sessions across Europe and Shenzhen, and taking on remote/onsite data science engagements that run from retrieval architecture to ASR research sprints.

Based in Stockholm · Shenzhen friendly · English / 中文