Zhinan Liu

(206) 714-7829 · zliu24601@gmail.com

MSEE Candidate at UW specializing in embedded systems. Focused on MCU/SoC bring-up, peripheral driver development, and firmware optimization on ESP32 and STM32 platforms. Proficient in C/C++, Python, and assembly (RISC-V, ARM, Xtensa), with knowledge of FreeRTOS and Linux. Experience with communication protocols (I²C, SPI, UART, BL, BLE). Developed a SIMD-optimized math library for the ESP32-S3 achieving up to 10x integer and 3-5x floating-point performance improvements.

Download Resume

Skills

Languages

C, C++, Python, RISC-V ASM, Xtensa ASM, SystemVerilog

MCUs / SoCs & Peripherals

ESP32-S3, STM32, Xtensa LX7, ARM-M7
Interfaces: I²C, SPI, UART, USB-3.0, Bluetooth, BLE

Systems

Bare-metal, FreeRTOS/RTOS, device drivers, DMA, Embedded Linux/Yocto

Tooling

Git, CMake, esp-idf, OpenOCD/JTAG, Docker, Jenkins

Projects

esp_simd

High-level C library wrapping Xtensa SIMD intrinsics for vectorized math on ESP32-S3. Provides safe alignment, saturation handling, and drop-in APIs for esp-idf.

Hand-tuned, branchless assembly with zero-overhead loops
Vector ops: e.g. add, sub, sum, dotp etc for int8/16/32 and float32; benchmarks show ~[5-10x speedup on INT types and 3-5x speedup of FLOAT type] vs scalar
Reproducible benchmarks and unit tests; CMake integration

Scalar vs SIMD ASM (dot product example) click to expand

Scalar (baseline)


/*
    C Code:
    int32_t output = 0;
    int8_t *vec1_data = (int8_t*)vec1->data;
    int8_t *vec2_data = (int8_t*)vec2->data;
    for (int i = 0; i < vec1->size; i++){
        int a = (int)vec1_data[i];
        int b = (int)vec2_data[i];
        output +=  a * b;
    }
    *result = output;
    return VECTOR_SUCCESS;
*/

420169d4:   08d8        l32i.n  a13, a8, 0
420169d6:   03e8        l32i.n  a14, a3, 0
420169d8:   0a0c        movi.n  a10, 0
420169da:   0acd        mov.n   a12, a10
420169dc:   0005c6      j   420169f7 
420169df:   00          .byte   00
420169e0:   8daa        add.n   a8, a13, a10
420169e2:   000882      l8ui    a8, a8, 0
420169e5:   238800      sext    a8, a8, 7
420169e8:   beaa        add.n   a11, a14, a10
420169ea:   000bb2      l8ui    a11, a11, 0
420169ed:   23bb00      sext    a11, a11, 7
420169f0:   8288b0      mull    a8, a8, a11
420169f3:   cc8a        add.n   a12, a12, a8
420169f5:   aa1b        addi.n  a10, a10, 1
420169f7:   e53a97      bltu    a10, a9, 420169e0 
420169fa:   04c9        s32i.n  a12, a4, 0

SIMD (esp_simd)

 
simd_dotp_i8:
    entry a1, 16                                    // reserve 16 bytes for the stack frame
    extui a6, a5, 0, 4                              // extracts the lowest 4 bits of a5 into a6 (a5 % 16), for tail processing
    srli a5, a5, 4                                  // shift a5 right by 4 to get the number of 16-byte blocks (a5 / 16)
    movi.n a7, 0                                    // zeros a7
    beqz a5, .Ltail_start                           // if no full blocks (a5 == 0), skip SIMD and go to scalar tail

    // SIMD mul-accumulate loop for 16-byte blocks 
    ee.zero.accx                                    // clears the QACC register
    ee.vld.128.ip     q0, a2, 16                    // loads 16 bytes from a2 into q0, then increment a2 by 16
    loopnez a5, .Lsimd_loop                         // loop until a5 == 0
        ee.vld.128.ip     q1, a3, 16                // loads 16 bytes from a3 into q1, then increments a3 by 16 
        ee.vmulas.s8.accx.ld.ip q0, a2, 16, q0, q1  // multiply-accumulates q0 and q1, stores result in QACC, increments a2, updates q0 
    .Lsimd_loop:

    rur.accx_0 a7                                   // write the lower 32 bits of QACC into a7
    addi a2, a2, -16                                // adjust a2 pointer back to the last processed element (it goes too far due to the last increment in the loop)

    .Ltail_start:                                   // Handle remaining elements that were not part of a full 16-byte block  
    loopnez a6, .Ltail_loop 
        l8ui a8, a2, 0
        l8ui a9, a3, 0
        sext a8, a8, 7
        sext a9, a9, 7
        mull a8, a8, a9
        add a7, a7, a8 
        addi a2, a2, 1
        addi a3, a3, 1
    .Ltail_loop:  
        
    s32i.n a7, a4, 0
    movi.n a2,  0                                   //return exit code 0 (success)
    retw.n

Notes: [insert vector length, alignment strategy, saturation/rounding mode, tail handling policy, and measured cycles here].

Benchmarks: esp_simd vs scalar click to expand

Benchmark results for esp_simd showing speedups over scalar across operations and vector sizes — Benchmark results for esp_simd showing operation runtime for 32 vectors of random length 1–256

Tech: C, C++, Xtensa ASM, esp-idf, CMake

Experience

Researcher

Harborview Medical Center

Research Engineer within HIPRC. Developed automated data pipelines for clinical research; created and maintained a trauma-transfusion database connecting trauma admissions to patient blood use; created and deployed analytical and predictive models from large trauma datasets.

Emphasis on robust, reproducible pipelines and production deployment (version control, CI/CD).

Feb 2021 - May 2024

Student Assistant

University of Washington, Lieber Lab

Researched adenovirus-based gene therapy (Hemophilia A, β-Thalassemia). Analyzed off-target CRISPR mutagenesis from large genomic datasets.

Built analysis tooling and workflows; collaborated across engineering/research teams.

August 2016 – August 2020

Education

University of Washington

Master of Science, Electrical Engineering

Relevant Coursework: Computer Architecture, Embedded Software Design, Data Structures & Algorithms

Sep 2023 – Dec 2025 (est.)

University of Washington

Bachelor of Science, Biochemistry

Sep 2016 – Jun 2020

Publications

Selected publications (expand)

Age, admission platelet count, and mortality in severe isolated traumatic brain injury: A retrospective cohort study Anesth Analg. 2023 May 1;136(5):927-933. doi: 10.1213/ANE.0000000000006388. Epub 2023 Apr 14. PMID: 37058729. Kunapaisal T, Phuong J, Liu Z, et al.
Ultramassive Transfusion for Trauma in the Age of Hemostatic Resuscitation: A Retrospective Single-Center Cohort From a Large US Level-1 Trauma Center, 2011-2021. Anesth Analg. 2023 May 1;136(5):927-933. doi: 10.1213/ANE.0000000000006388. Epub 2023 Apr 14. PMID: 37058729. Muldowney M, Liu Z, Stansbury LG, Vavilala MS, Hess JR.
Drivers of blood use in paediatric trauma: A retrospective cohort study. Transfus Med. 2022 Oct;32(5):383-393. doi: 10.1111/tme.12901. Epub 2022 Aug 14. PMID: 36205390. Gebregiorgis HT, Hasan RA, Liu Z, et al.
Blood product availability in the Washington state trauma system. Transfusion. 2022 Jun;62(6):1218-1229. doi: 10.1111/trf.16888. Epub 2022 Apr 26. PMID: 35470898. Ali M, Liu Z, et al.
Safe and efficient in vivo hematopoietic stem cell transduction in nonhuman primates using HDAd5/35++ vectors. Mol Ther Methods Clin Dev. 2021 Dec 6;24:127-141. doi: 10.1016/j.omtm.2021.12.003. Erratum: 2022 May 22;25:533. Li C, Wang H, Gil S, … Liu Z, et al.
Concussion symptoms and temporary accommodations using a student-centered return to learn care plan. NeuroRehabilitation. 2021;49(4):655-662. doi: 10.3233/NRE-210182. Philipson EB, … Liu Z, et al.
Cold-stored whole blood and platelet counts in severe acute injury: A comparison of four retrospective cohorts. Transfusion. 2021 Dec;61(12):3321-3327. doi: 10.1111/trf.16699. Asif M, Hasan RA, Liu Z, et al.
Blood component use and injury characteristics of acute trauma patients… Transfusion. 2021 Nov;61(11):3139-3149. doi: 10.1111/trf.16679. Liu Z, et al.
Curative in vivo hematopoietic stem cell gene therapy of murine thalassemia using large regulatory elements. JCI Insight. 2020;5(16):e139538. Wang H, … Liu Z, et al.
High-level protein production in erythroid cells derived from in vivo transduced hematopoietic stem cells. Blood Adv. 2019 Oct 8;3(19):2883-2894. Wang H, Liu Z, et al.
Single-dose MGTA-145/plerixafor leads to efficient mobilization and in vivo transduction of HSCs… Blood Adv. 2021 Mar 9;5(5):1239-1249. Li C, … Liu Z, et al.