
CZ Invests $11M in Seed Round for a Chinese Junior Student's Education Agent Startup
TechFlow Selected TechFlow Selected

CZ Invests $11M in Seed Round for a Chinese Junior Student's Education Agent Startup
College students in Silicon Valley are reshaping people's understanding of AI entrepreneurship with one AI product after another setting new records in fundraising.
Author: Founder Park
A Chinese undergraduate student, $11 million seed round—the highest-funded student startup in Silicon Valley to date.
VideoTutor, an education agent product targeting K12 students that generates personalized teaching/explanatory videos from a single sentence, today announced the completion of its $11 million seed funding round. The round was led by YZi Labs, with participation from Baidu Ventures, ChinChow Fund, Amino Capital, BridgeOne Capital, and several prominent individual investors.
This is also the first AI product company invested in by YZi Labs.
Founder Kai Zhao stated that VideoTutor received recognition and support from CZ (Changpeng Zhao) and the YZi Labs investment team, ultimately securing YZi Labs as the lead investor. They received over 10 term sheets (TS) and carefully selected these few partners.
The first version of the product launched on May 14 (initially released on Founder Park’s Product Marketplace), gaining market validation and confirming PMF (Product-Market Fit). In under five months, they closed this $11 million seed round.
In Kai's view, the core reason for securing this funding lies in: given the right direction, a "young genius team" solved the pain points of U.S. college exam preparation in the K12 sector through visualized learning.
"This field is particularly suitable for young people, especially those with strong engineering and hands-on capabilities, combined with founders who possess deep insight and experience, enabling extremely fast execution."
They're not alone—Cursor, Mercor, Pika, GPTZero, and others show how university students in Silicon Valley are redefining perceptions of AI entrepreneurship with one high-funding AI product after another.
Entrepreneurship in the AI era really is different now.
We spoke with these young founders behind VideoTutor to understand why they secured this seed round, what changes are happening in Silicon Valley startups today, and why they’re so eager to hire talent from major Chinese tech companies.
Guests: CEO Kai Zhao, CTO James Zhan.
Interview & Editing | Wan Hu
Below is the interview, edited and organized by Founder Park.
01 In the K12 Space, Visual Learning Is the Real Direction
Founder Park: So many institutions are bullish on you. What do you think resonated most with them?
Kai: First, I believe it’s because we’re heading in the right direction. The AI education space holds tremendous potential, and we’ve focused specifically on U.S. college entrance exams like SAT and AP. Our target users are high schoolers in the K12 system, and our gap with this user group is minimal—practically no generational divide. We’ve personally gone through the entire test prep cycle and know exactly where the pain points lie, allowing us to build a product that truly solves real problems for this audience.
Second, the team is exceptionally strong. James comes from Gemini, where he was a core AI engineering and algorithms engineer at Google. I myself have had three prior education-related startups, starting from my freshman year building educational software. During sophomore year, I co-founded MathGPTPro, which was selected into Merit Circle (a well-known accelerator). I have a proven track record of successfully launching education products.
Third, our work in AI education centers around an animation engine—and we are the core developers of VideoTutor, making us the team that understands the core technology best. This allows us to achieve highly accurate rendering with our animation engine.
The team also has strong marketing DNA—we know how to spread the word effectively.
VideoTutor perfectly embodies what mainstream U.S. VCs call the “young genius team” thesis: a domain well-suited for young people, paired with excellent engineering capability, and founders with strong insight, experience, and rapid execution. I think this is a shared conviction among all the investors who backed us.

VideoTutor at YZi Labs EASY Residency Demo Day, ringing the NYSE bell
Founder Park: What core problem in education does your product aim to solve?
Kai: Current learning products on the market fall into two categories: active learning and passive learning. Passive learning products, such as ByteDance’s Gauth, Chegg, and AnswersAi, serve what we call the “homework help” scenario. The learning loop is very short—students pay primarily to get homework answers.
VideoTutor, however, targets active learning scenarios. We don’t need to worry about student motivation because they *have* to study and take exams—like the SAT or AP. In this context, there’s a significant demand for visualization. About 80% of U.S. college entrance exam content involves functions, calculus, and other topics requiring complex graphical rendering. Our animation engine excels precisely in this area.

Moreover, this market has very high customer willingness to pay. On average, 2.6 million U.S. students take the SAT annually, creating massive paid demand. Offline SAT courses are expensive—not sold in packages, but priced hourly, starting at $150 per hour and often reaching $230. Many students and parents are willing to pay. But VideoTutor can effectively replace or even surpass human tutoring, as AI-generated videos today are nearly indistinguishable from those taught by humans. This way, students can have their own personalized AI tutor at the lowest possible cost.
Founder Park: What inspired you to build this product?
Kai: Before us, a team from Stanford attempted something similar—Gatekeep Ai. They also aimed for visual learning. At the time, I already sensed the impact this direction could have. In previous ventures, most education products were essentially GPT API wrappers—ChatGPT-based tools. But we realized text-only Q&A products hit a ceiling. You can see Chegg and Gauth’s businesses declining, as many use cases are now replaced by ChatGPT—students pay $20 to solve homework via ChatGPT instead.
API wrapper products optimized at the surface level have reached their limits.
But multimodal visual generation has huge potential, especially in U.S. college exams where visual learning is critical. Unfortunately, Gatekeep started strong but didn’t continue—they launched too early, when foundational model programming wasn’t mature yet and GPT-4 hadn’t been released. Plus, math animation engines require advanced rendering and algorithms, which they failed to crack. Our team mastered all core development of the animation engine, solving the accuracy issue in video rendering.
02 PMF: Strong User Willingness to Pay
Founder Park: After launch, you partnered with several schools. When did you feel, 'We got it right—the pain point is validated,' indicating PMF?
Kai: I’d break it down into three dimensions.
First, revenue metrics: To date, VideoTutor has received API requests from 1,000 organizations—including all major U.S. education institutions and even some in China. Many schools want to purchase our service. On the consumer side, feedback is even more direct. One parent, who is also an investor, tried our product and shared it with friends and family—all of whom were willing to pay. He somehow got my number and texted me wanting to invest. There’s very strong willingness to pay from end users.
Second, user demand. Why is 1-on-1 private tutoring so essential in the U.S.? Because parents believe it works and are willing to pay. Now, multimodal AI can simulate the same 1-on-1 teaching effect—your question gets an immediate, tailored response. And honestly, pre-recorded videos from top online tutors are no different from AI-generated ones. That’s what I call “demand migration”—if students pay thousands for recorded courses that are no better than what AI produces, why not switch to AI? It’s cheaper and offers better teaching quality.
We’ve received overwhelmingly positive feedback from students, and many teachers are actively promoting our product. Early completion rates and usage duration are excellent. The 200 seed users we’ve screened were accumulated organically.
Third, there’s a certain product taste and intuition. As you keep iterating—considering progress in the education industry, core payment drivers for parents and students, and the product’s own evolution—you realize everything forms a closed loop. These three dimensions together confirm PMF. Most importantly, willingness to pay is extremely strong.

Partnership with FIZZ
Founder Park: Many users proactively want to pay, and some even reach out to invest.
Kai: Yes. In the SAT/AP space, willingness to pay is inherently high. Average pricing starts at $100–200 per session, with in-person classes costing up to $800. With 2.6 million U.S. students taking the SAT annually, and 37% paying for prep, this is a market with intense demand. Our product enables seamless demand migration.
Founder Park: For SAT test-takers, would they trust an AI over a human tutor?
Kai: Today, AI rarely makes factual errors on SAT/AP-level questions. Why is it better than a human tutor? First, it’s cheaper. Second, students can ask anything without fear of judgment or impatience. They can learn anytime, 24/7.
And this market is transferable—after the U.S., we can expand to Canada, the UK’s A-Level exams, etc., where paid demand is equally strong.
Founder Park: How are you thinking about monetization now?
Kai: We offer monthly subscriptions and outcome-based pricing. I believe AI is now capable of charging based on results. We might launch a package: pay $799, and we guarantee your child achieves a perfect SAT Math score.
Founder Park: But outcome-based pricing depends on the student’s effort, doesn’t it?
Kai: That wouldn’t work for China’s Gaokao, which covers thousands of knowledge points. But the SAT has only 62 key topics—50 standard ones that most students master easily, and 12 additional ones mostly within reach. Unless a student has serious logical issues, failure is unlikely. And AI significantly boosts efficiency.
Many U.S. online tutors already offer this—pay $1,800, and they guarantee success because SAT topics are fixed. As long as the student has normal intelligence, it’s virtually guaranteed. But Gaokao prep can’t be accelerated this way—it requires widening score gaps with difficult questions. The SAT doesn’t have “impossible” questions; it simply tests whether you’ve mastered the concepts.
Outcome-based pricing is already a proven model used by tutors—we’re just building on existing conditions.
Founder Park: Does model cost affect your pricing? Is it a significant portion?
Kai: Our customer price point is very high—starting at $69/month—while model costs are now negligible. Unlike coding tools, where everyone competes on price due to long-context needs, education doesn’t face that pressure.
03 For a High Schooler-Focused Product, Web Is King
Founder Park: If I recall, your first prototype took just over two months. How did you approach development cycles, task division, and deciding which features to build or skip?
Kai: Our entire team agreed: iterate fast, because speed brings early user feedback.
After launching the first version on Twitter, we saw massive traction and user influx. But many early users were programmers, investors, or tech enthusiasts—what we call “tech early adopters.” Feedback from them was scattered and less valuable. We needed to filter out truly core seed users: high-quality high schoolers—and gather meaningful insights through direct consultation.
The key feedback we received was clear: video rendering accuracy must be 100%. That became our top optimization priority. Features like UI aesthetics or multiple TTS voice options were cut. We returned to the product’s essence: we’re solving STEM learning, so graphical rendering accuracy is paramount.
Founder Park: How did you decide on video length initially?
Kai: Initially, peak length was around 6 minutes. Our main consideration: explanations for standard problems shouldn’t exceed 6 minutes. But later feedback showed some lower-performing students wanted slower, deeper explanations. We realized length shouldn’t be capped—it should adapt to the user’s learning pace.
Founder Park: What’s the maximum length now?
Kai: Up to an hour, with continuous Q&A. The system generates content in real-time during interaction—a feature recently added, absent in the initial version.
Founder Park: Were there any features you initially wanted but later deprioritized?
Kai: Like a mobile app. We considered building one quickly, but discovered most U.S. students use laptops or iPads. Most K12 schools provide Chromebooks, and homework is done on computers. Nearly every high schooler has a computer; phones account for less than 5% of learning time.
Founder Park: So for an education or student-focused product, web should come first—app is secondary.
Kai: Exactly. We already knew this data from years living in the U.S. Later, we surveyed 100 early users from tens of thousands—over 90 had computers. That confirmed it further.
Founder Park: Was your first version also targeted at K12 from day one?
Kai: Yes, and we remain focused on this group. We’re not direct competitors with Gauth—we focus on exam prep. Many U.S. high schoolers already use offline or online platforms, and VideoTutor seamlessly migrates that demand.
Founder Park: Will K12 remain your core user base for at least a year?
Kai: Likely for the next two years as a key metric.
04 Using Large Models, But Not Relying Solely on Them
Founder Park: Briefly explain your current technical architecture? VideoTutor excels at generating courses and charts—outperforming other video models, even when many struggle with basic text accuracy. Your tech is impressive.
James: Our generated videos include both text and graphics. The general workflow: a large language model generates text and corresponding animation instructions, which are then rendered by our proprietary animation engine and composited into the final video.
Text rendering is relatively straightforward—we generate text via LLM and overlay it directly. But the animation component is powered entirely by our custom-built math animation rendering engine. Its strength lies in ultra-precise rendering of coordinate axes, geometric shapes, and other elements—that’s our core technical edge.
Current LLMs only output text. Our agent setup gives the LLM a “pen and paper,” letting it draw the ideal teaching animation it imagines. The drawing part—that’s all our technology.
Founder Park: How do you handle final video synthesis, including audio and video?
James: Initially, the user inputs a prompt, e.g., “What is the Pythagorean theorem?” Step one: we let the LLM reason through all scenes—typically 3 to 5, depending on question complexity—and draft a rough script for each. Then, a second round of reasoning generates scene-specific text, diagrams, and voiceover scripts. Voiceovers are synthesized via TTS.
Finally, we stitch all scenes together into a complete video.
Founder Park: I understand that was your initial approach. Now that real-time interaction is added, has the generation process changed?
James: Yes, it has. To minimize user wait time, we now render the first scene immediately while background processes handle subsequent scenes. When a user asks a question, we convert their speech to text, then feed that text along with all prior scene content back to the LLM for reasoning, which plans the next teaching segment. The rendering pipeline remains the same afterward.
Founder Park: Suppose a user has a question at the one-minute mark and speaks up. You receive the query, combine it with prior content, and process it. During this, does the animation continue or pause?
James: Our latency has dropped from 20–30 seconds initially to under 5 seconds. During interaction, we use transitional cues so users barely notice the delay—making the experience feel smooth. Within 4–5 seconds, they see new content dynamically generated based on their question.
Currently, the AI teacher says, “Hmm, let me think,” then wipes the board—just like a real teacher. If you find the explanation flawed, they erase and rewrite it. This feels natural.
We’re not purely reactive either—we proactively insert quizzes. Based on quiz results and user questions, we adjust teaching flow. Also, we don’t use open mics; users must manually activate the microphone, adding intentionality to interactions.
Founder Park: So with this mechanism, you can generate up to an hour of continuous lecture.
James: Precisely, there’s no hard limit—if the user keeps asking, it keeps going.
Kai: Right, no preset limits. Building VideoTutor aligns with advancements in multimodal AI. We’re not inventing demand—we’re fulfilling existing needs better. Think about in-person teaching: why do U.S. parents pay premium prices? Because tutoring is mostly 1-on-1, starting at $100/hour. The value lies in guided questioning—teachers observe your weaknesses and probe accordingly. VideoTutor mimics that real teaching effect, enabling real-time interactive learning for every student.
Founder Park: Do students need to turn on their cameras during class?
Kai: Not really. Camera usage depends on U.S. privacy laws. We won’t force activation—users decide. Primary interaction remains voice-based Q&A.
Founder Park: Technically, do you use a mix of small on-device models and large cloud models?
Kai: A hybrid approach. Internally, we’ve built a dataset with over 100,000 video entries. High-quality samples are manually annotated and used to fine-tune smaller models. For example, we have over 8,000 SAT training samples. These fine-tuned small models work alongside commercial cloud models like Claude and Gemini.
Founder Park: Does using Claude, Gemini, or GPT affect core performance?
Kai: For K12 content, foundational model capability is already sufficient. But to ensure 100% accuracy, we run dual-model verification—if both agree, error risk is near zero. For code generation, we favor Claude due to its superior coding ability.
Founder Park: Where are your current technical bottlenecks—model capability or code generation?
Kai: Model capability is one piece. Rendering speed is another—we’ve reduced it to under 5 seconds, and further GPU deployment will accelerate it. The third challenge is long-term memory. We need to accumulate longitudinal learning data—knowing which concepts a student struggles with, even if forgotten weeks ago, so we can re-teach them.
James: We’ve invested heavily in reducing rendering time—from 2 minutes to 1, then to under 10 seconds. Our ultimate goal is near-zero latency: the moment inference finishes, output appears. It’s a key challenge we’re tackling, though we’ve identified promising new directions.
05 Don’t Measure Completion Rates—Measure Final Exam Scores
Founder Park: How do you measure core metrics now? How do you judge if a video is useful?
Kai: The most important metric is exam performance. In the new version, after watching a video, a quiz appears at the end. Getting it right means you understood; failing means the explanation wasn’t clear.
You can’t judge learning effectiveness solely by completion rate—some students grasp concepts halfway through. If they pass a mid-video quiz, they don’t need to watch the rest. Our core metric is how many students improve their scores using VideoTutor.
Founder Park: But final exams happen externally—how do you collect results?
Kai: This ties into U.S. product culture: users naturally share positive outcomes. Many students who used VideoTutor for SAT prep voluntarily share their scores and experiences. We invite them to become campus ambassadors for secondary promotion.
We have 20 high school campus ambassadors. Mercor succeeded early using the classic “user success story” model—helping Indian engineers land U.S. jobs, then filming their stories. That created powerful word-of-mouth. VideoTutor follows the same logic: more students achieving great results, sharing their journeys—turning real outcomes into compelling narratives.
Founder Park: Which platforms do students mainly use to share?
Kai: Students use TikTok; parents use Facebook groups.
Founder Park: Looking ahead six to twelve months, what’s your growth strategy?
Kai: Fundamentally, VideoTutor is a consumer (C-side) product—word-of-mouth is critical. Many successful AI apps grew via seed-user口碑, like designers loving a tool and spreading it. For us, the key metric is how many SAT takers achieve high scores using our product and tell others. Parents use Facebook and Instagram; students use TikTok—we’ll leverage these channels. Once consensus builds, school teachers will naturally notice. Many schools learned about us because teachers tried it, liked it, and recommended it to procurement officers. Ultimately, C-side口碑—how many kids improved their scores—is the true north star.
Founder Park: What’s the timeline and status for the new version?
Kai: We aim for public release within two months. By then, students will experience near-instant Q&A with minimal latency, and STEM graphics rendering will be 100% accurate. For now, we won’t cover competition-level content or advanced college topics like linear algebra—focus remains on K12.
Founder Park: What’s VideoTutor’s current moat or defensibility?
Kai: Several aspects. First, a data flywheel. Videos are backed by code; high-quality user-generated videos, once re-annotated, train and refine our models. More data = better videos. Plus, behavioral data reveals individual knowledge gaps—building a stronger flywheel: the more users, the smarter the product becomes.
Second, technological lead—especially our animation engine algorithms. While algorithms alone aren’t the ultimate edge, rapid iteration and growing data amplify our advantage.
Third, brand. VideoTutor has become a leading AI education brand among North American parents. Parental trust itself is an invisible barrier.
Founder Park: Three to five years from now, what do you envision VideoTutor becoming?
Kai: We hope VideoTutor becomes everyone’s AI tutor for STEM subjects. We focus only on STEM. I believe it will surpass Duolingo. Duolingo is a global leader in language learning, but no world-class product has emerged for STEM—because STEM demands heavy graphical rendering. Now that foundational models are ready, I believe the next “Duolingo” will emerge in STEM.
06 Hiring—Especially Talent from Chinese Tech Giants
Founder Park: You’ve had previous startups—what were they about?
Kai: I’m currently a junior. In my freshman year, I co-founded an edtech startup with James, raising $200K in angel funding. Though it failed, I gained crucial insight: avoid commoditized competition. Our app faced many similar products, forcing us into costly user acquisition early on, making monetization hard.
My second venture was joining MathGPTPro as a co-founder for several months. There, I learned to analyze product metrics, build products, and scale users. That’s when I concluded: text-based Q&A education products are dead-end. They’re no different from ChatGPT, and structured knowledge databases painstakingly built by companies like Zuoyebang are now replaced by LLM editing capabilities. So for my third venture, I knew visualization was inevitable.

Photo of Kai Zhao pitching to Sam Altman at Harvard University
Founder Park: Beyond recognizing text-product limitations, how did those past experiences help with VideoTutor in terms of team or other areas?
Kai: Immensely.
First, better judgment of market direction and product viability—I now assess competitor traffic and revenue to gauge product evolution.
Second, product development: better sense of pacing, design, front-end/back-end integration, and key metrics.
Third, team management and organizational culture. I’ve established clearer systems—role definitions, rewards, equity distribution. I’ve also learned fundraising—this $10M round was completed in under 20 days.
Founder Park: How big is your team now?
Kai: Six people, living together.
Founder Park: How did you originally form the team?
Kai: James and I have co-founded twice. We graduated from the same school—built an app together as freshmen. In sophomore year, I worked with two others—so we all knew each other. When we saw the tech’s potential, we regrouped. We’re alumni—Nick, our other co-founder, was my college roommate.
Founder Park: You’re expanding now—what kind of people are you hiring?
Kai: We’re hiring for backend, frontend, large language models, and UI/UX roles—experienced candidates preferred. We’ve passed the trial-and-error phase and entered rapid product-building mode—experienced hires will accelerate growth.
Founder Park: Need experienced engineers, PMs, and growth leads to scale from 1 to 10, or even 10 to 100.
Kai: Exactly. We plan to grow to 9–10 people, prioritizing engineers.
Hiring may occur in China, so we’ll support hybrid in-person and remote work.
Founder Park: What’s your ideal candidate profile?
Kai: We prefer candidates from major tech firms like ByteDance or Meituan. ByteDance has a fast-paced, competitive culture that values youth. Engineers trained there bring strong methodologies and skills—valuable assets we can integrate.
We want people who’ve fought tough battles in Chinese tech giants, with experience in rapid iteration. We’ve moved beyond student创业—we don’t need juniors, nor “industry veterans” who may prioritize family over hustle. Mid-career, young, and driven individuals are ideal.
We offer generous equity to top talent. Though we raised $11M, why not hire in the U.S.? Because we believe Chinese product and engineering excellence is unmatched. This wave will absolutely produce globally successful products led by Chinese teams. Many AI applications today are built by Chinese engineers—China’s engineering prowess is exceptional. This is our advantage—leveraging strengths from both China and the U.S.
VideoTutor’s full job postings: https://videotutor.io/
07 Silicon Valley Students Are All Starting AI Companies
Founder Park: Especially in Silicon Valley, student entrepreneurship is booming. What’s the landscape look like?
Kai: Consider this: companies now valued at billions—Mercor (AI recruiting) recently raised over $300M, hitting $10B valuation; Cursor is firmly on track for $1B+ valuation. Others include GPTZero, Pika, etc. These are all student-led ventures—Cursor and Mercor’s founders were juniors who dropped out.
This wave of young founders shares a trait: extreme differentiation. They focus intensely on narrow domains, avoiding generic solutions. Mercor started exclusively on recruiting Indian programmers.
Second, environment. Silicon Valley’s capital ecosystem and foundational innovation—Stanford, YC, Peter Thiel’s fund—support student startups from day one, offering mentorship and networks regardless of idea maturity.
Third, founder qualities. Whether us or other SV students, we share bold risk-taking and exceptional learning ability. This spirit is less common among Chinese students. In SV, peer success stories inspire you, and capital trusts youth.
For me, I weighed costs and benefits. Graduating and getting a job might not repay my family’s tuition investment or yield high returns. But entrepreneurship lets me learn aggressively at my youngest, opening infinite possibilities. I’ve always dreamed of building a great company.
Founder Park: Why can today’s student founders build billion-dollar companies, whereas before selling for $10–20M was considered great? Is there an AI hype or bubble factor?
Kai: Not entirely a bubble. Cursor has $450M in real revenue—that’s solid. The real driver is the methodology and insight of these young teams. These founders are sharp, with exceptional learning capacity.
Cursor grew early through fellow student programmers—high AI adoption, strong feedback. Founders themselves were young genius engineers, deeply understanding users, with elite iteration skills—four people launched the product. Great iteration bred口碑, revenue, and investor FOMO (fear of missing out), fueling further funding.
The fundamental shift: AI introduces new technologies. Young people learn faster, stay practical, and execute relentlessly—achieving unmatched user understanding and lightning-fast iteration, defeating traditional products. Before Cursor, GitHub Copilot was good—but lost due to inferior UX and slower execution.
Founder Park: Could we say AI being new means product thinking must evolve too?
Kai: Yes. This generation has deeper insight and closer user proximity. Mainstream AI users are Gen Z—they iterate faster, give quicker feedback, and are more包容than previous founders.
Thus, speed of cognitive iteration is key. In the mobile internet era, tech evolved yearly or quarterly. In AI, it evolves daily. Founders must learn fast—and young people can pull all-nighters with relentless drive.
Founder Park: Some media report SV founders now work 996—what’s your take?
Kai: Some white founder friends of mine, despite raising big rounds, also work 996. Like us, they rent a big house and live/work together. I think 996 is largely environmental—SV feels like a gold rush. No one wants to fall behind, so you race on product iteration—requiring late nights and rapid cycles. It’s the environment pushing you.
Founder Park: Any trends in domain selection among SV student founders?
Kai: Whether us in education or others, there’s a trend: founding within your comfort zone—where you deeply understand the domain and users. Cursor’s founder knows coding inside out; we know education well. Today’s youth avoid jumping into unfamiliar fields. This ensures faster, more accurate user feedback.
Cognitive stacking matters too. My three education ventures built layered insight. These students don’t randomly try new things—they ask, “How can I do it better?” They apply next-gen thinking, iteratively improving within their expertise, boldly creating opportunities.
Plus, fearless pioneering spirit—unbothered by criticism, embodying “I don’t care what you think about me.” Underlying this is a “rapid experimentation” culture—I know my product isn’t perfect, but I launch fast, iterate fast, get feedback fast.
Founder Park: When did this trend start?
Kai: It’s a consensus-driven success. When people see projects like GPTZero grow from dorm rooms, iterate rapidly, gain capital and user validation, these quick-test, quick-scale stories build collective belief.
One motto: “Better done than perfect.” Perfection is secondary to shipping. And competition isn’t feared—many SV founders openly share ideas, unafraid of copying, confident in their iteration speed. This generation also excels at storytelling—not empty hype, but grounded in truth, layered with future vision.
Founder Park: Market yourself first.
Kai: Exactly. The underlying mindset is boldness and extreme confidence. Driven by this, they fearlessly experiment, unafraid of mistakes. Speak boldly about your vision, execute boldly—wrong? Just fix it. This culture of fearless iteration fuels the current wave of student startup success.
U.S. VCs actively seek student projects—YC dedicates slots to them each batch.
08 Fundraising Is the Last Thing VideoTutor Needs to Worry About
Founder Park: If you could go back to when you first started VideoTutor, what advice would you give yourself? Any improvements?
Kai: Probably move even faster. And team composition. Our team evolved through multiple iterations. Had I known earlier, I’d have assembled the team sooner based on required skill sets. Ultimately, entrepreneurship comes down to organizational ability. I’d spend more time on hiring, identifying, and empowering the right people.
Our current team excels at 0-to-1 growth. But to scale VideoTutor bigger, we need experienced professionals bringing proven skills to elevate the entire team.
Founder Park: In the next six months, what product or technical challenges might VideoTutor face?
Kai: One is rendering—achieving true zero latency requires further engineering breakthroughs. Second is growth—product taste. This encompasses many things: flawless UI, seamless interactions, bug-free functionality, beautiful visual layout. These are ongoing challenges.
James: Initially, we envisioned VideoTutor as a visual tutor for all subjects, but later went hyper-vertical—focusing solely on math, our strongest area. Our math rendering engine is the most professional. Next, we’ll explore horizontal expansion—how to extend visual advantages to humanities? For example, explaining “锄禾日当午,汗滴禾下土.” That’s our next technical frontier.
Founder Park: Could founder background limit future expansion?
Kai: Unlikely. Many top VCs, like a16z, have approached us. They typically don’t invest too early—they wait until a team shows signs of success, ensuring safer bets. We maintain strong relationships with major VCs.
Fundraising is the last thing VideoTutor worries about. Our real focus remains user ecosystem and product.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














