By Huang Shiliang
"Data is the new oil" — this phrase is almost worn out in AI circles. But in mainstream narratives, it feels like it has nothing to do with ordinary people like us. It’s portrayed as a capital game for tech giants, all about GPUs and trillion-parameter models.
But after thinking about it, I realized this metaphor actually serves as an excellent compass for navigating the AI world.
1. A Seriously Misunderstood Metaphor
"Data is the new oil" has practically become gospel in the AI era.
But honestly, most people's first reaction is probably: "This is big corporate stuff — what the hell does it have to do with me?"
Because in mainstream discourse, their idea of "data" means PB-scale internet dumps or Wikipedia; "refining technology" means tens of thousands of H100 GPUs plus scientists earning seven-figure salaries; and the "final product" is something like GPT-5 — an omniscient god-model.
That logic works fine commercially, but here's the problem — it basically says: don't even bother showing up. You're not invited to the table.
Ordinary individuals are simply kicked out of the game.
And there's an even darker version that pisses me off more every time I think about it:
Data is the new oil, and consumer data from people like us is Venezuela’s oil field; while companies like Meituan, Alibaba, and TikTok are Trump-era America.
They accidentally (but really, intentionally) show up at our doorstep, stick a pipe into our lives, take our data for free, refine it into "premium gasoline" (precise algorithms, big-data price discrimination), then sell it right back to us at gunpoint.
The result? We become suckers — not only giving away raw materials for free, but also helping these platforms count the money after they've sold us out.
In this version of the story, only giants play. We have neither massive datasets nor capital, and certainly can't train large models ourselves. So "data is the new oil" becomes a slogan that sounds impressive but is utterly useless — even disgusting — for individuals.
2. Change Your Perspective, and There’s Hope
I believe this consensus is flawed. We need to flip the script.
If we insist on applying the "data is the new oil" concept to ordinary people, the question isn’t whether the analogy holds — it’s: how the hell does this guide my actions?
The reason the oil industry is powerful is because it follows a clear, unavoidable logic chain:
Find oil fields (exploration) → Build refineries (processing) → Standardize products (gasoline) → Establish distribution channels (gas stations) → Sell to users.
For us regular folks, the AI age’s “data oil” must follow this exact sequence. Miss one step, and your AI anxiety will never turn into productivity — it’ll just become mental fatigue: refreshing feeds, bookmarking links, watching others get rich.
So let me break down how ordinary people can actually act on this.
3. Step One: Where Are the Oil Fields? — Hunt for Your Nearby “Micro-Rich Deposits”
In traditional industries, you go to Saudi Arabia or Russia to find oil. But along our path, the oil fields are right under your nose. I see at least two major categories.
1. Personal Private Data: Your Own Backyard
This is the most overlooked yet most reliable type of data. It doesn’t need to be huge — but its purity is extremely high.
Examples include your work processes, decision-making logic, past failures (post-mortems), and unwritten rules you’ve learned over years in your field.
Also included are your digital footprints: notes written over the past decade, code repositories, drafts, emails — all of it counts.
The value here is clear: this data belongs entirely to you. Any “personal digital twin” or “domain expert agent” trained on it cannot be replaced by any general-purpose large model.
If you haven’t used a computer much in the last five years and relied solely on your phone, you’re unlikely to evolve into an AI producer — you’re destined to remain an AI consumer.
If you truly want to make money using AI, I suggest buying a computer. Why?
Without a computer, you likely lack systematic data accumulation — you’re essentially a full-blown “oil-poor nation.” Don’t expect the photos in your phone gallery or dozens of GBs of voice messages and chatty WeChat logs to accomplish anything meaningful — too many impurities, too little structure. At best, you might refine some 29-octane fuel, but forget about producing proper 92-octane gasoline.
2. Public Data Rich Deposits: Assemble Your “Exploration Team”
The second category consists of data visible to everyone, yet 99% of people merely “consume” rather than “explore”: X.com, public accounts, arXiv, YouTube — these are the “high seas” of the data age.
Today’s internet, especially social media, is deteriorating rapidly. I’d dare say more than 50%, possibly over 90%, of content is AGRC (AI Generated Rubbish Content).
These people mass-produce nonsense using AI, directly polluting the data strata. If you go prospecting without awareness, all you’ll dig up is garbage.
Worse still: feeding garbage to your brain or to AI results in nothing but garbage output — and may even clog your refinery.
To ensure what you extract isn’t AGRC, I recommend building a strictly curated **“inspiration source portfolio.” But note: passive consumption isn’t enough — that’s just hoarding crude oil. You need **crude oil pre-processing — use AI to process each source so it becomes machine-readable fuel:
Deep sedimentary rock (books): The ballast. Set a yearly reading list covering both professional classics and literature.
AI integration: Don’t just read mindlessly. Use Gemini or ChatGPT to assist — discuss chapters, generate reflection questions. After finishing, create digital reading notes and feed them into your AI — that’s your knowledge base.
Frontier exploration zones (papers & reports): Regularly browse arXiv or Google Scholar. Hold weekly “paper lunch meetings” to force yourself through one paper.
AI integration: Can’t digest raw papers? Upload PDFs directly to NotebookLM or ChatGPT to summarize key arguments and data — turning “tough bones” into “nutritious broth” for storage.
Surface runoff (news & updates): Use RSS or customized feeds. I scan headlines only, deeply saving only truly exceptional pieces.
AI integration: Don’t just save links. Copy content and use AI to tag, extract keywords, and categorize into your note-taking app — otherwise, saved items just gather dust.
Associated gas fields (podcasts & lectures): Listen during commutes — e.g., TED Radio Hour. Force yourself to attend one or two offline salons monthly.
AI integration: When you hear great ideas, don’t just nod. Use Whisper to transcribe audio, then have AI convert it into structured notes. Sound can’t be searched — text can.
High-yield wells (social media): Follow real experts on Twitter/X. Regularly clean your following list — unfollow those spewing emotional junk.
AI integration: For brilliant threads, copy them into AI to analyze logical flaws or integrate insights into your own knowledge system.
Field expeditions (life observation, field research): Practice “question-driven living.” This qualitative data is inaccessible to AI crawlers.
AI integration: When inspiration strikes, speak instead of typing — record via voice, then let AI organize it into journal entries. Turn rambling thoughts into logical insights.
We must cultivate the habit of pulling out our phones and verbally dumping ideas into apps like Doubao.
These six sources form your “mixed oil field.” Only when your inputs are wild, diverse, and pre-processed by AI will your outputs avoid clichés.
4. Step Two: Where’s the Refinery? — Don’t Just Focus on Large Models
Once you’ve found oil, you need to refine it. Mainstream media keeps pushing GPU purchases, but for individuals, the real refinery is your personal software stack + thinking methodology.
1. Large Models Are Just “Boilers”
Paying for a ChatGPT Plus subscription won’t magically make you smarter — it’s like buying a boiler and standing next to it admiring how bright it glows — but you’re not operating it.
Large models like ChatGPT and DeepSeek are merely basic power units — foundations. They can ignite, but that doesn’t mean you’re producing oil.
2. The Real Refinery Is Your “Personal Tool System”
An efficient personal refinery requires these components:
Pipelines (toolchain): VS Code, Python, Skills, etc.
Processes (methodology): This is the core moat. How you write prompts, build RAG knowledge bases, orchestrate multiple agents (skills) to collaborate.
The focus should never be “how powerful the model is,” but rather: how you interact with AI — how you translate your implicit experience into instructions AI can understand.
This “personal engineering system” is your real refinery — not the model itself.
5. Step Three: Product Isn’t the End — Distribution Is the Real Battle
This is the cruelest part of the entire chain. PetroChina only needs to deliver oil to gas stations — drivers line up automatically. But in the AI era, productization and sales are damn hard.
1. The “Gasoline” Refined by AI Is Highly Non-Standard
What you produce using “personal data” + “large models” likely isn’t universal gasoline, but:
- A Python script only useful to you
- An article with a unique style
- An AI-enhanced medical report post-checkup
- A personalized legal consultation package
These outputs aren’t universal, standardized, and are highly context-specific.
2. The Real Problem: Who Do You Sell To?
Before starting, ask backward: who exactly am I making this for? This reverse-engineers what kind of oil you should refine.
Selling to yourself (self-use): Saving time equals earning money — the easiest closed loop.
Selling to businesses (B2B): Package your prompt or workflow into a solution. Requires strong pre-sales skills (i.e., persuasion).
Selling to the public (B2C): Turn into an app or content column. Depends entirely on your ability to distribute traffic.
Truth is: refining oil (generating content) in the AI era is getting easier, but building gas stations (distribution and sales) is harder than ever.
6. Don’t Forget Environmental Protection: Don’t Let Waste Bury You
Traditional refining produces waste sludge, wastewater, and exhaust. If unmanaged, your refinery won’t make money — you’ll die from fumes first.
Data refining is no different — **“cyber pollution”** is severe and demands a dedicated “environmental department” for regular cleanup.
1. Clean Outdated “Tool Waste”
AI evolves absurdly fast — terrifyingly so.
That “Top 10 AI Tools to Use in 2025” list you bookmarked last month? Half may already be dead. That AI image-generation parameter you’re obsessing over today? Might be obsolete tomorrow thanks to “one-click generation.”
Don’t be a “cyber scavenger” hoarding outdated tools. Uninstall, unfollow. Tools are meant to be used — not worshipped.
Hoarding obsolete tools is like filling your home with rusty scrap metal — it only slows you down.
2. Discard Drained “Data Shells”
Many suffer from “squirrel syndrome”: downloading every PDF, saving every video, stuffing multi-TB archives, feeling like they own the world.
That’s not knowledge — it’s a landfill.
Proper environmental practice: use AI to extract the “oil” from PDFs, videos, long articles — generate summaries, pull quotes, convert into personal notes.
Once extracted, delete the original files (or archive to cold storage). Your attention is an incredibly expensive, limited resource — don’t let raw files consume your bandwidth.
Keep only “refined fuel,” discard “empty crude shells” — that’s what makes an efficient refinery.
3. Cut Off “Zombie Bills” That Suck Your Blood
AI anxiety drives us to do stupid things — none stupider than spending money to buy false security.
Enrolling in courses, buying subscriptions, attending conferences, purchasing Plus memberships… costs add up. Worse, once you subscribe (monthly billing), you often forget to cancel.
I once rented a server for testing — over three years ago — and it silently charged me monthly, buried in bills I never noticed. I only used it once.
I impulsively bought ChatGPT, Gemini, Claude, Perplexity... endless auto-renewals, plus APIs. Result? Most sat idle, gathering digital dust.
Damn, what a waste.
These are exactly the things “environmental protection” must eliminate. Otherwise, before you refine anything sellable, your entire budget gets drained by invisible pollution.
7. Final Words: An Action Map
When we strip away the grand narrative of “data is the new oil,” it stops being a distant capital tale and becomes a stark roadmap for ordinary people.
If you want to win in this era, check your “balance sheet” now:
- Reserves: Are you still scrolling TikTok? Or are you actively accumulating high-quality data through “inspiration sources” + AI assistance? (Remember to avoid AGRC garbage)
- Production capacity: Do you have your own toolkit and methodology (your refinery)? And do you know what kind of oil you’re refining?
- Distribution: Have you figured out who you’ll sell your non-standard products to? This should inform your production — deciding whether to refine 92-octane or 98-octane fuel.
- Environmental compliance: Are you hoarding digital trash? Have you reviewed credit card bills and cut those zombie subscriptions?
Final advice: Forget those news headlines about billion-parameter models. Start today — buy a computer, set up your “inspiration data sources,” drill your first micro oil well, sell first to yourself, and refine tools that automate your work with AI as co-pilot and you as commander.
Honestly, I’m still confused too. I’ve been tinkering with AI for over three years and haven’t produced much. All I’ve managed is an AI managing my to-do list and another handling my reading notes. I’m still wondering — what else can I refine?















