Conversation with Amazon CTO: Stories, Insights, and Secrets from Inside Amazon

2023.06.30

Conversation with Amazon CTO: Stories, Insights, and Secrets from Inside Amazon

If you want to become a rapidly growing company, you cannot operate like a traditional business.

2023.06.30 - 13:06:47

AmazonCTOYC

Navigating Web3 tides with focused insights

If you want to become a rapidly growing company, you cannot operate like a traditional business.

Compiled by: TechFlow

Note: This article is part of the TechFlow special series "YC Startup Course Chinese Notes" (updated daily; this is the final installment). The series aims to collect and organize Chinese translations of YC courses. This 25th episode features the online course "Experiences and Insights on Technology and Startups" by Werner Vogels, Chief Technology Officer of Amazon.

Before Joining Amazon

Before joining Amazon, I was an academic researcher who spent ten years conducting scientific research at Cornell University, building large-scale distributed systems. Prior to that, I wasn't a typical computer scientist—I didn’t make the decision to return to school until I was 28. Before that, I worked in hospitals delivering radiation therapy for cancer patients at the Netherlands Cancer Institute. One day, I realized I couldn’t bear watching people die one after another, so I decided to do something completely different. Computer science seemed like a good choice.

It was the mid-1980s, and computer science wasn’t as widespread as it is today. But it turned out I had a natural talent for it—though I didn’t know it at the time. So I dove in deeply because it was genuinely what interested me. I earned my Ph.D., worked for several years at a research institute in Portugal, and was then invited to join Cornell University.

During my time at Cornell, in addition to research, I frequently consulted for big companies like HP and others whose names I can no longer recall, and participated in various conferences. At one point, Amazon invited me to present some of my ongoing research. Initially, I was surprised and puzzled: Really? Do I need to do this?

At the time, how challenging could web browsing and databases really be? But when I started looking into it, I realized it was actually a massive technical challenge. Amazon wasn’t just a retailer—it was a technology company operating at a scale I’d never seen before, far beyond any other company I had consulted for. From the perspective of distributed systems research, the challenges they faced were astonishing.

When Amazon offered me a job, I accepted without hesitation.

Amazon’s Scale of Operations and Technological Leadership

I believe most distributed systems researchers now recognize the scale at which large companies must operate—and not only large ones. Whether internet or digital companies, success requires operating at enormous scale.

Looking back to 2004, when I joined Amazon, many might think operating at Amazon's scale then was relatively manageable. But that doesn’t mean you can rely on existing roles or essential infrastructure. Significant work was done in cloud technologies and other areas to fully leverage their advantages.

In 2004, Amazon achieved its scale purely through practice, with no books or guides explaining how to build scalable organizations or companies. Therefore, I believe Amazon was ahead of the curve—leading by five to ten years—in technology application, development, and operational scale. This is especially critical for fast-growing companies.

If you want to become a rapidly growing company, you cannot operate like a traditional enterprise. Traditional enterprises often face the innovator’s dilemma—once something succeeds, they become very slow.

Building a company that grows continuously and quickly is a completely different story—you must carefully weigh business trade-offs. For example, whether to create technical debt or tolerate some redundancy, which would be unacceptable in traditional enterprises where efficiency is the primary goal.

At Amazon, speed, rapid innovation, and a long-term pipeline of experimentation are paramount. So we’re willing to tolerate some redundancy and allow some technical debt—as long as we know we’ll eventually pay it off.

Thus, such trade-offs Amazon makes are rarely found in traditional MBA textbooks. Most of the time, Amazon had to develop its own technologies, processes, and business workflows. Of course, having a visionary leader like Jeff Bezos—who truly understands what the future looks like and how the modern world should function—was crucial.

Keys to Achieving Growth

Despite Amazon's massive success in scale, we still face challenges in achieving even greater growth. To reach the next stage of growth, we need to think and act more rigorously.

Take performance, for example. How do you measure performance? What infrastructure do you need for measurement? To become a truly data-driven company making data-driven decisions, you first need data—and a culture centered around evaluating and interpreting these metrics.

Even a slight delay of 1.2 seconds in page loading can negatively impact customer experience. If half your customers have a worse experience, you need to understand how bad it really is. From an engineering standpoint, controlling at 99% or 99.9% becomes increasingly important. Then, you need mechanisms capable of mastering 99% engineering disciplines and linking them directly to business decisions.

I believe in 2004, our reliability was already quite high. We had rules—for example, we had to use data centers in specific regions (SEA). Whatever operations you performed on SEA data centers needed to be replicated across others, so customers wouldn’t be affected if one center failed.

Customers might experience latency, but functionality remained intact. We were very good at handling all these rules—until one day we decided to disconnect one data center to see what happened. With just a small network adjustment, we could isolate one data center from the rest.

However, in practice, all those things that looked great on paper didn’t go as smoothly as expected. The first few attempts involved many manual processes, such as manual database failover—we rehearsed throughout the year. By the third or fourth run, however, we reached a point where nearly everything ran automatically without human intervention. This capability is critical for ensuring system high availability and fault tolerance.

In our efforts, we also emphasized developing data analytics and insights. Amazon has vast amounts of data, but extracting valuable insights remains a challenge. We’ve focused on building powerful data analysis tools and models to help us understand customer behavior, market trends, and business opportunities. This data-driven approach enables better decision-making and allows us to deliver personalized products and services.

Additionally, we’ve worked hard to improve user experience. We studied users’ needs and preferences during shopping, enhancing UX through design optimization, interface improvements, and personalized recommendations. We strive for simple, intuitive, and seamless shopping experiences to meet customer expectations and earn their loyalty.

The Evolving Role of the CTO

After becoming a technology provider, the role of the CTO changes.

I discussed this in a blog post. In my view, there are four distinct types of CTO responsibilities:

First is the enterprise-level CTO, typically responsible for infrastructure management, reporting to the CIO, and overseeing large-scale infrastructure.
Second is the co-founder-type CTO common in young startups—visionary in technology. But I think this role carries risks, as many other responsibilities, such as engineering team management, often fall under it. A CTO may not necessarily excel at these tasks—we’ll discuss this further later.
Third is the CTO as a big-picture thinker who drives innovation forward. Companies like AT&T and Lucent had CTOs or CTO offices dedicated to researching and experimenting with next-generation technologies.
Finally, there’s the externally facing expert CTO—the kind seen when a company acts as a technology provider. They engage deeply with customers technically, understanding how customers use their products and identifying deeper patterns and pain points. This role focuses not just on internal technology, but on the broader picture.

Importantly, these roles are more customer-centric rather than purely technology-focused. It’s crucial to bring customer feedback back into the company and reflect on what new features or products need to be developed—or what processes need to change—to better serve customers.

Therefore, the CTO role at a technology provider becomes more customer-oriented, going beyond pure technology focus.

Amazon’s Unique Culture

Since Amazon was my first real job, I long assumed other workplaces were similar—but they aren’t.

Amazon has a unique culture that works exceptionally well for fast-growing companies. They encourage teams to be as independent as possible, minimizing organizational layers and structures. Hierarchy feels unnatural to them.

They aim for self-organizing teams, hiring people who genuinely want independence and ownership over products. Young companies especially need this—not followers or coders.

Amazon has a set of leadership principles—including Customer Obsession, Ownership, Dive Deep—14 in total—that drive its culture.

At Amazon, interviews focus heavily on cultural fit, because an employee who doesn’t adapt can severely disrupt a small team. Amazon strongly favors small teams, typically 10 to 12 people, where everyone clearly knows their role.

As companies grow, the CTO’s role evolves—from initially handling all technical matters toward focusing more on team management and ensuring engineers deliver required technologies and products.

Compared to a VP of Engineering, the CTO focuses more on technical aspects—building the right tech and using the right tools.

How Did Amazon Grow?

Internally, Amazon went through a series of transformations. They created independent teams that resembled startups, each owning their goals and innovation agendas. However, earlier on, Amazon violated architectural principles in pursuit of rapid growth, resulting in fragile backend database infrastructure that could no longer scale.

To address declining efficiency, they shifted to a service-oriented architecture, breaking the system into independent functional blocks—microservices.

But as team count grew, each service had to manage its own database, increasing communication overhead while reducing innovation.

To improve, they built shared platform services using virtualization and APIs to manage servers. These were first developed internally, then launched externally as products like Amazon S3 and EC2—making storage and computing programmable and scalable, highly cost-effective for businesses.

Amazon’s goal was to achieve internet-scale storage and compute capabilities and provide them to all kinds of enterprises.

The Path of Innovation

Amazon’s innovation operates on two levels:

First, team-level innovation—each team sets its own innovation agenda for the coming year and executes independently. For example, improving recommendation engines to reduce returns. They own roadmaps, acquiring new data sources and interacting with customers differently.
Second, capital-intensive innovation—projects like Kindle and Amazon Prime requiring major funding. Amazon established a rule: only pursue large capital investments if an innovation has strong potential to succeed and significantly impact the company’s balance sheet.

Amazon recognized that early technical decisions proved wise, but as scale increased, they had to reevaluate architectures and develop adaptable software—using multiple architectures and versions to solve technical challenges like storage engines.

Like others, Amazon also discovered that beyond technical scaling, non-technical factors such as sales, solution architecture, technical account managers, and customer support were equally vital to building a successful company.

How to Launch New Services?

We want all teams closely connected to customers, since about 95% of our features and services respond directly to customer needs. Our earliest services met nearly all customer expectations—from basic IT infrastructure to storage, compute, databases, networking, and security.

Over time, customers expressed additional needs—analytics, cloud tech, mobile development, blockchain—they wanted to use these technologies without managing them. Helping customers build the right features and tools became essential.

When launching new products and services, we follow a strong cultural norm: release with a minimal feature set (MVP). But this is only the starting point for building the necessary business technology. We can’t just release something flimsy—we must ensure stability and reliability. Then, we collaborate with customers to identify further feature needs.

Early on, we don’t always know what else customers need. For instance, when we launched DynamoDB, we didn’t know customers wanted secondary indexes. We didn’t include it initially, but it quickly became clear this was a key demand. By launching with minimal features, we observe how customers use the product and gradually iterate and add new features and services.

For example, when we launched Lambda—an environment for serverless computing—it simplified development: just write code and deploy it to S3, without worrying about servers. You only pay for what you use, with no idle costs.

This changed development—we observed how customers used the product. They quickly began iterative debugging using X-ray-like environments and built complex applications using step functions. Observing usage revealed priorities—for example, we learned auxiliary indexes mattered more to customers than secondary data centers in DynamoDB.

Essentially, customers redefine our roadmap. We start delivering what matters most to them. This is critically important. Even if it looks like an MVP, we can’t treat it as temporary—people build businesses on top of it and depend on it. A different cultural structure forms around the product.

Last year, we released 1,400 new features and services—a number that will keep growing as team count increases. We use the same structure in AWS: each team partners with specific customer segments and builds roadmaps based on their needs. More services mean more roadmaps.

Yet this is a fast-moving environment where software development has transformed. If we dictated how customers should develop software, we’d still be coding the way we did five or ten years ago. Instead, we must work closely with customers—they drive our innovation engine, shaping how software is developed from 2020 or 2025 onward.

So we don’t make decisions for customers—we collaborate closely, letting them power our innovation. We must closely observe how they use our products and continuously iterate and improve based on feedback.

Overall, AWS pursues innovation at both team level and through capital investment. By working closely with customers—understanding their needs and observing usage patterns—AWS delivers features and services that meet real demands. Simultaneously, AWS invests heavily in R&D to launch new products, services, and capabilities, adapting to evolving market needs.

This innovation approach keeps AWS tightly aligned with customers, ensuring stable, reliable solutions that meet expectations. Through minimal-feature launches and continuous iteration, AWS rapidly responds to customer needs and steadily delivers advanced functionalities.

Amazon’s innovation journey is an evolving process, always customer-centered. By deeply understanding customer needs, observing usage behaviors, and consistently investing in R&D, AWS continues pushing the boundaries of technology and business, delivering exceptional cloud computing solutions.

Building Customer-Driven Products

We’re everywhere—whether starting from customers or within Amazon. As a technology company, we deeply focus on building things that truly matter to customers. Though we’re a heavy-tech company, product design and engineers also take on risk.

We care about products, not just technology. We ask: what can we do for our customers? We aim to build amazing technology, but that’s not our sole motivation. We care about solving customer problems.

To stay focused on customers, we use a process called “working backwards.” First, we write a press release—clear and concise—describing exactly what we’ll build. Then, we prepare a document with 20 common questions, answered in simple language. In complex cases, we revise both documents repeatedly until we’re crystal clear about what we’re building.

Next, we write a user experience (UX) document detailing how customers will interact with the product. We also create user manuals, glossaries, and other supporting materials.

Finally, we have four precise documents that fully describe what we’re doing.

At Amazon, we stick to one principle: our delivery never exceeds our commitment. We don’t casually add v2 features into v1. We focus solely on delivering what we promised. This method provides a powerful framework for thinking through customer needs, product experience, and technology.

In Amazon meetings, we don’t use slides or presentations. Instead, we use a six-page memo—everyone reads it silently for 30 minutes before the meeting starts. This memo is crucial—it ensures everyone has a shared, clear understanding of the topic.

Writing a compelling story is hard, so we encourage collaboration and feedback. We revise the memo multiple times until the feature, product, or business area is clearly described. After 30 minutes of reading, everyone in the room is on the same page—enabling high-quality discussion.

In short, we have unique culture and processes ensuring we remain focused on solving customer problems and delivering outstanding products and services.

Container Technology

More and more companies are skipping container technology, especially when moving toward finer-grained microservices environments. One reason containers became popular is their ease of scaling components up and down—aligning well with microservices philosophy. Many now decompose monoliths into containers, particularly around serverless development.

However, before Fargate launched, using container technology posed challenges. You had to manage multiple containers running across multiple availability zones, mapping them onto virtual machines. So while containers were a great development option, running and managing them required significant effort. To simplify, we introduced Fargate—a solution that eliminates all underlying VM management. Just drop your container in, and it runs.

In the future, I expect more tools, support platforms, and infrastructure to emerge, enabling the construction of more sophisticated serverless environments. Better integration with other services will be a key direction.

*TechFlow Note: Fargate is a computing service provided by Amazon Web Services (AWS), serving as a serverless compute engine. Fargate allows developers to easily manage and deploy containerized applications without worrying about underlying infrastructure or servers.

Container technology is a form of virtualization that packages applications and their dependencies into isolated, portable containers, enabling rapid deployment and portability. Container technology uses container engines (like Docker) to create, manage, and run containers, allowing applications to run consistently across different computing environments regardless of underlying infrastructure differences. Containerization is widely adopted in modern application development and deployment, offering greater flexibility, scalability, and portability.

Protecting Customers

However, I believe security will become a central focus. Over the next five years, everyone—from CEOs to CTOs to engineers—must treat security as a top priority. We all need security awareness and must act as security engineers. Major data breaches occur almost weekly—this should embarrass and anger us as tech experts and digital business leaders. Protecting customer data is our responsibility—because without protecting customers, there is no business.

We must begin thinking about how to protect data collected from customers—whether renting cars or other consumer services. Security must be built-in by default—for example, triggering security checks in CI/CD pipelines, reviewing and assessing every new open-source library added.

Development pipelines themselves must be secure, equipped with automated tools for vulnerability testing. Especially in healthcare and finance, compliance with regulations and oversight requirements is essential.

Within five years, I hope we all maintain heightened security awareness and place customer protection first. At Amazon, whether intellectual or financial capital, protecting customers will always be our top investment area.

Common Mistakes Startups Make When Using AWS

First, those with traditional data center experience may lack confidence when first using AWS. While AWS offers elasticity and usability advantages, unless higher-level services—such as security, data analytics, and mobile—are utilized, its full potential won’t be realized, especially for large-scale, high-reliability development.

Second, it’s crucial to define your company type and goals. There are two distinct styles: one seeks rapid growth and massive customer bases, prioritizing expansion over revenue, investing heavily to scale quickly and possibly get acquired; the other pursues sustainable growth, aiming to build lasting businesses rather than targeting acquisition.

These two types use AWS very differently. Fast-growth companies can confidently leverage AWS capacity and services, less concerned about cost. Sustainable companies must build different architectures, focus on cost control, and ensure clear alignment between cost and customer acquisition.

Jeff Bezos often categorizes startup founders as mercenaries or missionaries. Mercenaries pursue startups for money; missionaries are driven by passion for their product. Both approaches are valid, but the technical support and architecture built will differ accordingly.

Therefore, clearly identify your company type and choose appropriate technical support and architecture accordingly.

Join TechFlow official community to stay tuned

Telegram:https://t.me/TechFlowDaily

X (Twitter):https://x.com/TechFlowPost

X (Twitter) EN:https://x.com/BlockFlow_News

Source

Add to Favorites

Share to Social Media

Author

深潮 TechFlow

深潮TechFlow