it always pains me to say ‘big data’ – especially as someone who started companies in the space before the label and worked with companies who have long dealt with massive amounts of data – but i also know that i am not a marketer 🙂
big data keeps getting bigger. last year, VC firms invested $3.6B – 75 percent of what they invested in the previous five years combined. the pace has continued this year, with several firms announcing new funding rounds in the tens and even hundreds of millions of dollars.
for aspiring big data entrepreneurs, it’s exciting – and intimidating. i meet a lot of smart, talented engineers who want to work in big data but don’t know where to start.
i tell them to focus on an area where you can have a big impact, including feature engineering, mining email for B2B, applications for CRM, data governance, vertical integration, health care solutions – big data can drive health care savings of $300B according to a recent study – and tying into existing consumer properties such as facebook or linkedin to drive sales leads.
other areas, like data visualization or databases, are important but saturated, though there may be an opportunity to build next-generation databases using time series data. still others, like personalization technology, are better for established companies like google and facebook that have the data to train their image and voice recognition models.
once you focus and develop your big data idea, how do you turn that idea into a company?
turning your big data idea into a company
my advice in brief: be a painkiller rather than a vitamin, build and sell for enterprise customers, and remember that even with big data, less can be more.
be a painkiller, not a vitamin
like so many entrepreneurs, i love the technical challenge of programming. i started coding in fourth grade and have never stopped. so i understand how founders can be enchanted by the technical wizardry behind their products, especially in fields like data and machine learning.
but the corporate customers who are deciding whether to buy the product will be asking a set of questions with a very different focus. questions like: what’s the ROI here? will your proposed solution integrate well with our business culture? will it help move my production workloads?
one way to stay focused is to remind yourself to be a painkiller, not a vitamin. vitamins are great, but painkillers are vital. use technology to build a product that customers need – now.
i always ask founders in our first meeting why they made certain technical decisions. if you don’t know why you selected a particular technology and how your decision helps the customer, i would be hard-pressed to back your company.
build and sell for the enterprise
startups need to sell. in big data and machine learning, most customers will be enterprise customers. and most startups greatly underestimate what it means to be enterprise-ready.
my two bits of advice: first, if you’re an engineer, be sure to work closely with a product person, business person, or CIO so that you understand what it really means to sell to the enterprise. as a venture investor, i often introduce people to one another for precisely this purpose.
second, manage the gap between perception and reality. there are so many possibilities for big data, but there is also a lot of hype. manage the expectations of CMOs and CIOs so that you do not under-deliver at the start of what may otherwise be a lucrative long-term relationship.
understand the “why” of data storage
we all know how easy and efficient it is to store data today. in three decades, the cost of storing a gigabyte has gone from thousands of dollars to a few pennies. but now people have a tendency to store data without knowing how they want to use it. at some point, you enter a “data obesity” state where data storage, maintenance, and upkeep cost too much and slow you down.
even in a data-driven world, you shouldn’t default to storing every bit of data. instead, stop and ask yourself: do i have an idea of how i or somebody else in my company wants to use this data in the future? data storage still consumes energy and resources. before you store data, consider whether it will ever help you make a decision or deliver a product or service.
whether it is the onrush of data from sensors, advances in machine learning and deep-belief nets, or new modes of virtual reality, we are swimming in new information and need to imagine what will create the next wave of extracting knowledge and insights from all of it.
as i learned at twitter/cloudera/composite software/microsoft, building tools that allow more people to access and ask questions of the data enables everyone to make better decisions more quickly. as an investor, i often wonder: what are the new opportunities that will be created that we haven’t even thought of?