Capabilities Focus: Dr Yiliang Zhao on Data in Openspace

Freddie Luchterhand-Dare
May 18, 2023

At Openspace, we are driven by Active Intelligence, where real-time data drives the investing decisions we make. We prioritise the use of data in helping both the Openspace office as well as for our portfolio companies. As such, we'd like to shine a spotlight on the use of data here at Openspace, with help from our Director for Data Science, Dr Yiliang Zhao.

As Director for Data Science at Openspace, what do you do here at the office?  

My primary responsibilities revolve around leading our data science team and providing guidance to our portfolio companies on enhancing their AI and machine learning strategies, which is integral for driving their growth. In addition to this, I also have a significant role in evaluating potential new opportunities from the lens of data science and AI. This entails assessing the data science aspects of prospective ventures and determining their potential for success or growth.

How do you think the use of data gives Openspace a competitive advantage over other firms?

The use of data provides Openspace with a competitive advantage in several ways:

  1. Informed Decision-making: By utilizing data from various data sources, Openspace is able to make more informed and strategic decisions about potential investments. This includes understanding market trends, founders' backgrounds and connections, as well as app usage.
  2. Predictive Modelling: Openspace leverages machine learning technologies to create predictive models to assess a company's potential for success. This ability to estimate the investability of a startup and potential exit values can significantly improve investment decisions.
  3. Deal Sourcing: Automated systems can identify trending companies, products, or apps based on data feeds, flagging them to the investment team for further analysis. This supports the human-led efforts, making the deal sourcing process more efficient and effective.
  4. Increased Efficiency: Data-driven approaches can help Openspace streamline its processes and identify promising investment opportunities faster, giving them a head start over firms that rely solely on traditional methods.
  5. Understanding External Factors: Data on external factors such as sector readiness can help Openspace identify causal relationships that impact investment outcomes, enhancing its strategic planning and investment decisions.

What are some notable developments in Data Science that you’ve seen over the last few years? And has that impacted your work in any way?

In recent years, we've observed significant advancements in Data Science, particularly in the areas of generative AI and large language models (LLMs). Generative AI, which is instrumental in creating various types of content and augmenting data, has paved the way for innovative solutions across numerous sectors. LLMs, on the other hand, have revolutionized areas such as automated customer support and efficient information extraction, among others.

At Openspace, these developments have significantly influenced our operations. We've incorporated large language models into our workflow to enhance various processes. For instance, LLMs assist us in conducting comprehensive searches for companies that align with our investment focus, thereby increasing the precision and efficiency of our prospecting efforts. Moreover, we've employed LLMs to improve our industry and sector tagging capabilities, ensuring that activities at the industry/sector levels are accurately tracked.

We will also embrace the potential of chatbots powered by LLMs to streamline information extraction and search processes.

In essence, the advancements in generative AI and large language models have indeed had a transformative impact on our work, enabling us to enhance our decision-making processes, improve our operational efficiency, and provide better support to our portfolio companies.

What are some projects that you’ve worked on in the past that you think are particularly interesting?

When I worked at Google, I helped a large gaming company to build a game abuser detection pipeline to identify and curtail in-game abusive behaviors in order to maintain a fair and enjoyable gaming environment. The project employed an AutoEncoder, a type of artificial neural network used for learning efficient codings of input data, as the core methodology. This allowed the model to learn normal player behavior patterns and, consequently, identify anomalies representing potential abusers. The AutoEncoder was trained on a vast dataset of in-game activities, capturing the typical actions of regular players. Once trained, the model could analyze new player data and flag activities that deviated significantly from the learned norm, effectively identifying potential abusers. The project thus showcased an effective use of machine learning in enhancing the gaming experience by maintaining a fair play environment.

Do you foresee any developments in this field in the near future?

The surge in interest and R&D around large language models (LLMs) and generative AI signifies a transformative shift in technology and data science. As we continue to build more complex and efficient applications on top of these foundational models, the scope for automation across various sectors will grow exponentially. This will likely lead to a significant reduction in manual efforts, streamlining workflows, and driving productivity. One of the most promising prospects is the potential to construct software applications through simple natural language prompts. This could fundamentally reshape the software development landscape, making it more accessible and efficient.