AI has advanced and grown tremendously in the past decade. I remember the times when AI was a niche research topic and top academic conferences were small enough that we had the luxury to stop by every single poster and talk to all the authors. It’s a very different experience compared to today’s conferences that may have over a thousand papers accepted.

But we’ve come a long way and AI has made it to the mainstream. I believe I first heard from Fei-Fei Li (circa 2016, I think), attributing the tremendous AI advances to the convergence of three forces:

  • Data: Availability of large datasets to enable large scale learning
  • Algorithms: Improvements in neural network training and architectures
  • Compute: Cheaper parallel computation in the form of GPUs

An often unspoken quality of data and algorithms is that many of them are open source. In a way, the speed of these advances may have been only possible because many of the datasets/benchmarks as well as algorithmic implementations are open sourced. This made it a lot easier to build on previous work, make improvements and push the technology forward.

However, there has been a shift in recent years. We are now moving into a new status quo where closed-source AI has entered the scene.

Many of the most performing AI models today are privately owned by entities that choose not to disclose their data, training procedure/recipes, or code. So these entities are now prioritizing closed models, trained with closed data, and undisclosed algorithms, to power AI as a service.

So one has to wonder: What does this shift mean for the future of AI, let’s say in the next decade? Will open source and closed source approaches co-exist and influence the future trajectory of AI research and development?

Not easy questions. But we can break the discussion down by taking three points of view:

  • Research: How will the pace of advancement change? Some academic AI conferences and venues have vigorously encouraged open implementations for reproducibility, which has been great. But once key breakthroughs start happening in the close source AI world, will everyone else be left out in the dark? This is particularly problematic for the advancement of AI research and science that rely on open scientific discourse.
  • Business: While I may accept that it’s hard to quantify, I think we can agree that both startups and big tech products have benefited tremendously from open data and open source algorithms. If that’s the case, the transition towards close AI implementations may be a double-edged sword. One one hand, the shift will likely concentrate know-how into a few and make it harder to bootstrap from open implementations and data. On the other hand, some may argue that close source AIs as a service will be stronger, more stable and robust platforms on top of which other businesses will be able to build their products. I won’t fall into the trap of predicting which way this will go, but I want to watch this space closely.
  • Policymaking: This is a very complex space that I’m trying to learn more about. In short, there are implications around the ethical and safe use of AI, the transparency of AI-made decisions, trust in AI systems, privacy and bias, just to name some key dimensions. So the question here is, who will get to make decisions about each of these? Will it be open communities? Whose values will be reflected into these AI systems? I think these are tightly connected to the business point above, since these systems will be powering products that people will use.

I don’t have answers, but I think these are important questions to ask. Let me know your thoughts.