Every few months, a new wave of AI tools captures the industry’s attention. Chatbots get smarter. Image generators get faster. Video models produce clips that look increasingly polished. The cycle moves quickly, and each new release is treated as a revolution.
Erhan Ciris, founder and CEO of 4D Sight, has watched these cycles with interest but without distraction. He has spent the last several years focused on a problem that does not get solved by faster text generation or better image synthesis. He has been teaching AI to understand physical space in real time.
“There is a lot of energy around AI right now, and much of it is focused on content generation,” Ciris said. “But generating content and understanding the world are two very different things. The harder problem, and the one with deeper long-term value, is spatial intelligence.”
The Difference Between Generating and Understanding
Most of the AI tools dominating headlines today work in two dimensions. They take inputs like text, images, or flat video and produce outputs in the same plane. The results can be impressive, but they operate without any real understanding of the physical environments they depict.
Spatial intelligence is a fundamentally different challenge. It requires AI to interpret depth, geometry, motion, and lighting inside a live scene and to do so continuously, in real time, with no margin for error.
That is the problem Erhan Ciris and the 4D Sight team have been solving since 2020. Their platform, which Ciris calls a Perception Layer, processes live broadcast feeds and builds a three-dimensional understanding of the environment as it unfolds. The system then inserts photorealistic virtual content that behaves as though it physically exists within the scene.
“If an AI system cannot reason about a scene in three dimensions, everything it produces will feel artificial,” Ciris said. “That is true whether you are talking about advertising, broadcast production, or any other application that touches live video.”
Why the Hype Cycle Misses the Point
The current AI conversation tends to reward speed and novelty. A model that generates a video clip in seconds attracts more attention than a system that can interpret the spatial structure of a live broadcast with millisecond precision. But Ciris argues that the latter is far more difficult and far more consequential.
“Where most people get it wrong is believing the future is about better 2D overlays or faster rendering,” Ciris said. “The real future is spatial. That is a harder problem, and it takes longer to solve, which is exactly why most companies avoid it.”
4D Sight’s technology was not assembled from general-purpose AI tools or built on top of the latest open-source model. It grew out of years of deep learning research conducted by engineers with backgrounds in autonomous robotics, UAV navigation, and synthetic aperture radar imaging. These are disciplines where understanding the physical world in real time is not optional. It is the entire point.
That foundation is protected by U.S. Patent 11,270,517, which covers the core method for dynamically inserting content into live video streams.
Spatial Intelligence in Practice
For Erhan Ciris, spatial intelligence is not a theoretical concept. 4D Sight applies it every day across Tier 1 sports and esports broadcasts.
The company’s platform has been validated in partnerships with Riot Games and TKO, which includes the UFC and WWE. In these environments, the AI has to perform under conditions that would break most systems: rapidly shifting camera angles, real-world lighting changes, fast physical movement, and zero tolerance for visual errors.
4D Sight handles all of this from the cloud, with no sensors or hardware required at the venue. The system reads a single live feed, understands its spatial structure, and inserts virtual content that is indistinguishable from the physical environment.
Ciris chose to prove the technology first in esports, where the speed and visual complexity are extreme. The logic was simple: if the AI could survive competitive gaming broadcasts, it could work anywhere. The subsequent expansion into live sports confirmed that thesis.
What Comes After Virtual Advertising
Ciris sees virtual advertising as the first application of spatial intelligence in live media, not the last. Once a system can reason about three-dimensional environments in real time, entirely new categories of interaction become possible.
He describes the next phase as multimodal creative infrastructure. Instead of working from a single live feed, AI systems will draw from multiple reference inputs at once: spatial context, brand assets, historical footage, and live video. The result will be broadcast environments that can be modified, extended, and personalized on the fly.
“We are moving toward a world where AI collaborates with creators to shape live environments in real time,” Ciris said. “That requires spatial understanding at a level most current AI tools are not designed to provide.”
For Erhan Ciris, that is exactly the point. The AI tools generating the most excitement today are solving visible, accessible problems. Spatial intelligence solves a deeper one. It is harder to build, harder to demonstrate in a tweet, and harder to explain in a headline. But Ciris believes it is the layer that everything else will eventually be built on.
“Trends come and go,” Ciris said. “Spatial intelligence is infrastructure. 4D Sight was built on that belief, and everything we have accomplished so far has confirmed it.”






