Tesla recently showed off a demo of Optimus, its humanoid robot, walking around in moderately challenging terrain—not on a flat surface but on dirt and slopes. These things can be difficult for a humanoid robot, especially during the training cycle.
Most interestingly, Milan Kovac, VP of Engineering for Optimus, clarified what it takes to get Optimus to this stage. Let’s break down what he said.
Optimus is Blind
Optimus is getting seriously good at walking now - it can keep its balance over uneven ground - even while walking blind. Tesla is currently using just the sensors, all powered by a neural net running on the embedded computer.
Essentially, Tesla is building Optimus from the ground up, relying on as much additional data as possible while it trains vision. This is similar to how they train FSD on vehicles, using LiDAR rigs to validate the vision system’s accuracy. While Optimus doesn’t have LiDAR, it relies on all those other sensors on board, many of which will likely become simplified as vision takes over as the primary sensor.
Today, Optimus is walking blind, but it’s able to react almost instantly to changes in the terrain underneath it, even if it falls or slips.
What’s Next?
Next up, Tesla AI will be adding vision to Optimus - helping complete the neural net. Remember, Optimus runs on the same overall AI stack as FSD - in fact, Optimus uses an FSD computer and an offshoot of the FSD stack for vision-based tasks.
Milan mentions they’re planning on adding vision to help the robot plan ahead and improve its walking gait. While the zombie shuffle is iconic and a little bit amusing, getting humanoid robots to walk like humans is actually difficult.
There’s plenty more, too - including better responsiveness to velocity and direction commands and learning to fall and stand back up. Falling while protecting yourself to minimize damage is something natural to humans - but not exactly natural to something like a robot. Training it to do so is essential in keeping the robot, the environment around it, and the people it is interacting with safe.
We’re excited to see what’s coming with Optimus next because it is already getting started in some fashion in Tesla’s factories.
Subscribe
Subscribe to our newsletter to stay up to date on the latest Tesla news, upcoming features and software updates.
With Tesla’s first major expansion of the Robotaxi Geofence now complete and operational, they’ve been hard at work with validation in new locations - and some are quite the drive from the current Austin Geofence.
Validation fleet vehicles have been spotted operating in a wider perimeter around the city, from rural roads in the west end to the more complex area closer to the airport. Tesla mentioned during their earnings call that the Robotaxi has already completed 7,000 miles in Austin, and it will expand its area of operation to roughly 10 times what it is now. This lines up with the validation vehicles we’ve been tracking around Austin.
Based on the spread of the new sightings, the potential next geofence could cover a staggering 450 square miles - a tenfold increase from the current service area of roughly 42 square miles. You can check this out in our map below with the sightings we’re tracking.
If Tesla decides to expand into these new areas, it would represent a tenfold increase over their current geofence, matching Tesla’s statement. The new area would cover approximately 10% of the 4,500-square-mile Austin metropolitan area. If Tesla can offer Robotaxi services in that entire area, it would prove they can tackle just about any city in the United States.
From Urban Core to Rural Roads
The locations of the validation vehicles show a clear intent to move beyond the initial urban and suburban core and prepare the Robotaxi service for a much wider range of uses.
In the west, validation fleet vehicles have been spotted as far as Marble Falls - a much more rural environment that features different road types, higher speed limits, and potentially different challenges.
In the south, Tesla has been expanding towards Kyle, which is part of the growing Austin-San Antonio suburban corridor spanning Highway 35. San Antonio is only 80 miles (roughly a 90-minute drive) away, and could easily become part of the existing Robotaxi area if Tesla obtains regulatory approval there.
In the East, we haven’t spotted any new validation vehicles. This is likely because Tesla’s validation vehicles originate from Giga Texas, which is located East of Austin. We won’t really know if Tesla is expanding in this direction until they start pushing past Giga Texas and toward Houston.
Finally, there have been some validation vehicles spotted just North of the new expanded boundaries, meaning that Tesla isn’t done in that direction either. This direction consists of the largest suburban areas of Austin, which have so far not been serviced by any form of autonomous vehicle.
Rapid Scaling
This new, widespread validation effort confirms what we already know. Tesla is pushing for an intensive period of public data gathering and system testing in a new area, right before conducting geofence expansions. The sheer scale of this new validation zone tells us that Tesla isn’t taking this slowly - the next step is going to be a great leap instead, and they essentially confirmed this during this Q&A session on the recent call. The goal is clearly to bring the entire Austin Metropolitan area into the Robotaxi Network.
While the previous expansion showed off just how Tesla can scale the network, this new phase of validation testing is a demonstration of just how fast they can validate and expand their network. The move to validate across rural, suburban, and urban areas simultaneously shows their confidence in these new Robotaxi FSD builds.
Eventually, all these improvements from Robotaxi will make their way to customer FSD builds sometime in Q3 2025, so there is a lot to look forward to.
For years, the progress of Tesla’s FSD has been measured by smoother turns, better lane centering, and more confident unprotected left turns. But as the system matures, a new, more subtle form of intelligence is emerging - one that shifts its attention to the human nuances of navigating roads. A new video posted to X shows the most recent FSD build, V13.2.9, demonstrating this in a remarkable real-world scenario.
Toll Booth Magic
In the video, a Model Y running FSD pulls up to a toll booth and smoothly comes to a stop, allowing the driver to handle payment. The car waits patiently as the driver interacts with the attendant. Then, at the precise moment the toll booth operator finishes the transaction and says “Have a great day”, the vehicle starts moving, proceeding through the booth - all without any input from the driver.
If you notice, there’s no gate here at this toll booth. This interaction all happened naturally with FSD.
While the timing was perfect, the FSD wasn’t listening to the conversation for clues (maybe one day, with Grok?) The reality, as explained by Ashok Elluswamy, Tesla’s VP of AI, is even more impressive.
It can see the transaction happening using the repeater & pillar cameras. Hence FSD proceeds on its own when the transaction is complete 😎
FSD is simply using the cameras on the side of the vehicle to watch the exchange between the driver and attendant. The neural network has been trained on enough data that it can visually recognize the conclusion of a transaction - the exchange of money or a card and the hands pulling away - and understands that this is the trigger to proceed.
The Bigger Picture
This capability is far more significant than just a simple party trick. FSD is gaining the ability to perceive and navigate a world built for humans in the most human-like fashion possible.
If FSD can learn what a completed toll transaction looks like, it’s an example of the countless other complex scenarios it’ll be able to handle in the future. This same visual understanding could be applied to navigating a fast-food drive-thru, interacting with a parking garage attendant, passing through a security checkpoint, or boarding a ferry or vehicle train — all things we thought that would come much later.
These human-focused interactions will eventually become even more useful, as FSD becomes ever more confident in responding to humans on the road, like when a police officer tells a vehicle to go a certain direction, or a construction worker flags you through a site. These are real-world events that happen every day, and it isn’t surprising to see FSD picking up on the subtleties and nuances of human interaction.
This isn’t a pre-programmed feature for a specific toll booth. It is an emergent capability of the end-to-end AI neural nets. By learning from millions of videos across billions of miles, FSD is beginning to build a true contextual understanding of the world. The best part - with a 10x context increase on its way, this understanding will grow rapidly and become far more powerful.
These small, subtle moments of intelligence are the necessary steps to a truly robust autonomous system that can handle the messy, unpredictable nature of human society.