Elon tweeted that v9 of the FSD beta would remove its reliance on radar completely and instead determine decisions based purely on vision. Humans don’t have radar after all, so it seems like a logical solution and tells us Tesla is feeling much more confident in their vision AI.
Radar and vision each have their advantages, but radar has thus far been much more reliable in detecting objects and determining speed. If you’ve ever noticed your Tesla being able to detect two vehicles in front of you when you can only see the one directly ahead of you, that’s radar at work.
In this situation the radio waves from the radar sensor are bouncing underneath the car in front of you and are able to continue traveling and detect that there is another object ahead even though it could never “see” it.
It really is one of those wow moments where you can feel the future and the ability for AI-powered cars to drive better than humans one day. It’s baby steps and slowly we’ll see more and more of these situations where the vehicle simply sees or does something we could never do.
There’s no doubting that more sensors could provide a more reliable and accurate interpretation of the real world as they each have their own advantages. In an ideal world a vehicle with radar, lidar, vision, ultrasonic sensors and even audio processing would provide the best solution. However, more sensors and systems come at a price, resulting in increased vehicle cost and system complexity.
After all humans are relatively safe drivers with two “cameras” and vision alone. If Tesla can completely solve vision, they’ll easily be able to achieve superhuman driving capabilities. Teslas have eight cameras, facing in all directions. They’re able to analyze all of them concurrently and make much more accurate interpretations then we ever could in the same amount of time.
Tristan on Twitter recently had some great insight into Tesla vision AI and how they’re going to replace radar. Here’s what Tristan had to say:
"We recently got some insight into how Tesla is going to replace radar in the recent firmware updates + some nifty ML model techniques
From the binaries we can see that they've added velocity and acceleration outputs. These predictions in addition to the existing xyz outputs give much of the same information that radar traditionally provides
(distance + velocity + acceleration).
For autosteer on city streets, you need to know the velocity and acceleration of cars in all directions but radar is only pointing forward. If it's accurate enough to make a left turn, radar is probably unnecessary for the most part.
How can a neural network figure out velocity and acceleration from static images you ask?
They can't!
They've recently switched to something that appears to be styled on an Recurrent Neural Network.
Net structure is unknown (LSTM?) but they're providing the net with a queue of the 15 most recent hidden states. Seems quite a bit easier to train than normal RNNs which need to learn to encode historical data and can have issues like vanishing gradients for longer time windows.
The velocity and acceleration predictions is new, by giving the last 15 frames (~1s) of data I'd expect you can train a highly accurate net to predict velocity + acceleration based off of the learned time series.
They've already been using these queue based RNNs with the normal position nets for a few months presumably to improve stability of the predictions.
This matches with the recent public statements from Tesla about new models training on video instead of static images.
To evaluate the performance compared to radar, I bet Tesla has run some feature importance techniques on the models and radar importance has probably dropped quite a bit with the new nets. See tools like https://captum.ai for more info.
I still think that radar is going to stick around for quite a while for highway usage since the current camera performance in rain and snow isn't great.
NoA often disables in mild rain. City streets might behave better since the relative rain speed is lower.
One other nifty trick they've recently added is a task to rectify the images before feeding them into the neural nets.
This is a common in classical CV applications so surprised it only popped up in the last couple of months.
This makes a lot of sense since it means that the nets don't need to learn the lens distortion. It also likely makes it a lot easier for the nets to correlate objects across multiple cameras since the movement is now much more linear.
For more background on LSTMs (Long Short-Term Memory) see https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
They're tricky to train because they need to encode history which is fed into future runs. The more times you pass the state, the more the earlier frames is diluted hence "vanishing gradients".
Tesla’s FSD beta v9 will be a big improvement forward from what FSD beta users have been using where the system was still relying on radar. And it’ll be an even bigger leap from what non-beta testers currently have access to. We can’t wait. Now where’s that button?
Almost ready with FSD Beta V9.0. Step change improvement is massive, especially for weird corner cases & bad weather. Pure vision, no radar.
With Tesla’s first major expansion of the Robotaxi Geofence now complete and operational, they’ve been hard at work with validation in new locations - and some are quite the drive from the current Austin Geofence.
Validation fleet vehicles have been spotted operating in a wider perimeter around the city, from rural roads in the west end to the more complex area closer to the airport. Tesla mentioned during their earnings call that the Robotaxi has already completed 7,000 miles in Austin, and it will expand its area of operation to roughly 10 times what it is now. This lines up with the validation vehicles we’ve been tracking around Austin.
Based on the spread of the new sightings, the potential next geofence could cover a staggering 450 square miles - a tenfold increase from the current service area of roughly 42 square miles. You can check this out in our map below with the sightings we’re tracking.
If Tesla decides to expand into these new areas, it would represent a tenfold increase over their current geofence, matching Tesla’s statement. The new area would cover approximately 10% of the 4,500-square-mile Austin metropolitan area. If Tesla can offer Robotaxi services in that entire area, it would prove they can tackle just about any city in the United States.
From Urban Core to Rural Roads
The locations of the validation vehicles show a clear intent to move beyond the initial urban and suburban core and prepare the Robotaxi service for a much wider range of uses.
In the west, validation fleet vehicles have been spotted as far as Marble Falls - a much more rural environment that features different road types, higher speed limits, and potentially different challenges.
In the south, Tesla has been expanding towards Kyle, which is part of the growing Austin-San Antonio suburban corridor spanning Highway 35. San Antonio is only 80 miles (roughly a 90-minute drive) away, and could easily become part of the existing Robotaxi area if Tesla obtains regulatory approval there.
In the East, we haven’t spotted any new validation vehicles. This is likely because Tesla’s validation vehicles originate from Giga Texas, which is located East of Austin. We won’t really know if Tesla is expanding in this direction until they start pushing past Giga Texas and toward Houston.
Finally, there have been some validation vehicles spotted just North of the new expanded boundaries, meaning that Tesla isn’t done in that direction either. This direction consists of the largest suburban areas of Austin, which have so far not been serviced by any form of autonomous vehicle.
Rapid Scaling
This new, widespread validation effort confirms what we already know. Tesla is pushing for an intensive period of public data gathering and system testing in a new area, right before conducting geofence expansions. The sheer scale of this new validation zone tells us that Tesla isn’t taking this slowly - the next step is going to be a great leap instead, and they essentially confirmed this during this Q&A session on the recent call. The goal is clearly to bring the entire Austin Metropolitan area into the Robotaxi Network.
While the previous expansion showed off just how Tesla can scale the network, this new phase of validation testing is a demonstration of just how fast they can validate and expand their network. The move to validate across rural, suburban, and urban areas simultaneously shows their confidence in these new Robotaxi FSD builds.
Eventually, all these improvements from Robotaxi will make their way to customer FSD builds sometime in Q3 2025, so there is a lot to look forward to.
For years, the progress of Tesla’s FSD has been measured by smoother turns, better lane centering, and more confident unprotected left turns. But as the system matures, a new, more subtle form of intelligence is emerging - one that shifts its attention to the human nuances of navigating roads. A new video posted to X shows the most recent FSD build, V13.2.9, demonstrating this in a remarkable real-world scenario.
Toll Booth Magic
In the video, a Model Y running FSD pulls up to a toll booth and smoothly comes to a stop, allowing the driver to handle payment. The car waits patiently as the driver interacts with the attendant. Then, at the precise moment the toll booth operator finishes the transaction and says “Have a great day”, the vehicle starts moving, proceeding through the booth - all without any input from the driver.
If you notice, there’s no gate here at this toll booth. This interaction all happened naturally with FSD.
While the timing was perfect, the FSD wasn’t listening to the conversation for clues (maybe one day, with Grok?) The reality, as explained by Ashok Elluswamy, Tesla’s VP of AI, is even more impressive.
It can see the transaction happening using the repeater & pillar cameras. Hence FSD proceeds on its own when the transaction is complete 😎
FSD is simply using the cameras on the side of the vehicle to watch the exchange between the driver and attendant. The neural network has been trained on enough data that it can visually recognize the conclusion of a transaction - the exchange of money or a card and the hands pulling away - and understands that this is the trigger to proceed.
The Bigger Picture
This capability is far more significant than just a simple party trick. FSD is gaining the ability to perceive and navigate a world built for humans in the most human-like fashion possible.
If FSD can learn what a completed toll transaction looks like, it’s an example of the countless other complex scenarios it’ll be able to handle in the future. This same visual understanding could be applied to navigating a fast-food drive-thru, interacting with a parking garage attendant, passing through a security checkpoint, or boarding a ferry or vehicle train — all things we thought that would come much later.
These human-focused interactions will eventually become even more useful, as FSD becomes ever more confident in responding to humans on the road, like when a police officer tells a vehicle to go a certain direction, or a construction worker flags you through a site. These are real-world events that happen every day, and it isn’t surprising to see FSD picking up on the subtleties and nuances of human interaction.
This isn’t a pre-programmed feature for a specific toll booth. It is an emergent capability of the end-to-end AI neural nets. By learning from millions of videos across billions of miles, FSD is beginning to build a true contextual understanding of the world. The best part - with a 10x context increase on its way, this understanding will grow rapidly and become far more powerful.
These small, subtle moments of intelligence are the necessary steps to a truly robust autonomous system that can handle the messy, unpredictable nature of human society.