Building an AI In-Car Co-Pilot: What We Learned

A few months ago we started building something we had not seen done properly: an AI assistant that lives in your car, reads your vehicle's live data, and answers questions hands-free while you drive. Not a glorified Siri shortcut. An actual context-aware co-pilot that knows what your engine is doing while you talk to it.

Here is what we built, what broke, and what we would do differently.

The Concept: Ambient AI for Drivers

The idea started with a simple frustration. Modern cars generate enormous amounts of data, and most drivers never see any of it beyond the basic dashboard. Meanwhile, large language models are now good enough to answer nuanced questions in natural language. The gap between those two things felt like an obvious product.

The goal was not navigation. Google Maps handles that. The goal was a layer of ambient intelligence: "Why is my fuel consumption up this week?", "What does that warning light mean for a 2022 model specifically?", "How far can I realistically go on this charge?" Questions that require context, not just a lookup.

We wanted it to work hands-free, respond quickly, and feel natural rather than robotic. We also wanted it to run on a progressive web app so it worked across device types without an App Store approval process.

The Hardware Layer: OBD2 ELM327 Bluetooth Dongle

Every car sold in Australia after 2006 has an OBD2 port, usually below the steering column. Plug in an ELM327 Bluetooth dongle and you have a data stream: engine RPM, coolant temperature, throttle position, vehicle speed, fuel trim, error codes, and more. The dongle costs between $15 and $80 depending on quality. We tested 4 different models.

The cheap ones work. Mostly. The issue with sub-$20 dongles is connection stability over longer drives. They drop the Bluetooth connection, reconnect, and you lose data continuity. We settled on the Vgate iCar Pro as our reference hardware after it maintained a clean connection across 6 hours of testing.

The dongle speaks the ELM327 AT command protocol. You send commands, it returns raw hex data, you decode it using the OBD2 PID specification. For the web app, we used the Web Bluetooth API to connect directly from the browser without requiring a native app. This works in Chrome on Android. It does not work in Safari, which is a problem we will come back to.

Parsing the hex responses into meaningful values is tedious but well-documented. Engine RPM, for example, comes back as two hex bytes that you convert using the formula (A * 256 + B) / 4. The OBD2 PID list covers most of what you need. We pulled around 14 PIDs on a 500ms polling loop.

Building the Tesla-Compatible PWA

Tesla's browser is Chromium-based, which means Web Bluetooth API support is theoretically present. In practice, Tesla's browser environment has restrictions that made some standard browser APIs behave unexpectedly. We had to test every feature assumption directly on the car's touchscreen.

The PWA needed to work in three environments: the driver's Android phone mounted on a holder, the Tesla touchscreen, and a tablet. Each had different screen ratios, different touch targets, and different audio behaviour. We used a single responsive layout with breakpoints that addressed all three.

Voice input was handled through the Web Speech API. On Android this worked well once we worked around the permission timing issues that occur when the page loads. On Tesla's browser, the Speech API was present but microphone access required user gesture initiation, meaning we could not auto-start listening. We solved this with a large tap-to-talk button that auto-restarts after each response.

The audio output used the Web Speech Synthesis API for short confirmations and pre-built responses. For longer, more complex responses generated by the LLM, we used ElevenLabs TTS via API call, which produced noticeably better voice quality. The latency trade-off was acceptable: around 800ms additional delay for the API round trip.

The LLM integration was Claude via the Anthropic API. We chose Claude for its longer context window and its tendency to give direct, practical answers rather than hedged non-answers.

The Context Window Problem in a Vehicle

This is where things got genuinely interesting and genuinely difficult. The context window problem in an AI application is usually about document length or conversation history. In a vehicle context, it is about data volume and relevance.

We were feeding the model a real-time snapshot of 14 OBD2 parameters every time a query came in. We also wanted to include recent history: the last 10 minutes of readings so the model could answer questions about trends. That data adds up fast. 14 parameters at 2 readings per second over 10 minutes is 16,800 data points. You cannot send all of that.

The solution was a summarisation layer. Before each query, we ran a lightweight aggregation across the recent telemetry: average speed, average RPM, peak coolant temp, any error codes present, fuel consumption rate. That summary went into the system prompt as a structured block. The driver's question sat in the user message. The model had context without drowning in raw numbers.

The harder problem was knowing what the driver actually needed. Someone asking "is everything okay?" when the check engine light is on needs a different answer than the same question during a normal drive. We used a pre-query classification step: scan the live data for anomalies before constructing the prompt, and if anomalies exist, surface them explicitly in the system context. This made responses noticeably more relevant.

What We Would Do Differently

4 things we would change if we rebuilt this from scratch.

First, we would invest more time in hardware qualification upfront. The dongle compatibility issues cost us days of debugging that turned out to be hardware problems, not code problems. A proper hardware test matrix before writing any AI integration code would have saved that time.

Second, we would not use the Web Speech API as the primary voice input. It is inconsistent across environments and handles background noise poorly. A dedicated voice activity detection library paired with a server-side speech-to-text API gives more reliable results. The latency difference is minimal for conversational queries.

Third, we would design the data summarisation layer first, not as an afterthought. We bolted it on after realising the raw data approach was not working. Designing it as a core component from the start would have produced a cleaner architecture.

Fourth, we would pick a lane on the Tesla integration earlier. Supporting the Tesla browser added significant complexity for a small portion of the user base. A dedicated Android app would have been faster to build and more reliable. The PWA approach was the right call for cross-device support, but we spent too long trying to make the Tesla browser behave like a standard Chromium environment.

Where This Fits

We are sharing this because vehicle-context AI is an area where the tooling is ahead of the implementations. The OBD2 data layer is accessible, LLMs are capable, and voice interfaces work well enough for most use cases. What is missing is the product thinking: what questions does a driver actually want to ask, how do you handle the hands-free safety requirement seriously, and how do you make it feel like a natural part of driving rather than a novelty you stop using after a week.

If you are building in this space, or if you want to explore what AI automation looks like in product contexts beyond marketing, talk to us at Adelaide Socials. This is exactly the kind of build we take on.